On Jun 9, 2009, at 6:23 AM, [email protected] 
  wrote:

>>
>> Further, what is the most popular unit to handle text in this
>> community - sentence, document, word... ?
>
> I do not know.

All of the above, plus paragraphs, phrases, and any other logical  
textual units you can dream up. Of course, I'm no biologist either - I  
want this for social science research. The more flexible you can make  
it, the more potential uses it will have.

This is actually a substantial problem that my research group has  
encountered as we've slowly been working to build a human-in-the-loop  
NLP-based content analysis system. At the moment, our NLP tools are  
only able to perform analysis at the sentence level, but the "thematic  
units" used for qualitative content analysis coding can vary  
substantially, may overlap, and are not necessarily consistent within  
one piece of analysis. For our uses, the applications of the tools are  
drastically restricted when only one level of text is allowable, but  
transcending the sentence structure is difficult for NLP. For non-NLP  
text mining, it seems that the restrictions imposed by sentence  
structure should not interfere as much.

I would definitely be interested in getting my hands on a text mining  
plugin or service for Taverna. I would immediately be able to do quite  
a few interesting snippets of research that are currently impossible  
for me, starting with analysis of the 3.42 GB CSV of juicy search log  
data that's just gathering virtual dust on my hard drive...

Cheers,

Andrea


Andrea Wiggins
PhD Student, School of Information Studies
Syracuse University

337 Hinds Hall
Syracuse, NY 13244
[email protected]
www.andreawiggins.com


------------------------------------------------------------------------------
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing 
server and web deployment.
http://p.sf.net/sfu/businessobjects
_______________________________________________
taverna-hackers mailing list
[email protected]
Web site: http://www.taverna.org.uk
Mailing lists: http://www.taverna.org.uk/taverna-mailing-lists/
Developers Guide: http://www.mygrid.org.uk/tools/developer-information

Reply via email to