On Jun 9, 2009, at 6:23 AM, [email protected] wrote:
>> >> Further, what is the most popular unit to handle text in this >> community - sentence, document, word... ? > > I do not know. All of the above, plus paragraphs, phrases, and any other logical textual units you can dream up. Of course, I'm no biologist either - I want this for social science research. The more flexible you can make it, the more potential uses it will have. This is actually a substantial problem that my research group has encountered as we've slowly been working to build a human-in-the-loop NLP-based content analysis system. At the moment, our NLP tools are only able to perform analysis at the sentence level, but the "thematic units" used for qualitative content analysis coding can vary substantially, may overlap, and are not necessarily consistent within one piece of analysis. For our uses, the applications of the tools are drastically restricted when only one level of text is allowable, but transcending the sentence structure is difficult for NLP. For non-NLP text mining, it seems that the restrictions imposed by sentence structure should not interfere as much. I would definitely be interested in getting my hands on a text mining plugin or service for Taverna. I would immediately be able to do quite a few interesting snippets of research that are currently impossible for me, starting with analysis of the 3.42 GB CSV of juicy search log data that's just gathering virtual dust on my hard drive... Cheers, Andrea Andrea Wiggins PhD Student, School of Information Studies Syracuse University 337 Hinds Hall Syracuse, NY 13244 [email protected] www.andreawiggins.com ------------------------------------------------------------------------------ Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects _______________________________________________ taverna-hackers mailing list [email protected] Web site: http://www.taverna.org.uk Mailing lists: http://www.taverna.org.uk/taverna-mailing-lists/ Developers Guide: http://www.mygrid.org.uk/tools/developer-information
