Hi, I am trying to embed our text mining system into Taverna workflow. Since I myself is not a biologist in any meaning, I am not sure how the text/document is handled in the Taverna (or biological) community. A couple of questions regarding the text handling issue.
In the Results tab, when Result Type is "Text" newlines are not shown. Is there any way to display newlines as really newlines? How about wrapping lines? Iteration Strategy. It seems like that the iteration strategy is handled using Java objects like ArrayList, the workflow itself is not iterated. Is this a correct understanding, or is there any way to iterate the same workflow as a "batch" like way? How do you handle a large scale document set, e.g. the whole Pubmed papers, to avoid the large memory consumption? If there is such a batch mode exists, how do I notice the end of the batch? Further, what is the most popular unit to handle text in this community - sentence, document, word... ? Any help appreciated! Thanks, -Yoshinobu -- Yoshinobu Kano (Given/Family) [email protected] Project Research Associate, the University of Tokyo / U-Compare Project Lead http://www-tsujii.is.s.u-tokyo.ac.jp/ http://u-compare.org/kano/ ------------------------------------------------------------------------------ Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects _______________________________________________ taverna-hackers mailing list [email protected] Web site: http://www.taverna.org.uk Mailing lists: http://www.taverna.org.uk/taverna-mailing-lists/ Developers Guide: http://www.mygrid.org.uk/tools/developer-information
