Yoshinobu Kano wrote:
> Hi,
> 
> I am trying to embed our text mining system into Taverna workflow.
> Since I myself is not a biologist in any meaning,
> I am not sure how the text/document is handled in the Taverna (or
> biological) community.
> A couple of questions regarding the text handling issue.
> 
> 
> In the Results tab, when Result Type is "Text" newlines are not shown.
> Is there any way to display newlines as really newlines?
> How about wrapping lines?

What version of Taverna are you running?  In Taverna 1.7.1, 2.0 and 2.1 
beta 1, I get newlines in output text.

> Iteration Strategy.
> It seems like that the iteration strategy is handled using Java
> objects like ArrayList,
> the workflow itself is not iterated.
> Is this a correct understanding, or is there any way to iterate the
> same workflow as a "batch" like way?

In Taverna 1.7.1 you can feed a list into a workflow port that expects a 
single value.  The workflow will iterate using the same rules as if you 
were passing lists into a service.  So if the workflow has two ports 
that each expect a single value, and you pass [a,b,c] and [1,2], the 
workflow will iterate 6 times, once for a&1, a&2, b&1, b&2, c&1, c&2.

In Taverna 2.0 (and 2.1 beta 1) you cannot feed a list into a workflow 
port if the port is expecting a single value.  The input dialog will not 
allow you to specify multiple values.  However, you can alter the depth 
of the workflow port to accept lists.  That will often achieve what you 
want.  You can still, of course, run it with a single value just by 
specifying only one element in the list.

Katy Wolstencroft has raised workflow iteration as an issue that needs 
to be fixed.

> How do you handle a large scale document set, e.g. the whole Pubmed
> papers, to avoid the large memory consumption?

The best way is to design the services so that they take URLs rather 
than the actual documents.

For Taverna 1.7.1, you can use the data proxy mechanism 
http://www.taverna.org.uk/associated-tools/webservice-data-proxy/

In Taverna 2, the data handling is better and much improved in 2.1 beta.

> If there is such a batch mode exists, how do I notice the end of the batch?

In Taverna 1.7.1, the output from iterations will not appear until all 
the iterations have finished.

In Taverna 2, the values are output as they are produced.  In Taverna 
2.1 beta 1, there is an explicit indication of when the workflow has 
finished.

> Further, what is the most popular unit to handle text in this
> community - sentence, document, word... ?

I do not know.

> Any help appreciated!
> 
> Thanks,
> 
> -Yoshinobu

Alan

------------------------------------------------------------------------------
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing 
server and web deployment.
http://p.sf.net/sfu/businessobjects
_______________________________________________
taverna-hackers mailing list
[email protected]
Web site: http://www.taverna.org.uk
Mailing lists: http://www.taverna.org.uk/taverna-mailing-lists/
Developers Guide: http://www.mygrid.org.uk/tools/developer-information

Reply via email to