On 06/10/2010 08:27, Richard Holland wrote:
Hello all, had a conversation with Carole here in Hannover yesterday. I had some suggestions for future improvements to Taverna and she said I should post these to the list:
The summary for the "tl;dr" crowd: Some of these things we're going to be doing, some we'd like to do but don't yet know enough, some are ferociously difficult, and some I simply don't have the expertise to talk about. :-) For the interested...
1. Internal parallelisation of workflows. The ability to specify that a particular subsection within a workflow should implicitly divide the input, run several instances in parallel on some kind of back-end grid (LSF, SGE, Condor, or an EC2 approach, etc.), and recombine the output. A subsection could just be one component, or several chained together, or the entire workflow. The current strategy as per App4Andy I believe is just the last one of the three - i.e. the entire workflow.
Automatic parallelization is one of these things that is Known To Be Difficult. If the workflow processor was a call to a SOAP or REST based service, autoparallel would turn the workflow into a distributed denial of service attack tool. :-) The only way to make real progress with parallelizing a workflow requires knowledge of the specific workflow, the data it is processing, the resources that it relies on, etc. The App4Andy work was a success in large part because we did this first.
3. Reporting functionality. The ability to dynamically generate PDFs or spreadsheets or Word docs etc. based on workflow output via some kind of graphic designer for report templates is really valuable. Knime and Pipeline Pilot both have this feature.
One of the things I want to see is more community involvement in Taverna so that it isn't just a project done in one place. Reporting is one of the areas where I think contributions would be very valuable.
5. Still need a better way of delegating security credentials to the server/grid instances so the workflow can log into things on your behalf.
That's an area where there will be more work done; I plan to have a sketch of a strawman proposal hammered out later this week, including having a key requirement to not be tightly bound to Java clients. (The big problem is that existing solutions for this tend to rely on *everything* using the same auth system; that's just not what happens on the ground.)
6. The killer feature would be the ability to install a plugin on your desktop or grid front-end client and have it propagate to the back-end server or multiple grid instances along with the workflow, in the case that it hasn't already been installed back there. You'd need to think of some way of sandboxing the plugins distributed in this way so that they can only affect the workflows of the user that submitted them.
A seriously neat idea, but *very* difficult to get right.
7. Have the ability (maybe via myExperiment) to log every execution of a workflow including a reference to the input data, the structure of the workflow at the time of the execution, and a reference to the results, provenance, etc. This is very useful for lab notebook concepts and also for reproducing work at a later date.
Another seriously neat idea. I'd need to understand what "every" means to you here (I can think of a few conflicting use-cases ;-)) and there's some issues with where the data is actually stored, but it would fit in with some of the concepts of the Next-Gen Workbench too. I'd rate the probability of this being done as high.
8. Taverna server (App4Andy is a great start by the way) to offer the possibility to upload arbitrary workflows via the desktop client and execute them on the server/grid/cloud, rather than choosing from a predefined selection. On uploading it could make an auto-generated but editable web interface for obtaining workflow input, monitoring progress, downloading results/provenance/etc. The workflow could be a one-off, or could be stored there, and kept private, or shared via user/group notions, etc. This is all in Knime already (except the cloud bit).
We can do some of this now (the auto-interface isn't sophisticated, but it does exist) and other things - the integration with the workbench in particular - are on the technical feature roadmap. This *will* be done. Donal.
<<attachment: donal_k_fellows.vcf>>
------------------------------------------------------------------------------ Beautiful is writing same markup. Internet Explorer 9 supports standards for HTML5, CSS3, SVG 1.1, ECMAScript5, and DOM L2 & L3. Spend less time writing and rewriting code and more time creating great experiences on the web. Be a part of the beta today. http://p.sf.net/sfu/beautyoftheweb
_______________________________________________ taverna-hackers mailing list [email protected] Web site: http://www.taverna.org.uk Mailing lists: http://www.taverna.org.uk/about/contact-us/ Developers Guide: http://www.taverna.org.uk/developers/
