Hi all, As some of you might know, at Eagle we use Ensembl Hive to construct pipelines that can be deployed on clusters/clouds. It has essentially the same concepts as Taverna, but configuring it is a nightmare combination of Perl modules and records in database tables.
We want to be able to describe a workflow/pipeline using a formal language (preferably within a GUI so we don't have to go using a text editor to do it) then convert it into Hive config files programmatically. Or, convert it into Taverna - but using the same workflow description document format regardless of which system the pipeline is intended for. >From Stian's email below, I see that the SCUFL format is currently being >redesigned to produce SCUFL2. Is it intended to represent generic pipelines or >is it Taverna specific? If it is supposed to be generic, we'd like to offer to help. We'd do the SCUFL2-Hive conversion tool ourselves, but we'd work with you on defining SCUFL2 such that it can satisfactorily represent both Taverna and Hive workflows. We could also put some time towards developing the SCUFL2-building GUI, if you have one planned. We have some free developer time coming up for a few days in Feb and March, which we could contribute to you (for free) to help develop the SCUFL2 format. A few days (we're talking maybe 2 weeks max) is not much time really on a project like this but we hope it would help in some way. What do you all think? cheers, Richard On 6 Jan 2010, at 10:47, Stian Soiland-Reyes wrote: > Hi! > > We're working on making the new SCUFL2 workflow language. > > This will be a simplification of the current .t2flow serialisation, > but will also come with an API. > > We're basing this new workflow definition language on what we have > learnt are the best features of Scufl (from Taverna 1) and .t2flow - > Scufl was quite easy for third party suppliers to generate or parse > (for instance myExperiment generates the Taverna 1 diagrams from > scratch using Ruby code that parses the scufl), while .t2flow allowed > to specify all the finer grained details possible in the new Taverna 2 > engine - but this also made it a bit too verbose. > > These are early days, so we'll figure out what the language should be > and what the API should look like. Paolo Missier has done good work in > making a proposed UML model [1] of the new language, which we can then > use as the basis for figuring out the XML serialisation, but also the > Java beans of the API, and possibly also RDF and JSON versions. > > In my spare time I've tried to tie together some simple Java beans > implementing this UML model, and I've now checked this into > Subversion. [2] - these beans are not complete yet, have no > integration with Taverna code, and the API can only serialise to RDF > currently. (To test out Sesame/Elmo annotations on beans). > > I might come back with code examples so we can discuss what the API > should look like. The current tests only builds a workflow from > scratch - also note that scufl2-rdf is not yet connected to scufl2-api > and can be considered an early version of the scufl2-api. (API wise > there are pulls in different directions, for instance we want to make > it easy to inspect a workflow, but also to construct one. If more > information is needed for inspection, this could make it more tricky > to construct.) > > The API should minimally be able to: > > * Work independently, without any Taverna dependencies, runtime or > plugin system > * Load .t2flows > * Save as .scufl2 (undetermined yet what this format is - most > likely XML and/or RDF inside a Research Object .zip) > * Inspect an existing workflow to tell: > a) Processors > b) Connection between processor/workflow ports (and conditional links) > c) Activities/Services (ie. 'WSDL' method 'fish' from endpoint > 'http://asdkljasdkjasdkj') > d) Annotations > * Allow modification and creation from scratch of such workflows > > The API should be rich enough so that you could use it to generate the > workflow diagram - ie. what the myExperiment does already in Ruby. > > > Optionally: > * Load scufl 1 .xml from Taverna 1 > * Save as backwards compatible .t2flow or even scufl 1 if possible > * Exposed as a RESTful service > > > However, the API should also be lightweight, so it will not do tasks > better done by Taverna engine (t2core): > * Determining if a workflow definition is valid (checking for loops, > invalid iteration strategies etc) > * Perform the actual execution of the workflow > > Other tasks are also better suited for the main Taverna code base, as > they require various plugins or other considerations: > * Discovering available services/methods > * Find input/output ports of a given service definition > * Determining what configuration can be done for a given service > * Merging workflows > > > If you talk about a client/server architecture, you can picture these > (RESTful?) services: > > * Taverna engine: execute workflow and manage data/provenance > * Taverna inspection: check workflow definition validity, calculate depths, > etc > * Taverna service descriptions: Find available services, specify > possible service definition, determine ports for service definition > * Taverna editing: Workbench-type activities, Undo/redo, merge > workflows, workflow refactoring > * Taverna diagram: Generate workflow diagram in various formats and > configurations > > (The last two of these should be possible to implement using mainly > the Scufl2 API.) > > A client could then use the Scufl2 API and a selection of these > services - and still be able to implement what would look like the > current Taverna workbench. The client could be written in a non-Java > language, and use the Scufl2 serialisation schema/ontology directly > with the help of whatever XML/RDF/JSON support is available for its > language - this should give the same functionality but without a few > convenience methods. > > > We're very interested in hearing about potential use cases for what > such a SCUFL2 language and API could be used for. Feel free to add > your comments! > > > [1] http://www.mygrid.org.uk/dev/wiki/display/developer/SCUFL2 > [2] http://taverna.googlecode.com/svn/unsorted/scufl2/trunk/ > > -- > Stian Soiland-Reyes, myGrid team > School of Computer Science > The University of Manchester > > ------------------------------------------------------------------------------ > This SF.Net email is sponsored by the Verizon Developer Community > Take advantage of Verizon's best-in-class app development support > A streamlined, 14 day to market process makes app distribution fast and easy > Join now and get one step closer to millions of Verizon customers > http://p.sf.net/sfu/verizon-dev2dev > _______________________________________________ > taverna-hackers mailing list > [email protected] > Web site: http://www.taverna.org.uk > Mailing lists: http://www.taverna.org.uk/taverna-mailing-lists/ > Developers Guide: http://www.mygrid.org.uk/tools/developer-information -- Richard Holland, BSc MBCS Operations and Delivery Director, Eagle Genomics Ltd T: +44 (0)1223 654481 ext 3 | E: [email protected] http://www.eaglegenomics.com/ ------------------------------------------------------------------------------ Throughout its 18-year history, RSA Conference consistently attracts the world's best and brightest in the field, creating opportunities for Conference attendees to learn about information security's most important issues through interactions with peers, luminaries and emerging and established companies. http://p.sf.net/sfu/rsaconf-dev2dev _______________________________________________ taverna-hackers mailing list [email protected] Web site: http://www.taverna.org.uk Mailing lists: http://www.taverna.org.uk/taverna-mailing-lists/ Developers Guide: http://www.mygrid.org.uk/tools/developer-information
