Re: GSOC: Add CWL support to Taverna

Thilina Manamgoda Tue, 15 Mar 2016 08:18:20 -0700

Hi,
  For TAVERNA-880 (discovery)  these are the things i should do ?

1). Build service discovery plugin for CWL .


              1. Implement AbstractConfigurableServiceProvider which will
look for *.cwl files in given directory.
              2. provide ServiceDescriptions which holds configurations for
CWL when  findServiceDescriptionsAsync is called.These configurations
should be in JSON .
              3.These JSON configuration should have details to build
CWLActivity.class  like How Input port and output port are defined .

2) Testing and documentaion



             Am i correct ?

regards ,
Thilina.


On Tue, Mar 15, 2016 at 8:24 PM, Thilina Manamgoda <[email protected]>
wrote:

> Hi,
>
>
> On Tue, Mar 15, 2016 at 8:00 PM, Stian Soiland-Reyes <[email protected]>
> wrote:
>
>> On 15 March 2016 at 08:07, Thilina Manamgoda <[email protected]>
>> wrote:
>>
>> > I have gone through the tutorials Service invocation plugin
>> > <
>> http://dev.mygrid.org.uk/wiki/display/developer/Tutorial+-+Service+invocation+plugin
>> >
>> >  and Service discovery plugin
>> > <
>> http://dev.mygrid.org.uk/wiki/display/developer/Tutorial+-+Service+discovery+plugin
>> >.
>>
>> Great!  Did you find any issues while doing so?
>>
>>
>> > So in order to bring CWL workflows to the Taverna i have to implement
>> Service
>> > discovery plugin
>> > <
>> http://dev.mygrid.org.uk/wiki/display/developer/Tutorial+-+Service+discovery+plugin
>> >
>> > for CWL right ?.
>>
>> Yes, the TAVERNA-880 task is basically to implement a service
>> discovery plugin for CWL.
>>
>> The Service Invocation tutorial would be more relevant for
>> TAVERNA-878.  It could be that to test your UI, to be able to drag a
>> CWL Tool into a workflow, you need to have a "dummy" Activity similar
>> to the one in the tutorial, even if it doesn't actually do any actual
>> invocation when it is run. That is, it would be a placeholder.
>>
>>
>> >  1. SCULF2  workflows are saved as workflow bundle document .
>>
>> Correct.. except it's a ZIP file of XML files, not a single document :-)
>>
>>
>> > 2.  Service discovery plugin
>> > <
>> http://dev.mygrid.org.uk/wiki/display/developer/Tutorial+-+Service+discovery+plugin
>> >
>> > Service
>> > Description is java bean and it's build using corresponding  workflow
>> > bundle document .
>>
>> It's a java beans (in the Configuration), but as the tutorial is
>> according to Taverna 2.5 you will later need to update your Service
>> Discovery code for Taverna 3, where you will use the Taverna Language
>> SCUFL2 API - which is a java bean approach to the Workflow Bundle.
>> (those beans are then saved to the Workflow Bundle ZIP file, but that
>> is already handled).
>>
>> The beans are different though, the Taverna 2.5 beans have a
>> Configuration subclass per activity type, e.g. a ToolConfiguration -
>> while in Taverna 3 the Configuration class is not subclassed (instead
>> it declares a type URI), and all the actual configuration is in the
>> linked JSON object - which content would vary per configuration type.
>>
>>
>>
>> > 3. Activity has the logic of workflow .For example let's say service is
>> > addtiion of two numbers . Then two numbers are input and the addition is
>> > inside the Activity class .
>>
>> Exactly, the Activity is the thing that actually 'happens' in a box in
>> the workflow.  The rest of the workflow is basically connections,
>> iterations and controls.
>>
>>
>> > 4.Activity class is also configured (build) using  corresponding
>> workflow
>> > bundle document .
>>
>> Yes. When a WorkflowBundle is set to run, the Taverna Engine will
>> select the corresponding Activity subclass based on its type URI,
>> instantiate it, and then configured it with a JsonObject (which exists
>> as a JSON file if the Workflow Bundle is saved as a ZIP file.)
>>
>>
>> There are different types of activities depending on what kind of
>> invocation they are doing, e.g. a RESTActivity that can do HTTP calls
>> (the configuration says which URI and headers), the ToolActivity can
>> execute a local or SSH command line (the configuration says which
>> command/host), or the BeanshellActivity can run a Beanshell script
>> (the configuration contains the script). There are thus different
>> configuration types, and a corresponding JSON Schema that says which
>> keys and value types to expect for a given type.
>>
>>
>> The imagined CWLActivity (TAVERNA-878) will be either a new kind of
>> activity, or just an alternative configuration of the ToolActivity -
>> as basically a CWL Tool is a command line that in theory may be
>> executed using the correct "docker run" syntax.   In Taverna 3 it is
>> possible to have different kinds of configuration for the same
>> activity, the ToolActivity could be changed to recognize both its
>> existing "classic" Tool configuration and a new "CWL tool"
>> configuration.
>>
>> If you are interested in this execution logic, then it is probably
>> worth having an early stage investigation in the beginning of your
>> GSOC project to see to what extent the existing execution logic of the
>> ToolActivity can run a docker command line - e.g. experimenting in the
>> 2.5 Workbench and adding a Tool that executes a tool as in the CWL
>> Tool description, and then you would be able to see the configuration
>> mapping in a way.   (But it could be that this reveals that say the
>> data handling of CWL Tools is different to how the Tool activity
>> handles input and output files - in which case a new CWLActivity would
>> be a better approach).
>>
>> As you write up your project proposal now, you should have some rough
>> estimates and time plans. Doing both TAVERNA-878 (activity) and
>> TAVERNA-880 (discovery) could be too much for the short duration of
>> GSOC, so if you are interested in both I would suggest to do one of
>> them only minimally.
>>
>>
>> > *5. when designing a CWL * Service discovery plugin
>> > <
>> http://dev.mygrid.org.uk/wiki/display/developer/Tutorial+-+Service+discovery+plugin
>> >
>> > i
>> > Have to implement dummy Activity class for  CWL .
>>
>>
>> Yes, with a "dummy activity" I mean one that can't actually execute
>> anything, it might just always say "Hello" on the output so that you
>> can see in the workbench that you have added something from your CWL
>> Service Discovery plugin.
>>
>> Also when you do this in Taverna 2.5 you need to have an actual
>> Activity subclass to add to the workflow - this is a problem in 2.5 in
>> that you couldn't build workflows with activities your local Taverna
>> didn't know how to execute. In Taverna 3 the workflow building is done
>> with plain java beans from the Taverna Language API, and those don't
>> know anything about execution, and so there it would be possible to
>> build workflows with activities that only run elsewhere (e.g. build a
>> workflow in Windows even though its activity can only run in Linux).
>>
>>
>> Alan - do you think in Taverna 2.5 phase we could let the CWL
>> Discovery plugin add a DisabledActivity instance and then chuck the
>> configuration JSON inside the XML? Would not then that XML be saved
>> directly to the .t2flow?  It would be a bit cheating.. but then this
>> would be cheating anyway, and also it would mean it would both save
>> from Taverna Workbench 2 and load in Taverna 3. (We would need to add
>> a translator on the Taverna Language side though).
>>
>> > It should have logic to
>> > execute cwl-runner with corresponding tool and inputs and get the
>> output .
>>
>>
>> Executing cwlrunner from Taverna (configured with a CWL workflow
>> rather than a CWL tool) could be an interesting thing - that would be
>> a way to include a CWL workflow as a nested workflow in Taverna.
>> However I think that would be a different approach, so I've tracked
>> that as a new Jira task
>> https://issues.apache.org/jira/browse/TAVERNA-938
>>
>> This could be some kind of intermediate approach you could explore if
>> you want, as you can execute any cwl tool by generating a one-step cwl
>> workflow and run cwlrunner. However then the user would need to have
>> cwlrunner AND Taverna installed, so it would be more of an
>> intermediate solution, which would however be great for demonstration
>> purposes - and mean that your GSOC work would be usable without
>> waiting for the other tasks.
>>
>>
>> Personally I am also going to try to work on the CWL support during
>> this spring/summer - so whatever tasks are not picked by an accepted
>> GSOC student would be something I would try to do - however I wouldn't
>> want the students to have to rely on this arriving in time, as your
>> GSOC evaluation (which determines if you get paid!) should be
>> independent of other concurrent work, including different GSOC
>> students.  That doesn't mean you can't collaborate and discuss
>> solutions on this list - I would hope you do! Just don't build other
>> people's work into your project proposal like a blocker.
>>
>>
>> You are asking the right questions! Feel free to ask if you need help
>> with your project proposal!
>>
>>
>> --
>> Stian Soiland-Reyes
>> Apache Taverna (incubating), Apache Commons RDF (incubating)
>> http://orcid.org/0000-0001-9842-9718
>>
>
>

Re: GSOC: Add CWL support to Taverna

Reply via email to