Hi Stian, Thank you very much for the valuable feedback. Completed the requested changes. Please let me know if you see anything else. I will anyway submit this to GSoC tonight, because I can change it till 25th.
On Wed, Mar 23, 2016 at 1:39 PM, Stian Soiland-Reyes <[email protected]> wrote: > Thanks! > > Overall your proposal looks good! Precise and well structured. I like > the diagram, it shows good understanding. > > "Travena" -> "Taverna" > > TAVERNA-879 proposes a way to execute a particular step of a Taverna > workflow as if it was a command line. This could enable any "classic" > Taverna workflow to be converted to a CWL workflow - not just those > with the Tool Activity. Imagine a command line tool that takes a > JSON/YAML file which corresponds to the existing configuration any > existing Taverna activity (e.g. R, WSDL, REST, Beanshell) - and runs a > corresponding one-step workflow with corresponding input and output > files. > > While this might not be an efficient way to do many of the Taverna > steps in a non-Taverna CWL engine, it could be a nice transition - > allowing you to start your workflow in Taverna, then save as CWL to > run on any CWL engine (assuming no fancy iterations were done in the > Taverna wf), and develop further by editing the CWL, replacing some of > the Taverna steps with more 'native' tools, e.g. calling "curl" in a > Docker image instead of a Taverna step that just does a REST call. > > > TAVERNA-879 is a bit more experimental (and exciting!) and hard to > tell how far you would get, so I would mark your Task 7 as optional in > your proposal, but keep it in the schedule - thus you have the option > to free up time at the end if for instance you struggle to capture the > Docker metadata for task 6. > > > > Under "Deliverables" you say "The entire source code zipped" - I think > we would prefer to follow the same pattern we used for last year's > GSOC - where we ask the students to sign the Apache Individual > Contributer License Agreement https://www.apache.org/licenses/icla.txt > - > > Then you commit your code continually to our git repositories using > GitHub pull requests. (If you don't like GitHub we can do git patches > by email/Jira) - rather than a big ZIP at the end which we have to > figure out. This parts helps you learn how to interact with open > source project - and it teaches us on how to interact with third-party > submissions :) > > So I would change deliverable 4 to "Regular GitHub pull requests with > source code" - we can agree on the repositories later - I guess > docker-activity would be added to taverna-common-activities - while > the TAVERNA-879 tool could be added to the taverna-command-line. > > > As for testing it would be great to start with some example workflows > which just runs Docker with the existing Tool activity - you could > develop these during your first 4 weeks as a way to get to know > Taverna. And then we can transition those workflows for the new Docker > activity in the Testing steps - and they can become a separate > deliverable. > > > > On 23 March 2016 at 08:54, Nadeesh Dilanga <[email protected]> wrote: > > Hi Stian et al, > > Here I have drafted my proposal [1]. Appreciate everone's feedback on the > > proposal. Please let me know if this is not align with your original > > expectation from this project. Or whether it needs any scope level > changes. > > > > Apart from that, @TAVERNA-900 can you please clarify following; > > > > "Create a Docker tool for executing Taverna activities (TAVERNA-879) - > *this > > allows any Taverna steps to be used by other CWL engines*" > > > > [1] - > > > https://docs.google.com/document/d/1DKYuzr2hA5brQ2xBz_AVQgMWXB5qm6rftbWoGbnbXrg/edit?usp=sharing > > > > On Tue, Mar 22, 2016 at 1:42 PM, Nadeesh Dilanga <[email protected]> > > wrote: > > > >> Hi, > >> Thank you very much for the quick response. I will go through these bit > >> more and get back when I meet any roadblocks. > >> > >> On Mon, Mar 21, 2016 at 10:15 PM, Stian Soiland-Reyes <[email protected] > > > >> wrote: > >> > >>> On 21 March 2016 at 00:51, Nadeesh Dilanga <[email protected]> > wrote: > >>> > >>> > First of all, apologize for the delayed response. I wanted to give my > >>> self > >>> > bit more time to understand and going through what Taverna is and > what > >>> > exactly the expected outcome of the project (tutorials and related > slide > >>> > decks and also youtube videos were very helpful). Because this will > be > >>> my > >>> > one and only GSoC proposal and I want it to be perfect!. > >>> > >>> Thanks! You don't have to do it perfect - just great! :-)) > >>> > >>> > 1. Taverna is a BPMN like(but more extensive and scoped more widely > in > >>> > features) workflow engine which has several ways of creating work > flows > >>> and > >>> > different interfaces of access them. > >>> > >>> While I guess we don't like to be compared with BPMN, I think you are > >>> correct. :) > >>> > >>> > >>> > 2. When creating workflows, one major extension point to cater > custom > >>> use > >>> > cases is, to plug/create your own services/service types which is a > >>> great > >>> > model IMHO. And this project is in fact to write an adapter(activity > >>> plugin > >>> > which I believe is the executor of an invocation of a service) when > some > >>> > one needs to run something on Docker at some phase of his workflow. > >>> > >>> Correct - thus one could have a workflow with multiple tools from > >>> different docker images. > >>> > >>> > >>> > if #2 is correct, can you please provide me an example of an use case > >>> which > >>> > led to this project idea, because feels I may be missing something > here. > >>> > Because IMHO, even for docker eventually it will be a service > invocation > >>> > from a workflow front, and what Tarvena needs is some activity > plugins > >>> that > >>> > are aware of the particular transport protocols. > >>> > >>> We already have the Tool activity which allow you to run command line > >>> tools - however such workflows are hard to share as anyone receiving > >>> it may not have that tool installed, or in the same version/location. > >>> > >>> While approaches like https://www.debian.org/devel/debian-med/ and > >>> BioLinux have helped towards "How to get it installed" - it then moves > >>> the requirement to a particular operating system, which in a way is > >>> worse. > >>> > >>> Docker solves the "How to consistently install this tool" problem - > >>> and even works (almost) seemlessly from OS X and Windows. It adds nice > >>> reproducibility aspects as you can mark the exact snapshot version of > >>> the docker image you have used. > >>> > >>> > >>> There are now also initiatives such as http://bioboxes.org/ (and to a > >>> certain degreehttp://bio.tools/ ) which describe bioinformatics tools > >>> as Docker images - thus these can in theory be used directly from > >>> Taverna. > >>> > >>> > >>> Perhaps part of the project would be to define a use case so we find > >>> some actual command lines we want to run in a Taverna workflow - e.g. > >>> to run HMMER for sequence alignment using > >>> https://hub.docker.com/r/dockerbiotools/hmmer/ using sequences fetched > >>> from an EBI web service? I am not sure how much of the bioinformatics > >>> side you would like to get into! :) > >>> > >>> > >>> > >>> > (example: http service hosted in Docker, Http activity plugin, > Message > >>> > Broker service hosted in Docker, you need AMQP,MQTT like activity > >>> plugin) > >>> > >>> Yes, but I don't think we want to run many of those kind of services > >>> from Taverna, I was thinking more of running just command line tools > >>> that happen to be packaged as Docker images. > >>> > >>> > 3. Or the case is to invoke some composite applications that > >>> > deployed/installed in Docker disregarding what the protocols are ? > >>> > >>> No, this would get a bit more complex, so I would stay away from that > >>> for the GSOC project - although of course the potential is very > >>> interesting motivation as well. > >>> > >>> I think this is what I described in > >>> https://issues.apache.org/jira/browse/TAVERNA-941 > >>> > >>> > >>> > if #3 is correct, what we run in the docker container can be another > >>> > Taverna workflow. If that is the case your idea on "Save workflow as > >>> Docker > >>> > image" will be a superb addition!. > >>> > >>> Yes! It should then be possible! But.. why? :) Run with older Taverna > >>> version? > >>> > >>> One interesting thing could be if there's also "Save workflow as > >>> Docker image" - if such a docker image is added as a Docker image - > >>> would be to "unwrap" it and show the inner workflow in Taverna. > >>> > >>> With Docker there's a big danger of going down the "It's turtles all > >>> the way down" recursion - hence I tried to scope the GSOC ideas to be > >>> more concrete about running command line tools. > >>> > >>> > >>> > So with this, I would like to understand what Taverna community > expect > >>> > from "Invoking Docker from Taverna" on this GSoC project. So that I > >>> can be > >>> > more specific on my project proposal and make it the best project for > >>> this > >>> > summer for Taverna. > >>> > > >>> > > >>> > > >>> > On Fri, Mar 18, 2016 at 7:18 AM, Stian Soiland-Reyes < > [email protected]> > >>> > wrote: > >>> > > >>> >> On 17 March 2016 at 15:22, alaninmcr <[email protected]> > wrote: > >>> >> >> I found Docker as an excellent solution for scaling, easy > >>> deployment and > >>> >> >> obviously a hot topic these days in enterprises who want to > >>> implement > >>> >> >> micro > >>> >> >> services based architecture/deployment for low footprint > >>> >> servers/services. > >>> >> >> > >>> >> >> I presume the idea behind Docker support for Taverna is NOT from > a > >>> micro > >>> >> >> service standpoint, but more like from a packaging and deployment > >>> >> >> perspective. Please correct me if I am wrong. > >>> >> > >>> >> No, you are right in that our current Docker ideas would not be > about > >>> >> creating Taverna (or Taverna workflow) as a micro-service,. but to > use > >>> >> Docker for execution. > >>> >> > >>> >> A similar aspect could be to use Docker to start up a set of > >>> >> microservices accompanying the Workflow, and then access them from > >>> >> Taverna workflow using the existing WSDL and REST activities. > >>> >> This is something that I am interested in within the > >>> >> http://bioexcel.eu/ project - but is a bit more architecturally > >>> >> challenging as it would mean things like dynamic port bindings in > the > >>> >> workflow configuration. It > >>> >> > >>> >> I've tracked this as > https://issues.apache.org/jira/browse/TAVERNA-941 > >>> >> but IMHO it would be a too big task for a GSOC project. > >>> >> > >>> >> > >>> >> > There are two separate issues: > >>> >> > > >>> >> > https://issues.apache.org/jira/browse/TAVERNA-901 is to allow > >>> Taverna > >>> >> > workflows to include steps that are tools that inside docker > >>> containers. > >>> >> > That would be deployment of an existing docker. > >>> >> > > >>> >> > https://issues.apache.org/jira/browse/TAVERNA-879 is to create > >>> docker > >>> >> > containers for Taverna workflows. That is packaging and (because > the > >>> >> > containers will be part of a CWL workflow) deployment. > >>> >> > >>> >> Nadeesh, I've added your interest to > >>> >> > >>> > https://cwiki.apache.org/confluence/display/TAVERNADEV/2016-03+GSOC+2016 > >>> >> > >>> >> but if you are more interested in packaging for Docker, then perhaps > >>> >> we could look at the existing Docker wrapping of Taverna Server > >>> >> > >>> >> https://hub.docker.com/r/taverna/taverna-server/ > >>> >> https://github.com/taverna-extras/taverna-server-docker > >>> >> > >>> >> and consider doing something similar for our command line tools > >>> >> "executeworkflow" and "tavlang". > >>> >> > >>> >> That shouldn't take you too long - so you may want to prototype one > of > >>> >> TAVERNA-901 and TAVERNA-879 as well. > >>> >> > >>> >> > >>> >> I know Dmitry used wsdl-generic as a command line tool as in > >>> >> http://inb.bsc.es/documents/galaxygears/ which could also be > >>> >> interesting as a Docker container (e.g. for running WSDL services > >>> >> within a CWL workflow), but I am not sure where the source code for > >>> >> that is (is that outside Apache, Dmitry?) > >>> >> > >>> >> > >>> >> > >>> >> > >>> >> > >>> >> >> If that is the case, can you please clarify what is the current > >>> >> packaging > >>> >> >> deployment model ? > >>> >> > >>> >> > >>> >> For Taverna 2.5 we used install4j via Maven to package into an > >>> installer: > >>> >> > >>> >> > >>> >> > >>> > https://github.com/apache/incubator-taverna-commandline/blob/old/taverna-commandline-product-core-20141228/pom.xml#L1712 > >>> >> > >>> >> That's what made the installers we have at > >>> >> https://taverna.incubator.apache.org/download/command-line-tool/ > >>> >> > >>> >> One packaging task we could consider for Taverna 3.0 is to update > >>> >> > >>> >> > >>> > https://github.com/apache/incubator-taverna-commandline/tree/master/taverna-commandline-product > >>> >> to use install4j or similar to generate such installers also for > >>> >> Taverna 3, which has a slightly different > >>> >> folder structure. > >>> >> > >>> >> As an open source project we have 5 licenses for Install4j, but we > >>> >> have not asked the author yet if this is still valid under Apache. > >>> >> Now releasing under Apache license instead of LGPL we would > ironically > >>> >> now be allowed to bundle the binary Oracle JRE rather than having to > >>> >> use the open source > >>> >> OpenJDK builds. > >>> >> > >>> >> But I'm afraid such a task would not involve Docker - as I think > most > >>> >> users of Taverna Command line would not have Docker (or even the > right > >>> >> Java version) installed. > >>> >> > >>> >> > >>> >> > >>> >> > There is no current mechanism for packaging up something to run a > >>> >> specific > >>> >> > Taverna workflow. You can run workflows from the command line tool > >>> or on > >>> >> a > >>> >> > Taverna Server. > >>> >> > >>> >> Making a recipe for generating Docker images for running a > particular > >>> >> Taverna Workflow could be interesting. We could then have "Save > >>> >> workflow as Docker image" built into Taverna! > >>> >> > >>> >> If you are thinking about such an idea, feel free to suggest it as a > >>> >> new Jira task! > >>> >> > >>> >> > >>> >> > >>> >> Overall - you don't have to pick exactly our ideas - you can be > >>> >> inspired by them and will have to write your own proposal about what > >>> >> work you propose to do (which should be reasonably scoped and > >>> >> scheduled) and say how Apache Taverna would benefit. > >>> >> > >>> >> Looking forward to hear more about your ideas! > >>> >> > >>> >> -- > >>> >> Stian Soiland-Reyes > >>> >> Apache Taverna (incubating), Apache Commons RDF (incubating) > >>> >> http://orcid.org/0000-0001-9842-9718 > >>> >> > >>> > >>> > >>> > >>> -- > >>> Stian Soiland-Reyes > >>> Apache Taverna (incubating), Apache Commons RDF (incubating) > >>> http://orcid.org/0000-0001-9842-9718 > >>> > >> > >> > > > > -- > Stian Soiland-Reyes > Apache Taverna (incubating), Apache Commons RDF (incubating) > http://orcid.org/0000-0001-9842-9718 >
