Thank you, it looks like it's ready to be submitted! Everyone, GSOC submission deadline is tomorrow, you might want to give it a go today in case the web falls over! On 23 Mar 2016 20:36, "Nadeesh Dilanga" <[email protected]> wrote:
> Hi Stian, > Thank you very much for the valuable feedback. Completed the requested > changes. Please let me know if you see anything else. > I will anyway submit this to GSoC tonight, because I can change it till > 25th. > > On Wed, Mar 23, 2016 at 1:39 PM, Stian Soiland-Reyes <[email protected]> > wrote: > > > Thanks! > > > > Overall your proposal looks good! Precise and well structured. I like > > the diagram, it shows good understanding. > > > > "Travena" -> "Taverna" > > > > TAVERNA-879 proposes a way to execute a particular step of a Taverna > > workflow as if it was a command line. This could enable any "classic" > > Taverna workflow to be converted to a CWL workflow - not just those > > with the Tool Activity. Imagine a command line tool that takes a > > JSON/YAML file which corresponds to the existing configuration any > > existing Taverna activity (e.g. R, WSDL, REST, Beanshell) - and runs a > > corresponding one-step workflow with corresponding input and output > > files. > > > > While this might not be an efficient way to do many of the Taverna > > steps in a non-Taverna CWL engine, it could be a nice transition - > > allowing you to start your workflow in Taverna, then save as CWL to > > run on any CWL engine (assuming no fancy iterations were done in the > > Taverna wf), and develop further by editing the CWL, replacing some of > > the Taverna steps with more 'native' tools, e.g. calling "curl" in a > > Docker image instead of a Taverna step that just does a REST call. > > > > > > TAVERNA-879 is a bit more experimental (and exciting!) and hard to > > tell how far you would get, so I would mark your Task 7 as optional in > > your proposal, but keep it in the schedule - thus you have the option > > to free up time at the end if for instance you struggle to capture the > > Docker metadata for task 6. > > > > > > > > Under "Deliverables" you say "The entire source code zipped" - I think > > we would prefer to follow the same pattern we used for last year's > > GSOC - where we ask the students to sign the Apache Individual > > Contributer License Agreement https://www.apache.org/licenses/icla.txt > > - > > > > Then you commit your code continually to our git repositories using > > GitHub pull requests. (If you don't like GitHub we can do git patches > > by email/Jira) - rather than a big ZIP at the end which we have to > > figure out. This parts helps you learn how to interact with open > > source project - and it teaches us on how to interact with third-party > > submissions :) > > > > So I would change deliverable 4 to "Regular GitHub pull requests with > > source code" - we can agree on the repositories later - I guess > > docker-activity would be added to taverna-common-activities - while > > the TAVERNA-879 tool could be added to the taverna-command-line. > > > > > > As for testing it would be great to start with some example workflows > > which just runs Docker with the existing Tool activity - you could > > develop these during your first 4 weeks as a way to get to know > > Taverna. And then we can transition those workflows for the new Docker > > activity in the Testing steps - and they can become a separate > > deliverable. > > > > > > > > On 23 March 2016 at 08:54, Nadeesh Dilanga <[email protected]> wrote: > > > Hi Stian et al, > > > Here I have drafted my proposal [1]. Appreciate everone's feedback on > the > > > proposal. Please let me know if this is not align with your original > > > expectation from this project. Or whether it needs any scope level > > changes. > > > > > > Apart from that, @TAVERNA-900 can you please clarify following; > > > > > > "Create a Docker tool for executing Taverna activities (TAVERNA-879) - > > *this > > > allows any Taverna steps to be used by other CWL engines*" > > > > > > [1] - > > > > > > https://docs.google.com/document/d/1DKYuzr2hA5brQ2xBz_AVQgMWXB5qm6rftbWoGbnbXrg/edit?usp=sharing > > > > > > On Tue, Mar 22, 2016 at 1:42 PM, Nadeesh Dilanga <[email protected] > > > > > wrote: > > > > > >> Hi, > > >> Thank you very much for the quick response. I will go through these > bit > > >> more and get back when I meet any roadblocks. > > >> > > >> On Mon, Mar 21, 2016 at 10:15 PM, Stian Soiland-Reyes < > [email protected] > > > > > >> wrote: > > >> > > >>> On 21 March 2016 at 00:51, Nadeesh Dilanga <[email protected]> > > wrote: > > >>> > > >>> > First of all, apologize for the delayed response. I wanted to give > my > > >>> self > > >>> > bit more time to understand and going through what Taverna is and > > what > > >>> > exactly the expected outcome of the project (tutorials and related > > slide > > >>> > decks and also youtube videos were very helpful). Because this will > > be > > >>> my > > >>> > one and only GSoC proposal and I want it to be perfect!. > > >>> > > >>> Thanks! You don't have to do it perfect - just great! :-)) > > >>> > > >>> > 1. Taverna is a BPMN like(but more extensive and scoped more widely > > in > > >>> > features) workflow engine which has several ways of creating work > > flows > > >>> and > > >>> > different interfaces of access them. > > >>> > > >>> While I guess we don't like to be compared with BPMN, I think you are > > >>> correct. :) > > >>> > > >>> > > >>> > 2. When creating workflows, one major extension point to cater > > custom > > >>> use > > >>> > cases is, to plug/create your own services/service types which is a > > >>> great > > >>> > model IMHO. And this project is in fact to write an > adapter(activity > > >>> plugin > > >>> > which I believe is the executor of an invocation of a service) when > > some > > >>> > one needs to run something on Docker at some phase of his workflow. > > >>> > > >>> Correct - thus one could have a workflow with multiple tools from > > >>> different docker images. > > >>> > > >>> > > >>> > if #2 is correct, can you please provide me an example of an use > case > > >>> which > > >>> > led to this project idea, because feels I may be missing something > > here. > > >>> > Because IMHO, even for docker eventually it will be a service > > invocation > > >>> > from a workflow front, and what Tarvena needs is some activity > > plugins > > >>> that > > >>> > are aware of the particular transport protocols. > > >>> > > >>> We already have the Tool activity which allow you to run command line > > >>> tools - however such workflows are hard to share as anyone receiving > > >>> it may not have that tool installed, or in the same version/location. > > >>> > > >>> While approaches like https://www.debian.org/devel/debian-med/ and > > >>> BioLinux have helped towards "How to get it installed" - it then > moves > > >>> the requirement to a particular operating system, which in a way is > > >>> worse. > > >>> > > >>> Docker solves the "How to consistently install this tool" problem - > > >>> and even works (almost) seemlessly from OS X and Windows. It adds > nice > > >>> reproducibility aspects as you can mark the exact snapshot version of > > >>> the docker image you have used. > > >>> > > >>> > > >>> There are now also initiatives such as http://bioboxes.org/ (and > to a > > >>> certain degreehttp://bio.tools/ ) which describe bioinformatics tools > > >>> as Docker images - thus these can in theory be used directly from > > >>> Taverna. > > >>> > > >>> > > >>> Perhaps part of the project would be to define a use case so we find > > >>> some actual command lines we want to run in a Taverna workflow - e.g. > > >>> to run HMMER for sequence alignment using > > >>> https://hub.docker.com/r/dockerbiotools/hmmer/ using sequences > fetched > > >>> from an EBI web service? I am not sure how much of the > bioinformatics > > >>> side you would like to get into! :) > > >>> > > >>> > > >>> > > >>> > (example: http service hosted in Docker, Http activity plugin, > > Message > > >>> > Broker service hosted in Docker, you need AMQP,MQTT like activity > > >>> plugin) > > >>> > > >>> Yes, but I don't think we want to run many of those kind of services > > >>> from Taverna, I was thinking more of running just command line tools > > >>> that happen to be packaged as Docker images. > > >>> > > >>> > 3. Or the case is to invoke some composite applications that > > >>> > deployed/installed in Docker disregarding what the protocols are ? > > >>> > > >>> No, this would get a bit more complex, so I would stay away from that > > >>> for the GSOC project - although of course the potential is very > > >>> interesting motivation as well. > > >>> > > >>> I think this is what I described in > > >>> https://issues.apache.org/jira/browse/TAVERNA-941 > > >>> > > >>> > > >>> > if #3 is correct, what we run in the docker container can be > another > > >>> > Taverna workflow. If that is the case your idea on "Save workflow > as > > >>> Docker > > >>> > image" will be a superb addition!. > > >>> > > >>> Yes! It should then be possible! But.. why? :) Run with older > Taverna > > >>> version? > > >>> > > >>> One interesting thing could be if there's also "Save workflow as > > >>> Docker image" - if such a docker image is added as a Docker image - > > >>> would be to "unwrap" it and show the inner workflow in Taverna. > > >>> > > >>> With Docker there's a big danger of going down the "It's turtles all > > >>> the way down" recursion - hence I tried to scope the GSOC ideas to be > > >>> more concrete about running command line tools. > > >>> > > >>> > > >>> > So with this, I would like to understand what Taverna community > > expect > > >>> > from "Invoking Docker from Taverna" on this GSoC project. So that > I > > >>> can be > > >>> > more specific on my project proposal and make it the best project > for > > >>> this > > >>> > summer for Taverna. > > >>> > > > >>> > > > >>> > > > >>> > On Fri, Mar 18, 2016 at 7:18 AM, Stian Soiland-Reyes < > > [email protected]> > > >>> > wrote: > > >>> > > > >>> >> On 17 March 2016 at 15:22, alaninmcr <[email protected]> > > wrote: > > >>> >> >> I found Docker as an excellent solution for scaling, easy > > >>> deployment and > > >>> >> >> obviously a hot topic these days in enterprises who want to > > >>> implement > > >>> >> >> micro > > >>> >> >> services based architecture/deployment for low footprint > > >>> >> servers/services. > > >>> >> >> > > >>> >> >> I presume the idea behind Docker support for Taverna is NOT > from > > a > > >>> micro > > >>> >> >> service standpoint, but more like from a packaging and > deployment > > >>> >> >> perspective. Please correct me if I am wrong. > > >>> >> > > >>> >> No, you are right in that our current Docker ideas would not be > > about > > >>> >> creating Taverna (or Taverna workflow) as a micro-service,. but to > > use > > >>> >> Docker for execution. > > >>> >> > > >>> >> A similar aspect could be to use Docker to start up a set of > > >>> >> microservices accompanying the Workflow, and then access them from > > >>> >> Taverna workflow using the existing WSDL and REST activities. > > >>> >> This is something that I am interested in within the > > >>> >> http://bioexcel.eu/ project - but is a bit more architecturally > > >>> >> challenging as it would mean things like dynamic port bindings in > > the > > >>> >> workflow configuration. It > > >>> >> > > >>> >> I've tracked this as > > https://issues.apache.org/jira/browse/TAVERNA-941 > > >>> >> but IMHO it would be a too big task for a GSOC project. > > >>> >> > > >>> >> > > >>> >> > There are two separate issues: > > >>> >> > > > >>> >> > https://issues.apache.org/jira/browse/TAVERNA-901 is to allow > > >>> Taverna > > >>> >> > workflows to include steps that are tools that inside docker > > >>> containers. > > >>> >> > That would be deployment of an existing docker. > > >>> >> > > > >>> >> > https://issues.apache.org/jira/browse/TAVERNA-879 is to create > > >>> docker > > >>> >> > containers for Taverna workflows. That is packaging and (because > > the > > >>> >> > containers will be part of a CWL workflow) deployment. > > >>> >> > > >>> >> Nadeesh, I've added your interest to > > >>> >> > > >>> > > https://cwiki.apache.org/confluence/display/TAVERNADEV/2016-03+GSOC+2016 > > >>> >> > > >>> >> but if you are more interested in packaging for Docker, then > perhaps > > >>> >> we could look at the existing Docker wrapping of Taverna Server > > >>> >> > > >>> >> https://hub.docker.com/r/taverna/taverna-server/ > > >>> >> https://github.com/taverna-extras/taverna-server-docker > > >>> >> > > >>> >> and consider doing something similar for our command line tools > > >>> >> "executeworkflow" and "tavlang". > > >>> >> > > >>> >> That shouldn't take you too long - so you may want to prototype > one > > of > > >>> >> TAVERNA-901 and TAVERNA-879 as well. > > >>> >> > > >>> >> > > >>> >> I know Dmitry used wsdl-generic as a command line tool as in > > >>> >> http://inb.bsc.es/documents/galaxygears/ which could also be > > >>> >> interesting as a Docker container (e.g. for running WSDL services > > >>> >> within a CWL workflow), but I am not sure where the source code > for > > >>> >> that is (is that outside Apache, Dmitry?) > > >>> >> > > >>> >> > > >>> >> > > >>> >> > > >>> >> > > >>> >> >> If that is the case, can you please clarify what is the current > > >>> >> packaging > > >>> >> >> deployment model ? > > >>> >> > > >>> >> > > >>> >> For Taverna 2.5 we used install4j via Maven to package into an > > >>> installer: > > >>> >> > > >>> >> > > >>> >> > > >>> > > > https://github.com/apache/incubator-taverna-commandline/blob/old/taverna-commandline-product-core-20141228/pom.xml#L1712 > > >>> >> > > >>> >> That's what made the installers we have at > > >>> >> https://taverna.incubator.apache.org/download/command-line-tool/ > > >>> >> > > >>> >> One packaging task we could consider for Taverna 3.0 is to update > > >>> >> > > >>> >> > > >>> > > > https://github.com/apache/incubator-taverna-commandline/tree/master/taverna-commandline-product > > >>> >> to use install4j or similar to generate such installers also for > > >>> >> Taverna 3, which has a slightly different > > >>> >> folder structure. > > >>> >> > > >>> >> As an open source project we have 5 licenses for Install4j, but we > > >>> >> have not asked the author yet if this is still valid under Apache. > > >>> >> Now releasing under Apache license instead of LGPL we would > > ironically > > >>> >> now be allowed to bundle the binary Oracle JRE rather than having > to > > >>> >> use the open source > > >>> >> OpenJDK builds. > > >>> >> > > >>> >> But I'm afraid such a task would not involve Docker - as I think > > most > > >>> >> users of Taverna Command line would not have Docker (or even the > > right > > >>> >> Java version) installed. > > >>> >> > > >>> >> > > >>> >> > > >>> >> > There is no current mechanism for packaging up something to run > a > > >>> >> specific > > >>> >> > Taverna workflow. You can run workflows from the command line > tool > > >>> or on > > >>> >> a > > >>> >> > Taverna Server. > > >>> >> > > >>> >> Making a recipe for generating Docker images for running a > > particular > > >>> >> Taverna Workflow could be interesting. We could then have "Save > > >>> >> workflow as Docker image" built into Taverna! > > >>> >> > > >>> >> If you are thinking about such an idea, feel free to suggest it > as a > > >>> >> new Jira task! > > >>> >> > > >>> >> > > >>> >> > > >>> >> Overall - you don't have to pick exactly our ideas - you can be > > >>> >> inspired by them and will have to write your own proposal about > what > > >>> >> work you propose to do (which should be reasonably scoped and > > >>> >> scheduled) and say how Apache Taverna would benefit. > > >>> >> > > >>> >> Looking forward to hear more about your ideas! > > >>> >> > > >>> >> -- > > >>> >> Stian Soiland-Reyes > > >>> >> Apache Taverna (incubating), Apache Commons RDF (incubating) > > >>> >> http://orcid.org/0000-0001-9842-9718 > > >>> >> > > >>> > > >>> > > >>> > > >>> -- > > >>> Stian Soiland-Reyes > > >>> Apache Taverna (incubating), Apache Commons RDF (incubating) > > >>> http://orcid.org/0000-0001-9842-9718 > > >>> > > >> > > >> > > > > > > > > -- > > Stian Soiland-Reyes > > Apache Taverna (incubating), Apache Commons RDF (incubating) > > http://orcid.org/0000-0001-9842-9718 > > >
