On 21 March 2016 at 00:51, Nadeesh Dilanga <[email protected]> wrote:
> First of all, apologize for the delayed response. I wanted to give my self > bit more time to understand and going through what Taverna is and what > exactly the expected outcome of the project (tutorials and related slide > decks and also youtube videos were very helpful). Because this will be my > one and only GSoC proposal and I want it to be perfect!. Thanks! You don't have to do it perfect - just great! :-)) > 1. Taverna is a BPMN like(but more extensive and scoped more widely in > features) workflow engine which has several ways of creating work flows and > different interfaces of access them. While I guess we don't like to be compared with BPMN, I think you are correct. :) > 2. When creating workflows, one major extension point to cater custom use > cases is, to plug/create your own services/service types which is a great > model IMHO. And this project is in fact to write an adapter(activity plugin > which I believe is the executor of an invocation of a service) when some > one needs to run something on Docker at some phase of his workflow. Correct - thus one could have a workflow with multiple tools from different docker images. > if #2 is correct, can you please provide me an example of an use case which > led to this project idea, because feels I may be missing something here. > Because IMHO, even for docker eventually it will be a service invocation > from a workflow front, and what Tarvena needs is some activity plugins that > are aware of the particular transport protocols. We already have the Tool activity which allow you to run command line tools - however such workflows are hard to share as anyone receiving it may not have that tool installed, or in the same version/location. While approaches like https://www.debian.org/devel/debian-med/ and BioLinux have helped towards "How to get it installed" - it then moves the requirement to a particular operating system, which in a way is worse. Docker solves the "How to consistently install this tool" problem - and even works (almost) seemlessly from OS X and Windows. It adds nice reproducibility aspects as you can mark the exact snapshot version of the docker image you have used. There are now also initiatives such as http://bioboxes.org/ (and to a certain degreehttp://bio.tools/ ) which describe bioinformatics tools as Docker images - thus these can in theory be used directly from Taverna. Perhaps part of the project would be to define a use case so we find some actual command lines we want to run in a Taverna workflow - e.g. to run HMMER for sequence alignment using https://hub.docker.com/r/dockerbiotools/hmmer/ using sequences fetched from an EBI web service? I am not sure how much of the bioinformatics side you would like to get into! :) > (example: http service hosted in Docker, Http activity plugin, Message > Broker service hosted in Docker, you need AMQP,MQTT like activity plugin) Yes, but I don't think we want to run many of those kind of services from Taverna, I was thinking more of running just command line tools that happen to be packaged as Docker images. > 3. Or the case is to invoke some composite applications that > deployed/installed in Docker disregarding what the protocols are ? No, this would get a bit more complex, so I would stay away from that for the GSOC project - although of course the potential is very interesting motivation as well. I think this is what I described in https://issues.apache.org/jira/browse/TAVERNA-941 > if #3 is correct, what we run in the docker container can be another > Taverna workflow. If that is the case your idea on "Save workflow as Docker > image" will be a superb addition!. Yes! It should then be possible! But.. why? :) Run with older Taverna version? One interesting thing could be if there's also "Save workflow as Docker image" - if such a docker image is added as a Docker image - would be to "unwrap" it and show the inner workflow in Taverna. With Docker there's a big danger of going down the "It's turtles all the way down" recursion - hence I tried to scope the GSOC ideas to be more concrete about running command line tools. > So with this, I would like to understand what Taverna community expect > from "Invoking Docker from Taverna" on this GSoC project. So that I can be > more specific on my project proposal and make it the best project for this > summer for Taverna. > > > > On Fri, Mar 18, 2016 at 7:18 AM, Stian Soiland-Reyes <[email protected]> > wrote: > >> On 17 March 2016 at 15:22, alaninmcr <[email protected]> wrote: >> >> I found Docker as an excellent solution for scaling, easy deployment and >> >> obviously a hot topic these days in enterprises who want to implement >> >> micro >> >> services based architecture/deployment for low footprint >> servers/services. >> >> >> >> I presume the idea behind Docker support for Taverna is NOT from a micro >> >> service standpoint, but more like from a packaging and deployment >> >> perspective. Please correct me if I am wrong. >> >> No, you are right in that our current Docker ideas would not be about >> creating Taverna (or Taverna workflow) as a micro-service,. but to use >> Docker for execution. >> >> A similar aspect could be to use Docker to start up a set of >> microservices accompanying the Workflow, and then access them from >> Taverna workflow using the existing WSDL and REST activities. >> This is something that I am interested in within the >> http://bioexcel.eu/ project - but is a bit more architecturally >> challenging as it would mean things like dynamic port bindings in the >> workflow configuration. It >> >> I've tracked this as https://issues.apache.org/jira/browse/TAVERNA-941 >> but IMHO it would be a too big task for a GSOC project. >> >> >> > There are two separate issues: >> > >> > https://issues.apache.org/jira/browse/TAVERNA-901 is to allow Taverna >> > workflows to include steps that are tools that inside docker containers. >> > That would be deployment of an existing docker. >> > >> > https://issues.apache.org/jira/browse/TAVERNA-879 is to create docker >> > containers for Taverna workflows. That is packaging and (because the >> > containers will be part of a CWL workflow) deployment. >> >> Nadeesh, I've added your interest to >> https://cwiki.apache.org/confluence/display/TAVERNADEV/2016-03+GSOC+2016 >> >> but if you are more interested in packaging for Docker, then perhaps >> we could look at the existing Docker wrapping of Taverna Server >> >> https://hub.docker.com/r/taverna/taverna-server/ >> https://github.com/taverna-extras/taverna-server-docker >> >> and consider doing something similar for our command line tools >> "executeworkflow" and "tavlang". >> >> That shouldn't take you too long - so you may want to prototype one of >> TAVERNA-901 and TAVERNA-879 as well. >> >> >> I know Dmitry used wsdl-generic as a command line tool as in >> http://inb.bsc.es/documents/galaxygears/ which could also be >> interesting as a Docker container (e.g. for running WSDL services >> within a CWL workflow), but I am not sure where the source code for >> that is (is that outside Apache, Dmitry?) >> >> >> >> >> >> >> If that is the case, can you please clarify what is the current >> packaging >> >> deployment model ? >> >> >> For Taverna 2.5 we used install4j via Maven to package into an installer: >> >> >> https://github.com/apache/incubator-taverna-commandline/blob/old/taverna-commandline-product-core-20141228/pom.xml#L1712 >> >> That's what made the installers we have at >> https://taverna.incubator.apache.org/download/command-line-tool/ >> >> One packaging task we could consider for Taverna 3.0 is to update >> >> https://github.com/apache/incubator-taverna-commandline/tree/master/taverna-commandline-product >> to use install4j or similar to generate such installers also for >> Taverna 3, which has a slightly different >> folder structure. >> >> As an open source project we have 5 licenses for Install4j, but we >> have not asked the author yet if this is still valid under Apache. >> Now releasing under Apache license instead of LGPL we would ironically >> now be allowed to bundle the binary Oracle JRE rather than having to >> use the open source >> OpenJDK builds. >> >> But I'm afraid such a task would not involve Docker - as I think most >> users of Taverna Command line would not have Docker (or even the right >> Java version) installed. >> >> >> >> > There is no current mechanism for packaging up something to run a >> specific >> > Taverna workflow. You can run workflows from the command line tool or on >> a >> > Taverna Server. >> >> Making a recipe for generating Docker images for running a particular >> Taverna Workflow could be interesting. We could then have "Save >> workflow as Docker image" built into Taverna! >> >> If you are thinking about such an idea, feel free to suggest it as a >> new Jira task! >> >> >> >> Overall - you don't have to pick exactly our ideas - you can be >> inspired by them and will have to write your own proposal about what >> work you propose to do (which should be reasonably scoped and >> scheduled) and say how Apache Taverna would benefit. >> >> Looking forward to hear more about your ideas! >> >> -- >> Stian Soiland-Reyes >> Apache Taverna (incubating), Apache Commons RDF (incubating) >> http://orcid.org/0000-0001-9842-9718 >> -- Stian Soiland-Reyes Apache Taverna (incubating), Apache Commons RDF (incubating) http://orcid.org/0000-0001-9842-9718
