Re: GSoC 2016 Docker support for Taverna

Nadeesh Dilanga Tue, 22 Mar 2016 10:42:48 -0700

Hi,
Thank you very much for the quick response. I will go through these bit
more and get back when I meet any roadblocks.


On Mon, Mar 21, 2016 at 10:15 PM, Stian Soiland-Reyes <[email protected]>
wrote:

> On 21 March 2016 at 00:51, Nadeesh Dilanga <[email protected]> wrote:
>
> > First of all, apologize for the delayed response. I wanted to give my
> self
> > bit more time to understand and going through what Taverna is and what
> > exactly the expected outcome of the project (tutorials and related slide
> > decks and also youtube videos were very helpful). Because this will be my
> > one and only GSoC proposal and I want it to be perfect!.
>
> Thanks!  You don't have to do it perfect - just great! :-))
>
> > 1. Taverna is a BPMN like(but more extensive and scoped more widely in
> > features) workflow engine which has several ways of creating work flows
> and
> > different interfaces of access them.
>
> While I guess we don't like to be compared with BPMN, I think you are
> correct. :)
>
>
> >  2. When creating workflows, one major extension point to cater custom
> use
> > cases is, to plug/create your own services/service types which is a great
> > model IMHO. And this project is in fact to write an adapter(activity
> plugin
> > which I believe is the executor of an invocation of a service) when some
> > one needs to run something on Docker at some phase of his workflow.
>
> Correct - thus one could have a workflow with multiple tools from
> different docker images.
>
>
> > if #2 is correct, can you please provide me an example of an use case
> which
> > led to this project idea, because feels I may be missing something here.
> > Because IMHO, even for docker eventually it will be a service invocation
> > from a workflow front, and what Tarvena needs is some activity plugins
> that
> > are aware of the particular transport protocols.
>
> We already have the Tool activity which allow you to run command line
> tools - however such workflows are hard to share as anyone receiving
> it may not have that tool installed, or in the same version/location.
>
> While approaches like https://www.debian.org/devel/debian-med/ and
> BioLinux have helped towards "How to get it installed" - it then moves
> the requirement to a particular operating system, which in a way is
> worse.
>
> Docker solves the "How to consistently install this tool" problem -
> and even works (almost) seemlessly from OS X and Windows. It adds nice
> reproducibility aspects as you can mark the exact snapshot version of
> the docker image you have used.
>
>
> There are now also initiatives such as http://bioboxes.org/ (and  to a
> certain degreehttp://bio.tools/ ) which describe bioinformatics tools
> as Docker images - thus these can in theory be used directly from
> Taverna.
>
>
> Perhaps part of the project would be to define a use case so we find
> some actual command lines we want to run in a Taverna workflow - e.g.
> to run HMMER for sequence alignment using
> https://hub.docker.com/r/dockerbiotools/hmmer/ using sequences fetched
> from an EBI web service?  I am not sure how much of the bioinformatics
> side you would like to get into! :)
>
>
>
> > (example: http service hosted in Docker, Http activity plugin, Message
> > Broker service hosted in Docker, you need AMQP,MQTT like activity plugin)
>
> Yes, but I don't think we want to run many of those kind of services
> from Taverna, I was thinking more of running just command line tools
> that happen to be packaged as Docker images.
>
> > 3. Or the case is to invoke some composite applications that
> > deployed/installed in Docker disregarding what the protocols are ?
>
> No, this would get a bit more complex, so I would stay away from that
> for the GSOC project - although of course the potential is very
> interesting motivation as well.
>
> I think this is what I described in
> https://issues.apache.org/jira/browse/TAVERNA-941
>
>
> > if #3 is correct, what we run in the docker container can be another
> > Taverna workflow. If that is the case your idea on "Save workflow as
> Docker
> > image" will be a superb addition!.
>
> Yes! It should then be possible! But.. why? :)  Run with older Taverna
> version?
>
> One interesting thing could be if there's also "Save workflow as
> Docker image" - if such a docker image is added as a Docker image -
> would be to "unwrap" it and show the inner workflow in Taverna.
>
> With Docker there's a big danger of going down the "It's turtles all
> the way down" recursion - hence I tried to scope the GSOC ideas to be
> more concrete about running command line tools.
>
>
> >  So with this, I would like to understand what Taverna community expect
> > from "Invoking Docker from Taverna"  on this GSoC project. So that I can
> be
> > more specific on my project proposal and make it the best project for
> this
> > summer for Taverna.
> >
> >
> >
> > On Fri, Mar 18, 2016 at 7:18 AM, Stian Soiland-Reyes <[email protected]>
> > wrote:
> >
> >> On 17 March 2016 at 15:22, alaninmcr <[email protected]> wrote:
> >> >> I found Docker as an excellent solution for scaling, easy deployment
> and
> >> >> obviously a hot topic these days in enterprises who want to implement
> >> >> micro
> >> >> services based architecture/deployment for low footprint
> >> servers/services.
> >> >>
> >> >> I presume the idea behind Docker support for Taverna is NOT from a
> micro
> >> >> service standpoint, but more like from a packaging and deployment
> >> >> perspective. Please correct me if I am wrong.
> >>
> >> No, you are right in that our current Docker ideas would not be about
> >> creating Taverna (or Taverna workflow) as a micro-service,. but to use
> >> Docker for execution.
> >>
> >> A similar aspect could be to use Docker to start up a set of
> >> microservices accompanying the Workflow, and then access them from
> >> Taverna workflow using the existing WSDL and REST activities.
> >> This is something that I am interested in within the
> >> http://bioexcel.eu/ project - but is a bit more architecturally
> >> challenging as it would mean things like dynamic port bindings in the
> >> workflow configuration. It
> >>
> >> I've tracked this as https://issues.apache.org/jira/browse/TAVERNA-941
> >> but IMHO it would be a too big task for a GSOC project.
> >>
> >>
> >> > There are two separate issues:
> >> >
> >> > https://issues.apache.org/jira/browse/TAVERNA-901 is to allow Taverna
> >> > workflows to include steps that are tools that inside docker
> containers.
> >> > That would be deployment of an existing docker.
> >> >
> >> > https://issues.apache.org/jira/browse/TAVERNA-879 is to create docker
> >> > containers for Taverna workflows. That is packaging and (because the
> >> > containers will be part of a CWL workflow) deployment.
> >>
> >> Nadeesh, I've added your interest to
> >>
> https://cwiki.apache.org/confluence/display/TAVERNADEV/2016-03+GSOC+2016
> >>
> >> but if you are more interested in packaging for Docker, then perhaps
> >> we could look at the existing Docker wrapping of Taverna Server
> >>
> >> https://hub.docker.com/r/taverna/taverna-server/
> >> https://github.com/taverna-extras/taverna-server-docker
> >>
> >> and consider doing something similar for our command line tools
> >> "executeworkflow" and "tavlang".
> >>
> >> That shouldn't take you too long - so you may want to prototype one of
> >> TAVERNA-901 and TAVERNA-879 as well.
> >>
> >>
> >> I know Dmitry used wsdl-generic as a command line tool as in
> >> http://inb.bsc.es/documents/galaxygears/ which could also be
> >> interesting as a Docker container (e.g. for running WSDL services
> >> within a CWL workflow), but I am not sure where the source code for
> >> that is (is that outside Apache, Dmitry?)
> >>
> >>
> >>
> >>
> >>
> >> >> If that is the case, can you please clarify what is the current
> >> packaging
> >> >> deployment model ?
> >>
> >>
> >> For Taverna 2.5 we used install4j via Maven to package into an
> installer:
> >>
> >>
> >>
> https://github.com/apache/incubator-taverna-commandline/blob/old/taverna-commandline-product-core-20141228/pom.xml#L1712
> >>
> >> That's what made the installers we have at
> >> https://taverna.incubator.apache.org/download/command-line-tool/
> >>
> >> One packaging task we could consider for Taverna 3.0 is to update
> >>
> >>
> https://github.com/apache/incubator-taverna-commandline/tree/master/taverna-commandline-product
> >> to use install4j or similar to generate such installers also for
> >> Taverna 3, which has a slightly different
> >> folder structure.
> >>
> >> As an open source project we have 5 licenses for Install4j, but we
> >> have not asked the author yet if this is still valid under Apache.
> >> Now releasing under Apache license instead of LGPL we would ironically
> >> now be allowed to bundle the binary Oracle JRE rather than having to
> >> use the open source
> >> OpenJDK builds.
> >>
> >> But I'm afraid such a task would not involve Docker - as I think most
> >> users of Taverna Command line would not have Docker (or even the right
> >> Java version) installed.
> >>
> >>
> >>
> >> > There is no current mechanism for packaging up something to run a
> >> specific
> >> > Taverna workflow. You can run workflows from the command line tool or
> on
> >> a
> >> > Taverna Server.
> >>
> >> Making a recipe for generating Docker images for running a particular
> >> Taverna Workflow could be interesting. We could then have "Save
> >> workflow as Docker image" built into Taverna!
> >>
> >> If you are thinking about such an idea, feel free to suggest it as a
> >> new Jira task!
> >>
> >>
> >>
> >> Overall - you don't have to pick exactly our ideas - you can be
> >> inspired by them and will have to write your own proposal about what
> >> work you propose to do (which should be reasonably scoped and
> >> scheduled) and say how Apache Taverna would benefit.
> >>
> >> Looking forward to hear more about your ideas!
> >>
> >> --
> >> Stian Soiland-Reyes
> >> Apache Taverna (incubating), Apache Commons RDF (incubating)
> >> http://orcid.org/0000-0001-9842-9718
> >>
>
>
>
> --
> Stian Soiland-Reyes
> Apache Taverna (incubating), Apache Commons RDF (incubating)
> http://orcid.org/0000-0001-9842-9718
>

Re: GSoC 2016 Docker support for Taverna

Reply via email to