Re: [Proposal] Taverna workflow
Right - many, many levels of provenance... provenance of source code is not very much different from provenance of workflow definitions. We already did quite a bit of work with that on a social level with our generic workflow sharing platform myExperiment - http://www.myexperiment.org/home -- e.g. Version 4 of this workflow - attributed to this other workflow by someone else. This might be of interest to other Workflow-like projects within Apache - as myExperiment can deal with any workflow type. For instance, my workflow at http://www.myexperiment.org/workflows/3860 attributes http://www.myexperiment.org/workflows/3369 because I have embedded it as a nested workflow. I have therefore also given the original authors credit on my workflow. Lots of this information can be deduced by inspecting the definitions, looking at hashes and identifiers, etc. (Taverna workflow definitions includes a chain of identifiers throughout its evolution - so you can even tell if an earlier, unpublished version of a workflow has been reused). Provenance of a workflow *execution* is also quite related to, but still quite distinct from, the higher level provenance of research data and of the scientific analysis it has been going through. Similarly the provenance of a command line tool can be on system-level Ran for 14 seconds on a Linux host asdkjasd using 1127 MB of memory and these shared libraries - or on a semantic level like Aligned these two biological sequences from mouse and rat. The big challenge is trying to bind these kinds of provenance together, and to infer one level of provenance from another. But I am digressing! My apologies to the rest of the list.. but do let me know if you are interested in workflows, provenance, versioning and semantics, and we can put together some kind of interest group. On 23 October 2014 07:41, Bertrand Delacretaz bdelacre...@apache.org wrote: Hi, Thanks for the clarifications. On Thu, Oct 23, 2014 at 4:43 AM, Stian Soiland-Reyes soiland-re...@cs.manchester.ac.uk wrote: ...Provenance exchange - I am thinking in particular if it would be possible to combine our W3C PROV-O provenance support - https://github.com/taverna/taverna-prov (which describes a workflow run) - with exposing service-level provenance... Ok got it now. We sometimes talk about the provenance of our code, which must be traceable etc. so I was confused why you'd exchange provenance with other projects ;-) All clear now. -Bertrand - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org -- Stian Soiland-Reyes, myGrid team School of Computer Science The University of Manchester http://soiland-reyes.com/stian/work/ http://orcid.org/-0001-9842-9718
Re: [Proposal] Taverna workflow
Hi, Thanks for the clarifications. On Thu, Oct 23, 2014 at 4:43 AM, Stian Soiland-Reyes soiland-re...@cs.manchester.ac.uk wrote: ...Provenance exchange - I am thinking in particular if it would be possible to combine our W3C PROV-O provenance support - https://github.com/taverna/taverna-prov (which describes a workflow run) - with exposing service-level provenance... Ok got it now. We sometimes talk about the provenance of our code, which must be traceable etc. so I was confused why you'd exchange provenance with other projects ;-) All clear now. -Bertrand - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org
Re: [Proposal] Taverna workflow
Hi, I have already voted +1 on the incubation, but I have two nitpicks on the proposal: ...Taverna developer workshop (2014-10-30).. It's good to advertise here how people can join - I suppose that's http://www.taverna.org.uk/2014/09/08/taverna-open-development-workshop-2014-10-30-2014-10-31/ ? Provenance exchange with relevant Apache products (e.g. Apache CXF-Taverna-CouchDB) I have no idea what provenance exchange means, can you clarify? -Bertrand - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org
Re: [Proposal] Taverna workflow
I would love to join and listen in if the time zone differences don't make it too difficult. -- Joyce On Wed, Oct 15, 2014 at 5:45 PM, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: Great I would be happy to participate and to receive the remote dial in instructions. Thanks Stian! ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Stian Soiland-Reyes soiland-re...@cs.manchester.ac.uk Reply-To: general@incubator.apache.org general@incubator.apache.org Date: Wednesday, October 15, 2014 at 4:52 AM To: general@incubator.apache.org general@incubator.apache.org Subject: Re: [Proposal] Taverna workflow Would any of the Taverna mentors be interested in joining the Taverna Development Workshop at the end of the month? http://taverna2014.eventbrite.co.uk/ http://dev.mygrid.org.uk/wiki/display/developer/Taverna+Open+Development+W orkshop I know most of you are not really based near Manchester - but we are already arranging for remoting in for another participant, so that is always an option if the time-zones allow. On 15 October 2014 10:02, Andy Seaborne a...@apache.org wrote: On 14/10/14 16:51, Marlon Pierce wrote: Hi all-- I'm a bit late on this but I would also like to serve as a mentor. I'm a PMC member of Apache Airavata and Apache Rave, and I've also served as a mentor for Apache Stratos. Marlon Marlon, Thank you for the offer - I've added you to the the mentor list on the proposal. Andy On 9/26/14, 10:18 AM, Andy Seaborne wrote: On 25/09/14 19:19, Suresh Marru wrote: If you need a mentor, count me in. I actively contribute to Apache Airavata, and will be happy to bring our experiences from a similar journey. Infact Ross queried on airavata lists few years ago about potential taverna move to airavata/apache(Ross mentioned it further in this thread), good to see finally its happening. Integrating plugin community into the apache project (once its voted in) seems to be a low hanging fruit to diversify. On 25/09/14 17:36, Suresh Srinivas wrote: If you you need a volunteer, I am available. Hi there, It being Friday, and Stian is about to be away, I've added you both to the mentors list. Taverna has a long history so getting as much experience from mentors will be very valuable. Thanks Andy PS I put Michael in as Not formally a mentor - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org -- Stian Soiland-Reyes, myGrid team School of Computer Science The University of Manchester http://soiland-reyes.com/stian/work/ http://orcid.org/-0001-9842-9718 - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org
Re: [Proposal] Taverna workflow
On 14/10/14 16:51, Marlon Pierce wrote: Hi all-- I'm a bit late on this but I would also like to serve as a mentor. I'm a PMC member of Apache Airavata and Apache Rave, and I've also served as a mentor for Apache Stratos. Marlon Marlon, Thank you for the offer - I've added you to the the mentor list on the proposal. Andy On 9/26/14, 10:18 AM, Andy Seaborne wrote: On 25/09/14 19:19, Suresh Marru wrote: If you need a mentor, count me in. I actively contribute to Apache Airavata, and will be happy to bring our experiences from a similar journey. Infact Ross queried on airavata lists few years ago about potential taverna move to airavata/apache(Ross mentioned it further in this thread), good to see finally its happening. Integrating plugin community into the apache project (once its voted in) seems to be a low hanging fruit to diversify. On 25/09/14 17:36, Suresh Srinivas wrote: If you you need a volunteer, I am available. Hi there, It being Friday, and Stian is about to be away, I've added you both to the mentors list. Taverna has a long history so getting as much experience from mentors will be very valuable. Thanks Andy PS I put Michael in as Not formally a mentor - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org
Re: [Proposal] Taverna workflow
On 25 September 2014 17:34, David Nalley da...@gnsa.us wrote: Can you make sure a comprehensive list of those currently incompatible list of dependencies is included in your proposal. I have added them to http://dev.mygrid.org.uk/wiki/display/developer/Third-party+licenses and added references to this from https://wiki.apache.org/incubator/TavernaProposal#External_Dependencies and added complete resolution of the remaining unknown licenses to https://wiki.apache.org/incubator/TavernaProposal#Initial_Goals Should I move those sections from the Taverna wiki to embed in the proposal? (I think it could be a bit noisy). See https://github.com/taverna/taverna-build/tree/master/licenses for the complete lists as generated by Maven. Taverna 2 is licensed as LGPL 2.1, which meant we could use several LGPL libraries like Hibernate and RShell. Hibernate can be replaced by other JPA providers (with some code update to remove Hibernate specific calls), while the RShell support would have to be moved out to an separately installable plugin. Do you have adequate rights to change the license wholesale? Yes, as we declare under https://wiki.apache.org/incubator/TavernaProposal#Source_and_Intellectual_Property_Submission_Plan we the University of Manchester is the copyright holder and contributitions from outside the University have all signed CLAs that allow us to change the license and redelegate copyright. This sounds like fragmenting your existing community before you really even get started. I believe the ASF is a great place, but I am not convinced it's the best place for everyone. Yes, I am afraid the AstroTaverna community was not too keen on the perceived fragmentation, but as long as we keep the AstroTaverna developers involved in Apache Taverna (Julián Garrido is included as an Initial Committer), and keep the existing plugin support (with an additional What do you want splash screen on first start) it should not be a big change for existing users or indeed for developers. AstroTavena is already a separate project - at https://github.com/wf4ever/astrotaverna (of which I am personally also taking part) - and it is only in the recent Taverna Astronomy Edition (introduced in Taverna 2.5) that this plugin was brought into the release. The Astronomy Edition (if it is with a different plugin system in Taverna 3) can be distributed as a non-Apache release by the AstroTaverna community - our build system already have the editions parametrized so that this is not too difficult. I see two paragraphs here that don't answer the question posed. Where will plugin development happen in an ideal world? I guess as separate projects, but with a clear invitation to the Apache project so that such developers also get involved in the Apache Taverna community. One way to do such invitation is to include those plugins in the release (optional on install). One followup question - what kind of build/CI resources does the University of Manchester provide? What manner of resources do you believe you'll need if the project moves to the ASF? We tried to detail this under https://wiki.apache.org/incubator/TavernaProposal#Required_Resources We should be fine with the existing capabilities as ASF already have Jenkins, Jira, Confluence, Git and Maven repositories. There would obviously be migrations jobs needed for existing documentation, issues etc. -- Stian Soiland-Reyes, myGrid team School of Computer Science The University of Manchester http://soiland-reyes.com/stian/work/ http://orcid.org/-0001-9842-9718 - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org
Re: [Proposal] Taverna workflow
On 25 September 2014 19:19, Suresh Marru sma...@apache.org wrote: Hi Stian, If you need a mentor, count me in. I actively contribute to Apache Airavata, and will be happy to bring our experiences from a similar journey. Infact Ross queried on airavata lists few years ago about potential taverna move to airavata/apache(Ross mentioned it further in this thread), good to see finally its happening. Integrating plugin community into the apache project (once its voted in) seems to be a low hanging fruit to diversify. Very interesting to have you in as a mentor - it will be exciting to also look at working together with Airavata. David questions are right on. Two things you may want to consider addressing before you call for a vote are: listing of non-apache compatible license in the proposal and having adequate rights to change the license to Apache V2. I have added a link to these at the end of https://wiki.apache.org/incubator/TavernaProposal#External_Dependencies I have added a sentence to make it explicit that we can change the license at the end of https://wiki.apache.org/incubator/TavernaProposal#Source_and_Intellectual_Property_Submission_Plan Not a blocker for the proposal and voting, but a blocker for importing the code will be to have on file the University signed CCLA/SGA to donate the code. The lawyers have told me some time ago that they have signed the CCLA - which I needed for a contribution to Apache CXF - I am not sure where on Apache.org to check for the list of CCLA signatures as only individual signatures seems to be listed. -- Stian Soiland-Reyes, myGrid team School of Computer Science The University of Manchester http://soiland-reyes.com/stian/work/ http://orcid.org/-0001-9842-9718 - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org
Re: [Proposal] Taverna workflow
Would any of the Taverna mentors be interested in joining the Taverna Development Workshop at the end of the month? http://taverna2014.eventbrite.co.uk/ http://dev.mygrid.org.uk/wiki/display/developer/Taverna+Open+Development+Workshop I know most of you are not really based near Manchester - but we are already arranging for remoting in for another participant, so that is always an option if the time-zones allow. On 15 October 2014 10:02, Andy Seaborne a...@apache.org wrote: On 14/10/14 16:51, Marlon Pierce wrote: Hi all-- I'm a bit late on this but I would also like to serve as a mentor. I'm a PMC member of Apache Airavata and Apache Rave, and I've also served as a mentor for Apache Stratos. Marlon Marlon, Thank you for the offer - I've added you to the the mentor list on the proposal. Andy On 9/26/14, 10:18 AM, Andy Seaborne wrote: On 25/09/14 19:19, Suresh Marru wrote: If you need a mentor, count me in. I actively contribute to Apache Airavata, and will be happy to bring our experiences from a similar journey. Infact Ross queried on airavata lists few years ago about potential taverna move to airavata/apache(Ross mentioned it further in this thread), good to see finally its happening. Integrating plugin community into the apache project (once its voted in) seems to be a low hanging fruit to diversify. On 25/09/14 17:36, Suresh Srinivas wrote: If you you need a volunteer, I am available. Hi there, It being Friday, and Stian is about to be away, I've added you both to the mentors list. Taverna has a long history so getting as much experience from mentors will be very valuable. Thanks Andy PS I put Michael in as Not formally a mentor - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org -- Stian Soiland-Reyes, myGrid team School of Computer Science The University of Manchester http://soiland-reyes.com/stian/work/ http://orcid.org/-0001-9842-9718 - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org
Re: [Proposal] Taverna workflow
Great I would be happy to participate and to receive the remote dial in instructions. Thanks Stian! ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Stian Soiland-Reyes soiland-re...@cs.manchester.ac.uk Reply-To: general@incubator.apache.org general@incubator.apache.org Date: Wednesday, October 15, 2014 at 4:52 AM To: general@incubator.apache.org general@incubator.apache.org Subject: Re: [Proposal] Taverna workflow Would any of the Taverna mentors be interested in joining the Taverna Development Workshop at the end of the month? http://taverna2014.eventbrite.co.uk/ http://dev.mygrid.org.uk/wiki/display/developer/Taverna+Open+Development+W orkshop I know most of you are not really based near Manchester - but we are already arranging for remoting in for another participant, so that is always an option if the time-zones allow. On 15 October 2014 10:02, Andy Seaborne a...@apache.org wrote: On 14/10/14 16:51, Marlon Pierce wrote: Hi all-- I'm a bit late on this but I would also like to serve as a mentor. I'm a PMC member of Apache Airavata and Apache Rave, and I've also served as a mentor for Apache Stratos. Marlon Marlon, Thank you for the offer - I've added you to the the mentor list on the proposal. Andy On 9/26/14, 10:18 AM, Andy Seaborne wrote: On 25/09/14 19:19, Suresh Marru wrote: If you need a mentor, count me in. I actively contribute to Apache Airavata, and will be happy to bring our experiences from a similar journey. Infact Ross queried on airavata lists few years ago about potential taverna move to airavata/apache(Ross mentioned it further in this thread), good to see finally its happening. Integrating plugin community into the apache project (once its voted in) seems to be a low hanging fruit to diversify. On 25/09/14 17:36, Suresh Srinivas wrote: If you you need a volunteer, I am available. Hi there, It being Friday, and Stian is about to be away, I've added you both to the mentors list. Taverna has a long history so getting as much experience from mentors will be very valuable. Thanks Andy PS I put Michael in as Not formally a mentor - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org -- Stian Soiland-Reyes, myGrid team School of Computer Science The University of Manchester http://soiland-reyes.com/stian/work/ http://orcid.org/-0001-9842-9718 - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org
Re: [Proposal] Taverna workflow
Hi all-- I'm a bit late on this but I would also like to serve as a mentor. I'm a PMC member of Apache Airavata and Apache Rave, and I've also served as a mentor for Apache Stratos. Marlon On 9/26/14, 10:18 AM, Andy Seaborne wrote: On 25/09/14 19:19, Suresh Marru wrote: If you need a mentor, count me in. I actively contribute to Apache Airavata, and will be happy to bring our experiences from a similar journey. Infact Ross queried on airavata lists few years ago about potential taverna move to airavata/apache(Ross mentioned it further in this thread), good to see finally its happening. Integrating plugin community into the apache project (once its voted in) seems to be a low hanging fruit to diversify. On 25/09/14 17:36, Suresh Srinivas wrote: If you you need a volunteer, I am available. Hi there, It being Friday, and Stian is about to be away, I've added you both to the mentors list. Taverna has a long history so getting as much experience from mentors will be very valuable. Thanks Andy PS I put Michael in as Not formally a mentor - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org
Re: [Proposal] Taverna workflow
Thanks guys. I added a similarity with Apache OODT http://oodt.apache.org/ to the wiki as well. ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Stian Soiland-Reyes soiland-re...@cs.manchester.ac.uk Reply-To: general@incubator.apache.org general@incubator.apache.org Date: Thursday, September 25, 2014 at 9:19 AM To: general@incubator.apache.org general@incubator.apache.org Subject: Re: [Proposal] Taverna workflow Proposal now moved to the Apache wiki: https://wiki.apache.org/incubator/TavernaProposal I just used copy-paste - so there might be some mistakes introduced - feel free to correct. I will be away for 2 weeks - but my colleague Shoaib Sufi should have signed up to this list to assist in any question during that period. On 23 September 2014 13:43, Stian Soiland-Reyes soiland-re...@cs.manchester.ac.uk wrote: I hereby present the Apache Incubator proposal for the project Taverna. Also available in rich text in the Taverna wiki (with more hyperlinks!): http://dev.mygrid.org.uk/wiki/display/developer/Taverna+incubator+proposa l (Could someone grant me access to edit the Incubator wiki pages? My wiki username is soilandreyes) # Abstract Taverna is an open source and domain-independent suite of tools used to design and execute data-driven workflows. # Proposal The Taverna suite includes: * Taverna Workbench, a Java-based desktop application for graphically composing, editing and executing workflows of distributed web services and local tools * Taverna Commandline Tool which allows repeated execution of parameterized workflow definitions * Taverna Server provides a REST and SOAP API for executing workflows * Taverna Player is a Ruby-based web interface towards the Server, providing a high-level view of workflow executions and their results, and allows further integrations with Ruby on Rails applications. Taverna can browse and combine different service types, allowing workflows to integrate steps of arbitrary REST and SOAP web services with command line tools (local and SSH), scripts (Beanshell, R, Jython) and finally visualize the results. The goal of the Taverna suite is to help researchers to access distributed datasets and processing capabilities by the construction of pipelines, and also to simplify the execution of these pipelines in various environments. The Taverna suite of products is already successful and in wide-use across different domains. The software is currently licensed as LGPL 2.1, with copyright owned by University of Manchester. External contributors have all signed Apache-like CLAs. # Background Taverna workflows coordinate inputs and outputs between computational processes and Web Services. The workflow is designed in a graphical interface which shows the workflow as a series of boxes and arrows; representing processes and their data connections. The different processes in a workflow can be command line tools, REST and WSDL Web Services; which are used for combining steps such as data acquisition, filtering, cleaning, integrating, analysis and visualization. Taverna calls these processes services, as they generally are provided by remote (third-party) servers. These kind of computational workflows, also known as pipelines and dataflows, focus on the movement of data rather than the execution order of the underlying processes. Features such as implicit iterations (where an input list of values causes multiple process executions) and parallel invocations (independent processes are executed as soon as their data is available) are intrinsic to a dataflow system, not requiring any particular constructs by the workflow designer. As a visual programming environment, workflows aids collaboration and reuse of workflows. At the highest level, a workflow represents the conceptual level of an analysis, allowing understanding, discussion and communication of the overall analysis protocol. More detail can be revealed and modified for individual steps. At the individual process level, the workflow defines execution specifics such as operations, parameters and command line tools. Sharing of the workflow definitions allows re-use and re-purposing of the computational analysis. During workflow execution, provenance can be collected from every step, allowing deep inspection of intermediate values for the purpose of debugging and validation. # Rationale
Re: [Proposal] Taverna workflow
On 25/09/14 19:19, Suresh Marru wrote: If you need a mentor, count me in. I actively contribute to Apache Airavata, and will be happy to bring our experiences from a similar journey. Infact Ross queried on airavata lists few years ago about potential taverna move to airavata/apache(Ross mentioned it further in this thread), good to see finally its happening. Integrating plugin community into the apache project (once its voted in) seems to be a low hanging fruit to diversify. On 25/09/14 17:36, Suresh Srinivas wrote: If you you need a volunteer, I am available. Hi there, It being Friday, and Stian is about to be away, I've added you both to the mentors list. Taverna has a long history so getting as much experience from mentors will be very valuable. Thanks Andy PS I put Michael in as Not formally a mentor - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org
Re: [Proposal] Taverna workflow
Thanks, Andy, and all who responded and volunteered (yay!). I'm sure Shoaib will respond, but as I have been the main maven and build guy on the project I guess I should also chip in. I'll try to find time to respond properly to the excellent questions next week, in between my baby duties. Some heads up for now, hyperlink-less as I am on mobile: Two things came up in several of the responses: (L)GPL dependencies. We know we have some, and we have a Dependencies wiki page that is linked to from the proposal, but more work is needed with Maven licensing plugin to double check we have not also got something coming in from transitive dependencies. Due to the occasional use of OSGi repackaging, the licensing seems to have been lost in some of the modules, requiring manual citation (e.g. googling ;). I agree that the definitive list should be in the proposal and it would be something we will work on producing. The infrastructure we need is listed in the proposal, it is bog standard confluence, hits, Jenkins and maven repository (obviously we would hope for importants for the first two). I have a script for making all the Jenkins jobs based on listing of github repositories, it could be modified to do the same for the repos once on Apache git server. I admit that the scale might be a bit different for our project, but part of moving to the incubator would be to also restructure/merge the repositories (and hence Jenkins jobs) as I have mentioned. (We know the big structure is also scaring potential developers away - part reason for the size of the existing structure is the homebrew Maven-based plugin system we used before moving to OSGi) It would probably make sense to do this merge restructure outside Apache before transitionin the code base, as on Github it is just a click away to make another repo. Perhaps making an alternative github group taverna-incubator as the staging area - would it be possible to do this as part of the early incubation period? It is a process that could do with community effort for discussion, testing and also in a way give a better feel for what the different modules are and do, so perhaps a nice way to get the ball rolling. (Why would it take 2-3 days to clone 70 repos anyway..? The make script we have in the taverna-build repo checks out all of them in 3-4 minutes, even dynamically picking up any new repos). On 26 Sep 2014 16:18, Andy Seaborne a...@apache.org wrote: On 25/09/14 19:19, Suresh Marru wrote: If you need a mentor, count me in. I actively contribute to Apache Airavata, and will be happy to bring our experiences from a similar journey. Infact Ross queried on airavata lists few years ago about potential taverna move to airavata/apache(Ross mentioned it further in this thread), good to see finally its happening. Integrating plugin community into the apache project (once its voted in) seems to be a low hanging fruit to diversify. On 25/09/14 17:36, Suresh Srinivas wrote: If you you need a volunteer, I am available. Hi there, It being Friday, and Stian is about to be away, I've added you both to the mentors list. Taverna has a long history so getting as much experience from mentors will be very valuable. Thanks Andy PS I put Michael in as Not formally a mentor - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org
Re: [Proposal] Taverna workflow
Thanks David makes sense to me and thank you for explaining Sent from my iPhone On Sep 26, 2014, at 12:59 PM, David Nalley da...@gnsa.us wrote: On Thu, Sep 25, 2014 at 1:08 PM, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: Hi Guys, -Original Message- From: David Nalley da...@gnsa.us Reply-To: general@incubator.apache.org general@incubator.apache.org Date: Thursday, September 25, 2014 9:34 AM To: general@incubator.apache.org general@incubator.apache.org Cc: Shoaib Sufi shoaib.s...@manchester.ac.uk Subject: Re: [Proposal] Taverna workflow [..snip..] I see two paragraphs here that don't answer the question posed. Where will plugin development happen in an ideal world? This is an implementation detail and can be dealt with during Incubation? One followup question - what kind of build/CI resources does the University of Manchester provide? What manner of resources do you believe you'll need if the project moves to the ASF? Is it a requirement for projects to provide build/CI resources now to enter the Incubator? I do not believe that it is. Not trying to obligate them to provide any resources, merely trying to get an idea how it will affect the Foundation's resources and trying to plan ahead. The migration of 85 github repos alone will, depending, on complexity, likely consume 1-3 contractor days worth of work. If large CI needs exist, it's best to at least know them up front so we can begin planning. Adding a podling has an inherent cost, and some podlings cost more than others; I'm trying to understand what that will be so it isn't a surprise down the road. --David - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org
Re: [Proposal] Taverna workflow
Hi Sitan, I am also interested in knowing your responses to some of the questions below. Looking through this list archives you will find that the issue of homogenous developers comes up every now and then. Its a welcoming move from Taverna team to pursue ASF as a potential home, but its important to understand on plans of diversifying core development beyond University of Manchester. Suresh On Sep 23, 2014, at 1:54 PM, Marlon Pierce marpi...@iu.edu wrote: Thanks, Stian, for submitting a well-developed proposal and for your interest in Apache. I have a few questions: * Can you say more about why you want to take Taverna to the ASF? * What is your strategy for increasing the diversity of your committer base? * Do you have any third party dependencies in the Taverna core that have incompatible licenses (like GPL)? * Would you like developer-contributed plugins to be covered within a future Apache Taverna project? My main goal here is to give the Incubator community a little more background and foster discussion, which will be useful in attracting mentors, so don't worry about right or wrong answers. Marlon On 9/23/14, 8:43 AM, Stian Soiland-Reyes wrote: I hereby present the Apache Incubator proposal for the project Taverna. Also available in rich text in the Taverna wiki (with more hyperlinks!): http://dev.mygrid.org.uk/wiki/display/developer/Taverna+incubator+proposal (Could someone grant me access to edit the Incubator wiki pages? My wiki username is soilandreyes) # Abstract Taverna is an open source and domain-independent suite of tools used to design and execute data-driven workflows. # Proposal The Taverna suite includes: * Taverna Workbench, a Java-based desktop application for graphically composing, editing and executing workflows of distributed web services and local tools * Taverna Commandline Tool which allows repeated execution of parameterized workflow definitions * Taverna Server provides a REST and SOAP API for executing workflows * Taverna Player is a Ruby-based web interface towards the Server, providing a high-level view of workflow executions and their results, and allows further integrations with Ruby on Rails applications. Taverna can browse and combine different service types, allowing workflows to integrate steps of arbitrary REST and SOAP web services with command line tools (local and SSH), scripts (Beanshell, R, Jython) and finally visualize the results. The goal of the Taverna suite is to help researchers to access distributed datasets and processing capabilities by the construction of pipelines, and also to simplify the execution of these pipelines in various environments. The Taverna suite of products is already successful and in wide-use across different domains. The software is currently licensed as LGPL 2.1, with copyright owned by University of Manchester. External contributors have all signed Apache-like CLAs. # Background Taverna workflows coordinate inputs and outputs between computational processes and Web Services. The workflow is designed in a graphical interface which shows the workflow as a series of boxes and arrows; representing processes and their data connections. The different processes in a workflow can be command line tools, REST and WSDL Web Services; which are used for combining steps such as data acquisition, filtering, cleaning, integrating, analysis and visualization. Taverna calls these processes services, as they generally are provided by remote (third-party) servers. These kind of computational workflows, also known as pipelines and dataflows, focus on the movement of data rather than the execution order of the underlying processes. Features such as implicit iterations (where an input list of values causes multiple process executions) and parallel invocations (independent processes are executed as soon as their data is available) are intrinsic to a dataflow system, not requiring any particular constructs by the workflow designer. As a visual programming environment, workflows aids collaboration and reuse of workflows. At the highest level, a workflow represents the conceptual level of an analysis, allowing understanding, discussion and communication of the overall analysis protocol. More detail can be revealed and modified for individual steps. At the individual process level, the workflow defines execution specifics such as operations, parameters and command line tools. Sharing of the workflow definitions allows re-use and re-purposing of the computational analysis. During workflow execution, provenance can be collected from every step, allowing deep inspection of intermediate values for the purpose of debugging and validation. # Rationale There is a strong need to lower the barrier of entry to datasets and computational resources widely available on the Internet, to increase their use by
Re: [Proposal] Taverna workflow
On 23 September 2014 18:54, Marlon Pierce marpi...@iu.edu wrote: * Can you say more about why you want to take Taverna to the ASF? As we say in the proposal, one of our goals of moving to ASF is to make it more obvious that we want to run an open development process. So far we have effectively been leading Taverna from Manchester and kept perhaps a too strong ownership - so any kind of request would be responded to almost as if from a customer; our language would fall into a we vs. them style. This makes the customers happy - of course - but it does not encourage them to contribute themselves to the project. As an example of us/them impression - see Taverna's own website: http://www.taverna.org.uk/about/ vs http://www.taverna.org.uk/collaborate/collaborations/ Under Apache, all of those should be we. Currently we feel it difficult to change this without having a separate foundation. We have looked into creating a standalone Taverna Foundation - but found that it requires a fair bit of legal administration. Apache is well recognized, and has all the legal processes sorted out, and stood out as the most viable candidate. While we have tried to keep things open, with mailing lists, source code, etc. freely available - our working philosophy has still been entrenched in the office environment - strategic decisions about the code decided in a coffee room meeting for instance, etc. Many of the non-Manchester contributors have mainly been adding features in the form of plugins. Our software has a highly pluggable architecture - so in a way that is also our fault - most of the things people wanted to do could be achieved with a plugin. Those contributors have not as much been engaged in any maintenance of the core code, the engine and the user interface. Obviously it is also more of a challenge to understand a whole system than just a plugin interface, but still we don't want to keep any artificial or real barriers for such engagement. So as much as our intended move to ASF is to further encourage others to get engaged, feel ownership to the project, and to contribute to the core; we also want to force ourselves in Manchester truly follow an open process. By having Apache Taverna it is obvious as a standalone project - we believe it could be easier for other scientific projects to bring in contributions to Taverna as a part of their research proposals, without a need to include University of Manchester as a project partner or feeling that they are giving Manchester something for free. * What is your strategy for increasing the diversity of your committer base? We are organizing a developer conference next month in Manchester, which has already generated a lot of interest and registrations. In doing so, we have been inviting personally existing plugin developers and integrators. http://dev.mygrid.org.uk/wiki/display/developer/Taverna+Open+Development+Workshop We in Manchester will want to keep those personal relations active, and will work to encourage engagement from old and new developers - particularly like when we found some integration in the wild where the developers have not signed up to our mailing lists. A move to the Incubator is in a way a good excuse for such recruitment - as it means they should be feeling they are engaging with and becoming part of the software project as an entity, rather than (as previously) just communicating with a particular research group in Manchester. * Do you have any third party dependencies in the Taverna core that have incompatible licenses (like GPL)? Unfortunately we do have a few of those, yes - the fact that we have to move away from those was one of the things that we discussed a lot in the Taverna community. Taverna 2 is licensed as LGPL 2.1, which meant we could use several LGPL libraries like Hibernate and RShell. Hibernate can be replaced by other JPA providers (with some code update to remove Hibernate specific calls), while the RShell support would have to be moved out to an separately installable plugin. The Astronomy edition of Taverna includes a plugin called AstroTaverna, which is GPL3 due to its inclusion of the Topcat and STILTS dependencies. The AstroTaverna community was therefore a bit sceptical about moving to Apache - but we concluded that as they would keep maintaining AstroTaverna as standalone plugins and instead of having multiple downloads for different editions, with Taverna 3 move to a Start screen that installs plugins from possibly third-party sites (Eclipse style). http://smtp.iaa.es/pipermail/astrotaverna-users/20140529/thread.html Here luckily our plugin system (OSGi) will help us out - so those bits that truly depend on GPL or LGPL would have to be maintained outside Apache. What perhaps we need to prepare a bit clearer is exactly which plugins will be in the Apache transfer, and which would stay outside. The Taverna Workbench installers currently include platform-specific binaries of OpenJDK 7, which is
Re: [Proposal] Taverna workflow
In addition to what I said to Marion, One of the things we want to achieve in the short-term is to get non-Manchester developers comfortable with working with the code base. We already have a fair amount of documentation on this - http://dev.mygrid.org.uk/wiki/display/developer/Developers+Guide - but it is still mainly centred around creating plugins. In a way, earlier, we have inadvertently tried to push people away from the core codebase because in most cases what they wanted to do could be achieved using the plugin mechanism - which simplifies both development and distribution (you don't need to distribute your own build of Taverna). As I mentioned to Marion, this has had the unfortunate effect of almost nobody else working with that code base. In the Taverna Development workshop, as mentioned, we have included in the agenda several items on working with the code base, how to create a build, showing how to fix a bug. We would want to keep working with Github mirrors, as we have seen what an enormous boost to third-party developer engagement it can be to, lowering the barrier for forking, changing, customizing, fixing. However we we recognize that our current large number of git repositories is also effectively a blocker to such engagement. The CLAs of Apache (and Taverna) is likewise such a barrier - but we would keep a similar stand as other Apache projects I've been involved with (Jena), where small contributors are accepted as-is, creating a stepping stone for further engagement that encourages signing of CLA and a deeper feeling of commitment. On 25 September 2014 13:55, Suresh Marru sma...@apache.org wrote: Hi Sitan, I am also interested in knowing your responses to some of the questions below. Looking through this list archives you will find that the issue of homogenous developers comes up every now and then. Its a welcoming move from Taverna team to pursue ASF as a potential home, but its important to understand on plans of diversifying core development beyond University of Manchester. Suresh On Sep 23, 2014, at 1:54 PM, Marlon Pierce marpi...@iu.edu wrote: Thanks, Stian, for submitting a well-developed proposal and for your interest in Apache. I have a few questions: * Can you say more about why you want to take Taverna to the ASF? * What is your strategy for increasing the diversity of your committer base? * Do you have any third party dependencies in the Taverna core that have incompatible licenses (like GPL)? * Would you like developer-contributed plugins to be covered within a future Apache Taverna project? My main goal here is to give the Incubator community a little more background and foster discussion, which will be useful in attracting mentors, so don't worry about right or wrong answers. Marlon On 9/23/14, 8:43 AM, Stian Soiland-Reyes wrote: I hereby present the Apache Incubator proposal for the project Taverna. Also available in rich text in the Taverna wiki (with more hyperlinks!): http://dev.mygrid.org.uk/wiki/display/developer/Taverna+incubator+proposal (Could someone grant me access to edit the Incubator wiki pages? My wiki username is soilandreyes) # Abstract Taverna is an open source and domain-independent suite of tools used to design and execute data-driven workflows. # Proposal The Taverna suite includes: * Taverna Workbench, a Java-based desktop application for graphically composing, editing and executing workflows of distributed web services and local tools * Taverna Commandline Tool which allows repeated execution of parameterized workflow definitions * Taverna Server provides a REST and SOAP API for executing workflows * Taverna Player is a Ruby-based web interface towards the Server, providing a high-level view of workflow executions and their results, and allows further integrations with Ruby on Rails applications. Taverna can browse and combine different service types, allowing workflows to integrate steps of arbitrary REST and SOAP web services with command line tools (local and SSH), scripts (Beanshell, R, Jython) and finally visualize the results. The goal of the Taverna suite is to help researchers to access distributed datasets and processing capabilities by the construction of pipelines, and also to simplify the execution of these pipelines in various environments. The Taverna suite of products is already successful and in wide-use across different domains. The software is currently licensed as LGPL 2.1, with copyright owned by University of Manchester. External contributors have all signed Apache-like CLAs. # Background Taverna workflows coordinate inputs and outputs between computational processes and Web Services. The workflow is designed in a graphical interface which shows the workflow as a series of boxes and arrows; representing processes and their data connections. The different processes in a workflow can be command line tools, REST and WSDL Web
Re: [Proposal] Taverna workflow
These are great answers, and much of them will help to guide the Incubation process. Congrats guys and I for one want to welcome you with my Director hat and my ASF member and ASF guy hats! Cheers and looking forward to it. Cheers, Chris ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Stian Soiland-Reyes soiland-re...@cs.manchester.ac.uk Reply-To: general@incubator.apache.org general@incubator.apache.org Date: Thursday, September 25, 2014 9:03 AM To: general@incubator.apache.org general@incubator.apache.org Cc: Shoaib Sufi shoaib.s...@manchester.ac.uk Subject: Re: [Proposal] Taverna workflow On 23 September 2014 18:54, Marlon Pierce marpi...@iu.edu wrote: * Can you say more about why you want to take Taverna to the ASF? As we say in the proposal, one of our goals of moving to ASF is to make it more obvious that we want to run an open development process. So far we have effectively been leading Taverna from Manchester and kept perhaps a too strong ownership - so any kind of request would be responded to almost as if from a customer; our language would fall into a we vs. them style. This makes the customers happy - of course - but it does not encourage them to contribute themselves to the project. As an example of us/them impression - see Taverna's own website: http://www.taverna.org.uk/about/ vs http://www.taverna.org.uk/collaborate/collaborations/ Under Apache, all of those should be we. Currently we feel it difficult to change this without having a separate foundation. We have looked into creating a standalone Taverna Foundation - but found that it requires a fair bit of legal administration. Apache is well recognized, and has all the legal processes sorted out, and stood out as the most viable candidate. While we have tried to keep things open, with mailing lists, source code, etc. freely available - our working philosophy has still been entrenched in the office environment - strategic decisions about the code decided in a coffee room meeting for instance, etc. Many of the non-Manchester contributors have mainly been adding features in the form of plugins. Our software has a highly pluggable architecture - so in a way that is also our fault - most of the things people wanted to do could be achieved with a plugin. Those contributors have not as much been engaged in any maintenance of the core code, the engine and the user interface. Obviously it is also more of a challenge to understand a whole system than just a plugin interface, but still we don't want to keep any artificial or real barriers for such engagement. So as much as our intended move to ASF is to further encourage others to get engaged, feel ownership to the project, and to contribute to the core; we also want to force ourselves in Manchester truly follow an open process. By having Apache Taverna it is obvious as a standalone project - we believe it could be easier for other scientific projects to bring in contributions to Taverna as a part of their research proposals, without a need to include University of Manchester as a project partner or feeling that they are giving Manchester something for free. * What is your strategy for increasing the diversity of your committer base? We are organizing a developer conference next month in Manchester, which has already generated a lot of interest and registrations. In doing so, we have been inviting personally existing plugin developers and integrators. http://dev.mygrid.org.uk/wiki/display/developer/Taverna+Open+Development+W orkshop We in Manchester will want to keep those personal relations active, and will work to encourage engagement from old and new developers - particularly like when we found some integration in the wild where the developers have not signed up to our mailing lists. A move to the Incubator is in a way a good excuse for such recruitment - as it means they should be feeling they are engaging with and becoming part of the software project as an entity, rather than (as previously) just communicating with a particular research group in Manchester. * Do you have any third party dependencies in the Taverna core that have incompatible licenses (like GPL)? Unfortunately we do have a few of those, yes - the fact that we have to move away from those was one of the things that we discussed a lot in the Taverna community. Taverna 2 is licensed as LGPL 2.1, which meant we could use several LGPL libraries
Re: [Proposal] Taverna workflow
Proposal now moved to the Apache wiki: https://wiki.apache.org/incubator/TavernaProposal I just used copy-paste - so there might be some mistakes introduced - feel free to correct. I will be away for 2 weeks - but my colleague Shoaib Sufi should have signed up to this list to assist in any question during that period. On 23 September 2014 13:43, Stian Soiland-Reyes soiland-re...@cs.manchester.ac.uk wrote: I hereby present the Apache Incubator proposal for the project Taverna. Also available in rich text in the Taverna wiki (with more hyperlinks!): http://dev.mygrid.org.uk/wiki/display/developer/Taverna+incubator+proposal (Could someone grant me access to edit the Incubator wiki pages? My wiki username is soilandreyes) # Abstract Taverna is an open source and domain-independent suite of tools used to design and execute data-driven workflows. # Proposal The Taverna suite includes: * Taverna Workbench, a Java-based desktop application for graphically composing, editing and executing workflows of distributed web services and local tools * Taverna Commandline Tool which allows repeated execution of parameterized workflow definitions * Taverna Server provides a REST and SOAP API for executing workflows * Taverna Player is a Ruby-based web interface towards the Server, providing a high-level view of workflow executions and their results, and allows further integrations with Ruby on Rails applications. Taverna can browse and combine different service types, allowing workflows to integrate steps of arbitrary REST and SOAP web services with command line tools (local and SSH), scripts (Beanshell, R, Jython) and finally visualize the results. The goal of the Taverna suite is to help researchers to access distributed datasets and processing capabilities by the construction of pipelines, and also to simplify the execution of these pipelines in various environments. The Taverna suite of products is already successful and in wide-use across different domains. The software is currently licensed as LGPL 2.1, with copyright owned by University of Manchester. External contributors have all signed Apache-like CLAs. # Background Taverna workflows coordinate inputs and outputs between computational processes and Web Services. The workflow is designed in a graphical interface which shows the workflow as a series of boxes and arrows; representing processes and their data connections. The different processes in a workflow can be command line tools, REST and WSDL Web Services; which are used for combining steps such as data acquisition, filtering, cleaning, integrating, analysis and visualization. Taverna calls these processes services, as they generally are provided by remote (third-party) servers. These kind of computational workflows, also known as pipelines and dataflows, focus on the movement of data rather than the execution order of the underlying processes. Features such as implicit iterations (where an input list of values causes multiple process executions) and parallel invocations (independent processes are executed as soon as their data is available) are intrinsic to a dataflow system, not requiring any particular constructs by the workflow designer. As a visual programming environment, workflows aids collaboration and reuse of workflows. At the highest level, a workflow represents the conceptual level of an analysis, allowing understanding, discussion and communication of the overall analysis protocol. More detail can be revealed and modified for individual steps. At the individual process level, the workflow defines execution specifics such as operations, parameters and command line tools. Sharing of the workflow definitions allows re-use and re-purposing of the computational analysis. During workflow execution, provenance can be collected from every step, allowing deep inspection of intermediate values for the purpose of debugging and validation. # Rationale There is a strong need to lower the barrier of entry to datasets and computational resources widely available on the Internet, to increase their use by researchers who understand the computational steps needed to produce their results, but who are not necessarily expert programmers. Taverna has already shown its success and popularity in a wide range of scientific disciplines. # Initial Goals * Transition mailing lists to Apache (keep existing subscribers, but invite more) * Taverna developer workshop (2014-10-30) * Prepare git repositories for move: * Update headers/metadata to indicate Apache License 2.0 * Restructure git repositories * Rename Maven groupIds to org.apache.taverna.* * Rename packages to org.apache.taverna.* * Move Github repositories to Apache git * Automated builds in Apache's Jenkins * Update to latest releases of Apache dependencies * Propose updated release testing procedure under Apache * Moved Website and
Re: [Proposal] Taverna workflow
Thanks Stian! ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Stian Soiland-Reyes soiland-re...@cs.manchester.ac.uk Reply-To: general@incubator.apache.org general@incubator.apache.org Date: Thursday, September 25, 2014 9:19 AM To: general@incubator.apache.org general@incubator.apache.org Subject: Re: [Proposal] Taverna workflow Proposal now moved to the Apache wiki: https://wiki.apache.org/incubator/TavernaProposal I just used copy-paste - so there might be some mistakes introduced - feel free to correct. I will be away for 2 weeks - but my colleague Shoaib Sufi should have signed up to this list to assist in any question during that period. On 23 September 2014 13:43, Stian Soiland-Reyes soiland-re...@cs.manchester.ac.uk wrote: I hereby present the Apache Incubator proposal for the project Taverna. Also available in rich text in the Taverna wiki (with more hyperlinks!): http://dev.mygrid.org.uk/wiki/display/developer/Taverna+incubator+proposa l (Could someone grant me access to edit the Incubator wiki pages? My wiki username is soilandreyes) # Abstract Taverna is an open source and domain-independent suite of tools used to design and execute data-driven workflows. # Proposal The Taverna suite includes: * Taverna Workbench, a Java-based desktop application for graphically composing, editing and executing workflows of distributed web services and local tools * Taverna Commandline Tool which allows repeated execution of parameterized workflow definitions * Taverna Server provides a REST and SOAP API for executing workflows * Taverna Player is a Ruby-based web interface towards the Server, providing a high-level view of workflow executions and their results, and allows further integrations with Ruby on Rails applications. Taverna can browse and combine different service types, allowing workflows to integrate steps of arbitrary REST and SOAP web services with command line tools (local and SSH), scripts (Beanshell, R, Jython) and finally visualize the results. The goal of the Taverna suite is to help researchers to access distributed datasets and processing capabilities by the construction of pipelines, and also to simplify the execution of these pipelines in various environments. The Taverna suite of products is already successful and in wide-use across different domains. The software is currently licensed as LGPL 2.1, with copyright owned by University of Manchester. External contributors have all signed Apache-like CLAs. # Background Taverna workflows coordinate inputs and outputs between computational processes and Web Services. The workflow is designed in a graphical interface which shows the workflow as a series of boxes and arrows; representing processes and their data connections. The different processes in a workflow can be command line tools, REST and WSDL Web Services; which are used for combining steps such as data acquisition, filtering, cleaning, integrating, analysis and visualization. Taverna calls these processes services, as they generally are provided by remote (third-party) servers. These kind of computational workflows, also known as pipelines and dataflows, focus on the movement of data rather than the execution order of the underlying processes. Features such as implicit iterations (where an input list of values causes multiple process executions) and parallel invocations (independent processes are executed as soon as their data is available) are intrinsic to a dataflow system, not requiring any particular constructs by the workflow designer. As a visual programming environment, workflows aids collaboration and reuse of workflows. At the highest level, a workflow represents the conceptual level of an analysis, allowing understanding, discussion and communication of the overall analysis protocol. More detail can be revealed and modified for individual steps. At the individual process level, the workflow defines execution specifics such as operations, parameters and command line tools. Sharing of the workflow definitions allows re-use and re-purposing of the computational analysis. During workflow execution, provenance can be collected from every step, allowing deep inspection of intermediate values for the purpose of debugging and validation. # Rationale There is a strong need to lower the barrier of entry to datasets
Re: [Proposal] Taverna workflow
Guys, FYI Mike Joyce isn't a member of the IPMC, so technically he cannot be a mentor for the project: http://people.apache.org/committer-index.html#joyce Mike I would be happy for you to provide your mentorship regardless of the title we just need to update the proposal before the VOTE since procedural it is not correct. CHeers, Chris ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Stian Soiland-Reyes soiland-re...@cs.manchester.ac.uk Reply-To: general@incubator.apache.org general@incubator.apache.org Date: Thursday, September 25, 2014 9:19 AM To: general@incubator.apache.org general@incubator.apache.org Subject: Re: [Proposal] Taverna workflow Proposal now moved to the Apache wiki: https://wiki.apache.org/incubator/TavernaProposal I just used copy-paste - so there might be some mistakes introduced - feel free to correct. I will be away for 2 weeks - but my colleague Shoaib Sufi should have signed up to this list to assist in any question during that period. On 23 September 2014 13:43, Stian Soiland-Reyes soiland-re...@cs.manchester.ac.uk wrote: I hereby present the Apache Incubator proposal for the project Taverna. Also available in rich text in the Taverna wiki (with more hyperlinks!): http://dev.mygrid.org.uk/wiki/display/developer/Taverna+incubator+proposa l (Could someone grant me access to edit the Incubator wiki pages? My wiki username is soilandreyes) # Abstract Taverna is an open source and domain-independent suite of tools used to design and execute data-driven workflows. # Proposal The Taverna suite includes: * Taverna Workbench, a Java-based desktop application for graphically composing, editing and executing workflows of distributed web services and local tools * Taverna Commandline Tool which allows repeated execution of parameterized workflow definitions * Taverna Server provides a REST and SOAP API for executing workflows * Taverna Player is a Ruby-based web interface towards the Server, providing a high-level view of workflow executions and their results, and allows further integrations with Ruby on Rails applications. Taverna can browse and combine different service types, allowing workflows to integrate steps of arbitrary REST and SOAP web services with command line tools (local and SSH), scripts (Beanshell, R, Jython) and finally visualize the results. The goal of the Taverna suite is to help researchers to access distributed datasets and processing capabilities by the construction of pipelines, and also to simplify the execution of these pipelines in various environments. The Taverna suite of products is already successful and in wide-use across different domains. The software is currently licensed as LGPL 2.1, with copyright owned by University of Manchester. External contributors have all signed Apache-like CLAs. # Background Taverna workflows coordinate inputs and outputs between computational processes and Web Services. The workflow is designed in a graphical interface which shows the workflow as a series of boxes and arrows; representing processes and their data connections. The different processes in a workflow can be command line tools, REST and WSDL Web Services; which are used for combining steps such as data acquisition, filtering, cleaning, integrating, analysis and visualization. Taverna calls these processes services, as they generally are provided by remote (third-party) servers. These kind of computational workflows, also known as pipelines and dataflows, focus on the movement of data rather than the execution order of the underlying processes. Features such as implicit iterations (where an input list of values causes multiple process executions) and parallel invocations (independent processes are executed as soon as their data is available) are intrinsic to a dataflow system, not requiring any particular constructs by the workflow designer. As a visual programming environment, workflows aids collaboration and reuse of workflows. At the highest level, a workflow represents the conceptual level of an analysis, allowing understanding, discussion and communication of the overall analysis protocol. More detail can be revealed and modified for individual steps. At the individual process level, the workflow defines execution specifics such as operations, parameters and command line tools. Sharing of the workflow definitions
Re: [Proposal] Taverna workflow
This is an exciting project. If you you need a volunteer, I am available. My interests are from my close participation in related projects - Hadoop, Falcon, and Storm. On Tue, Sep 23, 2014 at 5:43 AM, Stian Soiland-Reyes soiland-re...@cs.manchester.ac.uk wrote: I hereby present the Apache Incubator proposal for the project Taverna. Also available in rich text in the Taverna wiki (with more hyperlinks!): http://dev.mygrid.org.uk/wiki/display/developer/Taverna+incubator+proposal (Could someone grant me access to edit the Incubator wiki pages? My wiki username is soilandreyes) # Abstract Taverna is an open source and domain-independent suite of tools used to design and execute data-driven workflows. # Proposal The Taverna suite includes: * Taverna Workbench, a Java-based desktop application for graphically composing, editing and executing workflows of distributed web services and local tools * Taverna Commandline Tool which allows repeated execution of parameterized workflow definitions * Taverna Server provides a REST and SOAP API for executing workflows * Taverna Player is a Ruby-based web interface towards the Server, providing a high-level view of workflow executions and their results, and allows further integrations with Ruby on Rails applications. Taverna can browse and combine different service types, allowing workflows to integrate steps of arbitrary REST and SOAP web services with command line tools (local and SSH), scripts (Beanshell, R, Jython) and finally visualize the results. The goal of the Taverna suite is to help researchers to access distributed datasets and processing capabilities by the construction of pipelines, and also to simplify the execution of these pipelines in various environments. The Taverna suite of products is already successful and in wide-use across different domains. The software is currently licensed as LGPL 2.1, with copyright owned by University of Manchester. External contributors have all signed Apache-like CLAs. # Background Taverna workflows coordinate inputs and outputs between computational processes and Web Services. The workflow is designed in a graphical interface which shows the workflow as a series of boxes and arrows; representing processes and their data connections. The different processes in a workflow can be command line tools, REST and WSDL Web Services; which are used for combining steps such as data acquisition, filtering, cleaning, integrating, analysis and visualization. Taverna calls these processes services, as they generally are provided by remote (third-party) servers. These kind of computational workflows, also known as pipelines and dataflows, focus on the movement of data rather than the execution order of the underlying processes. Features such as implicit iterations (where an input list of values causes multiple process executions) and parallel invocations (independent processes are executed as soon as their data is available) are intrinsic to a dataflow system, not requiring any particular constructs by the workflow designer. As a visual programming environment, workflows aids collaboration and reuse of workflows. At the highest level, a workflow represents the conceptual level of an analysis, allowing understanding, discussion and communication of the overall analysis protocol. More detail can be revealed and modified for individual steps. At the individual process level, the workflow defines execution specifics such as operations, parameters and command line tools. Sharing of the workflow definitions allows re-use and re-purposing of the computational analysis. During workflow execution, provenance can be collected from every step, allowing deep inspection of intermediate values for the purpose of debugging and validation. # Rationale There is a strong need to lower the barrier of entry to datasets and computational resources widely available on the Internet, to increase their use by researchers who understand the computational steps needed to produce their results, but who are not necessarily expert programmers. Taverna has already shown its success and popularity in a wide range of scientific disciplines. # Initial Goals * Transition mailing lists to Apache (keep existing subscribers, but invite more) * Taverna developer workshop (2014-10-30) * Prepare git repositories for move: * Update headers/metadata to indicate Apache License 2.0 * Restructure git repositories * Rename Maven groupIds to org.apache.taverna.* * Rename packages to org.apache.taverna.* * Move Github repositories to Apache git * Automated builds in Apache's Jenkins * Update to latest releases of Apache dependencies * Propose updated release testing procedure under Apache * Moved Website and documentation We intend to only release the current development version Taverna 3.x http://www.taverna.org.uk/developers/work-in-progress/taverna-3/ under
Re: [Proposal] Taverna workflow
* Do you have any third party dependencies in the Taverna core that have incompatible licenses (like GPL)? Unfortunately we do have a few of those, yes - the fact that we have to move away from those was one of the things that we discussed a lot in the Taverna community. Can you make sure a comprehensive list of those currently incompatible list of dependencies is included in your proposal. Taverna 2 is licensed as LGPL 2.1, which meant we could use several LGPL libraries like Hibernate and RShell. Hibernate can be replaced by other JPA providers (with some code update to remove Hibernate specific calls), while the RShell support would have to be moved out to an separately installable plugin. Do you have adequate rights to change the license wholesale? The Astronomy edition of Taverna includes a plugin called AstroTaverna, which is GPL3 due to its inclusion of the Topcat and STILTS dependencies. The AstroTaverna community was therefore a bit sceptical about moving to Apache - but we concluded that as they would keep maintaining AstroTaverna as standalone plugins and instead of having multiple downloads for different editions, with Taverna 3 move to a Start screen that installs plugins from possibly third-party sites (Eclipse style). http://smtp.iaa.es/pipermail/astrotaverna-users/20140529/thread.html Here luckily our plugin system (OSGi) will help us out - so those bits that truly depend on GPL or LGPL would have to be maintained outside Apache. What perhaps we need to prepare a bit clearer is exactly which plugins will be in the Apache transfer, and which would stay outside. This sounds like fragmenting your existing community before you really even get started. I believe the ASF is a great place, but I am not convinced it's the best place for everyone. The Taverna Workbench installers currently include platform-specific binaries of OpenJDK 7, which is licensed under GPL 2 with classpath exception. It is likely that under Apache we could not distribute OpenJDK - but perhaps it would instead be allowed to distribute the normal JDK binaries? (For Taverna 2 we did not distribute the normal JDK as it can be seen as incompatible with GPL, which LGPL can be upgraded to). Do you know of any Apache projects that do this, like perhaps OpenOffice? An alternative is for the installer to download JDK on demand - but would that require the installer itself (currently Install4j) to be replaced? * Would you like developer-contributed plugins to be covered within a future Apache Taverna project? As we've seen, keeping plugin developers on the outside of the project has isolated them from the core development. We would therefore like to encourage any new plugin developers to eventually make their plugin a part of an Apache Taverna project - as we have done historically with successful plugins. Apache's use of CLAs is I must admit a bit of a hindrance to this as opposed to the Github Laissez-faire style - - it has kept myself away from Apache projects earlier when my suggested patch was deemed significant - yet the legal department of the University spent 8 months reviewing that patch and Apache's CLA before finally signing. Yet we consider Taverna to be such a mature project that we want IP and licensing to be done correctly - and as you see our earlier insistence on keeping CLAs for all Taverna 2 development means that we are now in a position to relicense Taverna and change ownership to a foundation like Apache. I see two paragraphs here that don't answer the question posed. Where will plugin development happen in an ideal world? One followup question - what kind of build/CI resources does the University of Manchester provide? What manner of resources do you believe you'll need if the project moves to the ASF? --David - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org
Re: [Proposal] Taverna workflow
Hi Guys, -Original Message- From: David Nalley da...@gnsa.us Reply-To: general@incubator.apache.org general@incubator.apache.org Date: Thursday, September 25, 2014 9:34 AM To: general@incubator.apache.org general@incubator.apache.org Cc: Shoaib Sufi shoaib.s...@manchester.ac.uk Subject: Re: [Proposal] Taverna workflow [..snip..] I see two paragraphs here that don't answer the question posed. Where will plugin development happen in an ideal world? This is an implementation detail and can be dealt with during Incubation? One followup question - what kind of build/CI resources does the University of Manchester provide? What manner of resources do you believe you'll need if the project moves to the ASF? Is it a requirement for projects to provide build/CI resources now to enter the Incubator? I do not believe that it is. Cheers, Chris - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org
RE: [Proposal] Taverna workflow
It's a requirement for the ASF to support its projects. Understanding what impact each project coming into the incubator might have is important to allow VP Infra to plan for our growth. David did not ask if Manchester will be donating resources, he asked what do they currently provide and what does the project think they will need from the ASF. For the record, I am familiar with Taverna from a previous life. It's is interesting to see this proposal coming to the ASF. The first time this was discussed with the University of Manchester was many years ago. The conversation occurred every couple of years, each time with different people, but never progressed to a proposal. Given the answers in this thread things have changed quite considerably since then. Sent from my Windows Phone From: Mattmann, Chris A (3980)mailto:chris.a.mattm...@jpl.nasa.gov Sent: 9/25/2014 10:09 AM To: general@incubator.apache.orgmailto:general@incubator.apache.org Cc: Shoaib Sufimailto:shoaib.s...@manchester.ac.uk Subject: Re: [Proposal] Taverna workflow Hi Guys, -Original Message- From: David Nalley da...@gnsa.us Reply-To: general@incubator.apache.org general@incubator.apache.org Date: Thursday, September 25, 2014 9:34 AM To: general@incubator.apache.org general@incubator.apache.org Cc: Shoaib Sufi shoaib.s...@manchester.ac.uk Subject: Re: [Proposal] Taverna workflow [..snip..] I see two paragraphs here that don't answer the question posed. Where will plugin development happen in an ideal world? This is an implementation detail and can be dealt with during Incubation? One followup question - what kind of build/CI resources does the University of Manchester provide? What manner of resources do you believe you'll need if the project moves to the ASF? Is it a requirement for projects to provide build/CI resources now to enter the Incubator? I do not believe that it is. Cheers, Chris - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org
Re: [Proposal] Taverna workflow
We should ask the same questions to projects and I don't see this question of infrastructure asked very often. So I raised it. David of course is doing great and I just saw some of this as can be worked out during incubation Thanks Sent from my iPhone On Sep 25, 2014, at 10:46 AM, Ross Gardler (MS OPEN TECH) ross.gard...@microsoft.com wrote: It's a requirement for the ASF to support its projects. Understanding what impact each project coming into the incubator might have is important to allow VP Infra to plan for our growth. David did not ask if Manchester will be donating resources, he asked what do they currently provide and what does the project think they will need from the ASF. For the record, I am familiar with Taverna from a previous life. It's is interesting to see this proposal coming to the ASF. The first time this was discussed with the University of Manchester was many years ago. The conversation occurred every couple of years, each time with different people, but never progressed to a proposal. Given the answers in this thread things have changed quite considerably since then. Sent from my Windows Phone From: Mattmann, Chris A (3980)mailto:chris.a.mattm...@jpl.nasa.gov Sent: 9/25/2014 10:09 AM To: general@incubator.apache.orgmailto:general@incubator.apache.org Cc: Shoaib Sufimailto:shoaib.s...@manchester.ac.uk Subject: Re: [Proposal] Taverna workflow Hi Guys, -Original Message- From: David Nalley da...@gnsa.us Reply-To: general@incubator.apache.org general@incubator.apache.org Date: Thursday, September 25, 2014 9:34 AM To: general@incubator.apache.org general@incubator.apache.org Cc: Shoaib Sufi shoaib.s...@manchester.ac.uk Subject: Re: [Proposal] Taverna workflow [..snip..] I see two paragraphs here that don't answer the question posed. Where will plugin development happen in an ideal world? This is an implementation detail and can be dealt with during Incubation? One followup question - what kind of build/CI resources does the University of Manchester provide? What manner of resources do you believe you'll need if the project moves to the ASF? Is it a requirement for projects to provide build/CI resources now to enter the Incubator? I do not believe that it is. Cheers, Chris - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org
Re: [Proposal] Taverna workflow
I think David asked it today because Brett, David and I shared a coffee yesterday and this was one of the topics of conversation. It's timing that's all. The problem with working out *during* incubation is that the foundation has already committed to providing the support necessary. We need an indication upon entry. I believe David asked this question in order to start doing this informally while we figure out how to do it in a more consistent and managed way. In summary, I fully agree with your comment We should ask the same questions to projects and moving forwards I believe we will do so - starting now. Sent from Surface From: Chris Mattmannmailto:chris.a.mattm...@jpl.nasa.gov Sent: ?Thursday?, ?September? ?25?, ?2014 ?11?:?03? ?AM To: general@incubator.apache.orgmailto:general@incubator.apache.org Cc: Shoaib Sufimailto:shoaib.s...@manchester.ac.uk We should ask the same questions to projects and I don't see this question of infrastructure asked very often. So I raised it. David of course is doing great and I just saw some of this as can be worked out during incubation Thanks Sent from my iPhone On Sep 25, 2014, at 10:46 AM, Ross Gardler (MS OPEN TECH) ross.gard...@microsoft.com wrote: It's a requirement for the ASF to support its projects. Understanding what impact each project coming into the incubator might have is important to allow VP Infra to plan for our growth. David did not ask if Manchester will be donating resources, he asked what do they currently provide and what does the project think they will need from the ASF. For the record, I am familiar with Taverna from a previous life. It's is interesting to see this proposal coming to the ASF. The first time this was discussed with the University of Manchester was many years ago. The conversation occurred every couple of years, each time with different people, but never progressed to a proposal. Given the answers in this thread things have changed quite considerably since then. Sent from my Windows Phone From: Mattmann, Chris A (3980)mailto:chris.a.mattm...@jpl.nasa.gov Sent: ?9/?25/?2014 10:09 AM To: general@incubator.apache.orgmailto:general@incubator.apache.org Cc: Shoaib Sufimailto:shoaib.s...@manchester.ac.uk Subject: Re: [Proposal] Taverna workflow Hi Guys, -Original Message- From: David Nalley da...@gnsa.us Reply-To: general@incubator.apache.org general@incubator.apache.org Date: Thursday, September 25, 2014 9:34 AM To: general@incubator.apache.org general@incubator.apache.org Cc: Shoaib Sufi shoaib.s...@manchester.ac.uk Subject: Re: [Proposal] Taverna workflow [..snip..] I see two paragraphs here that don't answer the question posed. Where will plugin development happen in an ideal world? This is an implementation detail and can be dealt with during Incubation? One followup question - what kind of build/CI resources does the University of Manchester provide? What manner of resources do you believe you'll need if the project moves to the ASF? Is it a requirement for projects to provide build/CI resources now to enter the Incubator? I do not believe that it is. Cheers, Chris - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org
Re: [Proposal] Taverna workflow
Hi Stian, Thank you for very nice elaborations. I really liked your honest views on where things are and where you want to be. As you might have experiences with Jena and other projects, apache is about community. It will take significant work to get pass the current tone of us vs them to “we. But you are on a great start. Also, Taverna with a long history will have to be ready for the challenge to build on the positives and overcome the “manchester - do it for me” inertia (from a core community building perspective). If you need a mentor, count me in. I actively contribute to Apache Airavata, and will be happy to bring our experiences from a similar journey. Infact Ross queried on airavata lists few years ago about potential taverna move to airavata/apache(Ross mentioned it further in this thread), good to see finally its happening. Integrating plugin community into the apache project (once its voted in) seems to be a low hanging fruit to diversify. David questions are right on. Two things you may want to consider addressing before you call for a vote are: listing of non-apache compatible license in the proposal and having adequate rights to change the license to Apache V2. Not a blocker for the proposal and voting, but a blocker for importing the code will be to have on file the University signed CCLA/SGA to donate the code. Echoing Chris, A hearty welcome!! Suresh On Sep 25, 2014, at 12:34 PM, David Nalley da...@gnsa.us wrote: * Do you have any third party dependencies in the Taverna core that have incompatible licenses (like GPL)? Unfortunately we do have a few of those, yes - the fact that we have to move away from those was one of the things that we discussed a lot in the Taverna community. Can you make sure a comprehensive list of those currently incompatible list of dependencies is included in your proposal. Taverna 2 is licensed as LGPL 2.1, which meant we could use several LGPL libraries like Hibernate and RShell. Hibernate can be replaced by other JPA providers (with some code update to remove Hibernate specific calls), while the RShell support would have to be moved out to an separately installable plugin. Do you have adequate rights to change the license wholesale? The Astronomy edition of Taverna includes a plugin called AstroTaverna, which is GPL3 due to its inclusion of the Topcat and STILTS dependencies. The AstroTaverna community was therefore a bit sceptical about moving to Apache - but we concluded that as they would keep maintaining AstroTaverna as standalone plugins and instead of having multiple downloads for different editions, with Taverna 3 move to a Start screen that installs plugins from possibly third-party sites (Eclipse style). http://smtp.iaa.es/pipermail/astrotaverna-users/20140529/thread.html Here luckily our plugin system (OSGi) will help us out - so those bits that truly depend on GPL or LGPL would have to be maintained outside Apache. What perhaps we need to prepare a bit clearer is exactly which plugins will be in the Apache transfer, and which would stay outside. This sounds like fragmenting your existing community before you really even get started. I believe the ASF is a great place, but I am not convinced it's the best place for everyone. The Taverna Workbench installers currently include platform-specific binaries of OpenJDK 7, which is licensed under GPL 2 with classpath exception. It is likely that under Apache we could not distribute OpenJDK - but perhaps it would instead be allowed to distribute the normal JDK binaries? (For Taverna 2 we did not distribute the normal JDK as it can be seen as incompatible with GPL, which LGPL can be upgraded to). Do you know of any Apache projects that do this, like perhaps OpenOffice? An alternative is for the installer to download JDK on demand - but would that require the installer itself (currently Install4j) to be replaced? * Would you like developer-contributed plugins to be covered within a future Apache Taverna project? As we've seen, keeping plugin developers on the outside of the project has isolated them from the core development. We would therefore like to encourage any new plugin developers to eventually make their plugin a part of an Apache Taverna project - as we have done historically with successful plugins. Apache's use of CLAs is I must admit a bit of a hindrance to this as opposed to the Github Laissez-faire style - - it has kept myself away from Apache projects earlier when my suggested patch was deemed significant - yet the legal department of the University spent 8 months reviewing that patch and Apache's CLA before finally signing. Yet we consider Taverna to be such a mature project that we want IP and licensing to be done correctly - and as you see our earlier insistence on keeping CLAs for all Taverna 2 development means that we are now in a position to relicense
Re: [Proposal] Taverna workflow
Thanks a lot - of course we would love to have you as mentors! I have added you both at http://dev.mygrid.org.uk/wiki/display/developer/Taverna+incubator+proposal#Tavernaincubatorproposal-NominatedMentors I'll transfer it to the incubator wiki as soon as I get access. On 23 September 2014 17:29, Michael Joyce mltjo...@gmail.com wrote: +1 this is really great news. Would happily help where I could as a mentor as well. On Tuesday, September 23, 2014, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: WOW that is so awesome guys! Taverna at Apache FTW!! Let me know if you need a mentor, I'm in! :) Cheers, Chris ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov javascript:; WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Stian Soiland-Reyes soiland-re...@cs.manchester.ac.uk javascript:; Reply-To: general@incubator.apache.org javascript:; general@incubator.apache.org javascript:; Date: Tuesday, September 23, 2014 5:43 AM To: general@incubator.apache.org javascript:; general@incubator.apache.org javascript:; Cc: List for general discussion and hacking of the Taverna project taverna-hack...@lists.sourceforge.net javascript:; Subject: [Proposal] Taverna workflow I hereby present the Apache Incubator proposal for the project Taverna. Also available in rich text in the Taverna wiki (with more hyperlinks!): http://dev.mygrid.org.uk/wiki/display/developer/Taverna+incubator+proposal (Could someone grant me access to edit the Incubator wiki pages? My wiki username is soilandreyes) # Abstract Taverna is an open source and domain-independent suite of tools used to design and execute data-driven workflows. # Proposal The Taverna suite includes: * Taverna Workbench, a Java-based desktop application for graphically composing, editing and executing workflows of distributed web services and local tools * Taverna Commandline Tool which allows repeated execution of parameterized workflow definitions * Taverna Server provides a REST and SOAP API for executing workflows * Taverna Player is a Ruby-based web interface towards the Server, providing a high-level view of workflow executions and their results, and allows further integrations with Ruby on Rails applications. Taverna can browse and combine different service types, allowing workflows to integrate steps of arbitrary REST and SOAP web services with command line tools (local and SSH), scripts (Beanshell, R, Jython) and finally visualize the results. The goal of the Taverna suite is to help researchers to access distributed datasets and processing capabilities by the construction of pipelines, and also to simplify the execution of these pipelines in various environments. The Taverna suite of products is already successful and in wide-use across different domains. The software is currently licensed as LGPL 2.1, with copyright owned by University of Manchester. External contributors have all signed Apache-like CLAs. # Background Taverna workflows coordinate inputs and outputs between computational processes and Web Services. The workflow is designed in a graphical interface which shows the workflow as a series of boxes and arrows; representing processes and their data connections. The different processes in a workflow can be command line tools, REST and WSDL Web Services; which are used for combining steps such as data acquisition, filtering, cleaning, integrating, analysis and visualization. Taverna calls these processes services, as they generally are provided by remote (third-party) servers. These kind of computational workflows, also known as pipelines and dataflows, focus on the movement of data rather than the execution order of the underlying processes. Features such as implicit iterations (where an input list of values causes multiple process executions) and parallel invocations (independent processes are executed as soon as their data is available) are intrinsic to a dataflow system, not requiring any particular constructs by the workflow designer. As a visual programming environment, workflows aids collaboration and reuse of workflows. At the highest level, a workflow represents the conceptual level of an analysis, allowing understanding, discussion and communication of the overall analysis protocol. More detail can be revealed and modified for individual steps. At the individual process level, the
Re: [Proposal] Taverna workflow
WOW that is so awesome guys! Taverna at Apache FTW!! Let me know if you need a mentor, I'm in! :) Cheers, Chris ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Stian Soiland-Reyes soiland-re...@cs.manchester.ac.uk Reply-To: general@incubator.apache.org general@incubator.apache.org Date: Tuesday, September 23, 2014 5:43 AM To: general@incubator.apache.org general@incubator.apache.org Cc: List for general discussion and hacking of the Taverna project taverna-hack...@lists.sourceforge.net Subject: [Proposal] Taverna workflow I hereby present the Apache Incubator proposal for the project Taverna. Also available in rich text in the Taverna wiki (with more hyperlinks!): http://dev.mygrid.org.uk/wiki/display/developer/Taverna+incubator+proposal (Could someone grant me access to edit the Incubator wiki pages? My wiki username is soilandreyes) # Abstract Taverna is an open source and domain-independent suite of tools used to design and execute data-driven workflows. # Proposal The Taverna suite includes: * Taverna Workbench, a Java-based desktop application for graphically composing, editing and executing workflows of distributed web services and local tools * Taverna Commandline Tool which allows repeated execution of parameterized workflow definitions * Taverna Server provides a REST and SOAP API for executing workflows * Taverna Player is a Ruby-based web interface towards the Server, providing a high-level view of workflow executions and their results, and allows further integrations with Ruby on Rails applications. Taverna can browse and combine different service types, allowing workflows to integrate steps of arbitrary REST and SOAP web services with command line tools (local and SSH), scripts (Beanshell, R, Jython) and finally visualize the results. The goal of the Taverna suite is to help researchers to access distributed datasets and processing capabilities by the construction of pipelines, and also to simplify the execution of these pipelines in various environments. The Taverna suite of products is already successful and in wide-use across different domains. The software is currently licensed as LGPL 2.1, with copyright owned by University of Manchester. External contributors have all signed Apache-like CLAs. # Background Taverna workflows coordinate inputs and outputs between computational processes and Web Services. The workflow is designed in a graphical interface which shows the workflow as a series of boxes and arrows; representing processes and their data connections. The different processes in a workflow can be command line tools, REST and WSDL Web Services; which are used for combining steps such as data acquisition, filtering, cleaning, integrating, analysis and visualization. Taverna calls these processes services, as they generally are provided by remote (third-party) servers. These kind of computational workflows, also known as pipelines and dataflows, focus on the movement of data rather than the execution order of the underlying processes. Features such as implicit iterations (where an input list of values causes multiple process executions) and parallel invocations (independent processes are executed as soon as their data is available) are intrinsic to a dataflow system, not requiring any particular constructs by the workflow designer. As a visual programming environment, workflows aids collaboration and reuse of workflows. At the highest level, a workflow represents the conceptual level of an analysis, allowing understanding, discussion and communication of the overall analysis protocol. More detail can be revealed and modified for individual steps. At the individual process level, the workflow defines execution specifics such as operations, parameters and command line tools. Sharing of the workflow definitions allows re-use and re-purposing of the computational analysis. During workflow execution, provenance can be collected from every step, allowing deep inspection of intermediate values for the purpose of debugging and validation. # Rationale There is a strong need to lower the barrier of entry to datasets and computational resources widely available on the Internet, to increase their use by researchers who understand the computational steps needed to produce their results, but who are not necessarily expert programmers. Taverna has already shown its success and popularity in a wide range of
Re: [Proposal] Taverna workflow
+1 this is really great news. Would happily help where I could as a mentor as well. On Tuesday, September 23, 2014, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: WOW that is so awesome guys! Taverna at Apache FTW!! Let me know if you need a mentor, I'm in! :) Cheers, Chris ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov javascript:; WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Stian Soiland-Reyes soiland-re...@cs.manchester.ac.uk javascript:; Reply-To: general@incubator.apache.org javascript:; general@incubator.apache.org javascript:; Date: Tuesday, September 23, 2014 5:43 AM To: general@incubator.apache.org javascript:; general@incubator.apache.org javascript:; Cc: List for general discussion and hacking of the Taverna project taverna-hack...@lists.sourceforge.net javascript:; Subject: [Proposal] Taverna workflow I hereby present the Apache Incubator proposal for the project Taverna. Also available in rich text in the Taverna wiki (with more hyperlinks!): http://dev.mygrid.org.uk/wiki/display/developer/Taverna+incubator+proposal (Could someone grant me access to edit the Incubator wiki pages? My wiki username is soilandreyes) # Abstract Taverna is an open source and domain-independent suite of tools used to design and execute data-driven workflows. # Proposal The Taverna suite includes: * Taverna Workbench, a Java-based desktop application for graphically composing, editing and executing workflows of distributed web services and local tools * Taverna Commandline Tool which allows repeated execution of parameterized workflow definitions * Taverna Server provides a REST and SOAP API for executing workflows * Taverna Player is a Ruby-based web interface towards the Server, providing a high-level view of workflow executions and their results, and allows further integrations with Ruby on Rails applications. Taverna can browse and combine different service types, allowing workflows to integrate steps of arbitrary REST and SOAP web services with command line tools (local and SSH), scripts (Beanshell, R, Jython) and finally visualize the results. The goal of the Taverna suite is to help researchers to access distributed datasets and processing capabilities by the construction of pipelines, and also to simplify the execution of these pipelines in various environments. The Taverna suite of products is already successful and in wide-use across different domains. The software is currently licensed as LGPL 2.1, with copyright owned by University of Manchester. External contributors have all signed Apache-like CLAs. # Background Taverna workflows coordinate inputs and outputs between computational processes and Web Services. The workflow is designed in a graphical interface which shows the workflow as a series of boxes and arrows; representing processes and their data connections. The different processes in a workflow can be command line tools, REST and WSDL Web Services; which are used for combining steps such as data acquisition, filtering, cleaning, integrating, analysis and visualization. Taverna calls these processes services, as they generally are provided by remote (third-party) servers. These kind of computational workflows, also known as pipelines and dataflows, focus on the movement of data rather than the execution order of the underlying processes. Features such as implicit iterations (where an input list of values causes multiple process executions) and parallel invocations (independent processes are executed as soon as their data is available) are intrinsic to a dataflow system, not requiring any particular constructs by the workflow designer. As a visual programming environment, workflows aids collaboration and reuse of workflows. At the highest level, a workflow represents the conceptual level of an analysis, allowing understanding, discussion and communication of the overall analysis protocol. More detail can be revealed and modified for individual steps. At the individual process level, the workflow defines execution specifics such as operations, parameters and command line tools. Sharing of the workflow definitions allows re-use and re-purposing of the computational analysis. During workflow execution, provenance can be collected from every step, allowing deep inspection of intermediate values for the purpose of debugging and
Re: [Proposal] Taverna workflow
Thanks, Stian, for submitting a well-developed proposal and for your interest in Apache. I have a few questions: * Can you say more about why you want to take Taverna to the ASF? * What is your strategy for increasing the diversity of your committer base? * Do you have any third party dependencies in the Taverna core that have incompatible licenses (like GPL)? * Would you like developer-contributed plugins to be covered within a future Apache Taverna project? My main goal here is to give the Incubator community a little more background and foster discussion, which will be useful in attracting mentors, so don't worry about right or wrong answers. Marlon On 9/23/14, 8:43 AM, Stian Soiland-Reyes wrote: I hereby present the Apache Incubator proposal for the project Taverna. Also available in rich text in the Taverna wiki (with more hyperlinks!): http://dev.mygrid.org.uk/wiki/display/developer/Taverna+incubator+proposal (Could someone grant me access to edit the Incubator wiki pages? My wiki username is soilandreyes) # Abstract Taverna is an open source and domain-independent suite of tools used to design and execute data-driven workflows. # Proposal The Taverna suite includes: * Taverna Workbench, a Java-based desktop application for graphically composing, editing and executing workflows of distributed web services and local tools * Taverna Commandline Tool which allows repeated execution of parameterized workflow definitions * Taverna Server provides a REST and SOAP API for executing workflows * Taverna Player is a Ruby-based web interface towards the Server, providing a high-level view of workflow executions and their results, and allows further integrations with Ruby on Rails applications. Taverna can browse and combine different service types, allowing workflows to integrate steps of arbitrary REST and SOAP web services with command line tools (local and SSH), scripts (Beanshell, R, Jython) and finally visualize the results. The goal of the Taverna suite is to help researchers to access distributed datasets and processing capabilities by the construction of pipelines, and also to simplify the execution of these pipelines in various environments. The Taverna suite of products is already successful and in wide-use across different domains. The software is currently licensed as LGPL 2.1, with copyright owned by University of Manchester. External contributors have all signed Apache-like CLAs. # Background Taverna workflows coordinate inputs and outputs between computational processes and Web Services. The workflow is designed in a graphical interface which shows the workflow as a series of boxes and arrows; representing processes and their data connections. The different processes in a workflow can be command line tools, REST and WSDL Web Services; which are used for combining steps such as data acquisition, filtering, cleaning, integrating, analysis and visualization. Taverna calls these processes services, as they generally are provided by remote (third-party) servers. These kind of computational workflows, also known as pipelines and dataflows, focus on the movement of data rather than the execution order of the underlying processes. Features such as implicit iterations (where an input list of values causes multiple process executions) and parallel invocations (independent processes are executed as soon as their data is available) are intrinsic to a dataflow system, not requiring any particular constructs by the workflow designer. As a visual programming environment, workflows aids collaboration and reuse of workflows. At the highest level, a workflow represents the conceptual level of an analysis, allowing understanding, discussion and communication of the overall analysis protocol. More detail can be revealed and modified for individual steps. At the individual process level, the workflow defines execution specifics such as operations, parameters and command line tools. Sharing of the workflow definitions allows re-use and re-purposing of the computational analysis. During workflow execution, provenance can be collected from every step, allowing deep inspection of intermediate values for the purpose of debugging and validation. # Rationale There is a strong need to lower the barrier of entry to datasets and computational resources widely available on the Internet, to increase their use by researchers who understand the computational steps needed to produce their results, but who are not necessarily expert programmers. Taverna has already shown its success and popularity in a wide range of scientific disciplines. # Initial Goals * Transition mailing lists to Apache (keep existing subscribers, but invite more) * Taverna developer workshop (2014-10-30) * Prepare git repositories for move: * Update headers/metadata to indicate Apache License 2.0 * Restructure git repositories * Rename Maven groupIds to org.apache.taverna.* * Rename packages to
Re: [Proposal] Taverna workflow
On 23 September 2014 13:43, Stian Soiland-Reyes soiland-re...@cs.manchester.ac.uk wrote: I hereby present the Apache Incubator proposal for the project Taverna. Also available in rich text in the Taverna wiki (with more hyperlinks!): http://dev.mygrid.org.uk/wiki/display/developer/Taverna+incubator+proposal (Could someone grant me access to edit the Incubator wiki pages? My wiki username is soilandreyes) (this would have been better as a separate e-mail) Done - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org