Hi, Majdi, thank you for your intest in GSOC and Apache Taverna! Apologies for late reply.
On Thu, 15 Feb 2018 05:08:55 +0100, Majdi Haouech <[email protected]> wrote: > I have worked with different languages and technologies, to name a few: > Java, C, C++, Python, JS, Git, OWL, JavaEE, RabbitMQ, Maven. > I am also contributing to a Mozilla open source project which required > working with existing code and delivering a full working and completely > tested contribution. This is a strong background! It's good you have experience with both OWL, Python and Java. As you might have noticed the cwltool reference implementation is written in Python, while Taverna is (largely) in Java. > > First, I need to be familiar with the CWL specification in order to be able > > to convert YAML files to Taverna workflows. > > Next, I'll need to do the opposite work by transforming a Taverna workflow > > into YAML files following the CWL specification as well. > > Finally, if there will be enough time, I can contribute to Taverna's Tool > > Activity by completing the TAVERNA-878 issue. > > I understand that the contribution will be to the Taverna Language API and > > any other remarks are welcome. Yes, this can be a good way to start. I am a bit worried about going for both CWL import and export at the same time - although trying to have some kind of round-trip support would be wonderful! I think what you are proposing is to keep the translation structural at first, that is Taverna can import the CWL workflow and get the correct boxes/arrows connected, but would not know how to execute the CWL tools (pending TAVERNA-878). So this could for instance be that you import CWL tools into the dummy CWL activity we already have https://github.com/apache/incubator-taverna-common-activities/tree/cwl-browse/taverna-cwl-activity where it would just keep a JSON tree of the CWL Tool configuration without knowing how to execute it. That should make it easier to do a round-trip save out again. TAVERNA-878 could easily also be the most time-consuming part of the effort, so it might be good to not risk all on that. You may have a go at doing just simple CWL command line tools (e.g. just arguments and stdin/out without files) which should be easier to map to the existing Tool activity. Export of Taverna Workflows to CWL can seem challenging because in Taverna we have many types of activities which are not supported by other CWL engines: https://taverna.incubator.apache.org/javadoc/taverna-common-activities/ So again it could be a structural export where the CWL side just has "TavernaActivity" and includes the Taverna Activity JSON for opposite round-trip, but would not be executable on other CWL engines. (They should however work on http://view.commonwl.org/) If you then towards the end want to move beyond structural/skeleton export of Taverna workflows into CWL you can progress to export the Taverna's Command Tool activity as CWL command line tools. Again there might be features in the Taverna side that don't match exactly to CWL, which you can say is not supported. > > I am used to work with different technologies and dealing with different > > specifications. Great, yes, that is perhaps also the greatest challenge of this particular GSOC project, in that you will be integrating between several existing technologies and specifications! > > I am also used to work with existing code and extending it. > > Also, I am really motivated for contributing to an Apache project and this > > opportunity is perfect for me to dive right in with the right support. Thank you, we very much welcome your effort! What I think you could do next is to have a go with the different technologies, see the tutorials - and then start building a proposal (Google Docs?) where you list the background material and start listing your task breakdown. Taverna Language has some (but not much) doc at https://github.com/apache/incubator-taverna-language/ and https://taverna.incubator.apache.org/javadoc/taverna-language/org/apache/taverna/scufl2/api/package-summary.html The GOSC proposal is due mid March so there is time to develop it now - but it's good if you keep in touch with us at dev@taverna so we can review it before the deadline; we want GSOC to be successful as well! -- Stian Soiland-Reyes The University of Manchester http://www.esciencelab.org.uk/ http://orcid.org/0000-0001-9842-9718 -- Stian Soiland-Reyes http://orcid.org/0000-0001-9842-9718
