Peter - My plans for pre-GCC workflow work are sort of outlined in this issue: https://github.com/galaxyproject/planemo/issues/408 (I want an abstract for GCC and BOSC like "Planemo – A Scientific Workflow SDK").
I've been doing most of my work out of this branch https://github.com/galaxyproject/galaxy/compare/dev...common-workflow-language:cwl. It has my work in progress on CWL support, collection operations (rejected once from Galaxy here https://github.com/galaxyproject/galaxy/pull/1313) but these are so important I'm going to take another stab at pushing them into Galaxy, and work on expression tools to produce values that will hopefully tie back into workflows as connections for non-data parameters - both as Galaxy native enties and CWL based enties. There have been some completely valid complaints about the background workflow scheduling being slow and buggy, these will need to be fixed by 16.04 since all workflows will be executed this way as of then. I hope also to take another pass at subworkflows - better tracking of sources, allowing upgrading subworkflow steps, fixing glaring bugs like https://github.com/galaxyproject/galaxy/issues/1739. Peter C. mentioned splitting and joining files into/from collections in workflows based on the datatype methods (so hooking into parallelism) - I have some initial WIP on this here https://github.com/jmchilton/galaxy/commit/c4d93acdb3b0f89b970b7c3d17b965be8ab3ba30 as part of this branch https://github.com/jmchilton/galaxy/tree/split_merge_collections. I spent a couple hours on it - I think if I spent a day or two on it I'd have a usable prototype to hack on - I don't remember thinking there were any big hurdles I was encountering in doing that. (So the answer to your last question is a definitive yes.) Sam started a bunch of work here with completely replacing the workflow form with an API driven one here https://github.com/galaxyproject/galaxy/pull/1249. I know he hopes to have that done in 16.04 - it will allow us to delete a bunch of paths through the workflow code and should allow future developments to be made more rapidly. It will ensure everything is coming through the API also - which means Galaxy's test coverage of workflow stuff will be much higher (given our depth of workflow API tests). I'm happy to have a hangout to discuss this more, I consider the planemo issue something of a roadmap for what I want to work on in the first half of 2016 - but I might get pulled away or told the project has other priorities. As for scheduling workflows instead of jobs - this is intriguing and really would probably be needed to get streaming working well in Galaxy. So I would say - I want to work on it someday - but I probably won't get to it in 2016. If others want to hack on it, that is fantastic but it is also a difficult feat. (At least scheduling out and optimizing pieces of the workflow, Kyle Ellrott, Dannon, and I had some interesting ideas about scheduling whole workflows on local Galaxy instances running on a cluster and just collecting the outputs - that would be significantly more doable given I sort of sculpted the changes made to backgrounding workflows to preserve things for doing that - though the work left is probably still a hard task). Hope this helps. -John On Mon, Feb 22, 2016 at 7:57 AM, Peter van Heusden <p...@sanbi.ac.za> wrote: > Hi there > > I see from the PR landing in Galaxy and the comments on things like issue > #1701 (https://github.com/galaxyproject/galaxy/issues/1701) that there's > lots of work happening on the workflow side of Galaxy. This is an area of > interest at SANBI too, so we'd like to coordinate development efforts as > much as possible. To this end: > > 1) Are there forks to track so we can see what new code is landing? > 2) Is there a roadmap for workflow work or perhaps can we have a Hangout to > talk about this? > 3) Specifically in terms of workflows and parallelisation: are there any > plans to work on running workflows as opposed to just generating lots of > jobs? I know this is a major change to how Galaxy works - it would mean > something like submitting a workflow specification to a job runner that is > located on the cluster, and then returning the results of workflow > execution. > 4) Currently parallelisation in Galaxy is supported using two mechanisms: > collections and dataset splitters/tasks. Are there plans on extending and > harmonising Galaxy's parallelisation capabilities? > > Thanks, > Peter > > > > ___________________________________________________________ > Please keep all replies on the list by using "reply all" > in your mail client. To manage your subscriptions to this > and other Galaxy lists, please use the interface at: > https://lists.galaxyproject.org/ > > To search Galaxy mailing lists use the unified search at: > http://galaxyproject.org/search/mailinglists/ ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/