Re: [galaxy-dev] Again: Variable inputs in a workflow

2013-12-11 Thread John Chilton
In a former position I wrote extensions to Galaxy that allow workflows
such this:

https://bitbucket.org/galaxy/galaxy-central/pull-request/116/multiple-file-datasets-implementation/diff
https://bitbucket.org/msiappdev/galaxy-extras/commits/all
http://bit.ly/beyond-proteomics

so believe me I understand this is important. I was recently hired by
the Galaxy team and am now working on a more palatable variant on
these ideas. More palatable also means  more complex and more work to
implement unfortunately... but I am working on it.

In the meantime, have you considered driving these kind of workflows
via the API? I think the refinery platform
(https://github.com/parklab/refinery-platform) for instance targets
the Galaxy API and rewrites workflows at runtime to handle variable
numbers of inputs mitigating Galaxy's limitations. This would require
some custom development and some mechanism outside of the traditional
Galaxy channels to launch the workflows and probably is not worth the
effort unless you are talking about a small pool of fixed workflows.

-John

On Tue, Dec 10, 2013 at 4:05 AM, Preussner, Jens
 wrote:
> Dear galaxy-dev’s,
>
>
>
> about a year ago you discussed on variable number of inputs into a workflow
> (see http://lists.bx.psu.edu/pipermail/galaxy-dev/2012-November/012012.html
> and
> http://thread.gmane.org/gmane.science.biology.galaxy.devel/4502/focus=4502,
> for example). We’re interested in having something simple like a number of
> fastq files that have to undergo automated quality control and trimming. A
> workflow would describe steps that need to be done per file (like fastqc,
> quality trimmer and so on), and in the end, a tabular data summarizes
> statistics on all files. Of course, we could prebuild workflows for
> 2,3,4,5,..n input files and join the output, but it would be much cooler to
> have it variable. Since it’s possible to start many instances of a workflow
> (i.e. one instance per file), this would be a good starting point. But how
> would one combine outputs of those instances? Is anyone out there having
> experience with such setups? Which files are involved in starting many
> instances of a workflow? Any other ideas or suggestions on how to go from
> here? Thanks a lot for any input!
>
>
>
> Best,
>
> Jens
>
>
>
> Max-Planck-Institute for Heart and Lung Research
>
> Bioinformatics Service - FGI
>
> Ludwigstraße 43
>
> 61231 Bad Nauheim
>
>
>
> Phone. +49 6032 705 1765
>
> Mail. jens.preuss...@mpi-bn.mpg.de
>
>
>
>
> ___
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
>   http://lists.bx.psu.edu/
>
> To search Galaxy mailing lists use the unified search at:
>   http://galaxyproject.org/search/mailinglists/

___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Again: Variable inputs in a workflow

2013-12-10 Thread Eric Rasche
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Galaxy developers,

I would be particularly interested in seeing this implemented as flow
control blocks. Having such blocks in workflows would vastly expand the
capabilities. Ideally there would be something like

- - for (i=0 to max)
- - foreach (file in files)
- - while
- - if (could pass off "testing" to some external bit of code/another
programme which would return 1/0)
- - switch case

This would allow you to run a copy of the workflow foreach of Jens'
files, and then run another foreach loop on the outputs of the foreach
to concatenate them down to a single file (or something like that to
combine them)

Just my two cents.

Cheers,
Eric

On 12/10/2013 04:05 AM, Preussner, Jens wrote:
> Dear galaxy-dev’s,
> 
>  
> 
> about a year ago you discussed on variable number of inputs into a
> workflow (see
> http://lists.bx.psu.edu/pipermail/galaxy-dev/2012-November/012012.html
> and
> http://thread.gmane.org/gmane.science.biology.galaxy.devel/4502/focus=4502,
> for example). We’re interested in having something simple like a number
> of fastq files that have to undergo automated quality control and
> trimming. A workflow would describe steps that need to be done per file
> (like fastqc, quality trimmer and so on), and in the end, a tabular data
> summarizes statistics on all files. Of course, we could prebuild
> workflows for 2,3,4,5,..n input files and join the output, but it would
> be much cooler to have it variable. Since it’s possible to start many
> instances of a workflow (i.e. one instance per file), this would be a
> good starting point. But how would one combine outputs of those
> instances? Is anyone out there having experience with such setups? Which
> files are involved in starting many instances of a workflow? Any other
> ideas or suggestions on how to go from here? Thanks a lot for any input!
> 
>  
> 
> Best,
> 
> Jens
> 
>  
> 
> Max-Planck-Institute for Heart and Lung Research
> 
> Bioinformatics Service - FGI
> 
> Ludwigstraße 43
> 
> 61231 Bad Nauheim
> 
>  
> 
> Phone. +49 6032 705 1765
> 
> Mail. jens.preuss...@mpi-bn.mpg.de 
> 
>  
> 
> 
> 
> ___
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
>   http://lists.bx.psu.edu/
> 
> To search Galaxy mailing lists use the unified search at:
>   http://galaxyproject.org/search/mailinglists/
> 

- -- 
Eric Rasche
Programmer II
Center for Phage Technology
Texas A&M University
College Station, TX 77843
404-692-2048
e...@tamu.edu
rasche.e...@yandex.ru
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.17 (GNU/Linux)

iQIcBAEBAgAGBQJSpyfTAAoJEMqDXdrsMcpVcWUP/R8EJoWwhJl1ciIUsghqyqOW
hCVVooD2PHw19ttIvXVtPae98U3RuqmWKVP1T9bJ8S1ygX8QneOOPog1jecZ9t87
sv1oztAvIjdiEVMUQL35uF6dcSkanpWWlf8VretZGBoSaXxyNn8+iYadZLZ4fBzq
yXoHwsbfzqdAHevrIRFrkZfY9vSsf+9Y767DBgZYYIYUD6prMFmztyMgf1LL7FoO
NUFgnQtWcrISmvs7B+Kro4jr5uCJsmP5i0k6ssdHDpUvUCDFDJZHqM9AEkGTTkDC
QUR/G1V1kV7YGXfFVxVKtz6k35M0aLGHX+ZnmYCnQdbdv4zIwSbdvNxfDvz6t6fQ
mr0yKpZvUAVvp/M62LfnwzlxAVKi+8r+ENu57PhGRaSG4jCJHWGm5S84iwvIlJw/
SIDDb2SgBWzS53olwhQOHj3LhBCG+tIsv4Nc0jPs7CXoZ5/YhZMWuQdy3JvQ/AU8
+NrOmFol8hcpFTgAbHseHgoKIOVpgCXBVMCFx8TQxgKAJvzU/5RaoMum8Tl+UH7e
9XxxBtxkX2LOBdqGz6oCDPc1i4Uo+ewwlvv3FFpsxUc8nUWU2pA7pSfCOEa1zEJn
RMPVHGDEAgyAAnXBrYJed+vT6+vA3d7oHfxeDwN1O304wwNItUMkXbwb32r2XdHb
q6aRY8oOm1ZlayTP0/M7
=e/q6
-END PGP SIGNATURE-
___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

[galaxy-dev] Again: Variable inputs in a workflow

2013-12-10 Thread Preussner, Jens
Dear galaxy-dev's,

about a year ago you discussed on variable number of inputs into a workflow 
(see http://lists.bx.psu.edu/pipermail/galaxy-dev/2012-November/012012.html and 
http://thread.gmane.org/gmane.science.biology.galaxy.devel/4502/focus=4502, for 
example). We're interested in having something simple like a number of fastq 
files that have to undergo automated quality control and trimming. A workflow 
would describe steps that need to be done per file (like fastqc, quality 
trimmer and so on), and in the end, a tabular data summarizes statistics on all 
files. Of course, we could prebuild workflows for 2,3,4,5,..n input files and 
join the output, but it would be much cooler to have it variable. Since it's 
possible to start many instances of a workflow (i.e. one instance per file), 
this would be a good starting point. But how would one combine outputs of those 
instances? Is anyone out there having experience with such setups? Which files 
are involved in starting many instances of a workflow? Any other ideas or 
suggestions on how to go from here? Thanks a lot for any input!

Best,
Jens

Max-Planck-Institute for Heart and Lung Research
Bioinformatics Service - FGI
Ludwigstraße 43
61231 Bad Nauheim

Phone. +49 6032 705 1765
Mail. jens.preuss...@mpi-bn.mpg.de

___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/