Re: [galaxy-dev] Number of outputs = number of inputs

2013-07-30 Thread Shafer, Christina
My question is related to this post 
http://dev.list.galaxyproject.org/Number-of-outputs-number-of-inputs-td4656644.html
 in the archive. As recommended, I referred to 
http://wiki.galaxyproject.org/Admin/Tools/Multiple%20Output%20Files#Number_of_Output_datasets_cannot_be_determined_until_tool_run
 for some guidance on how to achieve multiple output files, specifically one 
set of six files per number of times the repeat tag is called. The example 
provided isn't that informative, however, as it doesn't show what the 
example_tool.sh command script is doing so I can make my perl wrapper script do 
the same thing. If I have six files total, do I only specify one in the 
outputs tag and then have the remaining 5 named according to the scheme in 
the perl wrapper? Or is it Galaxy that does the naming of the remaining files 
automatically? Is there a perl equivalent to the $__new_file_path__ variable 
that I can use, or is this the literal resolved file path (e.g. 
/opt/galaxy/... etc)?

Right now, upon execution of my tool, if a user has two entries (so number of 
times repeat has executed is twice), my perl wrapper is called twice (good), 
but on the second call, Galaxy uses the same output filenames as the first run 
such that the first set of output files are all overwritten by the second 
execution (bad). How do I make sure that each repeat iteration results in a 
unique set of output files?

Thanks for your help!

Christina Shafer, Ph.D
Regenerative Biology Laboratory
Morgridge Institute for Research
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] Number of outputs = number of inputs

2013-07-30 Thread Peter Cock
On Tue, Jul 30, 2013 at 4:29 PM, Shafer, Christina
csha...@morgridgeinstitute.org wrote:
 My question is related to this post in the archive. As recommended, I
 referred to
 http://wiki.galaxyproject.org/Admin/Tools/Multiple%20Output%20Files#Number_of_Output_datasets_cannot_be_determined_until_tool_run
 for some guidance on how to achieve multiple output files, specifically one
 set of six files per number of times the repeat tag is called. The example
 provided isn't that informative, however, as it doesn't show what the
 example_tool.sh command script is doing so I can make my perl wrapper script
 do the same thing. If I have six files total, do I only specify one in the
 outputs tag and then have the remaining 5 named according to the scheme in
 the perl wrapper? Or is it Galaxy that does the naming of the remaining
 files automatically? Is there a perl equivalent to the $__new_file_path__
 variable that I can use, or is this the literal resolved file path (e.g.
 /opt/galaxy/... etc)?

 Right now, upon execution of my tool, if a user has two entries (so number
 of times repeat has executed is twice), my perl wrapper is called twice
 (good), but on the second call, Galaxy uses the same output filenames as the
 first run such that the first set of output files are all overwritten by the
 second execution (bad). How do I make sure that each repeat iteration
 results in a unique set of output files?

The normal expectation is the Galaxy wrapper calls the tool ONCE only,
with a more complex command line including multiple file names from
the repeat.

If I recall correctly from your last email, you are using a trick with semi
colons to embed multiple shell commands in the command tag, one
for each repetition of the repeat tag.

I suggest you rework your Perl script to be designed to be called once
for all the work,

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Number of outputs = number of inputs

2012-10-17 Thread John Chilton
Like most days, JJ very politely pointed out that I am a wrong this
morning. You can have variable numbers of outputs at runtime, see the
last section (Number of Output datasets cannot be determined until
tool run
) of this page:

http://wiki.g2.bx.psu.edu/Admin/Tools/Multiple%20Output%20Files

Sorry about that.

-John

On Tue, Oct 16, 2012 at 8:48 AM, John Chilton chil0...@umn.edu wrote:
 I don't believe this is possible in Galaxy right now. Are the outputs
 independent or is information from all inputs used to produce all
 outputs? If they are independent, you can create a workflow containing
 just your tool with 1 input and 1 output and use the batch workflow
 mode to run it on multiple files and get multiple outputs. This is not
 a beautiful solution but it gets the job done in some cases.

 Another thing to look at might be the discussion we are having on the
 thread pass more information on a dataset merge. We have a fork (its
 all work from Jorrit Boekel) of galaxy that creates composite
 datatypes for each explicitly defined type that can hold collections
 of a single type.

 https://bitbucket.org/galaxyp/galaxy-central-homogeneous-composite-datatypes/compare

 This would hopefully let you declare that you can accept a collection
 of whatever your input type is and produce a collection of whatever
 your output is. Lots of downsides to this approach - not fully
 implemented, and not included in Galaxy proper, your outputs would be
 wrapped up in a composite datatype so they wouldn't be easily
 processable by downstream tools. It would be good to have additional
 people hacking on it though :)

 -John

 
 John Chilton
 Senior Software Developer
 University of Minnesota Supercomputing Institute
 Office: 612-625-0917
 Cell: 612-226-9223
 Bitbucket: https://bitbucket.org/jmchilton
 Github: https://github.com/jmchilton
 Web: http://jmchilton.net

 On Tue, Oct 16, 2012 at 7:13 AM, Sascha Kastens
 s.kast...@gatc-biotech.com wrote:
 Hi all!



 I have a tool which takes one ore more input files. For each input file one
 output is created,

 i.e. 1 input file - 1 output file, 2 input files - 2 output files, etc.



 What is the best way to handle this? I used the directions for handlin
 multiple output files where

 the ’Number of Output datasets cannot be determined until tool run’ which in
 my opinion is a bit

 inappropriate. BTW: The input files are added via the repeat-Tag, so maybe
 there is a similar

 thing for outputs?



 Thanks in advance!



 Cheers,

 Sascha


 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:

   http://lists.bx.psu.edu/

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


[galaxy-dev] Number of outputs = number of inputs

2012-10-16 Thread Sascha Kastens
Hi all!

 

I have a tool which takes one ore more input files. For each input file one 
output is created,

i.e. 1 input file - 1 output file, 2 input files - 2 output files, etc.

 

What is the best way to handle this? I used the directions for handlin multiple 
output files where

the ?Number of Output datasets cannot be determined until tool run? which in my 
opinion is a bit

inappropriate. BTW: The input files are added via the repeat-Tag, so maybe 
there is a similar

thing for outputs?

 

Thanks in advance!

 

Cheers,

Sascha

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] Number of outputs = number of inputs

2012-10-16 Thread John Chilton
I don't believe this is possible in Galaxy right now. Are the outputs
independent or is information from all inputs used to produce all
outputs? If they are independent, you can create a workflow containing
just your tool with 1 input and 1 output and use the batch workflow
mode to run it on multiple files and get multiple outputs. This is not
a beautiful solution but it gets the job done in some cases.

Another thing to look at might be the discussion we are having on the
thread pass more information on a dataset merge. We have a fork (its
all work from Jorrit Boekel) of galaxy that creates composite
datatypes for each explicitly defined type that can hold collections
of a single type.

https://bitbucket.org/galaxyp/galaxy-central-homogeneous-composite-datatypes/compare

This would hopefully let you declare that you can accept a collection
of whatever your input type is and produce a collection of whatever
your output is. Lots of downsides to this approach - not fully
implemented, and not included in Galaxy proper, your outputs would be
wrapped up in a composite datatype so they wouldn't be easily
processable by downstream tools. It would be good to have additional
people hacking on it though :)

-John


John Chilton
Senior Software Developer
University of Minnesota Supercomputing Institute
Office: 612-625-0917
Cell: 612-226-9223
Bitbucket: https://bitbucket.org/jmchilton
Github: https://github.com/jmchilton
Web: http://jmchilton.net

On Tue, Oct 16, 2012 at 7:13 AM, Sascha Kastens
s.kast...@gatc-biotech.com wrote:
 Hi all!



 I have a tool which takes one ore more input files. For each input file one
 output is created,

 i.e. 1 input file - 1 output file, 2 input files - 2 output files, etc.



 What is the best way to handle this? I used the directions for handlin
 multiple output files where

 the ’Number of Output datasets cannot be determined until tool run’ which in
 my opinion is a bit

 inappropriate. BTW: The input files are added via the repeat-Tag, so maybe
 there is a similar

 thing for outputs?



 Thanks in advance!



 Cheers,

 Sascha


 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:

   http://lists.bx.psu.edu/

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] Number of outputs = number of inputs

2012-10-16 Thread Alex.Khassapov
I tried galaxy-central-homogeneous-composite-datatypes fork, works great. I 
have a similar problem, where number of output files varies, it seems that your 
approach might work for output files as well (not only input). Currently I'm 
trying to work out how to implement it, any help is appreciated.

Alex

-Original Message-
From: galaxy-dev-boun...@lists.bx.psu.edu 
[mailto:galaxy-dev-boun...@lists.bx.psu.edu] On Behalf Of John Chilton
Sent: Wednesday, 17 October 2012 12:49 AM
To: Sascha Kastens
Cc: galaxy-dev@lists.bx.psu.edu
Subject: Re: [galaxy-dev] Number of outputs = number of inputs

I don't believe this is possible in Galaxy right now. Are the outputs 
independent or is information from all inputs used to produce all outputs? If 
they are independent, you can create a workflow containing just your tool with 
1 input and 1 output and use the batch workflow mode to run it on multiple 
files and get multiple outputs. This is not a beautiful solution but it gets 
the job done in some cases.

Another thing to look at might be the discussion we are having on the thread 
pass more information on a dataset merge. We have a fork (its all work from 
Jorrit Boekel) of galaxy that creates composite datatypes for each explicitly 
defined type that can hold collections of a single type.

https://bitbucket.org/galaxyp/galaxy-central-homogeneous-composite-datatypes/compare

This would hopefully let you declare that you can accept a collection of 
whatever your input type is and produce a collection of whatever your output 
is. Lots of downsides to this approach - not fully implemented, and not 
included in Galaxy proper, your outputs would be wrapped up in a composite 
datatype so they wouldn't be easily processable by downstream tools. It would 
be good to have additional people hacking on it though :)

-John


John Chilton
Senior Software Developer
University of Minnesota Supercomputing Institute
Office: 612-625-0917
Cell: 612-226-9223
Bitbucket: https://bitbucket.org/jmchilton
Github: https://github.com/jmchilton
Web: http://jmchilton.net

On Tue, Oct 16, 2012 at 7:13 AM, Sascha Kastens s.kast...@gatc-biotech.com 
wrote:
 Hi all!



 I have a tool which takes one ore more input files. For each input 
 file one output is created,

 i.e. 1 input file - 1 output file, 2 input files - 2 output files, etc.



 What is the best way to handle this? I used the directions for handlin 
 multiple output files where

 the 'Number of Output datasets cannot be determined until tool run' 
 which in my opinion is a bit

 inappropriate. BTW: The input files are added via the repeat-Tag, so 
 maybe there is a similar

 thing for outputs?



 Thanks in advance!



 Cheers,

 Sascha


 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this and other 
 Galaxy lists, please use the interface at:

   http://lists.bx.psu.edu/

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this and other Galaxy 
lists, please use the interface at:

  http://lists.bx.psu.edu/

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/