Re: [galaxy-dev] Number of outputs = number of inputs
My question is related to this post http://dev.list.galaxyproject.org/Number-of-outputs-number-of-inputs-td4656644.html in the archive. As recommended, I referred to http://wiki.galaxyproject.org/Admin/Tools/Multiple%20Output%20Files#Number_of_Output_datasets_cannot_be_determined_until_tool_run for some guidance on how to achieve multiple output files, specifically one set of six files per number of times the repeat tag is called. The example provided isn't that informative, however, as it doesn't show what the example_tool.sh command script is doing so I can make my perl wrapper script do the same thing. If I have six files total, do I only specify one in the outputs tag and then have the remaining 5 named according to the scheme in the perl wrapper? Or is it Galaxy that does the naming of the remaining files automatically? Is there a perl equivalent to the $__new_file_path__ variable that I can use, or is this the literal resolved file path (e.g. /opt/galaxy/... etc)? Right now, upon execution of my tool, if a user has two entries (so number of times repeat has executed is twice), my perl wrapper is called twice (good), but on the second call, Galaxy uses the same output filenames as the first run such that the first set of output files are all overwritten by the second execution (bad). How do I make sure that each repeat iteration results in a unique set of output files? Thanks for your help! Christina Shafer, Ph.D Regenerative Biology Laboratory Morgridge Institute for Research ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Number of outputs = number of inputs
On Tue, Jul 30, 2013 at 4:29 PM, Shafer, Christina csha...@morgridgeinstitute.org wrote: My question is related to this post in the archive. As recommended, I referred to http://wiki.galaxyproject.org/Admin/Tools/Multiple%20Output%20Files#Number_of_Output_datasets_cannot_be_determined_until_tool_run for some guidance on how to achieve multiple output files, specifically one set of six files per number of times the repeat tag is called. The example provided isn't that informative, however, as it doesn't show what the example_tool.sh command script is doing so I can make my perl wrapper script do the same thing. If I have six files total, do I only specify one in the outputs tag and then have the remaining 5 named according to the scheme in the perl wrapper? Or is it Galaxy that does the naming of the remaining files automatically? Is there a perl equivalent to the $__new_file_path__ variable that I can use, or is this the literal resolved file path (e.g. /opt/galaxy/... etc)? Right now, upon execution of my tool, if a user has two entries (so number of times repeat has executed is twice), my perl wrapper is called twice (good), but on the second call, Galaxy uses the same output filenames as the first run such that the first set of output files are all overwritten by the second execution (bad). How do I make sure that each repeat iteration results in a unique set of output files? The normal expectation is the Galaxy wrapper calls the tool ONCE only, with a more complex command line including multiple file names from the repeat. If I recall correctly from your last email, you are using a trick with semi colons to embed multiple shell commands in the command tag, one for each repetition of the repeat tag. I suggest you rework your Perl script to be designed to be called once for all the work, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Number of outputs = number of inputs
Like most days, JJ very politely pointed out that I am a wrong this morning. You can have variable numbers of outputs at runtime, see the last section (Number of Output datasets cannot be determined until tool run ) of this page: http://wiki.g2.bx.psu.edu/Admin/Tools/Multiple%20Output%20Files Sorry about that. -John On Tue, Oct 16, 2012 at 8:48 AM, John Chilton chil0...@umn.edu wrote: I don't believe this is possible in Galaxy right now. Are the outputs independent or is information from all inputs used to produce all outputs? If they are independent, you can create a workflow containing just your tool with 1 input and 1 output and use the batch workflow mode to run it on multiple files and get multiple outputs. This is not a beautiful solution but it gets the job done in some cases. Another thing to look at might be the discussion we are having on the thread pass more information on a dataset merge. We have a fork (its all work from Jorrit Boekel) of galaxy that creates composite datatypes for each explicitly defined type that can hold collections of a single type. https://bitbucket.org/galaxyp/galaxy-central-homogeneous-composite-datatypes/compare This would hopefully let you declare that you can accept a collection of whatever your input type is and produce a collection of whatever your output is. Lots of downsides to this approach - not fully implemented, and not included in Galaxy proper, your outputs would be wrapped up in a composite datatype so they wouldn't be easily processable by downstream tools. It would be good to have additional people hacking on it though :) -John John Chilton Senior Software Developer University of Minnesota Supercomputing Institute Office: 612-625-0917 Cell: 612-226-9223 Bitbucket: https://bitbucket.org/jmchilton Github: https://github.com/jmchilton Web: http://jmchilton.net On Tue, Oct 16, 2012 at 7:13 AM, Sascha Kastens s.kast...@gatc-biotech.com wrote: Hi all! I have a tool which takes one ore more input files. For each input file one output is created, i.e. 1 input file - 1 output file, 2 input files - 2 output files, etc. What is the best way to handle this? I used the directions for handlin multiple output files where the ’Number of Output datasets cannot be determined until tool run’ which in my opinion is a bit inappropriate. BTW: The input files are added via the repeat-Tag, so maybe there is a similar thing for outputs? Thanks in advance! Cheers, Sascha ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] Number of outputs = number of inputs
Hi all! I have a tool which takes one ore more input files. For each input file one output is created, i.e. 1 input file - 1 output file, 2 input files - 2 output files, etc. What is the best way to handle this? I used the directions for handlin multiple output files where the ?Number of Output datasets cannot be determined until tool run? which in my opinion is a bit inappropriate. BTW: The input files are added via the repeat-Tag, so maybe there is a similar thing for outputs? Thanks in advance! Cheers, Sascha ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Number of outputs = number of inputs
I don't believe this is possible in Galaxy right now. Are the outputs independent or is information from all inputs used to produce all outputs? If they are independent, you can create a workflow containing just your tool with 1 input and 1 output and use the batch workflow mode to run it on multiple files and get multiple outputs. This is not a beautiful solution but it gets the job done in some cases. Another thing to look at might be the discussion we are having on the thread pass more information on a dataset merge. We have a fork (its all work from Jorrit Boekel) of galaxy that creates composite datatypes for each explicitly defined type that can hold collections of a single type. https://bitbucket.org/galaxyp/galaxy-central-homogeneous-composite-datatypes/compare This would hopefully let you declare that you can accept a collection of whatever your input type is and produce a collection of whatever your output is. Lots of downsides to this approach - not fully implemented, and not included in Galaxy proper, your outputs would be wrapped up in a composite datatype so they wouldn't be easily processable by downstream tools. It would be good to have additional people hacking on it though :) -John John Chilton Senior Software Developer University of Minnesota Supercomputing Institute Office: 612-625-0917 Cell: 612-226-9223 Bitbucket: https://bitbucket.org/jmchilton Github: https://github.com/jmchilton Web: http://jmchilton.net On Tue, Oct 16, 2012 at 7:13 AM, Sascha Kastens s.kast...@gatc-biotech.com wrote: Hi all! I have a tool which takes one ore more input files. For each input file one output is created, i.e. 1 input file - 1 output file, 2 input files - 2 output files, etc. What is the best way to handle this? I used the directions for handlin multiple output files where the ’Number of Output datasets cannot be determined until tool run’ which in my opinion is a bit inappropriate. BTW: The input files are added via the repeat-Tag, so maybe there is a similar thing for outputs? Thanks in advance! Cheers, Sascha ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Number of outputs = number of inputs
I tried galaxy-central-homogeneous-composite-datatypes fork, works great. I have a similar problem, where number of output files varies, it seems that your approach might work for output files as well (not only input). Currently I'm trying to work out how to implement it, any help is appreciated. Alex -Original Message- From: galaxy-dev-boun...@lists.bx.psu.edu [mailto:galaxy-dev-boun...@lists.bx.psu.edu] On Behalf Of John Chilton Sent: Wednesday, 17 October 2012 12:49 AM To: Sascha Kastens Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] Number of outputs = number of inputs I don't believe this is possible in Galaxy right now. Are the outputs independent or is information from all inputs used to produce all outputs? If they are independent, you can create a workflow containing just your tool with 1 input and 1 output and use the batch workflow mode to run it on multiple files and get multiple outputs. This is not a beautiful solution but it gets the job done in some cases. Another thing to look at might be the discussion we are having on the thread pass more information on a dataset merge. We have a fork (its all work from Jorrit Boekel) of galaxy that creates composite datatypes for each explicitly defined type that can hold collections of a single type. https://bitbucket.org/galaxyp/galaxy-central-homogeneous-composite-datatypes/compare This would hopefully let you declare that you can accept a collection of whatever your input type is and produce a collection of whatever your output is. Lots of downsides to this approach - not fully implemented, and not included in Galaxy proper, your outputs would be wrapped up in a composite datatype so they wouldn't be easily processable by downstream tools. It would be good to have additional people hacking on it though :) -John John Chilton Senior Software Developer University of Minnesota Supercomputing Institute Office: 612-625-0917 Cell: 612-226-9223 Bitbucket: https://bitbucket.org/jmchilton Github: https://github.com/jmchilton Web: http://jmchilton.net On Tue, Oct 16, 2012 at 7:13 AM, Sascha Kastens s.kast...@gatc-biotech.com wrote: Hi all! I have a tool which takes one ore more input files. For each input file one output is created, i.e. 1 input file - 1 output file, 2 input files - 2 output files, etc. What is the best way to handle this? I used the directions for handlin multiple output files where the 'Number of Output datasets cannot be determined until tool run' which in my opinion is a bit inappropriate. BTW: The input files are added via the repeat-Tag, so maybe there is a similar thing for outputs? Thanks in advance! Cheers, Sascha ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/