For the list sake - I think we figured this out and IRC and it had to
do with having two versions of Galaxy installed on the same machine.
Alexander - let me know if this issue is not resolved.


On Mon, Jun 15, 2015 at 4:02 PM, Alexander Vowinkel
<> wrote:
> Thank you for this detailed descriptions!
> I already have a followup question.
> I'm working on Galaxy Cloudman:
>> Galaxy is at revision: 93cda3eb81 (master branch) from 11 Jun 2015)
> But I just can find "Build dataset pair|list", not "List of Dataset Pairs"
> like
> in the video. At what version is that implemented?
> Best,
> Alexander
> 2015-06-15 10:17 GMT-05:00 John Chilton <>:
>> On Wed, Jun 10, 2015 at 4:04 PM, Alexander Vowinkel
>> <> wrote:
>> > Hi Folks,
>> >
>> > thank you so far for the previous help. I got much further.
>> > Now I'm stuck with data collections.
>> >
>> > Because this is quite a list, I appreciate also answers to parts of my
>> > questions ;)
>> >
>> > I have two issues:
>> > A) manual definition of data collections (any type) by user and/or admin
>> > B) definition of data collections as input/output of a tool and inside a
>> > workflow
>> >
>> >
>> > A) manual
>> > Basically I would like to create
>> > i) a list of fastq files (unpaired)
>> > ii) a paired set of two fastq files
>> > iii) a list of each two paired fastq files
>> >
>> > How can I do that?
>> > By using the web app? As user? As admin?
>> > By working via ssh on the server?
>> So each of these got much easier/more robust with the most recent release.
>> For the user perspective - for any of these options you will want to
>> load the fastq files into a history, open the manage multiple datasets
>> option
>> (,
>> select the datasets, and then choose the list type from the menu. Each
>> will cause a widget to pop up allowing you to group the datasets (into
>> a list, a pair, or a list of pairs  depending on your selection).
>> The most complicated option is the list of pairs - this option is
>> demonstrated in a the first video in Anton's recent NGS 101 -
>> Reference-based RNA-seq series
>> ( More information at
>> For all user-centric scenarios - you will need to get the plain
>> datasets into a history first. FTP upload for instance doesn't support
>> creating collections directly - you can import datasets and then
>> create them. Likewise - data libraries do not currently support
>> dataset collections. I believe there are Trello cards for both of
>> these issues.
>> For admins - there is a dataset collection API - I can point you at
>> examples if you want - but this doesn't seem to be your interest.
>> >
>> >
>> > B) in tool/workflow
>> > Here I also have different approaches I would like to realize:
>> > i) use a collection as input for a tool
>> > ii) create a collection as output of a tool
>> > ii.1) from known # of output parameters
>> > ii.2) from unknown # of output parameters
>> >
>> > For these things I was trying to find some tools in toolshed to see how
>> > they
>> > do it, but I couldn't quite adopt it.
>> I would look in the following directory instead of the tool shed -
>> These are the tools used to drive the testing of the collections
>> implementation and contain some very stripped down examples of what is
>> possible.
>> >
>> > i) use a collection as input for a tool
>> > this is good documented - realizable by type="data_collection" and the
>> > collection_type.
>> > Unfortunately I can't test this because I can't create a collection so
>> > far
>> > ;) - see A
>> Indeed :). Here some good examples are like the tools in the RNA-seq
>> pipeline - Tophat, Bowtie2, etc....
>> >
>> > ii) create a collection as output of a tool
>> > Here it gets blurry for me.
>> So one can get very far without ever creating an output from a tool
>> explicitly. I contend most of the time - if you have a list of bam
>> files and you want to create another list of bam files - you just want
>> to map some operation over them. This is demonstrated in that RNA-seq
>> outline - and talked about in a more theoretical way in my GCC talk
>> from last year
>> There are definitely cases when you want to explicitly create
>> collections though - the current best documentation on this is going
>> to be the pull request that added them - not the implementation but
>> the description which actually lays out these same categories and how
>> to handle them with explicit complete examples.
>> Hopefully this helps - please follow up with additional questions as
>> you have them. I am keen to see more developers leveraging dataset
>> collections.
>> Thanks a bunch.
>> -John
>> >
>> > ii.1) from known # of output parameters
>> > Here I didn't find a tool. I just thought, it might be a simpler case
>> > than
>> > ii.2 and
>> > good to understand the concept.
>> > I would be glad if someone could explain the way(s) to do this.
>> >
>> > ii.2) from unknown # of output parameters
>> > For this I found barcode splitter tools (also from devteam) that have
>> > different approaches.
>> > But. Their output (defined in xml) is only some report file.
>> > The output files seem to be fed into the history.
>> > And here I don't know how to get hands on these files when I want to use
>> > them to feed them into the next step during a workflow.
>> >
>> > Help highly appreciated!
>> >
>> > Thanks!
>> > Alexander
>> >
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

To search Galaxy mailing lists use the unified search at:

Reply via email to