I’m toying around a little in galaxy-dist with the dataset collections feature.
Since I know this is work in progress, I was wondering about some things I
haven’t really found online.
It seems to work really well to run a tool on a list of datasets, and a new job
is run for each list item. But when I want to reduce to a smaller amount of
list items, I understand I need to write some sort of merge tool myself,
dependent on the data (all proteomics data here currently). This works well for
reducing a dataset to a single file, but I am not sure about how to reduce to a
new smaller collection. In the tool I’m writing, I let the user choose the size
of the collection.
Is there some way to tell galaxy dynamically how many outputs to expect AND put
them in a collection? Something like:
<output type=“data_collection” amount_of_files=“3”/>
Where 3 is set by the user in a param also.
Also, when running with two or more lists as input, is there some sort of
correlation between the lists? It seems like it takes the files in dataset no
order, so just checking.
By the way, thanks very much John and everyone else involved in collections for
doing and pushing this stuff. If there are smaller issues I can help with, I’d
be thrilled. Can’t stress enough how much this feature means for galaxy
adoption in our lab and possibly field.
Proteomics systems developer
BILS / Lehtiö lab
Scilifelab Stockholm, Sweden
Please keep all replies on the list by using "reply all"
in your mail client. To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
To search Galaxy mailing lists use the unified search at: