Re: [galaxy-dev] Implementing dataset collections
On Mon, Mar 4, 2013 at 3:16 PM, John Chilton wrote: > Hi Alex and Peter, > > I appreciate the comments, but I am not going to keep talking about > this issue here, I consider the issue closed. The Galaxy team has > spoken clearly on this topic. They have accepted dozens of my pull > requests and I am very happy about all of them, and I need them to > accept dozens (if not hundreds) more in the future so I cannot afford > to continue to burn political capital by talking about this in this > forum. > > Alex, I plan to support some version of multiple file datasets for > at least the next two years. I will put the code here when it is ready > - https://bitbucket.org/msiappdev/galaxy-extras. Peter, I have > responded to your questions here - > https://groups.google.com/forum/?fromgroups#!forum/galaxy-extras. > > Thanks, > -John Thank John, Hopefully I will get a better insight into the internals during some of the developer orientated GCC2013 training sessions - right now I know that I only have a partial grasp of the compound datatypes. Regards, Peter ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Implementing dataset collections
Hi Alex and Peter, I appreciate the comments, but I am not going to keep talking about this issue here, I consider the issue closed. The Galaxy team has spoken clearly on this topic. They have accepted dozens of my pull requests and I am very happy about all of them, and I need them to accept dozens (if not hundreds) more in the future so I cannot afford to continue to burn political capital by talking about this in this forum. Alex, I plan to support some version of multiple file datasets for at least the next two years. I will put the code here when it is ready - https://bitbucket.org/msiappdev/galaxy-extras. Peter, I have responded to your questions here - https://groups.google.com/forum/?fromgroups#!forum/galaxy-extras. Thanks, -John On Mon, Mar 4, 2013 at 4:49 AM, Peter Cock wrote: > Hi all, > > I've retitled this from "Composite datatype output for Cuffdiff" to > "Implementing dataset collections" to try and link into Jeremy's thread. > References: > http://lists.bx.psu.edu/pipermail/galaxy-dev/2013-February/013574.html > http://lists.bx.psu.edu/pipermail/galaxy-dev/2013-March/013634.html > http://lists.bx.psu.edu/pipermail/galaxy-dev/2013-March/013636.html > > On Mon, Mar 4, 2013 at 4:08 AM, wrote: >>> Hi John, >>> >>> Are you saying that "composite multiple file dataset" isn't required >>> and won't be implemented? >>> >>> We are using your implementation of multifiles dataset ("m:xxx" type) >>> and hope that eventually it will be pushed into main Galaxy >>> implementation. >>> >>> As we are using Galaxy for CT reconstruction tools, where input >>> and output can consist of a couple thousand files, other options >>> are not feasible, i.e. grouping datasets. >>> >>> -Alex > > On Mon, Mar 4, 2013 at 5:42 AM, John Chilton wrote: >> Hi Alex, >> >> Thanks for the comments. The galaxy team has made it clear here and >> to me privately that this will NOT be included in the Galaxy main code >> base. I hope and am I confident that they will make grouping datasets >> work, hopefully even to thousands of files. >> >> I do not believe the two ideas are mutually exclusive and I will be >> maintaining a fork of galaxy-central with these additions, I will set >> this up this week hopefully. I will do my best to respond to support >> requests and make multiple file datasets and composite types in >> general as robust as possible, keep up with Galaxy updates, etc >> Obviously, it is risky to let a code base drift so far from galaxy >> main's however and you, me, and others who might want to use them will >> have to carefully weigh the risks when determining if multiple file >> datasets are worth the headache. >> >> Thanks for all your help and inputs. I am sorry this did not turn >> out differently, I feel I have really failed here. >> >> -John > > Hi John, > > Does your multiple file system work on composite datatypes? For > example, given a 'blastdbn' file (several files making up a BLAST > nucleotide database) could we use 'm:blastdbn' to refer to several > BLAST databases? Is this at least possible in principle based > on your work so far? > > How different is your m:xxx grouping of multiple files of a given > datatype at odds with the dataset collections idea as outlined by > Jeremy Goecks here? > > http://lists.bx.psu.edu/pipermail/galaxy-dev/2013-February/013574.html > https://trello.com/c/325AXIEr > > Jeremy's idea mentions the special case of a unique dataset > collection object for paired-end reads - well if that was defined > using the existing core Galaxy functionality as a 'pairedfastq' > composite datatype (made of two FASTQ files), then could we > use 'm:pairedfastq' (with your enhancement) for a whole bunch > of paired FASTQ files. Nice? :) > > Regards, > > Peter ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Implementing dataset collections
Hi all, I've retitled this from "Composite datatype output for Cuffdiff" to "Implementing dataset collections" to try and link into Jeremy's thread. References: http://lists.bx.psu.edu/pipermail/galaxy-dev/2013-February/013574.html http://lists.bx.psu.edu/pipermail/galaxy-dev/2013-March/013634.html http://lists.bx.psu.edu/pipermail/galaxy-dev/2013-March/013636.html On Mon, Mar 4, 2013 at 4:08 AM, wrote: >> Hi John, >> >> Are you saying that "composite multiple file dataset" isn't required >> and won't be implemented? >> >> We are using your implementation of multifiles dataset ("m:xxx" type) >> and hope that eventually it will be pushed into main Galaxy >> implementation. >> >> As we are using Galaxy for CT reconstruction tools, where input >> and output can consist of a couple thousand files, other options >> are not feasible, i.e. grouping datasets. >> >> -Alex On Mon, Mar 4, 2013 at 5:42 AM, John Chilton wrote: > Hi Alex, > > Thanks for the comments. The galaxy team has made it clear here and > to me privately that this will NOT be included in the Galaxy main code > base. I hope and am I confident that they will make grouping datasets > work, hopefully even to thousands of files. > > I do not believe the two ideas are mutually exclusive and I will be > maintaining a fork of galaxy-central with these additions, I will set > this up this week hopefully. I will do my best to respond to support > requests and make multiple file datasets and composite types in > general as robust as possible, keep up with Galaxy updates, etc > Obviously, it is risky to let a code base drift so far from galaxy > main's however and you, me, and others who might want to use them will > have to carefully weigh the risks when determining if multiple file > datasets are worth the headache. > > Thanks for all your help and inputs. I am sorry this did not turn > out differently, I feel I have really failed here. > > -John Hi John, Does your multiple file system work on composite datatypes? For example, given a 'blastdbn' file (several files making up a BLAST nucleotide database) could we use 'm:blastdbn' to refer to several BLAST databases? Is this at least possible in principle based on your work so far? How different is your m:xxx grouping of multiple files of a given datatype at odds with the dataset collections idea as outlined by Jeremy Goecks here? http://lists.bx.psu.edu/pipermail/galaxy-dev/2013-February/013574.html https://trello.com/c/325AXIEr Jeremy's idea mentions the special case of a unique dataset collection object for paired-end reads - well if that was defined using the existing core Galaxy functionality as a 'pairedfastq' composite datatype (made of two FASTQ files), then could we use 'm:pairedfastq' (with your enhancement) for a whole bunch of paired FASTQ files. Nice? :) Regards, Peter ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/