Re: [galaxy-dev] Implementing dataset collections

2013-03-04 Thread Peter Cock
Hi all,

I've retitled this from Composite datatype output for Cuffdiff to
Implementing dataset collections to try and link into Jeremy's thread.
References:
http://lists.bx.psu.edu/pipermail/galaxy-dev/2013-February/013574.html
http://lists.bx.psu.edu/pipermail/galaxy-dev/2013-March/013634.html
http://lists.bx.psu.edu/pipermail/galaxy-dev/2013-March/013636.html

On Mon, Mar 4, 2013 at 4:08 AM,  alex.khassa...@csiro.au wrote:
 Hi John,

 Are you saying that composite multiple file dataset isn't required
 and won't be implemented?

 We are using your implementation of multifiles dataset (m:xxx type)
 and hope that eventually it will be pushed into main Galaxy
 implementation.

 As we are using Galaxy for CT reconstruction tools, where input
 and output can consist of a couple thousand files, other options
 are not feasible, i.e. grouping datasets.

 -Alex

On Mon, Mar 4, 2013 at 5:42 AM, John Chilton chil...@msi.umn.edu wrote:
 Hi Alex,

   Thanks for the comments. The galaxy team has made it clear here and
 to me privately that this will NOT be included in the Galaxy main code
 base. I hope and am I confident that they will make grouping datasets
 work, hopefully even to thousands of files.

   I do not believe the two ideas are mutually exclusive and I will be
 maintaining a fork of galaxy-central with these additions, I will set
 this up this week hopefully. I will do my best to respond to support
 requests and make multiple file datasets and composite types in
 general as robust as possible, keep up with Galaxy updates, etc
 Obviously, it is risky to let a code base drift so far from galaxy
 main's however and you, me, and others who might want to use them will
 have to carefully weigh the risks when determining if multiple file
 datasets are worth the headache.

   Thanks for all your help and inputs. I am sorry this did not turn
 out differently, I feel I have really failed here.

 -John

Hi John,

Does your multiple file system work on composite datatypes? For
example, given a 'blastdbn' file (several files making up a BLAST
nucleotide database) could we use 'm:blastdbn' to refer to several
BLAST databases? Is this at least possible in principle based
on your work so far?

How different is your m:xxx grouping of multiple files of a given
datatype at odds with the dataset collections idea as outlined by
Jeremy Goecks here?

http://lists.bx.psu.edu/pipermail/galaxy-dev/2013-February/013574.html
https://trello.com/c/325AXIEr

Jeremy's idea mentions the special case of a unique dataset
collection object for paired-end reads - well if that was defined
using the existing core Galaxy functionality as a 'pairedfastq'
composite datatype (made of two FASTQ files), then could we
use 'm:pairedfastq' (with your enhancement) for a whole bunch
of paired FASTQ files.  Nice? :)

Regards,

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] Implementing dataset collections

2013-03-04 Thread John Chilton
Hi Alex and Peter,

  I appreciate the comments, but I am not going to keep talking about
this issue here, I consider the issue closed. The Galaxy team has
spoken clearly on this topic. They have accepted dozens of my pull
requests and I am very happy about all of them, and I need them to
accept dozens (if not hundreds) more in the future so I cannot afford
to continue to burn political capital by talking about this in this
forum.

  Alex, I plan to support some version of multiple file datasets for
at least the next two years. I will put the code here when it is ready
- https://bitbucket.org/msiappdev/galaxy-extras. Peter, I have
responded to your questions here -
https://groups.google.com/forum/?fromgroups#!forum/galaxy-extras.

Thanks,
-John

On Mon, Mar 4, 2013 at 4:49 AM, Peter Cock p.j.a.c...@googlemail.com wrote:
 Hi all,

 I've retitled this from Composite datatype output for Cuffdiff to
 Implementing dataset collections to try and link into Jeremy's thread.
 References:
 http://lists.bx.psu.edu/pipermail/galaxy-dev/2013-February/013574.html
 http://lists.bx.psu.edu/pipermail/galaxy-dev/2013-March/013634.html
 http://lists.bx.psu.edu/pipermail/galaxy-dev/2013-March/013636.html

 On Mon, Mar 4, 2013 at 4:08 AM,  alex.khassa...@csiro.au wrote:
 Hi John,

 Are you saying that composite multiple file dataset isn't required
 and won't be implemented?

 We are using your implementation of multifiles dataset (m:xxx type)
 and hope that eventually it will be pushed into main Galaxy
 implementation.

 As we are using Galaxy for CT reconstruction tools, where input
 and output can consist of a couple thousand files, other options
 are not feasible, i.e. grouping datasets.

 -Alex

 On Mon, Mar 4, 2013 at 5:42 AM, John Chilton chil...@msi.umn.edu wrote:
 Hi Alex,

   Thanks for the comments. The galaxy team has made it clear here and
 to me privately that this will NOT be included in the Galaxy main code
 base. I hope and am I confident that they will make grouping datasets
 work, hopefully even to thousands of files.

   I do not believe the two ideas are mutually exclusive and I will be
 maintaining a fork of galaxy-central with these additions, I will set
 this up this week hopefully. I will do my best to respond to support
 requests and make multiple file datasets and composite types in
 general as robust as possible, keep up with Galaxy updates, etc
 Obviously, it is risky to let a code base drift so far from galaxy
 main's however and you, me, and others who might want to use them will
 have to carefully weigh the risks when determining if multiple file
 datasets are worth the headache.

   Thanks for all your help and inputs. I am sorry this did not turn
 out differently, I feel I have really failed here.

 -John

 Hi John,

 Does your multiple file system work on composite datatypes? For
 example, given a 'blastdbn' file (several files making up a BLAST
 nucleotide database) could we use 'm:blastdbn' to refer to several
 BLAST databases? Is this at least possible in principle based
 on your work so far?

 How different is your m:xxx grouping of multiple files of a given
 datatype at odds with the dataset collections idea as outlined by
 Jeremy Goecks here?

 http://lists.bx.psu.edu/pipermail/galaxy-dev/2013-February/013574.html
 https://trello.com/c/325AXIEr

 Jeremy's idea mentions the special case of a unique dataset
 collection object for paired-end reads - well if that was defined
 using the existing core Galaxy functionality as a 'pairedfastq'
 composite datatype (made of two FASTQ files), then could we
 use 'm:pairedfastq' (with your enhancement) for a whole bunch
 of paired FASTQ files.  Nice? :)

 Regards,

 Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] Implementing dataset collections

2013-03-04 Thread Peter Cock
On Mon, Mar 4, 2013 at 3:16 PM, John Chilton chil...@msi.umn.edu wrote:
 Hi Alex and Peter,

   I appreciate the comments, but I am not going to keep talking about
 this issue here, I consider the issue closed. The Galaxy team has
 spoken clearly on this topic. They have accepted dozens of my pull
 requests and I am very happy about all of them, and I need them to
 accept dozens (if not hundreds) more in the future so I cannot afford
 to continue to burn political capital by talking about this in this
 forum.

   Alex, I plan to support some version of multiple file datasets for
 at least the next two years. I will put the code here when it is ready
 - https://bitbucket.org/msiappdev/galaxy-extras. Peter, I have
 responded to your questions here -
 https://groups.google.com/forum/?fromgroups#!forum/galaxy-extras.

 Thanks,
 -John

Thank John,

Hopefully I will get a better insight into the internals during some
of the developer orientated GCC2013 training sessions - right now
I know that I only have a partial grasp of the compound datatypes.

Regards,

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/