Re: [galaxy-dev] using Galaxy for map/reduce

2011-08-26 Thread Edward Kirton
ust 26, 2011 12:34 PM > To: Duddy, John > Cc: galaxy-...@bx.psu.edu > Subject: Re: [galaxy-dev] using Galaxy for map/reduce > > Not intending to hijack the thread, but in response to John's comment > -- I, too, made a general solution for embarassingly parallel problems >

Re: [galaxy-dev] using Galaxy for map/reduce

2011-08-26 Thread Duddy, John
Duddy, John Cc: galaxy-...@bx.psu.edu Subject: Re: [galaxy-dev] using Galaxy for map/reduce Not intending to hijack the thread, but in response to John's comment -- I, too, made a general solution for embarassingly parallel problems but instead of splitting the large files on disk, I just u

Re: [galaxy-dev] using Galaxy for map/reduce

2011-08-26 Thread Edward Kirton
Not intending to hijack the thread, but in response to John's comment -- I, too, made a general solution for embarassingly parallel problems but instead of splitting the large files on disk, I just use seek to move the file pointer so each task can grab it's part. On Tue, Aug 2, 2011 at 10:54 AM,

Re: [galaxy-dev] using Galaxy for map/reduce

2011-08-02 Thread Duddy, John
I did something similar, but implemented as an evolution of the original "basic" parallelism (see BWA), that: - Moved the splitting of input files into the datatype classes - Allowed any number of inputs to be split, as long as they were the same datatype (so they were mutually consistent - think

Re: [galaxy-dev] using Galaxy for map/reduce

2011-08-02 Thread Andrew Straw
On 08/02/2011 06:43 PM, James Taylor wrote: > On Aug 2, 2011, at 10:12 AM, Andrew Straw wrote: > >> 1) My first specific problem is that loading many datasets (e.g. 250) >> into history causes the javascript running locally withing a browser to >> be extremely slow. > What browser are you using? Pr

Re: [galaxy-dev] using Galaxy for map/reduce

2011-08-02 Thread James Taylor
On Aug 2, 2011, at 10:12 AM, Andrew Straw wrote: > 1) My first specific problem is that loading many datasets (e.g. 250) > into history causes the javascript running locally withing a browser to > be extremely slow. What browser are you using? > 2) My second specific problem is that applying a

Re: [galaxy-dev] using Galaxy for map/reduce

2011-08-02 Thread Ravi Madduri
Hi I really like this proposal. We faced some of the similar issues you talk about below when we tried to use galaxy to use High Throughput computing techniques (using Condor) for sequencing close to 500 genomes (embarrassingly parallel problem). We leveraged (hacked) the dataset construct but i

Re: [galaxy-dev] using Galaxy for map/reduce

2011-08-02 Thread Peter Cock
On Tue, Aug 2, 2011 at 3:12 PM, Andrew Straw wrote: > ... > > My proposal for a general solution, and what I'd be interested in > feedback on, is an idea of a "dataset container" (this is just a working > name). It would look and act much like a dataset in the history, but > would in fact be a log