Re: [Bioc-devel] Trying to reduce the memory overhead when using mclapply

2013-11-14 Thread Leonardo Collado Torres
Hi Martin, Thank you for the links, they contain a lot of useful information! I am trying to understand more about mclapply because of mainly two cases. 1) I have a large DataFrame which I use because of the low memory footprint and because the data is well behaved for compression using Rle's.

Re: [Bioc-devel] Trying to reduce the memory overhead when using mclapply

2013-11-14 Thread Leonardo Collado Torres
Hello Ryan, Thank you for looking at the example and even forking the gist. I have updated the example with your approach and also calculated that it uses 6.794G RAM when starting from scratch using 20 cores and thus beating the other 3 approaches under that scenario. Also note that just generati

Re: [Bioc-devel] BiocParallel: flattening iteration

2013-11-14 Thread Ryan C. Thompson
Just a note: the foreach package has solved this by providing a "nesting" operator, which effectively converts multiple nested foreach loops into one big one: http://cran.r-project.org/web/packages/foreach/vignettes/nested.pdf On Thu 14 Nov 2013 09:24:29 AM PST, Michael Lawrence wrote: I like

Re: [Bioc-devel] BiocParallel: flattening iteration

2013-11-14 Thread Michael Lawrence
I like the general idea of having iterators; was just checking out the itertools package after not having looked at it for a while. I could see having a BiocIterators package, and a bpiterate(iterator, FUN, ..., BPPARAM). My suggestion was simpler though. Right now, bpmapply runs a single job per i

Re: [Bioc-devel] Subsetting an RleList object

2013-11-14 Thread Michael Lawrence
Saw the fix: could this be considered a bug in the methods package? It seems callNextMethod gets confused with .local(). That said, I like explicit argument passing. On Tue, Oct 29, 2013 at 2:15 PM, Hervé Pagès wrote: > Hi Thomas, > > This is addressed in IRanges 1.20.4 (release) and 1.21.7 (de

Re: [Bioc-devel] BiocParallel: flattening iteration

2013-11-14 Thread Michel Lang
We use a design iterator in BatchExperiments::makeDesign for a cartesian product. I found a old version of designIterator (cf. < https://github.com/tudo-r/BatchExperiments/blob/master/R/designs.R>) w/o the optional data.frame input which is easier to read: < https://gist.github.com/mllg/7469844>.

Re: [Bioc-devel] Trying to reduce the memory overhead when using mclapply

2013-11-14 Thread Martin Morgan
On 11/14/2013 12:13 AM, Leonardo Collado Torres wrote: Dear BioC developers, I am trying to understand how to use mclapply() without blowing up the memory usage and need some help. My use case is splitting a large IRanges::DataFrame() into chunks, and feeding these chunks to mclapply(). Let say

Re: [Bioc-devel] BiocParallel: flattening iteration

2013-11-14 Thread Michael Lawrence
Something could go into BatchJobs, but it would be nice to have abstract support for it at the level of BiocParallel. On Thu, Nov 14, 2013 at 6:32 AM, Vincent Carey wrote: > Streamer package has DAGTeam/DAGParam components that I believe are > relevant. > An abstraction of the reduction plan for

Re: [Bioc-devel] BiocParallel: flattening iteration

2013-11-14 Thread Vincent Carey
Streamer package has DAGTeam/DAGParam components that I believe are relevant. An abstraction of the reduction plan for a parallelized task would seem to have a natural home in BatchJobs. On Thu, Nov 14, 2013 at 8:15 AM, Michael Lawrence wrote: > Hi guys, > > We often need to iterate over the ca

[Bioc-devel] BiocParallel: flattening iteration

2013-11-14 Thread Michael Lawrence
Hi guys, We often need to iterate over the cartesian product of two dimensions, like sample X chromosome. This is preferable to nested iteration, which is complicated. I've been using expand.grid and bpmapply for this, but it seems like this could be made easier. Like bpmapply could gain a CARTESI

Re: [Bioc-devel] tab completion for library()

2013-11-14 Thread Deepayan Sarkar
On Wed, Nov 13, 2013 at 10:55 PM, Martin Morgan wrote: > On 11/13/2013 09:17 AM, Tim Triche, Jr. wrote: >> >> This seems like what I'm looking for, but it doesn't do what I'd expect: >> >> R> rc.options(ipck=TRUE) >> R> rc.options()$ipck >> [1] TRUE >> R> require(Biostr\t >> # nothing happens >> >

Re: [Bioc-devel] Trying to reduce the memory overhead when using mclapply

2013-11-14 Thread Ryan
The minimize the additional memory used by mclapply, remember that mclapply works by forking processes, and the advantage of this is that as long as an object is not modified in either the parent or child, they will share the memory for that object, which effectively means that a child process

[Bioc-devel] Trying to reduce the memory overhead when using mclapply

2013-11-14 Thread Leonardo Collado Torres
Dear BioC developers, I am trying to understand how to use mclapply() without blowing up the memory usage and need some help. My use case is splitting a large IRanges::DataFrame() into chunks, and feeding these chunks to mclapply(). Let say that I am using n cores and that the operation I am doin