On Wed, Dec 1, 2010 at 10:00 PM, Dario Strbenac <[email protected]>wrote:

> Hello,
>
> I have a snippet of code that takes a GRangesList (in my case, of length 8,
> around about 10 million reads in each GRanges object) and a vector the same
> length which explains what type of experiment each element is. The snippet
> combines elements that are of the same type.
>
> Initially, once the GRangesList is loaded in, R is using 2 GB of RAM but
> then when I run my snippet, R's RAM usage hovers at 22GB - 24GB for many
> minutes. When I gc(), it drops back to 12GB.
>
> Could I be doing something more efficiently ?
>
> # readsIPs : the GRangesList
> # exptTypes : a vector like c("MeDIP", "MeDIP", "H3K27", "H3K27") that is
> the same length as readsIP.
>
> typeCounts <- table(exptTypes)
> if(any(typeCounts > 1))
> {
>        oldOrder <- unique(exptTypes)
>        repTypes <- names(typeCounts)[typeCounts > 1]
>        pooledIPs <- lapply(repTypes, function(repType)
>        {
>                whichReps <- which(exptTypes == repType)
>                unlist(readsIPs[whichReps])
>        })
>        names(pooledIPs) <- repTypes
>        uniqueIdxs <- exptTypes %in% names(typeCounts)[typeCounts == 1]
>        uniqueIPs <- readsIPs[uniqueIdxs]
>        names(uniqueIPs) <- exptTypes[uniqueIdxs]
>        readsIPs <- c(GRangesList(pooledIPs), uniqueIPs)[oldOrder]
> }
>
>
Something like:

split(unlist(readsIP), rep(exptTypes, elementLengths(readsIP))

Should do all that in a single line. It's also going to be fairly efficient,
given the internal compressed representation of GRangesList. Excess copying
is unavoidable with S4 though. At least this line leaves you with less to
clean up.

--------------------------------------
> Dario Strbenac
> Research Assistant
> Cancer Epigenetics
> Garvan Institute of Medical Research
> Darlinghurst NSW 2010
> Australia
>
> _______________________________________________
> Bioc-sig-sequencing mailing list
> [email protected]
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>

        [[alternative HTML version deleted]]

_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

Reply via email to