On Fri, Aug 23, 2013 at 8:41 AM, Valerie Obenchain <voben...@fhcrc.org>wrote:
> Hi Michael, > > Martin and I have been discussing this. In addition to the fix you > suggest, what do you think of changing the default to compressed=TRUE for > the RleList constructor? Rle is the only one of the AtomicLists with > default FALSE. Was there a reason for this when it was first implemented? > > I'm guessing Patrick did that because we always used Rles for coverage, and RleList for per-chromosome coverage. Also, there might be some overhead in that Rle runs in the unlistData can cross list elements. About my fix, the only downside would be if the range widths were much larger than the size of the vector, e.g., a highly compressed Rle, selected with chromosome-size ranges. Then the as.integer(ir) is big compared to the data. Otherwise, it's way faster. Val > > > > > On 08/22/2013 07:34 PM, Maintainer wrote: > >> Hi, >> >> SimpleLists are slow in this situation, basically because the underlying >> seqselect is slow, due to this loop: >> >> x <- do.call(c, lapply(seq_len(length(ir)), function(i) >> window(x, >> start = start(ir)[i], width = width(ir)[i]))) >> >> Am I missing something or could this become a simple x[as.integer(ir)]? >> >> In the meantime, using CompressedLists is the way to go. So for an >> RleList, you need to pass compress=TRUE to the constructor. >> >> >> On Wed, Aug 21, 2013 at 8:30 AM, Ou, Jianhong <jianhong...@umassmed.edu >> <mailto:Jianhong.Ou@umassmed.**edu <jianhong...@umassmed.edu>>> wrote: >> >> Hi, >> >> When I use big set of GrangesList, I found it become very slow when >> metadata contain AtomicList. e.g. >> >> > grll <- GRanges(seqnames="chr1", ranges=IRanges(start=1:500, >> width=2), someInfo=rep(RleList("*"), 500)) >> > grr <- split(grll, 1:500) >> > grl <- as.list(grr) >> > system.time(grl<- grl[500:1]) >> user system elapsed >> 0 0 0 >> > system.time(grr<- grr[500:1]) >> user system elapsed >> 1.622 0.013 1.635 >> > grll <- GRanges(seqnames="chr1", ranges=IRanges(start=1:500, >> width=2)) >> > grr <- split(grll, 1:500) >> > grl <- as.list(grr) >> > system.time(grl<- grl[500:1]) >> user system elapsed >> 0 0 0 >> > system.time(grr<- grr[500:1]) >> user system elapsed >> 0.029 0.001 0.030 >> > sessionInfo() >> R Under development (unstable) (2013-07-23 r63392) >> Platform: x86_64-apple-darwin12.4.0 (64-bit) >> >> locale: >> [1] en_US.UTF-8/en_US.UTF-8/en_US.**UTF-8/C/en_US.UTF-8/en_US.UTF-**8 >> >> attached base packages: >> [1] parallel stats graphics grDevices utils datasets >> methods base >> >> other attached packages: >> [1] GenomicRanges_1.13.36 XVector_0.1.0 IRanges_1.19.24 >> BiocGenerics_0.7.3 >> >> loaded via a namespace (and not attached): >> [1] stats4_3.1.0 tools_3.1.0 >> >> Is there any method to improve this? >> >> Yours sincerely, >> >> Jianhong Ou >> >> LRB 670A >> Program in Gene Function and Expression >> 364 Plantation Street Worcester, >> MA 01605 >> >> [[alternative HTML version deleted]] >> >> ______________________________**_________________ >> Bioc-devel@r-project.org >> <mailto:Bioc-devel@r-project.**org<Bioc-devel@r-project.org>> >> mailing list >> >> https://stat.ethz.ch/mailman/**listinfo/bioc-devel<https://stat.ethz.ch/mailman/listinfo/bioc-devel> >> >> >> >> >> ______________________________**______________________________** >> ____________ >> devteam-bioc mailing list >> To unsubscribe from this mailing list send a blank email to >> devteam-bioc-leave@lists.**fhcrc.org <devteam-bioc-le...@lists.fhcrc.org> >> You can also unsubscribe or change your personal options at >> https://lists.fhcrc.org/**mailman/listinfo/devteam-bioc<https://lists.fhcrc.org/mailman/listinfo/devteam-bioc> >> >> > [[alternative HTML version deleted]] _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel