Awesome. If Hector is finished cleaning up, I'd be glad to merge it. Michael
On Sat, Jul 6, 2013 at 6:18 PM, Kasper Daniel Hansen < kasperdanielhan...@gmail.com> wrote: > A little late, I can report that this speeds up my "many seqlevels" > problem, by 3 orders of magnitude. > > library(IRanges, lib.loc = "library") > library(GenomicRanges, lib.loc = "library") > library(BSgenome.Amellifera.BeeBase.assembly4) > Un <- Amellifera$GroupUn > gr <- GRanges(seqnames = names(Un), > ranges= IRanges(start = 1 , width = width(Un))) > > ## gr has a length of 9244, but each interval is in a new seqname. > ## this makes traditional findOverlaps extremely slow > > system.time({ > findOverlaps(gr, gr) > }) ## roughly 240 secs > > system.time({ > grF <- as(gr, "GIntervalTree") > }) > system.time({ > findOverlaps(grF, grF) > }) ## roughly 0.1 secs > > ## speedup (for this example): 2400x fold !!! > > Kasper > > > On Thu, May 30, 2013 at 6:51 AM, Hector Corrada Bravo < > hcorr...@umiacs.umd.edu> wrote: > >> Great. I already have unit tests there for IntervalForest and >> GIntervalTree. >> Hector >> >> >> On Wed, May 29, 2013 at 8:31 PM, Vincent Carey >> <st...@channing.harvard.edu>wrote: >> >> > Fine with me, as long as he is acquainted with the build/test before >> commit >> > practices that we are supposed >> > to follow. Breaking IRanges can have severe repercussions. >> > >> > On Wed, May 29, 2013 at 6:36 PM, Michael Lawrence < >> > lawrence.mich...@gene.com >> > > wrote: >> > >> > > Would it be feasible/acceptable to give Hector permission to commit? >> > > >> > > Michael >> > > >> > > >> > > On Wed, May 29, 2013 at 2:12 PM, Hector Corrada Bravo < >> > hcorr...@gmail.com >> > > >wrote: >> > > >> > > > That's great! There's some cleaning up to do there how should we do >> > this >> > > > post-merge? >> > > > >> > > > >> > > > On Wed, May 29, 2013 at 4:19 PM, Valerie Obenchain < >> voben...@fhcrc.org >> > > >wrote: >> > > > >> > > >> Hi Hector, Michael, >> > > >> >> > > >> This sounds great. Bringing these into svn is fine with us. >> Michael, >> > do >> > > >> you want to merge these in? >> > > >> >> > > >> Val >> > > >> >> > > >> On 05/24/2013 07:30 AM, Hector Corrada Bravo wrote: >> > > >> > Thanks Michael, >> > > >> > >> > > >> > It has made significant difference for our visualization >> project. I >> > > >> would >> > > >> > like to merge this into svn asap. Can I get a ruling from the >> rest >> > of >> > > >> the >> > > >> > core group? Please let me know if/when/how to proceed. >> > > >> > >> > > >> > Cheers, >> > > >> > Hector >> > > >> > >> > > >> > >> > > >> > On Wed, May 22, 2013 at 1:00 PM, Michael Lawrence < >> > > >> lawrence.mich...@gene.com >> > > >> >> wrote: >> > > >> > >> > > >> >> *Added bioc-devel; hope you don't mind* >> > > >> >> >> > > >> >> Hector, >> > > >> >> >> > > >> >> This is great stuff. The overall design is on the right track. >> As >> > you >> > > >> >> said, there's a bit of cleaning to do, but I think we should >> merge >> > > >> this >> > > >> >> into svn and work the rest out from there. This will really >> benefit >> > > >> >> performance, especially for visualization. Of course, I can't >> speak >> > > >> for the >> > > >> >> others. >> > > >> >> >> > > >> >> Michael >> > > >> >> >> > > >> >> >> > > >> >> >> > > >> >> On Tue, May 21, 2013 at 11:52 AM, Hector Corrada Bravo < >> > > >> >> hcorr...@umiacs.umd.edu> wrote: >> > > >> >> >> > > >> >>> Since the semester is over I finally finished this... >> > > >> >>> >> > > >> >>> Recall that I wanted a persistent set of IntervalTrees for >> GRanges >> > > >> >>> objects for repeated querying. (The application is this: >> > > >> >>> http://epiviz.cbcb.umd.edu/help/?page_id=62 which I hope to >> get >> > out >> > > >> >>> soon). Folding this into IRanges and GenomicRanges would make >> our >> > > >> life >> > > >> >>> easier come installation time. >> > > >> >>> >> > > >> >>> I've implemented class 'IntervalForest' within IRanges >> following >> > > >> >>> Michael's suggestion of storing this as an array of rbTree on >> the >> > C >> > > >> side. >> > > >> >>> I've implemented findOverlaps that operates with this array in >> C. >> > > >> There is >> > > >> >>> code duplication in IntervalTree.c that could be reduced but >> > that's >> > > >> if this >> > > >> >>> makes it into the package. >> > > >> >>> >> > > >> >>> I've also implemented a 'GIntervalTree' that uses >> 'IntervalForest' >> > > >> >>> underneath. findOverlaps-GenomicRanges-GIntervalTree-method is >> > > >> implemented >> > > >> >>> for this class. I didn't touch the existing >> > > >> >>> findOverlaps-GenomicRanges-GenomicRanges-method. >> > > >> >>> >> > > >> >>> You can pull these here: >> > > >> >>> http://github.com/hcorrada/IRanges >> > > >> >>> http://github.com/hcorrada/GenomicRanges >> > > >> >>> >> > > >> >>> These track the devel branch of the two packages. Let me know >> the >> > > >> best >> > > >> >>> way to propagate to svn if you guys want this. It needs >> > > >> documentation, but >> > > >> >>> I'll add that once implementation is settled. >> > > >> >>> >> > > >> >>> Kasper, I'm not sure if this would help with the 'too many >> > > seqlevels' >> > > >> >>> problem but I'd be curious to know if you try it. >> > > >> >>> >> > > >> >>> Cheers, >> > > >> >>> Hector >> > > >> >>> >> > > >> >> >> > > >> >> >> > > >> > >> > > >> > [[alternative HTML version deleted]] >> > > >> > >> > > >> > _______________________________________________ >> > > >> > Bioc-devel@r-project.org mailing list >> > > >> > https://stat.ethz.ch/mailman/listinfo/bioc-devel >> > > >> > >> > > >> >> > > > >> > > > >> > > >> > > [[alternative HTML version deleted]] >> > > >> > > _______________________________________________ >> > > Bioc-devel@r-project.org mailing list >> > > https://stat.ethz.ch/mailman/listinfo/bioc-devel >> > > >> > >> > [[alternative HTML version deleted]] >> > >> > _______________________________________________ >> > Bioc-devel@r-project.org mailing list >> > https://stat.ethz.ch/mailman/listinfo/bioc-devel >> > >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioc-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/bioc-devel >> > > [[alternative HTML version deleted]] _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel