Making IntervalTree chromosome would also be a great addition for organisms with many sequences, like bee (due to an incomplete genome; 10,000s of sequences). It does not matter for humans, but findOverlaps is excruciatingly slow for bee's. I have a couple of posts on this in the archive.
I am predicting this will be the case in the future for most non-model organisms; finishing a genome is expensive and time consuming. Kasper On Wed, Apr 3, 2013 at 12:45 PM, Michael Lawrence <lawrence.mich...@gene.com > wrote: > Some ideas: > > - Turn the IntervalTree into a list/array of nodes that can be > subset/reordered with shallow copying (just copy the pointers to the > nodes), and the index would be secondary. The index in the array could be > stored in each node, for lookup during overlap queries. Right now, as far > as I can tell, GIntervalTree will get confused if the user reorders e.g. > via [. > > - Make IntervalTree aware of the sequence/chromosome, e.g., have a hash of > trees, which is trivial since seqnames is already a factor. > > Michael > > > > On Wed, Apr 3, 2013 at 9:29 AM, Hector Corrada Bravo <hcorr...@gmail.com > >wrote: > > > Yep, I didn't comment on that, but I agree that abstracting how > > GRanges stores ranges would make this more elegant. Right now > > ranges(GRanges) is specified to be of IRanges class instead of the > > abstract Ranges class. > > > > If it were the latter then GIntervalTree can be a subclass of > > GenomicRanges, in a similar way that IntervalTree is a subclass of > > Ranges. > > > > On Wed, Apr 3, 2013 at 12:23 PM, Michael Lawrence > > <lawrence.mich...@gene.com> wrote: > > > Hi Hector, > > > > > > That's interesting, thanks for passing this along. I'm still wishing > that > > > somehow GRanges itself could abstract the way it stores ranges. I know > > that > > > Herve/Patrick had some reasons for depending specifically on GRanges. > One > > > reason was probably convenience at the C level, but it wouldn't be hard > > to > > > create a Ranges abstraction at the C level, as well. > > > > > > Michael > > > > > > > > > > > > On Tue, Apr 2, 2013 at 5:40 PM, Hector Corrada Bravo < > hcorr...@gmail.com > > > > > > wrote: > > >> > > >> Hello bioc-develers, > > >> > > >> I'm writing an application where lots findOverlap calls are made on > > >> static GRanges objects. For IRanges we can create persistent > > >> IntervalTree objects that would serve the multiple overlap query > > >> use-case. There is no equivalent for GenomicRanges objects, so I'm > > >> proposing an implementation for this. > > >> > > >> Please check > > >> http://github.com/hcorrada/GenomicIntervalTree > > >> > > >> There's a first cut implementation there you can test by installing > > >> this skeleton package. E.g, > > >> > > >> > library(devtools) > > >> > install_github("GenomicIntervalTree", username="hcorrada", > > subdir="pkg") > > >> > library(GenomicIntervalTree) > > >> > > >> Let me know what you think. > > >> > > >> Cheers, > > >> Hector > > >> > > >> _______________________________________________ > > >> Bioc-devel@r-project.org mailing list > > >> https://stat.ethz.ch/mailman/listinfo/bioc-devel > > > > > > > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioc-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/bioc-devel > [[alternative HTML version deleted]] _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel