Making IntervalTree chromosome would also be a great addition for organisms
with many sequences, like bee (due to an incomplete genome; 10,000s of
sequences).  It does not matter for humans, but findOverlaps is
excruciatingly slow for bee's.  I have a couple of posts on this in the
archive.

I am predicting this will be the case in the future for most non-model
organisms; finishing a genome is expensive and time consuming.

Kasper


On Wed, Apr 3, 2013 at 12:45 PM, Michael Lawrence <lawrence.mich...@gene.com
> wrote:

> Some ideas:
>
> - Turn the IntervalTree into a list/array of nodes that can be
> subset/reordered with shallow copying (just copy the pointers to the
> nodes), and the index would be secondary. The index in the array could be
> stored in each node, for lookup during overlap queries. Right now, as far
> as I can tell, GIntervalTree will get confused if the user reorders e.g.
> via [.
>
> - Make IntervalTree aware of the sequence/chromosome, e.g., have a hash of
> trees, which is trivial since seqnames is already a factor.
>
> Michael
>
>
>
> On Wed, Apr 3, 2013 at 9:29 AM, Hector Corrada Bravo <hcorr...@gmail.com
> >wrote:
>
> > Yep, I didn't comment on that, but I agree that abstracting how
> > GRanges stores ranges would make this more elegant. Right now
> > ranges(GRanges) is specified to be of IRanges class instead of the
> > abstract Ranges class.
> >
> > If it were the latter then GIntervalTree can be a subclass of
> > GenomicRanges, in a similar way that IntervalTree is a subclass of
> > Ranges.
> >
> > On Wed, Apr 3, 2013 at 12:23 PM, Michael Lawrence
> > <lawrence.mich...@gene.com> wrote:
> > > Hi Hector,
> > >
> > > That's interesting, thanks for passing this along. I'm still wishing
> that
> > > somehow GRanges itself could abstract the way it stores ranges. I know
> > that
> > > Herve/Patrick had some reasons for depending specifically on GRanges.
> One
> > > reason was probably convenience at the C level, but it wouldn't be hard
> > to
> > > create a Ranges abstraction at the C level, as well.
> > >
> > > Michael
> > >
> > >
> > >
> > > On Tue, Apr 2, 2013 at 5:40 PM, Hector Corrada Bravo <
> hcorr...@gmail.com
> > >
> > > wrote:
> > >>
> > >> Hello bioc-develers,
> > >>
> > >> I'm writing an application where lots findOverlap calls are made on
> > >> static GRanges objects. For IRanges we can create persistent
> > >> IntervalTree objects that would serve the multiple overlap query
> > >> use-case. There is no equivalent for GenomicRanges objects, so I'm
> > >> proposing an implementation for this.
> > >>
> > >> Please check
> > >> http://github.com/hcorrada/GenomicIntervalTree
> > >>
> > >> There's a first cut implementation there you can test by installing
> > >> this skeleton package. E.g,
> > >>
> > >> > library(devtools)
> > >> > install_github("GenomicIntervalTree", username="hcorrada",
> > subdir="pkg")
> > >> > library(GenomicIntervalTree)
> > >>
> > >> Let me know what you think.
> > >>
> > >> Cheers,
> > >> Hector
> > >>
> > >> _______________________________________________
> > >> Bioc-devel@r-project.org mailing list
> > >> https://stat.ethz.ch/mailman/listinfo/bioc-devel
> > >
> > >
> >
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioc-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>

        [[alternative HTML version deleted]]

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to