Awesome. If Hector is finished cleaning up, I'd be glad to merge it.

Michael


On Sat, Jul 6, 2013 at 6:18 PM, Kasper Daniel Hansen <
kasperdanielhan...@gmail.com> wrote:

> A little late, I can report that this speeds up my "many seqlevels"
> problem, by 3 orders of magnitude.
>
> library(IRanges, lib.loc = "library")
> library(GenomicRanges, lib.loc = "library")
> library(BSgenome.Amellifera.BeeBase.assembly4)
> Un <- Amellifera$GroupUn
> gr <- GRanges(seqnames = names(Un),
>               ranges= IRanges(start = 1 , width = width(Un)))
>
> ## gr has a length of 9244, but each interval is in a new seqname.
> ## this makes traditional findOverlaps extremely slow
>
> system.time({
>     findOverlaps(gr, gr)
> })  ## roughly 240 secs
>
> system.time({
>     grF <- as(gr, "GIntervalTree")
> })
> system.time({
>     findOverlaps(grF, grF)
> }) ## roughly 0.1 secs
>
> ## speedup (for this example): 2400x fold !!!
>
> Kasper
>
>
> On Thu, May 30, 2013 at 6:51 AM, Hector Corrada Bravo <
> hcorr...@umiacs.umd.edu> wrote:
>
>> Great. I already have unit tests there for IntervalForest and
>> GIntervalTree.
>> Hector
>>
>>
>> On Wed, May 29, 2013 at 8:31 PM, Vincent Carey
>> <st...@channing.harvard.edu>wrote:
>>
>> > Fine with me, as long as he is acquainted with the build/test before
>> commit
>> > practices that we are supposed
>> > to follow.  Breaking IRanges can have severe repercussions.
>> >
>> > On Wed, May 29, 2013 at 6:36 PM, Michael Lawrence <
>> > lawrence.mich...@gene.com
>> > > wrote:
>> >
>> > > Would it be feasible/acceptable to give Hector permission to commit?
>> > >
>> > > Michael
>> > >
>> > >
>> > > On Wed, May 29, 2013 at 2:12 PM, Hector Corrada Bravo <
>> > hcorr...@gmail.com
>> > > >wrote:
>> > >
>> > > > That's great! There's some cleaning up to do there how should we do
>> > this
>> > > > post-merge?
>> > > >
>> > > >
>> > > > On Wed, May 29, 2013 at 4:19 PM, Valerie Obenchain <
>> voben...@fhcrc.org
>> > > >wrote:
>> > > >
>> > > >> Hi Hector, Michael,
>> > > >>
>> > > >> This sounds great. Bringing these into svn is fine with us.
>> Michael,
>> > do
>> > > >> you want to merge these in?
>> > > >>
>> > > >> Val
>> > > >>
>> > > >> On 05/24/2013 07:30 AM, Hector Corrada Bravo wrote:
>> > > >> > Thanks Michael,
>> > > >> >
>> > > >> > It has made significant difference for our visualization
>> project. I
>> > > >> would
>> > > >> > like to merge this into svn asap. Can I get a ruling from the
>> rest
>> > of
>> > > >> the
>> > > >> > core group? Please let me know if/when/how to proceed.
>> > > >> >
>> > > >> > Cheers,
>> > > >> > Hector
>> > > >> >
>> > > >> >
>> > > >> > On Wed, May 22, 2013 at 1:00 PM, Michael Lawrence <
>> > > >> lawrence.mich...@gene.com
>> > > >> >> wrote:
>> > > >> >
>> > > >> >> *Added bioc-devel; hope you don't mind*
>> > > >> >>
>> > > >> >> Hector,
>> > > >> >>
>> > > >> >> This is great stuff. The overall design is on the right track.
>> As
>> > you
>> > > >> >> said, there's a bit of cleaning to do, but I think we should
>> merge
>> > > >> this
>> > > >> >> into svn and work the rest out from there. This will really
>> benefit
>> > > >> >> performance, especially for visualization. Of course, I can't
>> speak
>> > > >> for the
>> > > >> >> others.
>> > > >> >>
>> > > >> >> Michael
>> > > >> >>
>> > > >> >>
>> > > >> >>
>> > > >> >> On Tue, May 21, 2013 at 11:52 AM, Hector Corrada Bravo <
>> > > >> >> hcorr...@umiacs.umd.edu> wrote:
>> > > >> >>
>> > > >> >>> Since the semester is over I finally finished this...
>> > > >> >>>
>> > > >> >>> Recall that I wanted a persistent set of IntervalTrees for
>> GRanges
>> > > >> >>> objects for repeated querying. (The application is this:
>> > > >> >>> http://epiviz.cbcb.umd.edu/help/?page_id=62 which I hope to
>> get
>> > out
>> > > >> >>> soon). Folding this into IRanges and GenomicRanges would make
>> our
>> > > >> life
>> > > >> >>> easier come installation time.
>> > > >> >>>
>> > > >> >>> I've implemented class 'IntervalForest' within IRanges
>> following
>> > > >> >>> Michael's suggestion of storing this as an array of rbTree on
>> the
>> > C
>> > > >> side.
>> > > >> >>> I've implemented findOverlaps that operates with this array in
>> C.
>> > > >> There is
>> > > >> >>> code duplication in IntervalTree.c that could be reduced but
>> > that's
>> > > >> if this
>> > > >> >>> makes it into the package.
>> > > >> >>>
>> > > >> >>> I've also implemented a 'GIntervalTree' that uses
>> 'IntervalForest'
>> > > >> >>> underneath. findOverlaps-GenomicRanges-GIntervalTree-method is
>> > > >> implemented
>> > > >> >>> for this class. I didn't touch the existing
>> > > >> >>> findOverlaps-GenomicRanges-GenomicRanges-method.
>> > > >> >>>
>> > > >> >>> You can pull these here:
>> > > >> >>> http://github.com/hcorrada/IRanges
>> > > >> >>> http://github.com/hcorrada/GenomicRanges
>> > > >> >>>
>> > > >> >>> These track the devel branch of the two packages. Let me know
>> the
>> > > >> best
>> > > >> >>> way to propagate to svn if you guys want this. It needs
>> > > >> documentation, but
>> > > >> >>> I'll add that once implementation is settled.
>> > > >> >>>
>> > > >> >>> Kasper, I'm not sure if this would help with the 'too many
>> > > seqlevels'
>> > > >> >>> problem but I'd be curious to know if you try it.
>> > > >> >>>
>> > > >> >>> Cheers,
>> > > >> >>> Hector
>> > > >> >>>
>> > > >> >>
>> > > >> >>
>> > > >> >
>> > > >> > [[alternative HTML version deleted]]
>> > > >> >
>> > > >> > _______________________________________________
>> > > >> > Bioc-devel@r-project.org mailing list
>> > > >> > https://stat.ethz.ch/mailman/listinfo/bioc-devel
>> > > >> >
>> > > >>
>> > > >
>> > > >
>> > >
>> > >         [[alternative HTML version deleted]]
>> > >
>> > > _______________________________________________
>> > > Bioc-devel@r-project.org mailing list
>> > > https://stat.ethz.ch/mailman/listinfo/bioc-devel
>> > >
>> >
>> >         [[alternative HTML version deleted]]
>> >
>> > _______________________________________________
>> > Bioc-devel@r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/bioc-devel
>> >
>>
>>         [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioc-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>
>

        [[alternative HTML version deleted]]

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to