On Fri, Sep 3, 2010 at 3:07 PM, Chris Seidel <[email protected]> wrote:

> Did anything ever get resolved in terms of assigning chromosome lengths
> to a GRanges object when it contains alignments that run off the
> chromosome ends? The message below was the last of the original thread
> that I could find.
>
> I'm currently having the problem of reading solexa export files into a
> GRanges object, and then sometimes having an error while setting the
> chromosome lengths if the object has a few reads that are past the
> boundary.

The only solution I see is to somehow toss out the offending
> reads - which means I have to write a complicated function to loop
> through all reads and check them against the chromosome length - so I
> was just wondering since Ivan brought this problem up back in April, if
> a solution was ever reached. (or if anyone knows of an efficient way to
> address the problem).
>


One somewhat efficient way:

end(ranges(gr)) <- pmin(end(ranges(gr)), seqlens[seqnames(gr)])

Would be nice if GRanges had a "restrict" method.

Michael


> -Chris
>
> > -----Original Message-----
> > From: [email protected]
> > [mailto:[email protected]] On Behalf
> > Of Patrick Aboyoun
> > Sent: Tuesday, April 27, 2010 12:39 PM
> > To: Sean Davis
> > Cc: [email protected]
> > Subject: Re: [Bioc-sig-seq] GRanges, failure assigning
> > chromosome lengths
> >
> >
> > Sean and Ivan,
> > Thanks for the insight. I'll look at devising a compromise within the
> > existing framework. I need to explore the various methods for GRanges
> > object to better understand the impact of a compromise. We
> > started with
> > the simplest interpretation of limit bounds because it simplifies the
> > code. For example, we need to establish the rules for coverage or
> > findOverlaps when the DNA is circular or the alignment runs
> > off the end
> > of a linear chromosome.
> >
> >
> > Patrick
> >
> >
> > On 4/27/10 8:05 AM, Sean Davis wrote:
> > > On Tue, Apr 27, 2010 at 10:51 AM, Ivan
> > Gregoretti<[email protected]>
> > > wrote:
> > >
> > >> Good morning Sean and everybody,
> > >>
> > >>
> > >>> Actually, the edge case is general as alignments, even on linear
> > >>> chromosomes, may extend beyond the end of the chromosome,
> > I believe.
> > >>> In the best case, these alignments are clipped (in CIGAR
> > terms), but
> > >>> I don't know that all aligners are doing that appropriately.
> > >>>
> > >>> Sean
> > >>>
> > >> So, you rather go for an overriding switch rather than
> > infrastructure
> > >> overhaul?
> > >>
> > >> I ask this because GRanges is an exceptionally convenient
> > format for
> > >> ChIP-seqers and Patrick is trying to make a decision to
> > make it work
> > >> for real world data.
> > >>
> > > I guess that I mean to say that the two issues of aligning
> > off the end
> > > of the chromosome and handling circular genomes are related but
> > > separate issues.  An override seems quite reasonable for
> > dealing with
> > > the former.  Until aligners or common formats (BAM/SAM)
> > deal with the
> > > latter, it will be difficult to deal appropriately with circular
> > > genomes, so an override is probably a fine compromise.
> > >
> > > Sean
> > >
> > >
> > >
> > >> And yes indeed: aligners do align a little bit past the boundaries
> > >> even for linear chromosomes. Thanks for pointing that out!
> > >>
> > >> Ivan
> > >>
> > >>
> >
> > _______________________________________________
> > Bioc-sig-sequencing mailing list
> > [email protected]
> > https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
> >
> >
>
> _______________________________________________
> Bioc-sig-sequencing mailing list
> [email protected]
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>

        [[alternative HTML version deleted]]

_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

Reply via email to