On Fri, Sep 3, 2010 at 3:53 PM, Michael Lawrence <[email protected]> wrote:
> > > On Fri, Sep 3, 2010 at 3:07 PM, Chris Seidel <[email protected]> wrote: > >> Did anything ever get resolved in terms of assigning chromosome lengths >> to a GRanges object when it contains alignments that run off the >> chromosome ends? The message below was the last of the original thread >> that I could find. >> >> I'm currently having the problem of reading solexa export files into a >> GRanges object, and then sometimes having an error while setting the >> chromosome lengths if the object has a few reads that are past the >> boundary. > > The only solution I see is to somehow toss out the offending >> reads - which means I have to write a complicated function to loop >> through all reads and check them against the chromosome length - so I >> was just wondering since Ivan brought this problem up back in April, if >> a solution was ever reached. (or if anyone knows of an efficient way to >> address the problem). >> > > > One somewhat efficient way: > > end(ranges(gr)) <- pmin(end(ranges(gr)), seqlens[seqnames(gr)]) > > Would be nice if GRanges had a "restrict" method. > > Or if you want to toss them out, rather than trim them: gr <- gr[end(ranges(gr)) <= seqlens[seqnames(gr)])] > Michael > > >> -Chris >> >> > -----Original Message----- >> > From: [email protected] >> > [mailto:[email protected]] On Behalf >> > Of Patrick Aboyoun >> > Sent: Tuesday, April 27, 2010 12:39 PM >> > To: Sean Davis >> > Cc: [email protected] >> > Subject: Re: [Bioc-sig-seq] GRanges, failure assigning >> > chromosome lengths >> > >> > >> > Sean and Ivan, >> > Thanks for the insight. I'll look at devising a compromise within the >> > existing framework. I need to explore the various methods for GRanges >> > object to better understand the impact of a compromise. We >> > started with >> > the simplest interpretation of limit bounds because it simplifies the >> > code. For example, we need to establish the rules for coverage or >> > findOverlaps when the DNA is circular or the alignment runs >> > off the end >> > of a linear chromosome. >> > >> > >> > Patrick >> > >> > >> > On 4/27/10 8:05 AM, Sean Davis wrote: >> > > On Tue, Apr 27, 2010 at 10:51 AM, Ivan >> > Gregoretti<[email protected]> >> > > wrote: >> > > >> > >> Good morning Sean and everybody, >> > >> >> > >> >> > >>> Actually, the edge case is general as alignments, even on linear >> > >>> chromosomes, may extend beyond the end of the chromosome, >> > I believe. >> > >>> In the best case, these alignments are clipped (in CIGAR >> > terms), but >> > >>> I don't know that all aligners are doing that appropriately. >> > >>> >> > >>> Sean >> > >>> >> > >> So, you rather go for an overriding switch rather than >> > infrastructure >> > >> overhaul? >> > >> >> > >> I ask this because GRanges is an exceptionally convenient >> > format for >> > >> ChIP-seqers and Patrick is trying to make a decision to >> > make it work >> > >> for real world data. >> > >> >> > > I guess that I mean to say that the two issues of aligning >> > off the end >> > > of the chromosome and handling circular genomes are related but >> > > separate issues. An override seems quite reasonable for >> > dealing with >> > > the former. Until aligners or common formats (BAM/SAM) >> > deal with the >> > > latter, it will be difficult to deal appropriately with circular >> > > genomes, so an override is probably a fine compromise. >> > > >> > > Sean >> > > >> > > >> > > >> > >> And yes indeed: aligners do align a little bit past the boundaries >> > >> even for linear chromosomes. Thanks for pointing that out! >> > >> >> > >> Ivan >> > >> >> > >> >> > >> > _______________________________________________ >> > Bioc-sig-sequencing mailing list >> > [email protected] >> > https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing >> > >> > >> >> _______________________________________________ >> Bioc-sig-sequencing mailing list >> [email protected] >> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing >> > > [[alternative HTML version deleted]] _______________________________________________ Bioc-sig-sequencing mailing list [email protected] https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
