On Fri, Sep 3, 2010 at 3:07 PM, Chris Seidel <[email protected]> wrote:
> Did anything ever get resolved in terms of assigning chromosome lengths > to a GRanges object when it contains alignments that run off the > chromosome ends? The message below was the last of the original thread > that I could find. > > I'm currently having the problem of reading solexa export files into a > GRanges object, and then sometimes having an error while setting the > chromosome lengths if the object has a few reads that are past the > boundary. The only solution I see is to somehow toss out the offending > reads - which means I have to write a complicated function to loop > through all reads and check them against the chromosome length - so I > was just wondering since Ivan brought this problem up back in April, if > a solution was ever reached. (or if anyone knows of an efficient way to > address the problem). > One somewhat efficient way: end(ranges(gr)) <- pmin(end(ranges(gr)), seqlens[seqnames(gr)]) Would be nice if GRanges had a "restrict" method. Michael > -Chris > > > -----Original Message----- > > From: [email protected] > > [mailto:[email protected]] On Behalf > > Of Patrick Aboyoun > > Sent: Tuesday, April 27, 2010 12:39 PM > > To: Sean Davis > > Cc: [email protected] > > Subject: Re: [Bioc-sig-seq] GRanges, failure assigning > > chromosome lengths > > > > > > Sean and Ivan, > > Thanks for the insight. I'll look at devising a compromise within the > > existing framework. I need to explore the various methods for GRanges > > object to better understand the impact of a compromise. We > > started with > > the simplest interpretation of limit bounds because it simplifies the > > code. For example, we need to establish the rules for coverage or > > findOverlaps when the DNA is circular or the alignment runs > > off the end > > of a linear chromosome. > > > > > > Patrick > > > > > > On 4/27/10 8:05 AM, Sean Davis wrote: > > > On Tue, Apr 27, 2010 at 10:51 AM, Ivan > > Gregoretti<[email protected]> > > > wrote: > > > > > >> Good morning Sean and everybody, > > >> > > >> > > >>> Actually, the edge case is general as alignments, even on linear > > >>> chromosomes, may extend beyond the end of the chromosome, > > I believe. > > >>> In the best case, these alignments are clipped (in CIGAR > > terms), but > > >>> I don't know that all aligners are doing that appropriately. > > >>> > > >>> Sean > > >>> > > >> So, you rather go for an overriding switch rather than > > infrastructure > > >> overhaul? > > >> > > >> I ask this because GRanges is an exceptionally convenient > > format for > > >> ChIP-seqers and Patrick is trying to make a decision to > > make it work > > >> for real world data. > > >> > > > I guess that I mean to say that the two issues of aligning > > off the end > > > of the chromosome and handling circular genomes are related but > > > separate issues. An override seems quite reasonable for > > dealing with > > > the former. Until aligners or common formats (BAM/SAM) > > deal with the > > > latter, it will be difficult to deal appropriately with circular > > > genomes, so an override is probably a fine compromise. > > > > > > Sean > > > > > > > > > > > >> And yes indeed: aligners do align a little bit past the boundaries > > >> even for linear chromosomes. Thanks for pointing that out! > > >> > > >> Ivan > > >> > > >> > > > > _______________________________________________ > > Bioc-sig-sequencing mailing list > > [email protected] > > https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing > > > > > > _______________________________________________ > Bioc-sig-sequencing mailing list > [email protected] > https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing > [[alternative HTML version deleted]] _______________________________________________ Bioc-sig-sequencing mailing list [email protected] https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
