Did anything ever get resolved in terms of assigning chromosome lengths to a GRanges object when it contains alignments that run off the chromosome ends? The message below was the last of the original thread that I could find.
I'm currently having the problem of reading solexa export files into a GRanges object, and then sometimes having an error while setting the chromosome lengths if the object has a few reads that are past the boundary. The only solution I see is to somehow toss out the offending reads - which means I have to write a complicated function to loop through all reads and check them against the chromosome length - so I was just wondering since Ivan brought this problem up back in April, if a solution was ever reached. (or if anyone knows of an efficient way to address the problem). -Chris > -----Original Message----- > From: [email protected] > [mailto:[email protected]] On Behalf > Of Patrick Aboyoun > Sent: Tuesday, April 27, 2010 12:39 PM > To: Sean Davis > Cc: [email protected] > Subject: Re: [Bioc-sig-seq] GRanges, failure assigning > chromosome lengths > > > Sean and Ivan, > Thanks for the insight. I'll look at devising a compromise within the > existing framework. I need to explore the various methods for GRanges > object to better understand the impact of a compromise. We > started with > the simplest interpretation of limit bounds because it simplifies the > code. For example, we need to establish the rules for coverage or > findOverlaps when the DNA is circular or the alignment runs > off the end > of a linear chromosome. > > > Patrick > > > On 4/27/10 8:05 AM, Sean Davis wrote: > > On Tue, Apr 27, 2010 at 10:51 AM, Ivan > Gregoretti<[email protected]> > > wrote: > > > >> Good morning Sean and everybody, > >> > >> > >>> Actually, the edge case is general as alignments, even on linear > >>> chromosomes, may extend beyond the end of the chromosome, > I believe. > >>> In the best case, these alignments are clipped (in CIGAR > terms), but > >>> I don't know that all aligners are doing that appropriately. > >>> > >>> Sean > >>> > >> So, you rather go for an overriding switch rather than > infrastructure > >> overhaul? > >> > >> I ask this because GRanges is an exceptionally convenient > format for > >> ChIP-seqers and Patrick is trying to make a decision to > make it work > >> for real world data. > >> > > I guess that I mean to say that the two issues of aligning > off the end > > of the chromosome and handling circular genomes are related but > > separate issues. An override seems quite reasonable for > dealing with > > the former. Until aligners or common formats (BAM/SAM) > deal with the > > latter, it will be difficult to deal appropriately with circular > > genomes, so an override is probably a fine compromise. > > > > Sean > > > > > > > >> And yes indeed: aligners do align a little bit past the boundaries > >> even for linear chromosomes. Thanks for pointing that out! > >> > >> Ivan > >> > >> > > _______________________________________________ > Bioc-sig-sequencing mailing list > [email protected] > https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing > > _______________________________________________ Bioc-sig-sequencing mailing list [email protected] https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
