Hi Steve , Yes good point!, you would also need to check NM. You probably could not use straight ##M tags either as softclips and hardclips may appear in the tag but the sequence may still align perfectly. Chu- the NM tags (as well all other components you might need) are in BAM (or SAM) and can be accessed with Rsamtools if you decide to go that way.
Cheers Paul -----Original Message----- From: Steve Lianoglou <[email protected]> To: Paul Leo <[email protected]> Cc: Chu Zhang <[email protected]>, [email protected] Subject: Re: [Bioc-sig-seq] processing alignments Date: Thu, 10 Feb 2011 20:58:51 -0500 Hi, On Thu, Feb 10, 2011 at 8:48 PM, Paul Leo <[email protected]> wrote: > > Also if your alignments are in BAM format , so can use Rsamtools to > extract that region. Inspection of the cigar will tell you which reads > aligned perfectly. That would be an extremely fast calculation. Actually, I'm not sure that that's true, is it? Don't cigar strings only really tell you about indels? Say you have two reads, both 38bp long. If one aligns perfectly, its cigar is 38M If the other aligns with 1 mismatch, its still 38M. You can use the NM tag if you're after 'perfect matches', though ... [[alternative HTML version deleted]] _______________________________________________ Bioc-sig-sequencing mailing list [email protected] https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
