Hi All, I have Illumina RNA & DNA-seq data, and want to count the number of reads and mismatches at each position, and to discard the ends of the alignment, since these tend to have more mismatches (due to low read quality and errors in the local alignment)
I've generated a bam file using bwa, and then an mpileup file bwa aln -n 2 $(YEASTGENOME) $$S.fastq.gz > $$S.sai bwa samse $(YEASTGENOME) $$S.sai $$S.fastq.gz | samtools view -F 4 -q 30 -Shu - | samtools sort -m 30G -o - - > $$S.bam samtools mpileup -f $(YEASTGENOME) -d1000000 -O $${S}.bam | gzip > $${S}.mpileup.gz A quite a few positions there are more positions (output by -O) than bases read, eg: chrI 69088 G 78 .$..........................................................................^F.^F.^F. AIIHIHIFJGIHBIFIJIGDIIJGCIHGIIJHGEJGIIIHHCHIJIBJFGHJJGIJIAIHGHHHFHHHDDHHHHF@@C 50,45,43,43,43,43,43,42,42,42,42,42,42,42,42,42,42,42,42,42,40,40,40,40,39,39,39,39,38,34,34,34,34,34,33,33,33,33,32,23,22,22,22,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,12,12,12,12,12,12,12,12,12,12,12,12,12,12,11,8,1,1,1,1 there are 80 positions but only 78 qualities and 78 reads. You can see 4 position 1s, but only three ^Fs chrI 69097 T 75 ........................................................................... ????E7IIGJJJIIDJDIJJHJ<JJJJJJIGJJJJIJJEIHIJIIJJJHGHJIJJIFHHCFFFFFFDFFFFDDFD 49,49,49,49,48,48,48,48,47,43,43,43,43,43,42,42,42,42,41,32,31,31,31,25,25,25,25,25,25,25,25,25,25,25,25,25,25,25,25,25,21,21,21,21,21,21,21,21,21,21,21,21,21,21,20,17,10,10,10,10,9,8,8,8,8,9,8,8,8,8,8,9,8,8,7,6,5 there are 77 positions but only 75 qualities and 75 reads. -Lucas samtools Version: 1.2 (using htslib 1.2.1) bwa Version: 0.7.12-r1039 ------------------------------------------------------------------------------ One dashboard for servers and applications across Physical-Virtual-Cloud Widest out-of-the-box monitoring support with 50+ applications Performance metrics, stats and reports that give you Actionable Insights Deep dive visibility with transaction tracing using APM Insight. http://ad.doubleclick.net/ddm/clk/290420510;117567292;y _______________________________________________ Samtools-help mailing list Samtools-help@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/samtools-help