Hi All,

I have Illumina RNA & DNA-seq data, and want to count the number of
reads and mismatches at each position, and to discard the ends of the
alignment, since these tend to have more mismatches (due to low read
quality and errors in the local alignment)

I've generated a bam file using bwa, and then an mpileup file

  bwa aln -n 2 $(YEASTGENOME) $$S.fastq.gz > $$S.sai
  bwa samse $(YEASTGENOME) $$S.sai $$S.fastq.gz | samtools view -F 4
-q 30 -Shu - | samtools sort -m 30G -o - - > $$S.bam
  samtools  mpileup -f $(YEASTGENOME)   -d1000000   -O  $${S}.bam |
gzip > $${S}.mpileup.gz


A quite a few positions there are more positions (output by -O) than
bases read, eg:

chrI 69088 G 78
.$..........................................................................^F.^F.^F.
AIIHIHIFJGIHBIFIJIGDIIJGCIHGIIJHGEJGIIIHHCHIJIBJFGHJJGIJIAIHGHHHFHHHDDHHHHF@@C
50,45,43,43,43,43,43,42,42,42,42,42,42,42,42,42,42,42,42,42,40,40,40,40,39,39,39,39,38,34,34,34,34,34,33,33,33,33,32,23,22,22,22,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,12,12,12,12,12,12,12,12,12,12,12,12,12,12,11,8,1,1,1,1

there are 80 positions but only 78 qualities and 78 reads. You can see
4 position 1s, but only three ^Fs



chrI 69097 T 75
...........................................................................
????E7IIGJJJIIDJDIJJHJ<JJJJJJIGJJJJIJJEIHIJIIJJJHGHJIJJIFHHCFFFFFFDFFFFDDFD
49,49,49,49,48,48,48,48,47,43,43,43,43,43,42,42,42,42,41,32,31,31,31,25,25,25,25,25,25,25,25,25,25,25,25,25,25,25,25,25,21,21,21,21,21,21,21,21,21,21,21,21,21,21,20,17,10,10,10,10,9,8,8,8,8,9,8,8,8,8,8,9,8,8,7,6,5

there are 77 positions but only 75 qualities and 75 reads.




-Lucas
samtools Version: 1.2 (using htslib 1.2.1)
bwa Version: 0.7.12-r1039

------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud 
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help

Reply via email to