________________________________ Fra: Karina Borlaug <karina_ingeb...@hotmail.com> Sendt: mandag 17. juni 2019 13:26 Til: James Bonfield Emne: Sv: [Samtools-help] inconsistency in samtools mpilup basequality [EXT]
Hi and thank you for the quick reply! I tried this command instead: $samtools mpileup -r 1:43149119-43149119 -q 0 -Q 0 -B -f Hg19.fasta sample.bam But got the exact same output as without the -B option. However as you point out it might be more valuable to keep the BAQ. I am trying to extract the quality values with a python script in order to create a frequency plot . I am still a little confused on why they are mixed +33 and +64. Does it mean all of the ascii character in the output string is BAQ scores and only some of them was converted to +64? output string : b>^=_=@=??]@A^AAbBBccBAcBBcB<LcBBBcBAA>BcBAABcBBcAb,AcCBC,CBBCddCBBBdCCCBcBBBACCBBC?cdBC!8A!c!C!CBBCBB?CC7CABA!aaAB@B!Bc!CCBC;CCABC!cCBAB!CB!BBd@B!?CB@ABC!A!CC!B`Bb!C!!!!!CCA?B<CC!BBCCCCCCCAC!CCAACA!A!!!!!!A!=!=A! Or does it mean that only the +64 values are BAQ scores and the remaining +33 character are normal basequality values resulting in having to separate the two different values when plotting a frequency plot. Karina ________________________________ Fra: James Bonfield <j...@sanger.ac.uk> Sendt: mandag 17. juni 2019 11:33 Til: Karina Borlaug Kopi: samtools-help@lists.sourceforge.net Emne: Re: [Samtools-help] inconsistency in samtools mpilup basequality [EXT] On Fri, Jun 14, 2019 at 02:33:50PM +0000, Karina Borlaug wrote: > Hi I am seeing some inconsistency in the samtools mpilup. It seems to report > a mix of phred + 33 and phred + 64 for the basequality in a .bam file. > However when viewing the same .bam file using samtools view it appears to > only use phred +33. > > I am using samtools/1.8 > > Eksample: > > $samtools mpileup -r 1:43149119-43149119 -q 0 -Q 0 -f Hg19.fasta sample.bam The +64 is simply a modified base quality coming in via the BAQ calculation. If you add the "-B" option you'll probably see the qualities you expect, however this may not be what you want. See https://academic.oup.com/bioinformatics/article/27/8/1157/227268 for more information on what BAQ is doing. James -- James Bonfield (j...@sanger.ac.uk) The Sanger Institute, Hinxton, Cambs, CB10 1SA -- The Wellcome Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE.
_______________________________________________ Samtools-help mailing list Samtools-help@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/samtools-help