Hi Colin,

here are couple of alignments from sam file. I used name sorting for this demonstration to show reads of both orientations together. Otherwise for mpile I am using the same bam file sorted by plain sort command.

Jiri

M03964:25:000000000-AKT18:1:1101:1792:12037 83 chr16 3249344 42 173M = 3249342 -175 ACAACCCAGAGTTGTTGGGAAAATGAAGTAAGGCCCAGTGTGTCCAAGTGCCTGGCAGAGAAGAGCCCACAGGCAGGGAGTGCCTACCTTGTGTTCCAGGGCGACCTCCTCAATGGGGCGCACCCGGTGGCCTTGGTGCTCCTGACTCAGACTGCAGATGAGGCAGATGGGCT GGGFGGGGFGGGGGGGGGFGGGGGGGFGFFGGFGGFCDGGFGGGGGGGGGGGGGGGFGGGGGGGGGGFFGGGFFGGCGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGEGFGGGGGGGGGGGGGGGGEGGGGGFGGG AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:173 YS:i:0 YT:Z:CP M03964:25:000000000-AKT18:1:1101:1792:12037 163 chr16 3249342 42 173M = 3249344 175 CAACAACCCAGAGTTGTTGGGAAAATGAAGTAAGGCCCAGTGTGTCCAAGTGCCTGGCAGAGAAGAGCCCACAGGCAGGGAGTGCCTACCTTGTGTTCCAGGGCGACCTCCTCAATGGGGCGCACCCGGTGGCCTTGGTGCTCCTGACTCAGACTGCAGATGAGGCAGATGGG GGGGGGGGGDGGGGGGGGGCDGGF?FFFGAFGGGGGGGGFGGGGGGGGGGGGGGGGGGGEGGGGGGGGGFFECFF@EG7FCGGGGFGGGDFFGFFGGGGGG?CCCEFE>FGGGFFFCD8CEECGGGGGGGGGGGGGGGFCFCCGGFGF>F6F>FGGFGFDFFFAD>CAFFDAF AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:173 YS:i:0 YT:Z:CP M03964:25:000000000-AKT18:1:1101:1856:12139 83 chr16 3248843 42 198M = 3248841 -200 ACCCCTGCTCACTCTTCCCACCTTCCTCCCAGGGACGGATGGGCCATCAGCCACCTCTGACCTTACCAGAAAGCTCACTGCCTTCTCCTCCCCATAGGATCGCTGCTCCTCCCCTGATTTTCTCAGCTTCTTCAGATGCTCCAGCTGCTTCTGAATTTTCTTCTGGAAAAACAGCACTTGTTGAAAAGCTTGAATTTG FDAFFFFGGGFGGDF@5C77GGC>GEGEGGFEEGGGGGGGGFGGGGFDEF6E8FGGGGFGGGGGFFGGGEA9GGGGFGGGGGGGGGGGGGGDGGFGGGFGGGGGGGEGGGGGFGGGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFGGGGG AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:198 YS:i:-3 YT:Z:CP M03964:25:000000000-AKT18:1:1101:1856:12139 163 chr16 3248841 42 198M = 3248843 200 GGACCCCTGCTCACTCTTCCCACCTTCCTCCCAGGGACGGATGGGCCATCAGCCACCTCTGACCTTACCAGAAAGCTCACTGCCTTCTCCTCCCCATAGGATCGCTGCTCCTCCCCTGATTTTCTCAGCTTCTTCAGATGCTCCAGCTGCTTCTGACTTTTCTTCTGGAAAAACAGCACTTGTTGAAAAGCTTGAATT GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFGGGGGGGGGGGGGGGGGGDEGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFGGGGGGGGGFGGGGGGFGGGGEGGGCFGGGGGGGGGGGGFGGGGDGG+<D7FFGGFGFFFGFGFGF?5?FFFCFFF<@C5==@5@=CFF AS:i:-3 XN:i:0 XM:i:1 XO:i:0 XG:i:0 NM:i:1 MD:Z:156A41 YS:i:0 YT:Z:CP M03964:25:000000000-AKT18:1:1101:1936:12864 83 chr16 3254291 42 159M = 3254289 -161 AATTTCTGGATTTGCGGGCGCCTTCTCCCCTGTAGAAATGGTGACCTCAAGGCTTCTAGGTCGCATCTTTCCCGAGGGCAGGTACACTTCGAAGGGCCTGCACTCCTTCTGCCCCGGGGCGCCCCCCGCCAGCCCCTGCAGCCTCCCCGCGGAGCTGGC ?CGC:<FF<7:E>GGC>>DF88ECEGGEC<<+7@F<CC?CFGGGDFGGGGGGGF9DGEE>E:C7FGGFEGGDEE<EGFGGFGGGGGEFFGE6G?;5@CGGGGGGGGGGCEEE:GGGGGCCGGGGGCGGEDGF@7FGGGEGDGGGGGEEGCEE<GGGGDF AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:159 YS:i:0 YT:Z:CP M03964:25:000000000-AKT18:1:1101:1936:12864 163 chr16 3254289 42 159M = 3254291 161 AGAATTTCTGGATTTGCGGGCGCCTTCTCCCCTGTAGAAATGGTGACCTCAAGGCTTCTAGGTCGCATCTTTCCCGAGGGCAGGTACACTTCGAAGGGCCTGCACTCCTTCTGCCCCGGGGCGCCCCCCGCCAGCCCCTGCAGCCTCCCCGCGGAGCTG FCFGAEFCF9CEGGFFFGGGGEEGG:<@<:@B<FGGGGGGAF,C<EGFGD@FCA@FFFFGC,9C:@C@:FFG,EFGGGCB@F?@EFGFGCFDEG,BCGGGCEDCF>DFG9E9@DDFGC=6+@EEGG:@:EGGG*=:8**?F4D=<49EE7DDG557C46 AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:159 YS:i:0 YT:Z:CP M03964:25:000000000-AKT18:1:1101:1992:12226 83 chr16 3243623 42 218M = 3243621 -220 GTCTAACACTCTTCAGATCATCAGAGAAGATGAGGTTGGGGTAAGCGGTTTCTGCATCCAGAATCACATTAACTGCAAAGAAAATTTGAATACCTAGGTAGGGGTCCATGGGCAACATCCCTACAGGGTTCTCCCCACCTGCAGGAAACAGGGACAGGGTAGTTCTTCTGGAACGTGGTAGGGGAGAGCACAGGGATCCAGCAGGCCAGGGCCACTTG AFFFAFB:;+AFFA6FFFGFFGFGFFFGGGFFC>GGD?C6FEGGGED@CFGGGFDCFDECE,38FFEAGGGFGGGGGGGGGGECGGGGGCEFF@F9A=GGECFGGGGGDGEFGGGGGGGGGGGGE@CGGGGGGGGGGGGGGGGGGFGGGGFFGGGFGGFGGFGGGGGFGGFCGGGGGGGGDGGFCGFGGGFCFE@EFFDGFGFDCGGFF?GFFFFGFF AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:218 YS:i:-8 YT:Z:CP M03964:25:000000000-AKT18:1:1101:1992:12226 163 chr16 3243621 42 218M = 3243623 220 AAGTCTAACACTCTTCAGATCATCAGAGAAGATGAGGTTGGGGTAAGCGGTTTCTGCATCCAGAATCACATTAACTGCAAAGAAAATTTGAATACCGAGGTAGGGGTCCATGGGCAACATCCCTACAGGGTTATCCCCACCTGCAGGAAACAGGGACAGGGTAGTTCTTCTGGAACGTGGTAGGGGAGAGGACAGGGATCCAGCAGGCCAGGGCCACT EEGCFDGFFEFFFD9CDEEFGGGFGGFGE,@@<AEF?C@E@FCFDGFFFE,@:EEFGFAE,B@FF,?AAFA9FFEFG,EFEFFF,,AFC,A<EFG,,@>CC4C@BC=?F?9EDCGGGF=,@DFGGGEC88D,+6@=2@D>E6=EFGGCG61=DDD?=*=?8*8=?F7CA+3<8@?5<F?>5@A5?)*:8))0::@BAD94>0*7((/58(/885>?04 AS:i:-8 XN:i:0 XM:i:3 XO:i:0 XG:i:0 NM:i:3 MD:Z:96T35C57C27 YS:i:0 YT:Z:CP M03964:25:000000000-AKT18:1:1101:2092:11235 99 chr16 3254128 42 189M = 3254130 191 CGATATAAAGTAGGAAAGAACACAATTTACCGGTGACCGAATTTTCTGGATTTCCAGGGCCTTCCTTCAGGTCCGCAGATGCCCCTCCATCCGGCGTGGGCCTTGCCCGGGGTTCTGTTGCCGAGTCCAGATTCGCAGCTGTCTTTTCTTCTAGAGTCAGGAGAGTTTCTGGATTTGCGGGCGCCTTCT @+F@:FF@EFFG9F9FE<88FEF8,;C6CFG@E:FGGFG@@C,6CFGFFFF9FFGGGDGGGGFFDFDF9FE,9EEE7++:,5,BFFGDG<EFGE+FEGG+:AFCF@,==FEF+8A,<FEG,?F7F@,3,8733EEG@+>EGFFCCFEC;,@7BDC,3@9C8<*,*45,2,2,2,,29:19?CDD5::2; AS:i:-12 XN:i:0 XM:i:4 XO:i:0 XG:i:0 NM:i:4 MD:Z:42G51A53C15A24 YS:i:-19 YT:Z:CP M03964:25:000000000-AKT18:1:1101:2092:11235 147 chr16 3254130 42 189M = 3254128 -191 ATATAAAGGAGGAAAGAACTCAATTTACCGGTGACCGAATGTTCTGGAGTTCCAGGGCCTTCCTTCAGGTCCGCAGAGGCCCCTCCATCCGGAGTGGGCCTTGCCCGGGGTTCTGTTGCCGAGGCCAGATTCGCAGCTGTCTTTTCCTCTAGAGTCAGGAGAATTTCTGGATTTGCGGGCGCCTGCTCC ))5A5655:8+8;1+1+4*+>:=+:81D8*C8@/?7GC9A70CFD;2**CFCC75;,@,EFE9C@53++++8D8>D8+ECB83,+@EGFGGGCF?@44+++F=+C:@+<GF8F=,+7CDFEAA,GEFC@+8:F8FC8FGFF8F@GCDC,9GFFF<<E<C,FFEDF6,GGGF6:::F7F7@@6CF7F@@, AS:i:-19 XN:i:0 XM:i:6 XO:i:0 XG:i:0 NM:i:6 MD:Z:8T10A28T28T45T60T4 YS:i:-12 YT:Z:CP


On 2/20/2017 9:01 PM, Colin Hercus wrote:
Hi Jiri,

Could you paste a few alignments from the SAM file.

Colin

On 21 February 2017 at 00:31, Jiri Nehyba <ji...@utexas.edu <mailto:ji...@utexas.edu>> wrote:

    Hi Tom,

    Thanks for your help.

    I think Bowtie2 is not changing base qualities. To verify that I
    extracted from Bowtie2 bam file back the fastqs (bedtools bamtofastq)
    and looked base scores in FastQC - they look the same as before
    alignment. Bowtie2 is assigning alignment quality scores (bam fifth
    column) and those are mostly perfect (151364 from 158220 bam/sam lines
    have score 42).

    mpileup of samtools is changing the qualities. Below is one line from
    mpileup file (samtools mpileup -Q 0 -d 1000000 -f hg38.fa PREFIX.bam >
    PREFIX.pileup) . I folded long fifth and sixth column for better
    readibility. This is the position where the reads have C instead of T
    that is in reference. You might see about half of the bases have
    qualities higher than 0 (!), most frequent quality is m that would
    correspond to 109-33=76 on Illumina 1.8+ scale. Now original qualities
    both in fastqs and in bam/sam file don't have zeroes and the most
    common
    quality is G that is 38 on Illumina 1.8+ scale (lowest I can see
    just by
    looking at random places in bam file is + at the end of reads that is
    quality 10).


    Jiri Nehyba

    chr16    3243407    T    928
    
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
    
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC.CCCCCCCCCCCC
    
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
    
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
    
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCACCC
    
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC.CCCCCCCCCCCCCCCcccccccccccccccc
    
cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
    
cccccccccccccccccccccccccccccccccccccccccccccccccccc,ccccccccccccccccccccccccccc
    
cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
    
cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
    
cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccacccccccccccccccccc
    cccccccccccccccccccccccccccccccccccccccccccccccc
    
mhmmmmklgmlmmlmmmmmlmmmmmmmimlmmmljmml_mmmmm\llmmRkmmkjm_mmimmmlmmkmmlmmkmmfllmm
    
m`mmlmmmmmmm[m_mmmbkkmimmmmemmmRmmmkiml`glmlmmmmhi`_mm]gmmmbmkll]emLlkmmmmmkchgi
    
jmlmlkm_ilmmmmmmmmmikmXjmhjmm\jjjm`llmmimZmmh`mmmmmllRmjkmkmmkmllmmmmmmmmmmmmm[m
    
l^mmmmlmmlmhmmmmmmmmmlmmmmlmmYmmiYmmmjmmmmkmmjImmmkmmmmmgmmmklmbimmlmkmmmmmlmmim
    
emmmmmlmmmlmmmmmmmmmmmmmmkimmmkhmmjlGlhmllmmmmllmm[mmmgmeimimm_bmmkmQmamjmmlmmmf
    
mmhimmmmmlmmmmimmmml_mmjmmmmmlmmlmmmmimhemhR_^mm)mmjlmmmmkmmmm`m!!!!!!!!!!!!!!!!
    
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!


    On 2/20/2017 6:38 AM, Thomas W. Blackwell wrote:
    >
    > I think this is a bowtie2 question, not a samtools question. Why is
    > bowtie2 setting those base call qualities to zero ?
    >
    >                              -  tom blackwell  -
    >
    > On Sun, 19 Feb 2017, Jiri Nehyba wrote:
    >
    >> Hi,
    >>
    >> I would like to ask for help with mutation calling using
    >> samtools-bcftools. Specifically my problem is that I am loosing
    >> (almost) all my reverse bases and mutations are called only on
    >> forward bases (reads).
    >>
    >> I have installed last version of samtools and bcftools (1.3.1).
    >>
    >> I have paired fastqs from amplicon panel libraries sequenced on
    >> Illumina MiSeq. I cut off the primer sequences using Cutadapt,
    >> aligned reads to hg38 genome with Bowtie2 (paired alignment),
    sorted
    >> and indexed bam file with samtools and then I used following
    command:
    >>
    >> samtools mpileup -u -v -d 1000000 -f hg38.fa PREFIX.bam | bcftools
    >> call -c -v
    >>> PREFIX.vcf
    >>
    >> Resulting vcf file looks like this (showing just three rows and
    only
    >> some of the fields)
    >>
    >> chr16    3243407    .    T    C    221.999    . DP=928; . . .
    >> ;DP4=1,0,462,0;MQ=42;FQ=-281.989;PV4=1,1,0.454462,1 GT:PL
    >> 1/1:255,255,0
    >> chr16    3243888    .    C    T    221.999    . DP=2982; . . .
    >> ;DP4=2,0,1486,2;MQ=42;FQ=-281.989;PV4=1,0.458362,0.4298,1 GT:PL
    >> 1/1:255,255,0
    >> chr16    3243922    .    A    T    221.999    . DP=2982; . . .
    >> ;DP4=1,0,1486,4;MQ=42;FQ=-281.989;PV4=1,0.338485,0.450054,1 GT:PL
    >> 1/1:255,255,0
    >>
    >> What I don't like is the DP4 field that suggests extreme strand
    bias.
    >>
    >> I tried to find out why are those reverse bases discarded -
    there is
    >> no reason for generally lower quality score - amplicons are ligated
    >> randomly in both orientations and I am getting both sides of each
    >> amplicon in R1 and R2 reads.
    >>
    >> The only thing that prevents the "loss" of the bases is setting
    the Q
    >> to 0, like that:
    >>
    >> samtools mpileup -u -v -Q 0 -d 1000000 -f hg38.fa PREFIX.bam |
    >> bcftools call -c -v > PREFIX.vcf
    >>
    >> Result:
    >>
    >> chr16    3243407    .    T    C    221.999    . DP=928; . . .
    >> ;DP4=2,1,462,463;MQ=42;FQ=-281.989;PV4=1,1,0.419714,1 GT:PL
    >> 1/1:255,255,0
    >> chr16    3243888    .    C    T    221.999    . DP=2982; . . .
    >> ;DP4=2,2,1489,1489;MQ=42;FQ=-281.989;PV4=1,0.493227,0.40081,1 GT:PL
    >> 1/1:255,255,0
    >> chr16    3243922    .    A    T    221.999    . DP=2982; . . .
    >> ;DP4=4,3,1487,1488;MQ=42;FQ=-281.989;PV4=1,1,1,1   GT:PL
    1/1:255,255,0
    >>
    >> If I set Q to more than 0, for example 1 I will again loose almost
    >> all reverse bases.
    >>
    >> Finally, a look at mpileup file obtained by this command: samtools
    >> mpileup -Q 0 -d 1000000 -f hg38.fa PREFIX.bam > PREFIX.pileup
    >>
    >> I see that half of the bases is getting score ! that is 0  :
    >>
    >>
    >>
    >> Thank you for your help,
    >>
    >> Jiri
    >>


    
------------------------------------------------------------------------------
    Check out the vibrant tech community on one of the world's most
    engaging tech sites, SlashDot.org! http://sdm.link/slashdot
    _______________________________________________
    Samtools-help mailing list
    Samtools-help@lists.sourceforge.net
    <mailto:Samtools-help@lists.sourceforge.net>
    https://lists.sourceforge.net/lists/listinfo/samtools-help
    <https://lists.sourceforge.net/lists/listinfo/samtools-help>



------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help

Reply via email to