Hi,
I would like to ask for help with mutation calling using
samtools-bcftools. Specifically my problem is that I am loosing (almost)
all my reverse bases and mutations are called only on forward bases (reads).
I have installed last version of samtools and bcftools (1.3.1).
I have paired fastqs from amplicon panel libraries sequenced on Illumina
MiSeq. I cut off the primer sequences using Cutadapt, aligned reads to
hg38 genome with Bowtie2 (paired alignment), sorted and indexed bam file
with samtools and then I used following command:
samtools mpileup -u -v -d 1000000 -f hg38.fa PREFIX.bam | bcftools call
-c -v > PREFIX.vcf
Resulting vcf file looks like this (showing just three rows and only
some of the fields)
chr16 3243407 . T C 221.999 . DP=928; . .
. ;DP4=1,0,462,0;MQ=42;FQ=-281.989;PV4=1,1,0.454462,1 GT:PL
1/1:255,255,0
chr16 3243888 . C T 221.999 . DP=2982; . .
. ;DP4=2,0,1486,2;MQ=42;FQ=-281.989;PV4=1,0.458362,0.4298,1 GT:PL
1/1:255,255,0
chr16 3243922 . A T 221.999 . DP=2982; . .
. ;DP4=1,0,1486,4;MQ=42;FQ=-281.989;PV4=1,0.338485,0.450054,1
GT:PL 1/1:255,255,0
What I don't like is the DP4 field that suggests extreme strand bias.
I tried to find out why are those reverse bases discarded - there is no
reason for generally lower quality score - amplicons are ligated
randomly in both orientations and I am getting both sides of each
amplicon in R1 and R2 reads.
The only thing that prevents the "loss" of the bases is setting the Q to
0, like that:
samtools mpileup -u -v -Q 0 -d 1000000 -f hg38.fa PREFIX.bam | bcftools
call -c -v > PREFIX.vcf
Result:
chr16 3243407 . T C 221.999 . DP=928; . .
. ;DP4=2,1,462,463;MQ=42;FQ=-281.989;PV4=1,1,0.419714,1 GT:PL
1/1:255,255,0
chr16 3243888 . C T 221.999 . DP=2982; . . .
;DP4=2,2,1489,1489;MQ=42;FQ=-281.989;PV4=1,0.493227,0.40081,1 GT:PL
1/1:255,255,0
chr16 3243922 . A T 221.999 . DP=2982; . .
. ;DP4=4,3,1487,1488;MQ=42;FQ=-281.989;PV4=1,1,1,1 GT:PL
1/1:255,255,0
If I set Q to more than 0, for example 1 I will again loose almost all
reverse bases.
Finally, a look at mpileup file obtained by this command: samtools
mpileup -Q 0 -d 1000000 -f hg38.fa PREFIX.bam > PREFIX.pileup
I see that half of the bases is getting score ! that is 0 :
Thank you for your help,
Jiri
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help