Hi,

I would like to ask for help with mutation calling using samtools-bcftools. Specifically my problem is that I am loosing (almost) all my reverse bases and mutations are called only on forward bases (reads).

I have installed last version of samtools and bcftools (1.3.1).

I have paired fastqs from amplicon panel libraries sequenced on Illumina MiSeq. I cut off the primer sequences using Cutadapt, aligned reads to hg38 genome with Bowtie2 (paired alignment), sorted and indexed bam file with samtools and then I used following command:

samtools mpileup -u -v -d 1000000 -f hg38.fa PREFIX.bam | bcftools call -c -v > PREFIX.vcf

Resulting vcf file looks like this (showing just three rows and only some of the fields)

chr16 3243407 . T C 221.999 . DP=928; . . . ;DP4=1,0,462,0;MQ=42;FQ=-281.989;PV4=1,1,0.454462,1 GT:PL 1/1:255,255,0 chr16 3243888 . C T 221.999 . DP=2982; . . . ;DP4=2,0,1486,2;MQ=42;FQ=-281.989;PV4=1,0.458362,0.4298,1 GT:PL 1/1:255,255,0 chr16 3243922 . A T 221.999 . DP=2982; . . . ;DP4=1,0,1486,4;MQ=42;FQ=-281.989;PV4=1,0.338485,0.450054,1 GT:PL 1/1:255,255,0

What I don't like is the DP4 field that suggests extreme strand bias.

I tried to find out why are those reverse bases discarded - there is no reason for generally lower quality score - amplicons are ligated randomly in both orientations and I am getting both sides of each amplicon in R1 and R2 reads.

The only thing that prevents the "loss" of the bases is setting the Q to 0, like that:

samtools mpileup -u -v -Q 0 -d 1000000 -f hg38.fa PREFIX.bam | bcftools call -c -v > PREFIX.vcf

Result:

chr16 3243407 . T C 221.999 . DP=928; . . . ;DP4=2,1,462,463;MQ=42;FQ=-281.989;PV4=1,1,0.419714,1 GT:PL 1/1:255,255,0 chr16 3243888 . C T 221.999 . DP=2982; . . . ;DP4=2,2,1489,1489;MQ=42;FQ=-281.989;PV4=1,0.493227,0.40081,1 GT:PL 1/1:255,255,0 chr16 3243922 . A T 221.999 . DP=2982; . . . ;DP4=4,3,1487,1488;MQ=42;FQ=-281.989;PV4=1,1,1,1 GT:PL 1/1:255,255,0

If I set Q to more than 0, for example 1 I will again loose almost all reverse bases.

Finally, a look at mpileup file obtained by this command: samtools mpileup -Q 0 -d 1000000 -f hg38.fa PREFIX.bam > PREFIX.pileup

I see that half of the bases is getting score ! that is 0  :



Thank you for your help,

Jiri
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help

Reply via email to