On 22 Jul 2016, at 21:49, Annie Cowell <annie.cow...@gmail.com> wrote:
> samtools mpileup -C50 -Bug -t AD -Q10 -f my.fasta sample.bam | bcftools call 
> -mv -Ov > my.vcf
> 
> However, I am getting unusual bases for a couple of indels in the reference 
> field, such as W (as below) and R.
> 
> AAKM01000065  8883    .       GGW     G       228     .       
> INDEL;IDV=40;IMF=0.888889;DP=45;VDB=0.880501;SGB=-0.692976;MQSB=0.114162;MQ0F=0;AC=2;AN=2;DP4=0,0,8,18;MQ=49
>     GT:PL:AD        1/1:255,78,0:0,26

As Thomas guessed, these are indeed ambiguity codes coming from the reference 
genome.  From http://www.ncbi.nlm.nih.gov/nuccore/AAKM01000065 we can see the 
GGW bases at location 8883 onwards:

        8881 aaggwtataa agaacgcata tatgcctttt gtcccctgcc cgcgtctgga tactcgctac

The issue of IUPAC ambiguity codes in VCF REF fields came up a while ago [1], 
and the consensus expressed in the VCF v4.3 spec is that they shouldn't appear 
-- instead being simplified to a non-ambiguous base, in this case GGA.  So this 
is an mpileup and/or bcftools bug, and apparently it should be suppressing 
ambiguity codes by default.

    John

[1] https://github.com/samtools/hts-specs/issues/54

-- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 

------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are 
consuming the most bandwidth. Provides multi-vendor support for NetFlow, 
J-Flow, sFlow and other flows. Make informed decisions using capacity planning
reports.http://sdm.link/zohodev2dev
_______________________________________________
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help

Reply via email to