Hello Jody, On Thu, May 03, 2018 at 12:54:41AM +0100, Jody Phelan wrote: > The example finds a very large variant at position 3329088 however by > looking at the raw data I find very little support for it. > I have created a gist so you can replicate it if you would like. > https://gist.github.com/jodyphelan/e721112d91dcaaa67cd79217ef2af24f
Firstly, thank you greatly for such a clear and reproducable bug report. It is most welcome. This means I can reproduce this issue. Looking at the alignments in the bam file there is one single read with a large deletion at that point: SRR6117388.72300 has cigar ...10M1I2M105D11M1D4M...; note the 105D. The question therefore comes down to why does the variant caller produce a variant based on such scanty evidence. Sadly it wasn't what I expected it to be. The variant claims to have INFO fields of IDV=42 and IMF=0.724138, both of which are very significant, and I think *wrong*! IDV is the maximum number of reads supporting an indel, which is 1 for this specific indel, and IMF is the fraction of reads supporting it (close to 0.02 in my estimations). So obviously this is a bug. The number of reads supporting *an* indel (the -1A or various -2 and -3 forms) however is much higher. I think the tool has managed to confuse the two somehow and chosen the longest indel but claiming the evidence is for all combined and not the one it picked. Manually editing the IDV and IMF to 1 and 0.02 still means the indel is called by "bcftools call", however it's then trivial to filter out as a poor quality indel with things like -e "TYPE='indel' && IDV < 3". Are Petr or Heng able to comment on this? You both understand the code far better than I do. James -- James Bonfield (j...@sanger.ac.uk) The Sanger Institute, Hinxton, Cambs, CB10 1SA -- The Wellcome Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ Samtools-help mailing list Samtools-help@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/samtools-help