Hello Jody,

On Thu, May 03, 2018 at 12:54:41AM +0100, Jody Phelan wrote:
> The example finds a very large variant at position 3329088 however by
> looking at the raw data I find very little support for it.
> I have created a gist so you can replicate it if you would like.
> https://gist.github.com/jodyphelan/e721112d91dcaaa67cd79217ef2af24f

Firstly, thank you greatly for such a clear and reproducable bug
report.  It is most welcome.

This means I can reproduce this issue.  Looking at the alignments in
the bam file there is one single read with a large deletion at that
point:

SRR6117388.72300 has cigar ...10M1I2M105D11M1D4M...;  note the 105D.

The question therefore comes down to why does the variant caller
produce a variant based on such scanty evidence.  Sadly it wasn't what
I expected it to be.

The variant claims to have INFO fields of IDV=42 and IMF=0.724138,
both of which are very significant, and I think *wrong*!  IDV is the
maximum number of reads supporting an indel, which is 1 for this
specific indel, and IMF is the fraction of reads supporting it (close
to 0.02 in my estimations).  So obviously this is a bug.

The number of reads supporting *an* indel (the -1A or various -2 and
-3 forms) however is much higher.  I think the tool has managed to
confuse the two somehow and chosen the longest indel but claiming the
evidence is for all combined and not the one it picked.

Manually editing the IDV and IMF to 1 and 0.02 still means the indel
is called by "bcftools call", however it's then trivial to filter out
as a poor quality indel with things like -e "TYPE='indel' && IDV < 3".

Are Petr or Heng able to comment on this?  You both understand the
code far better than I do.

James

-- 
James Bonfield (j...@sanger.ac.uk)
The Sanger Institute, Hinxton, Cambs, CB10 1SA


-- 
 The Wellcome Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help

Reply via email to