It is probably the default mpileup -F and -m options are way too small. On Mon, 2018-02-05 at 16:23 +0000, James Bonfield wrote: > Does anyone have any recommended parameters for filtering bcftools > indel calls? > > It makes a *lot* of indel false positives and filtering on QUAL isn't > great compared to GATK / FreeBayes results, however I noticed IDV and > IMF info fields give a strong way to separate false positives from > true positives. > > For example with the SynDip truth set (CHM1 + CHM13) straight > bcftools > gives 439,484 true positives (TP) and 181,793 false positives (FP) > unfiltered. Filtered by QUAL >= 30 changes this to TP 417,136, FP > 163,720 - so not much at all. > > Filtering instead on IDV >=3 && IMG >= 0.3 gives TP 436,037, FP > 20,708. > IDV >= 6 && IMF >= 0.1 gives TP 426,163, FP 11,400. Given the total > number of true indels in the syndip truth set, this means we went > from > 79.0% recall 67.3% precision to 76.6% recall 98.0% precision. > > These are emormously better metrics than QUAL for discriminating > between correct and incorrect results, but they appear to be > completely undocumented other than in the header of the VCF file. > > I've tried a few other parameters, but haven't had such good results, > but in theory they could all be combined together in some phred-style > classifier system, similar to VQSR with GATK. Has anyone done this > already? If not, do peple have specific hard-filtering parameters > they use? > > James > > -- > James Bonfield (j...@sanger.ac.uk) > The Sanger Institute, Hinxton, Cambs, CB10 1SA > >
-- The Wellcome Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ Samtools-help mailing list Samtools-help@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/samtools-help