Hi--

I'm working on a comparison of variant callers (details below; 1). The
Samtools 1.1 results are within 0.2% of the maximum sensitivity
(congrats!). The other callers I'm looking at have a default minimal filter
strategy, and I'm hoping you can suggest one for Samtools.

We work on a lot of different kinds of experiments, so we're looking for a
robust initial filter that can be broadly applied. Under the conditions of
my comparison, these filters reduce sensitivity by roughly 1% and improve
specificity to a similar extent.

Samtools specificity doesn't need to be filtered under my current
conditions, because it's comparable to the filtered values for the other
callers (again, congrats). However, I expect specificity to worsen for all
callers when I don't have a defined set of callable regions (particularly
with less well-annotated model organisms) or when an experiment calls for
analysis of more difficult regions. Although they weren't intended for this
purpose, I applied your example filters (2), but they're far more stringent
than the minimal filters used by the other callers.

Thanks and best regards,
Holly

Details:
(1) I'm comparing high confidence genotype calls on NA12878 from the Genome
in a Bottle consortium (GiaB) with the results of variant calls made by
Freebayes, Platypus, & GATK on chr20 of NA12878 with 40x coverage and
excluding regions not defined as callable by GiaB. I use
vcfallelicprimitives from vcflib to try to standardize the representation
of indels and complex variants.
(2) g3 -G10 %QUAL<10 || (RPB<0.1 && %QUAL<15) || (AC<2 && %QUAL<15) ||
%MAX(DV)<=3 || %MAX(DV)/%MAX(DP)<=0.3

*Holly Beale, PhD*
Computational Biologist
hbe...@maverixbio.com
Maverix Biomics, Inc.
------------------------------------------------------------------------------
_______________________________________________
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help

Reply via email to