Hello,

Currently I am working on a project involving about 300 bam files
containing whole exome data with mean coverage of 150x. Since I am
 interested in the analysis involving homozygous  genotype I performed
multi-sample variant calling on all sample together following similar
procedure explained here : http://samtools.sourceforge.net/mpileup.shtml.

The issue is when I filtered variant with all default values  (

   1. i.e "bcftools view var.raw.bcf | vcfutils.pl varFilter
   > var.flt.vcf")  I tends to loose some of the variant containing quite
   high QUAL and MAPQ score and as well high DPs (even for 300 samples) ,
   which I think are quality variants.

Also able to figure out that most of such variants are removed due to PV4
default thresholds.

Is it advisable to use the default PV4 thresholds for large number of
samples? How these P-values for strand bias, baseQ bias, mapQ bias, and
tail distance bias are summarized to single values for multiple samples by
samtools ?

Thanks,
Nihir



   1.
------------------------------------------------------------------------------
_______________________________________________
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help

Reply via email to