Hello, Currently I am working on a project involving about 300 bam files containing whole exome data with mean coverage of 150x. Since I am interested in the analysis involving homozygous genotype I performed multi-sample variant calling on all sample together following similar procedure explained here : http://samtools.sourceforge.net/mpileup.shtml.
The issue is when I filtered variant with all default values ( 1. i.e "bcftools view var.raw.bcf | vcfutils.pl varFilter > var.flt.vcf") I tends to loose some of the variant containing quite high QUAL and MAPQ score and as well high DPs (even for 300 samples) , which I think are quality variants. Also able to figure out that most of such variants are removed due to PV4 default thresholds. Is it advisable to use the default PV4 thresholds for large number of samples? How these P-values for strand bias, baseQ bias, mapQ bias, and tail distance bias are summarized to single values for multiple samples by samtools ? Thanks, Nihir 1.
------------------------------------------------------------------------------
_______________________________________________ Samtools-help mailing list Samtools-help@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/samtools-help