Hello,
in a vcf output of mpileup, the I16 category is formatted as:
1 #reference Q13 bases on the forward strand
2 #reference Q13 bases on the reverse strand
3 #non-ref Q13 bases on the forward strand
4 #non-ref Q13 bases on the reverse strand
5 sum of reference base qualities
6 sum of squares of reference base qualities
7 sum of non-ref base qualities
8 sum of squares of non-ref base qualities
9 sum of ref mapping qualities
10 sum of squares of ref mapping qualities
11 sum of non-ref mapping qualities
12 sum of squares of non-ref mapping qualities
13 sum of tail distance for ref bases
14 sum of squares of tail distance for ref bases
15 sum of tail distance for non-ref bases
16 sum of squares of tail distance for non-ref
The first 4 categories are presenting the number of reference and variant
reads that surpass the base quality cutoff (defaulted as 13)
However, as best as I can tell, the remaining 12 categories are calculated
using all reads, not just the reads that surpassed the base quality cutoff.
This is making it difficult to calculate the mean base quality of my
reference alleles and my variant alleles.
Is there any way to change I16 values 5-16 so that they only use high
quality reads?
Is there some other option that I have missed that will show the base
quality, mapping quality, and tail distance of just the high quality reads?
Thank you,
Emily
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help