Hi, I am finding I can't rely on mpileup to report what is happening at each position (described in a bed file) in a .bam file. Some positions are omitted and I can't predict the reason this occurs. Example below:
# First, make a small bed file: cat smallBed.bed 2 186689185 186689205 2 179511757 179511777 # Then, get a subset of a larger bam file: ${samtools_htslib_path}/samtools view -b $bamFile 2:179511657-186689305 > test1.bam # Now, run mpileup: ${samtools_htslib_path}/samtools mpileup -B -d10000000 -l smallBed.bed -t DPR,INFO/DPR,DP4 -uf ref/${refFile} test1.bam | ${bcftools_htslib_path}/bcftools view - > result1.vcf # Check a position in first interval: grep 186689186 result1.vcf # nothing reported. # Check a position in the second interval: grep 179511758 result1.vcf 2 179511758 . T <X> 0 . DP=0;I16=0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0;QS=0,0;MQ0F=0;DPR=0,0 PL:DP4:DPR 0,0,0:0,0,0,0:0,0 Locus reported with a depth of 0. I repeated the call to mpileup without the -B or the -d parameters to see if this had something to do with it, but the result was the same. I can repeat this experiment using a second bam file, and see that in the second case, both positions are reported, even though both have a depth of 0 just like in the first sample. ${samtools_htslib_path}/samtools view -b $bamFile2 2:179511657-186689305 > test2.bam ${samtools_htslib_path}/samtools mpileup -B -d10000000 -l smallBed.bed -t DPR,INFO/DPR,DP4 -uf ref/${refFile} test2.bam | ${bcftools_htslib_path}/bcftools view - > result2.vcf grep 186689186 result2.vcf 2 186689186 . C <X> 0 . DP=0;I16=0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0;QS=0,0;MQ0F=0;DPR=0,0 PL:DP4:DPR 0,0,0:0,0,0,0:0,0 grep 179511758 result2.vcf 2 179511758 . T <X> 0 . DP=0;I16=0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0;QS=0,0;MQ0F=0;DPR=0,0 PL:DP4:DPR 0,0,0:0,0,0,0:0,0 Because I am not comfortable drawing conclusions about positions that are invisible to me, it would be great to learn what is happening at the omitted locus, and others like it. In earlier experiments using Samtools 0.1.18, I found some positions of importance to us with plenty of coverage were omitted, and I presume this may still be occurring in the latest samtools. Or perhaps I'm just missing some key info in the documentation that would explain this phenomenon? I am pasting a link to the test files, in case this is helpful for anyone interested in troubleshooting the software (I am using GRCh37 for the ref). https://drive.google.com/folderview?id=0B_o0arNq2sU0eTFYMTZUV2xzQTg&usp=sharing Thanks for any assistance. Best, Jonathan
------------------------------------------------------------------------------
_______________________________________________ Samtools-help mailing list Samtools-help@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/samtools-help