Hi,

I’ve found something very strange with samtools mpileup, looks like a bug.

If I provide the reference fasta I get completely different results than if I don’t.

Look at this example:
./samtools-1.3.1/samtools mpileup -r 20:36052981 -f human_g1k_v37.fasta file.bam
20 36052981 A 2 ,G >1
20 36052982 G 2 ,. >1
20 36052983 G 2 ,$. <3
20 36052984 A 9 .,..,,.,. <B7C77779
20 36052985 G 9 .,..,,.,. <B7C77779
20 36052986 A 9 .,..$,,.,. <F7=77779
20 36052987 A 8 GgGggGgG </777779
20 36052988 G 8 .,.,,.,. <F777779
20 36052989 G 8 .,.,,.,. <F777779
20 36052990 A 8 .,.,,.,. FFKKKKKK
20 36052991 G 8 .,.,,.,. FFKKKKKK
20 36052992 C 8 .,.,,.,. FFKKKKKK
20 36052993 T 8 .,.,,.,. FFKKKKKK
20 36052994 G 8 .,.,,.,. FFKKKKKK
20 36052995 A 8 .,.,,.,. BFKKKKKK
20 36052996 G 8 .,.$,,.,. BF>KKKKK

Now look what we get if we do not provide a reference:
./samtools-1.3.1/samtools mpileup -r 20:36052981 file.bam
20 36052981 N 11 aaGgGGggGgG BBFFKKKKKKK
20 36052982 N 11 g$gGgGGggGgG BBFFKKKKKKK
20 36052983 N 10 g$GgGGggGgG <FFKKKKKKK
20 36052984 N 9 AaAAaaAaA FBKKKKKKK
20 36052985 N 9 GgGGggGgG FBKKKKKKK
20 36052986 N 9 AaAA$aaAaA FFKKKKKKK
20 36052987 N 8 GgGggGgG F/KKKKKK
20 36052988 N 8 GgGggGgG FFKKKKKK
20 36052989 N 8 GgGggGgG FFKKKKKK
20 36052990 N 8 AaAaaAaA FFKKKKKK
20 36052991 N 8 GgGggGgG FFKKKKKK
20 36052992 N 8 CcCccCcC FFKKKKKK
20 36052993 N 8 TtTttTtT FFKKKKKK
20 36052994 N 8 GgGggGgG FFKKKKKK
20 36052995 N 8 AaAaaAaA BFKKKKKK
20 36052996 N 8 GgG$ggGgG BFKKKKKK

For site 36052981, if I provide the reference only two bases are reported, whereas without the reference 11 bases are found.
Note that quality bases also change.

I checked with IGV that the second case is the correct one (see snapshot). Interestingly, all but two of the reads that overlap 36052981 have an indel three bases before, so I guess the bug may be related to this. 

I am attaching a very small bam of the region. If it helps diagnosing the problem I can send more information for other “strange” cases.

Regards,
Federico



Attachment: caca.bam
Description: Binary data



Attachment: smime.p7s
Description: S/MIME cryptographic signature

------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are 
consuming the most bandwidth. Provides multi-vendor support for NetFlow, 
J-Flow, sFlow and other flows. Make informed decisions using capacity planning
reports. http://sdm.link/zohomanageengine
_______________________________________________
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help

Reply via email to