> -----Original Message----- > From: John Marshall [mailto:j...@sanger.ac.uk] > Sent: 28 July 2016 14:50 > To: Adam Witney <awit...@sgul.ac.uk> > Cc: samtools-help@lists.sourceforge.net > Subject: Re: [Samtools-help] different output when upgrading samtools from > 0.1.19 to 1.3.1 > > On 27 Jul 2016, at 15:34, Adam Witney <awit...@sgul.ac.uk> wrote: > > I am upgrading an analysis from samtools version 0.1.19-44428cd to > > version 1.3.1 (using htslib 1.3.1). I am trying to understand why the > > pileup changes so much in this example > > > > samtools mpileup -s -f genomes/NC_000962.fna alignments/sample-1.bam > | > > head [mpileup] 1 samples in 1 input files <mpileup> Set max per-file > > depth to 8000 > > NC_000962.3 1 T 13 > > ^],^],^],^],^],^].^].^].^].^].^].^],^], > ;GF/GGGGGBCDH ]]]]]]]]]]]]] > > NC_000962.3 2 T 13 ,,,,,......,, ?GFF?GDGGBGGA > > ]]]]]]]]]]]]] > > [...] > > ./samtools-1.3.1/samtools mpileup -s -f genomes/NC_000962.fna > > alignments/sample-1.bam | head [mpileup] 1 samples in 1 input files > > <mpileup> Set max per-file depth to 8000 > > NC_000962.3 1 T 0 > > NC_000962.3 2 T 0 > > [...] > > Is this change a change in the default settings somewhere such that the > reads are filtered out? > > Does this difference persist throughout the genome, or just in these first few > tens of positions in NC_000962.3 and other chromosomes? This sort of > difference in mpileup read filtering often comes down to BAQ calculations... > If you rerun with samtools-0.1.19 -B and samtools-1.3.1 -B to disable BAQ > recalculation, do they come back with more similar results? > > If they do and if the differences you have noticed occur at the beginning of > chromosomes but the two samtools versions produce more similar results > further on, I suspect what you are seeing is this bug fix in samtools 1.3: > > • The mpileup command now applies BAQ calculations at all base positions, > regardless of which ‑l or ‑r options are used (previously with -l it was not > applied to the first few tens of bases of each chromosome, leading to > different mpileup results with -l vs. -r; #79, #125, #286, #407).
Yes the -B does improve it, although it still shows only 8 reads instead of 13 ./samtools-1.3.1/samtools mpileup -B -s -f genomes/NC_000962.fna alignments/sample-1.bam | head [mpileup] 1 samples in 1 input files <mpileup> Set max per-file depth to 8000 NC_000962.3 1 T 8 ^],^],^],^],^],^].^],^], amlPiGDH ]]]]]]]] NC_000962.3 2 T 8 ,,,,,.,, emlgeDGA ]]]]]]]] And you're were right, it does seem less of a problem later in the file. Incidentally why are the base qualities different between the versions? Isn't this dependant on the base qualities in the fastq? Thanks again Adam ------------------------------------------------------------------------------ _______________________________________________ Samtools-help mailing list Samtools-help@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/samtools-help