> -----Original Message-----
> From: John Marshall [mailto:j...@sanger.ac.uk]
> Sent: 28 July 2016 14:50
> To: Adam Witney <awit...@sgul.ac.uk>
> Cc: samtools-help@lists.sourceforge.net
> Subject: Re: [Samtools-help] different output when upgrading samtools from
> 0.1.19 to 1.3.1
> 
> On 27 Jul 2016, at 15:34, Adam Witney <awit...@sgul.ac.uk> wrote:
> > I am upgrading an analysis from samtools version 0.1.19-44428cd to
> > version 1.3.1 (using htslib 1.3.1). I am trying to understand why the
> > pileup changes so much in this example
> >
> > samtools mpileup -s -f genomes/NC_000962.fna alignments/sample-1.bam
> |
> > head [mpileup] 1 samples in 1 input files <mpileup> Set max per-file
> > depth to 8000
> > NC_000962.3     1       T       13      
> > ^],^],^],^],^],^].^].^].^].^].^].^],^],
> ;GF/GGGGGBCDH   ]]]]]]]]]]]]]
> > NC_000962.3     2       T       13      ,,,,,......,,   ?GFF?GDGGBGGA   
> > ]]]]]]]]]]]]]
> > [...]
> > ./samtools-1.3.1/samtools mpileup -s -f genomes/NC_000962.fna
> > alignments/sample-1.bam | head [mpileup] 1 samples in 1 input files
> > <mpileup> Set max per-file depth to 8000
> > NC_000962.3     1       T       0
> > NC_000962.3     2       T       0
> > [...]
> > Is this change a change in the default settings somewhere such that the
> reads are filtered out?
> 
> Does this difference persist throughout the genome, or just in these first few
> tens of positions in NC_000962.3 and other chromosomes?  This sort of
> difference in mpileup read filtering often comes down to BAQ calculations...
> If you rerun with samtools-0.1.19 -B and samtools-1.3.1 -B to disable BAQ
> recalculation, do they come back with more similar results?
> 
> If they do and if the differences you have noticed occur at the beginning of
> chromosomes but the two samtools versions produce more similar results
> further on, I suspect what you are seeing is this bug fix in samtools 1.3:
> 
> • The mpileup command now applies BAQ calculations at all base positions,
> regardless of which ‑l or ‑r options are used (previously with -l it was not
> applied to the first few tens of bases of each chromosome, leading to
> different mpileup results with -l vs. -r; #79, #125, #286, #407).


Yes the -B does improve it, although it still shows only 8 reads instead of 13

./samtools-1.3.1/samtools mpileup -B -s -f genomes/NC_000962.fna  
alignments/sample-1.bam | head
[mpileup] 1 samples in 1 input files
<mpileup> Set max per-file depth to 8000
NC_000962.3     1       T       8       ^],^],^],^],^],^].^],^],        
amlPiGDH        ]]]]]]]]
NC_000962.3     2       T       8       ,,,,,.,,        emlgeDGA        ]]]]]]]]

And you're were right, it does seem less of a problem later in the file.

Incidentally why are the base qualities different between the versions? Isn't 
this dependant on the base qualities in the fastq?

Thanks again

Adam

------------------------------------------------------------------------------
_______________________________________________
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help

Reply via email to