Hi Nadia,

samtools stats counts duplicates using the duplicates flag.  If no other 
filtering is done while running stats then the count will include duplicates on 
supplementary and even unmapped reads.  So your calculations are probably 
removing the same reads twice.

Andrew
________________________________________
From: Nadia Baig <ba...@hhu.de>
Sent: 16 July 2020 11:57:14
To: Andrew Whitwham; samtools-help@lists.sourceforge.net
Subject: Re: [Samtools-help] samtools stats [EXT]

Dear Andrew,

I used picard.

On 16.07.20 12:41, Andrew Whitwham wrote:
> Hello Nadia,
>
> What did you use to mark the duplicates in the first place?
>
> Regards,
>
> Andrew
>
> ________________________________________
> From: Nadia Baig <ba...@hhu.de>
> Sent: 16 July 2020 10:22:07
> To: samtools-help@lists.sourceforge.net
> Subject: [Samtools-help] samtools stats [EXT]
>
> Dear support team,
>
> I have 10x genomics data set (paired-end). After alignment I used
> samtools to get stats.
>
> Following are the statistics:
>
> Total reads: 622018758
>
> Paired in sequencing: 311009379
>
> Mapped: 596403847
>
> Unmapped: 25614911
>
> Duplicates: 111855687
>
> properly paired: 415286153
>
> Non-primary alignments: 44200142
>
> reads MQ0: 22973571
>
> It means : 622018758-25614911-22973571-44200142-111855687=417374447 are
> the unique reads in this sample.
>
> I am using following command to get the uniquely aligned reads:
>
> samtools view -q 1 -F 4 -F 256 -h $input_bam | grep -v -e 'XA:Z:' -e
> 'SA:Z:' | samtools view -b > $path/output_filtered.bam
>
> and results are ( unique reads) = 412805249 and it does not match the
> above calculations.
>
> I used sambamba tool too with the following commands:
>
> sambamba view -t 12 -h -f bam -F "mapping_quality >= 1 and not (unmapped
> or secondary_alignment) and not ([XA] != null or [SA] != null)" $input2
> -o $path/Felsina-uniq.bam
>
> results =412805249 unique reads
>
> I used picard to remove duplicates. Is there any difference between
> samtools duplicates detection and picard?
>
>
> Looking forward to hearing from you soon.
>
>
> Best wishes,
>
> Nadia
>
>
>
>
>
>
>
>
> _______________________________________________
> Samtools-help mailing list
> Samtools-help@lists.sourceforge.net
> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.sourceforge.net_lists_listinfo_samtools-2Dhelp&d=DwICAg&c=D7ByGjS34AllFgecYw0iC6Zq7qlm8uclZFI0SqQnqBo&r=iKRz7N6xL3uJ6tuuPlViiQ&m=0kLKTaWw64WzDI-r_NaNQe-kuhVK6-i45wu_xTrFbHc&s=wI5YAPkY6Dsh1-uBTLHC1o4h_g_mIHjafOgi1cFMAY0&e=
>
>


-- 
 The Wellcome Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 


_______________________________________________
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help

Reply via email to