Dear support team,

I have 10x genomics data set (paired-end). After alignment I used samtools to get stats.

Following are the statistics:

Total reads: 622018758

Paired in sequencing: 311009379

Mapped: 596403847

Unmapped: 25614911

Duplicates: 111855687

properly paired: 415286153

Non-primary alignments: 44200142

reads MQ0: 22973571

It means : 622018758-25614911-22973571-44200142-111855687=417374447 are the unique reads in this sample.

I am using following command to get the uniquely aligned reads:

samtools view -q 1 -F 4 -F 256 -h $input_bam | grep -v -e 'XA:Z:' -e 'SA:Z:' | samtools view -b > $path/output_filtered.bam

and results are ( unique reads) = 412805249 and it does not match the above calculations.

I used sambamba tool too with the following commands:

sambamba view -t 12 -h -f bam -F "mapping_quality >= 1 and not (unmapped or secondary_alignment) and not ([XA] != null or [SA] != null)" $input2 -o $path/Felsina-uniq.bam

results =412805249 unique reads

I used picard to remove duplicates. Is there any difference between samtools duplicates detection and picard?


Looking forward to hearing from you soon.


Best wishes,

Nadia








_______________________________________________
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help

Reply via email to