Dear support team,
I have 10x genomics data set (paired-end). After alignment I used
samtools to get stats.
Following are the statistics:
Total reads: 622018758
Paired in sequencing: 311009379
Mapped: 596403847
Unmapped: 25614911
Duplicates: 111855687
properly paired: 415286153
Non-primary alignments: 44200142
reads MQ0: 22973571
It means : 622018758-25614911-22973571-44200142-111855687=417374447 are
the unique reads in this sample.
I am using following command to get the uniquely aligned reads:
samtools view -q 1 -F 4 -F 256 -h $input_bam | grep -v -e 'XA:Z:' -e
'SA:Z:' | samtools view -b > $path/output_filtered.bam
and results are ( unique reads) = 412805249 and it does not match the
above calculations.
I used sambamba tool too with the following commands:
sambamba view -t 12 -h -f bam -F "mapping_quality >= 1 and not (unmapped
or secondary_alignment) and not ([XA] != null or [SA] != null)" $input2
-o $path/Felsina-uniq.bam
results =412805249 unique reads
I used picard to remove duplicates. Is there any difference between
samtools duplicates detection and picard?
Looking forward to hearing from you soon.
Best wishes,
Nadia
_______________________________________________
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help