Hi,

I am trying to get an idea of number of mapped reads and average coverage of my 
genome. I have used samtools flagstat, but the number of mapped reads is 
greater than the number of reads in my fastq file. For what I understand, this 
is the number of mapped positions, thus it takes into consideration multiple 
alignments.

I could proceed in this way:


Step 1)

samtools view -F 0x100 myfile.sorted.bam > myfile.sorted.unique.bam


In this way, only what is considered primary alignment would be retained

Step 2)

samtools flagstat myfile.sorted.unique.bam > myfile.stats


And then, look at (reads with itself and mate mapped)- ( reads with mate mapped 
to a different chr) =

number of mapped reads. From here I could also do:

number of mapped reads * read length / genome length

to get an idea of average coverage.


My two questions:

1) Is this strategy correct?

2) if in step 1) I want to get rid of flag 0x100 and 0x400 (duplicates which I 
marked with Picard), how can I combine the two flags?


Thanks for your help,

Max
------------------------------------------------------------------------------
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
_______________________________________________
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help

Reply via email to