Hi,
I am trying to get an idea of number of mapped reads and average coverage of my
genome. I have used samtools flagstat, but the number of mapped reads is
greater than the number of reads in my fastq file. For what I understand, this
is the number of mapped positions, thus it takes into consideration multiple
alignments.
I could proceed in this way:
Step 1)
samtools view -F 0x100 myfile.sorted.bam > myfile.sorted.unique.bam
In this way, only what is considered primary alignment would be retained
Step 2)
samtools flagstat myfile.sorted.unique.bam > myfile.stats
And then, look at (reads with itself and mate mapped)- ( reads with mate mapped
to a different chr) =
number of mapped reads. From here I could also do:
number of mapped reads * read length / genome length
to get an idea of average coverage.
My two questions:
1) Is this strategy correct?
2) if in step 1) I want to get rid of flag 0x100 and 0x400 (duplicates which I
marked with Picard), how can I combine the two flags?
Thanks for your help,
Max
------------------------------------------------------------------------------
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
_______________________________________________
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help