Hi,

On Wed, Apr 28, 2010 at 3:12 PM, Pratap, Abhishek
<[email protected]> wrote:
> Hi Guys
>
> I did post the same thing on seqanswers couple of days but dint get a 
> response. May be you guys can educate me on this.
>
> I am trying to calculate RPKM on the tophat data but have come across this 
> issue that I believe could skew my results.
>
> My #input reads to tophat are ~49 million. The number of reads reported by 
> tophat to be mapped are ~55 million. I assume I am getting more reads mapped 
> than the total input due to the "--max-multihits 15" option I had set.  
> "Instructs TopHat to allow up to this many alignments to the reference for a 
> given read, and suppresses all alignments for reads with more than this many 
> alignments." -> manual
>
> Now for RPKM calculation I am not sure what number should I use for total 
> mapped reads.
>
> 1. Total reads mapped by Tophat including multireads
> 2. Total uniquely mapped reads
>
> If I go with #2 then I think I should also remove all multi reads when I am 
> doing the counting for reads mapping to my genes which could eliminate RPKM 
> count for paralogous genes.
>
>
> What do you think is my best bet in order to get #total_mapped_reads.

It sounds like what you propose is reasonable in either way, and yes,
if you go with #2, I would remove multireads when counting for RPKM.

Also, if you go with #2, you might want to ensure that your K is
calculate from the number of uniquely mappable positions in your gene
model, just so you keep same w/ same.

Why don't you try calculating RPKM using both 1 and 2, then plot the
expression of gene x from #1 vs. its expression from #2. I suspect the
plot you get will be pretty close to the diagonal, but you never know
unless you try.

Let us know :-)

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

Reply via email to