The picard library (java-based) is a very useful library for doing
this type of thing.  This can be done in R, but the picard folks have
put a lot of thought into how to find and mark duplicates including
optical duplicates.  This is particularly true if you have paired-end
data.

Sean


On Thu, Aug 11, 2011 at 12:50 PM, Kunbin Qu <k...@genomichealth.com> wrote:
> Hi, I have some human single end RNA-seq runs on HiSeq. Can I have some 
> suggestions on how to assess how many duplicated reads out of these 
> libraries? I looked around srFilter() in ShortRead, but have not had a clear 
> thought on how to implement it? Should I use IRanges as an alternative to 
> assess the unique starting site after the mapping? If so, what function do 
> you suggest? I'd like to count reads which map to the same location (even 
> with some mismatches) as duplicates. Thanks.
>
> -Kunbin
>
>
>
> ______________________________________________________________________
> The contents of this electronic message, including any attachments, are 
> intended only for the use of the individual or entity to which they are 
> addressed and may contain confidential information. If you are not the 
> intended recipient, you are hereby notified that any use, dissemination, 
> distribution, or copying of this message or any attachment is strictly 
> prohibited. If you have received this transmission in error, please send an 
> e-mail to postmas...@genomichealth.com and delete this message, along with 
> any attachments, from your computer.
>        [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioc-sig-sequencing mailing list
> Bioc-sig-sequencing@r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>

_______________________________________________
Bioc-sig-sequencing mailing list
Bioc-sig-sequencing@r-project.org
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

Reply via email to