On Wed, Nov 3, 2010 at 11:14 PM, Kunbin Qu <[email protected]> wrote:
> Dear all, > > I have a mapping file from Bowtie from a RNA-seq run against human genome. > I created RangedDataList to represent the mapping coordinates from different > chromosomes, strands and lanes. Now I would like to eliminate the RangedData > entries which have the same IRanges start and end, chromosome number and > strand orientation. > > In the following example, entry 1, 3 and 4 have the same chromosome, > strand, start and end, and after the procedure, they should be reduced to > one entry. Is there a function I can use? Or is there some other better ways > to represent the mapping info which include chromosome, strand, star t and > end, rather than RangedData? Thanks. > > I would say GRanges is better, but it does not have a unique() function. Are you aware that ShortRead performs this sort of filtering on AlignedRead, even during input? See help(occurrenceFilter). > -Kunbin > > > > head(sLane[["s_1"]][3]) > RangedData with 6 rows and 2 value columns across 1 space > space ranges | strand index > <character> <IRanges> | <factor> <integer> > 1 chr1 [223780005, 223780055] | + 6 > 2 chr1 [ 89018675, 89018725] | - 55 > 3 chr1 [223780005, 223780055] | + 68 > 4 chr1 [223780005, 223780055] | + 69 > 5 chr1 [107921032, 107921082] | - 75 > 6 chr1 [243086472, 243086522] | - 86 > > class(sLane[["s_1"]][3]) > [1] "RangedData" > attr(,"package") > [1] "IRanges" > > class(sLane[["s_1"]]) > [1] "RangedData" > attr(,"package") > [1] "IRanges" > > class(sLane) > [1] "RangedDataList" > attr(,"package") > [1] "IRanges" > > sessionInfo() > R version 2.11.0 (2010-04-22) > x86_64-unknown-linux-gnu > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] ShortRead_1.6.2 Rsamtools_1.0.1 lattice_0.19-11 > [4] Biostrings_2.16.7 GenomicRanges_1.0.1 IRanges_1.6.8 > > loaded via a namespace (and not attached): > [1] Biobase_2.8.0 grid_2.11.0 hwriter_1.2 tools_2.11.0 > > > > > ______________________________________________________________________ > The contents of this electronic message, including any attachments, are > intended only for the use of the individual or entity to which they are > addressed and may contain confidential information. If you are not the > intended recipient, you are hereby notified that any use, dissemination, > distribution, or copying of this message or any attachment is strictly > prohibited. If you have received this transmission in error, please send an > e-mail to [email protected] and delete this message, along with > any attachments, from your computer. > [[alternative HTML version deleted]] > > _______________________________________________ > Bioc-sig-sequencing mailing list > [email protected] > https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing > [[alternative HTML version deleted]] _______________________________________________ Bioc-sig-sequencing mailing list [email protected] https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
