Hi Vincent and Steve (and now Patrick), Thanks for directing me to the S4 methods functions, and teaching me something about Rle objects ... Specifically, because (strand(gr)) is a S4 Rle objects, we have:
R> is.logical(elementMetadata(gr)$score >6 & strand(gr)=="+") [1] FALSE If my previous example requires an explicit coercion, this suggests to me that a *mojo* enhanced subset() function would be even more useful than I initially thought. R> subset(gr, score>6 & strand == '+') ... instead of ... R> subset(gr, elementMetadata(gr)$score >6 & as.logical(strand(gr)=="+")) On the other hand, any changes would be non-trivial in comparison to the current implementation ;-) Best, - Stu Stuart Andrews, Ph.D. Postdoctoral Associate Institute for Computational Biomedicine Weill Cornell Medical College, New York, NY On Jun 5, 2010, at 12:58 AM, Vincent Carey wrote: > the sessionInfo() was > > > sessionInfo() > R version 2.12.0 Under development (unstable) (2010-05-03 r51901) > x86_64-apple-darwin10.3.0 > > locale: > [1] C > > attached base packages: > [1] stats graphics grDevices datasets tools utils > methods > [8] base > > other attached packages: > [1] GenomicRanges_1.1.12 IRanges_1.7.6 weaver_1.15.0 > [4] codetools_0.2-2 digest_0.4.2 > > > On Sat, Jun 5, 2010 at 7:57 AM, Vincent Carey <[email protected] > > wrote: > It seems better to discuss this with a concrete example -- we have > one in the GRanges man page. Thus > if we do > > library(GenomicRanges) > example(GRanges) > > we get > > > gr > GRanges with 10 ranges and 2 elementMetadata values > seqnames ranges strand | score GC > <Rle> <IRanges> <Rle> | <integer> <numeric> > a Chrom1 [ 1, 10] - | 1 1.0000000 > b Chrom2 [ 2, 10] + | 2 0.8888889 > c Chrom2 [ 3, 10] + | 3 0.7777778 > d Chrom2 [ 4, 10] * | 4 0.6666667 > e Chrom1 [ 5, 10] * | 5 0.5555556 > f Chrom1 [ 6, 10] + | 6 0.4444444 > g Chrom3 [ 7, 10] + | 7 0.3333333 > h Chrom3 [ 8, 10] + | 8 0.2222222 > i Chrom3 [ 9, 10] - | 9 0.1111111 > j Chrom3 [10, 10] - | 10 0.0000000 > > seqlengths > Chrom1 Chrom2 Chrom3 > NA NA NA > > now I find that Stuart's language does not quite work (at least for > the sessionInfo value given below) > > > subset(gr, elementMetadata(gr)$score >6 & strand(gr)=="+") > Error in .local(x, ...) : 'subset' must be logical > > elementMetadata(gr)$score >6 & strand(gr)=="+" > 'logical' Rle of length 10 with 3 runs > Lengths: 6 2 2 > Values : FALSE TRUE FALSE > > as.logical(.Last.value) > [1] FALSE FALSE FALSE FALSE FALSE FALSE TRUE TRUE FALSE FALSE > > so a coercion of the predicate above would be needed. On the other > hand, he mentions the > square bracket for subsetting: > > > gr[ strand(gr)=="+" & elementMetadata(gr)$score>6,] # you can > omit the comma, or use it > GRanges with 2 ranges and 2 elementMetadata values > seqnames ranges strand | score GC > <Rle> <IRanges> <Rle> | <integer> <numeric> > g Chrom3 [7, 10] + | 7 0.3333333 > h Chrom3 [8, 10] + | 8 0.2222222 > > which works fine and seems quite natural; the simplification that > Steve seems to be asking for would > allow implicit references to elementMetadata variables in the > predicate. I am not in favor of such > an extension of semantics of bracket. > > In response to the question about documentation, note that > showMethods("[") works, and in particular, > method?"[,GRanges" returns a good man page > > > > > > On Sat, Jun 5, 2010 at 7:32 AM, Stuart Andrews > <[email protected]> wrote: > > Hi, > > To answer Steve's question first question ... yes. People (n = 1) > are indeed subsetting GRanges objects. In my case, I used > GRanges::subset() like this: > > R> subset(tags, elementMetadata(tags)$genome.hits < 5 & strand(tag) > == '+') > > My question is, how does one discover that the singe-index square > bracket is implemented like this for GRanges? I didn't see any > mention of this in GenomicRanges.pdf of "?GRanges" . > > Are there other methods that do not appear in the documentation, and > how can I learn about the existence of undocumented method in the > future? > > Thx, > - Stu > > Stuart Andrews, Ph.D. > Postdoctoral Associate > Institute for Computational Biomedicine > Weill Cornell Medical College, New York, NY > > > > > On Jun 4, 2010, at 9:18 PM, Michael Lawrence wrote: > > On Fri, Jun 4, 2010 at 2:27 PM, Steve Lianoglou < > [email protected]> wrote: > > Hi, > > Random question I thought I'd shoot out there ... > > I'm finding myself wanting to slice and dice IRanges-like objects (I'm > playing with GRanges right now) based on some column of their > elementMetadata. > > > I guess if this makes sense then it would make sense to support this > for all > Sequence derivatives. It works already for RangedData, btw. > > Michael > > > Are other people finding that they want to do this, too? > Would it make sense to add some subset-mojo to do that? > > Here's a motivating example: > > Say I have a GRanges object (`tags`), that looks something like: > > GRanges with 2217486 ranges and 8 elementMetadata values > seqnames ranges strand | tag.id genome.hits gene.hits > <Rle> <IRanges> <Rle> | <integer> <integer> <integer> > [1] chr1 [ 4850, 4866] - | 405384 > 10 3 > [2] chr1 [ 7804, 7820] - | 405387 > 6 4 > [3] chr1 [13162, 13178] - | 405397 > 5 4 > [4] chr1 [16712, 16728] + | 35 12164 > 2475 > [5] chr1 [21381, 21397] + | 45 > 497 79 > [6] chr1 [21479, 21495] - | 1466 3823 > 957 > > And say that I want all "tags" with < 5 genome.hits on the "+" strand. > I'd like to: > > R> subset(tags, genome.hits < 5 & strand == '+') > > To do the same as: > > R> tags[elementMetadata(tags)$genome.hits < 5 & strand(tag) == '+'] > > I realize that using `genome.hits` (from the elementMetadata) and > `strand` (not in the metadata) is crossing some boundaries, but I just > wanted to point out one of the more "complex" cases. > > Just curious, > -steve > > -- > Steve Lianoglou > Graduate Student: Computational Systems Biology > | Memorial Sloan-Kettering Cancer Center > | Weill Medical College of Cornell University > Contact Info: > http://cbio.mskcc.org/~lianos/contact<http://cbio.mskcc.org/%7Elianos/contact > > > > _______________________________________________ > Bioc-sig-sequencing mailing list > [email protected] > https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioc-sig-sequencing mailing list > [email protected] > https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing > > _______________________________________________ > Bioc-sig-sequencing mailing list > [email protected] > https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing > > [[alternative HTML version deleted]] _______________________________________________ Bioc-sig-sequencing mailing list [email protected] https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
