Hi Vincent and Steve (and now Patrick),

Thanks for directing me to the S4 methods functions, and teaching me  
something about Rle objects ... Specifically, because  (strand(gr))   
is a S4 Rle objects, we have:

R> is.logical(elementMetadata(gr)$score >6 & strand(gr)=="+")
[1]  FALSE


If my previous example requires an explicit coercion, this suggests to  
me that a *mojo* enhanced subset() function would be even more useful  
than I initially thought.

R> subset(gr, score>6 & strand == '+')
   ... instead of ...
R> subset(gr, elementMetadata(gr)$score >6 &  
as.logical(strand(gr)=="+"))


On the other hand, any changes would be non-trivial in comparison to  
the current implementation ;-)

Best,
- Stu


Stuart Andrews, Ph.D.
Postdoctoral Associate
Institute for Computational Biomedicine
Weill Cornell Medical College, New York, NY


On Jun 5, 2010, at 12:58 AM, Vincent Carey wrote:


> the sessionInfo() was
>
> > sessionInfo()
> R version 2.12.0 Under development (unstable) (2010-05-03 r51901)
> x86_64-apple-darwin10.3.0
>
> locale:
> [1] C
>
> attached base packages:
> [1] stats     graphics  grDevices datasets  tools     utils      
> methods
> [8] base
>
> other attached packages:
> [1] GenomicRanges_1.1.12 IRanges_1.7.6        weaver_1.15.0
> [4] codetools_0.2-2      digest_0.4.2
>
>
> On Sat, Jun 5, 2010 at 7:57 AM, Vincent Carey <[email protected] 
> > wrote:
> It seems better to discuss this with a concrete example -- we have  
> one in the GRanges man page.  Thus
> if we do
>
> library(GenomicRanges)
> example(GRanges)
>
> we get
>
> > gr
> GRanges with 10 ranges and 2 elementMetadata values
>   seqnames    ranges strand |     score        GC
>      <Rle> <IRanges>  <Rle> | <integer> <numeric>
> a   Chrom1  [ 1, 10]      - |         1 1.0000000
> b   Chrom2  [ 2, 10]      + |         2 0.8888889
> c   Chrom2  [ 3, 10]      + |         3 0.7777778
> d   Chrom2  [ 4, 10]      * |         4 0.6666667
> e   Chrom1  [ 5, 10]      * |         5 0.5555556
> f   Chrom1  [ 6, 10]      + |         6 0.4444444
> g   Chrom3  [ 7, 10]      + |         7 0.3333333
> h   Chrom3  [ 8, 10]      + |         8 0.2222222
> i   Chrom3  [ 9, 10]      - |         9 0.1111111
> j   Chrom3  [10, 10]      - |        10 0.0000000
>
> seqlengths
>  Chrom1 Chrom2 Chrom3
>      NA     NA     NA
>
> now I find that Stuart's language does not quite work (at least for  
> the sessionInfo value given below)
>
> > subset(gr, elementMetadata(gr)$score >6 & strand(gr)=="+")
> Error in .local(x, ...) : 'subset' must be logical
> > elementMetadata(gr)$score >6 & strand(gr)=="+"
> 'logical' Rle of length 10 with 3 runs
>   Lengths:     6     2     2
>   Values : FALSE  TRUE FALSE
> > as.logical(.Last.value)
>  [1] FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE FALSE FALSE
>
> so a coercion of the predicate above would be needed.  On the other  
> hand, he mentions the
> square bracket for subsetting:
>
> > gr[ strand(gr)=="+" & elementMetadata(gr)$score>6,]  # you can  
> omit the comma, or use it
> GRanges with 2 ranges and 2 elementMetadata values
>   seqnames    ranges strand |     score        GC
>      <Rle> <IRanges>  <Rle> | <integer> <numeric>
> g   Chrom3   [7, 10]      + |         7 0.3333333
> h   Chrom3   [8, 10]      + |         8 0.2222222
>
> which works fine and seems quite natural; the simplification that  
> Steve seems to be asking for would
> allow implicit references to elementMetadata variables in the  
> predicate.  I am not in favor of such
> an extension of semantics of bracket.
>
> In response to the question about documentation, note that  
> showMethods("[") works, and in particular,
> method?"[,GRanges" returns a good man page
>
>
>
>
>
> On Sat, Jun 5, 2010 at 7:32 AM, Stuart Andrews  
> <[email protected]> wrote:
>
> Hi,
>
> To answer Steve's question first question ... yes.  People (n = 1)  
> are indeed subsetting GRanges objects.   In my case, I used  
> GRanges::subset() like this:
>
> R> subset(tags, elementMetadata(tags)$genome.hits < 5 & strand(tag)  
> == '+')
>
> My question is, how does one discover that the singe-index square  
> bracket is implemented like this for GRanges?   I didn't see any  
> mention of this in GenomicRanges.pdf of "?GRanges" .
>
> Are there other methods that do not appear in the documentation, and  
> how can I learn about the existence of undocumented method in the  
> future?
>
> Thx,
> - Stu
>
> Stuart Andrews, Ph.D.
> Postdoctoral Associate
> Institute for Computational Biomedicine
> Weill Cornell Medical College, New York, NY
>
>
>
>
> On Jun 4, 2010, at 9:18 PM, Michael Lawrence wrote:
>
> On Fri, Jun 4, 2010 at 2:27 PM, Steve Lianoglou <
> [email protected]> wrote:
>
> Hi,
>
> Random question I thought I'd shoot out there ...
>
> I'm finding myself wanting to slice and dice IRanges-like objects (I'm
> playing with GRanges right now) based on some column of their
> elementMetadata.
>
>
> I guess if this makes sense then it would make sense to support this  
> for all
> Sequence derivatives. It works already for RangedData, btw.
>
> Michael
>
>
> Are other people finding that they want to do this, too?
> Would it make sense to add some subset-mojo to do that?
>
> Here's a motivating example:
>
> Say I have a GRanges object (`tags`), that looks something like:
>
> GRanges with 2217486 ranges and 8 elementMetadata values
>  seqnames           ranges strand   |    tag.id genome.hits gene.hits
>     <Rle>        <IRanges>  <Rle>   | <integer>   <integer> <integer>
> [1]     chr1   [ 4850,  4866]      -   |    405384           
> 10         3
> [2]     chr1   [ 7804,  7820]      -   |    405387            
> 6         4
> [3]     chr1   [13162, 13178]      -   |    405397            
> 5         4
> [4]     chr1   [16712, 16728]      +   |        35       12164       
> 2475
> [5]     chr1   [21381, 21397]      +   |        45          
> 497        79
> [6]     chr1   [21479, 21495]      -   |      1466        3823        
> 957
>
> And say that I want all "tags" with < 5 genome.hits on the "+" strand.
> I'd like to:
>
> R> subset(tags, genome.hits < 5 & strand == '+')
>
> To do the same as:
>
> R> tags[elementMetadata(tags)$genome.hits < 5 & strand(tag) == '+']
>
> I realize that using `genome.hits` (from the elementMetadata) and
> `strand` (not in the metadata) is crossing some boundaries, but I just
> wanted to point out one of the more "complex" cases.
>
> Just curious,
> -steve
>
> --
> Steve Lianoglou
> Graduate Student: Computational Systems Biology
> | Memorial Sloan-Kettering Cancer Center
> | Weill Medical College of Cornell University
> Contact Info: 
> http://cbio.mskcc.org/~lianos/contact<http://cbio.mskcc.org/%7Elianos/contact 
> >
>
> _______________________________________________
> Bioc-sig-sequencing mailing list
> [email protected]
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>
>
>        [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioc-sig-sequencing mailing list
> [email protected]
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>
> _______________________________________________
> Bioc-sig-sequencing mailing list
> [email protected]
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>
>


        [[alternative HTML version deleted]]

_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

Reply via email to