It seems better to discuss this with a concrete example -- we have one in
the GRanges man page.  Thus
if we do

library(GenomicRanges)
example(GRanges)

we get

> gr
GRanges with 10 ranges and 2 elementMetadata values
  seqnames    ranges strand |     score        GC
     <Rle> <IRanges>  <Rle> | <integer> <numeric>
a   Chrom1  [ 1, 10]      - |         1 1.0000000
b   Chrom2  [ 2, 10]      + |         2 0.8888889
c   Chrom2  [ 3, 10]      + |         3 0.7777778
d   Chrom2  [ 4, 10]      * |         4 0.6666667
e   Chrom1  [ 5, 10]      * |         5 0.5555556
f   Chrom1  [ 6, 10]      + |         6 0.4444444
g   Chrom3  [ 7, 10]      + |         7 0.3333333
h   Chrom3  [ 8, 10]      + |         8 0.2222222
i   Chrom3  [ 9, 10]      - |         9 0.1111111
j   Chrom3  [10, 10]      - |        10 0.0000000

seqlengths
 Chrom1 Chrom2 Chrom3
     NA     NA     NA

now I find that Stuart's language does not quite work (at least for the
sessionInfo value given below)

> subset(gr, elementMetadata(gr)$score >6 & strand(gr)=="+")
Error in .local(x, ...) : 'subset' must be logical
> elementMetadata(gr)$score >6 & strand(gr)=="+"
'logical' Rle of length 10 with 3 runs
  Lengths:     6     2     2
  Values : FALSE  TRUE FALSE
> as.logical(.Last.value)
 [1] FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE FALSE FALSE

so a coercion of the predicate above would be needed.  On the other hand, he
mentions the
square bracket for subsetting:

> gr[ strand(gr)=="+" & elementMetadata(gr)$score>6,]  # you can omit the
comma, or use it
GRanges with 2 ranges and 2 elementMetadata values
  seqnames    ranges strand |     score        GC
     <Rle> <IRanges>  <Rle> | <integer> <numeric>
g   Chrom3   [7, 10]      + |         7 0.3333333
h   Chrom3   [8, 10]      + |         8 0.2222222

which works fine and seems quite natural; the simplification that Steve
seems to be asking for would
allow implicit references to elementMetadata variables in the predicate.  I
am not in favor of such
an extension of semantics of bracket.

In response to the question about documentation, note that showMethods("[")
works, and in particular,
method?"[,GRanges" returns a good man page




On Sat, Jun 5, 2010 at 7:32 AM, Stuart Andrews <[email protected]>wrote:

>
> Hi,
>
> To answer Steve's question first question ... yes.  People (n = 1) are
> indeed subsetting GRanges objects.   In my case, I used GRanges::subset()
> like this:
>
> R> subset(tags, elementMetadata(tags)$genome.hits < 5 & strand(tag) == '+')
>
> My question is, how does one discover that the singe-index square bracket
> is implemented like this for GRanges?   I didn't see any mention of this in
> GenomicRanges.pdf of "?GRanges" .
>
> Are there other methods that do not appear in the documentation, and how
> can I learn about the existence of undocumented method in the future?
>
> Thx,
> - Stu
>
> Stuart Andrews, Ph.D.
> Postdoctoral Associate
> Institute for Computational Biomedicine
> Weill Cornell Medical College, New York, NY
>
>
>
>
> On Jun 4, 2010, at 9:18 PM, Michael Lawrence wrote:
>
>  On Fri, Jun 4, 2010 at 2:27 PM, Steve Lianoglou <
>> [email protected]> wrote:
>>
>>  Hi,
>>>
>>> Random question I thought I'd shoot out there ...
>>>
>>> I'm finding myself wanting to slice and dice IRanges-like objects (I'm
>>> playing with GRanges right now) based on some column of their
>>> elementMetadata.
>>>
>>>
>>>  I guess if this makes sense then it would make sense to support this for
>> all
>> Sequence derivatives. It works already for RangedData, btw.
>>
>> Michael
>>
>>
>>  Are other people finding that they want to do this, too?
>>> Would it make sense to add some subset-mojo to do that?
>>>
>>> Here's a motivating example:
>>>
>>> Say I have a GRanges object (`tags`), that looks something like:
>>>
>>> GRanges with 2217486 ranges and 8 elementMetadata values
>>>  seqnames           ranges strand   |    tag.id genome.hits gene.hits
>>>     <Rle>        <IRanges>  <Rle>   | <integer>   <integer> <integer>
>>> [1]     chr1   [ 4850,  4866]      -   |    405384          10         3
>>> [2]     chr1   [ 7804,  7820]      -   |    405387           6         4
>>> [3]     chr1   [13162, 13178]      -   |    405397           5         4
>>> [4]     chr1   [16712, 16728]      +   |        35       12164      2475
>>> [5]     chr1   [21381, 21397]      +   |        45         497        79
>>> [6]     chr1   [21479, 21495]      -   |      1466        3823       957
>>>
>>> And say that I want all "tags" with < 5 genome.hits on the "+" strand.
>>> I'd like to:
>>>
>>> R> subset(tags, genome.hits < 5 & strand == '+')
>>>
>>> To do the same as:
>>>
>>> R> tags[elementMetadata(tags)$genome.hits < 5 & strand(tag) == '+']
>>>
>>> I realize that using `genome.hits` (from the elementMetadata) and
>>> `strand` (not in the metadata) is crossing some boundaries, but I just
>>> wanted to point out one of the more "complex" cases.
>>>
>>> Just curious,
>>> -steve
>>>
>>> --
>>> Steve Lianoglou
>>> Graduate Student: Computational Systems Biology
>>> | Memorial Sloan-Kettering Cancer Center
>>> | Weill Medical College of Cornell University
>>> Contact Info: 
>>> http://cbio.mskcc.org/~lianos/contact<http://cbio.mskcc.org/%7Elianos/contact>
>>> <http://cbio.mskcc.org/%7Elianos/contact>
>>>
>>> _______________________________________________
>>> Bioc-sig-sequencing mailing list
>>> [email protected]
>>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>>>
>>>
>>        [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioc-sig-sequencing mailing list
>> [email protected]
>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>>
>
> _______________________________________________
> Bioc-sig-sequencing mailing list
> [email protected]
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>

        [[alternative HTML version deleted]]

_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

Reply via email to