It seems better to discuss this with a concrete example -- we have one in
the GRanges man page. Thus
if we do
library(GenomicRanges)
example(GRanges)
we get
> gr
GRanges with 10 ranges and 2 elementMetadata values
seqnames ranges strand | score GC
<Rle> <IRanges> <Rle> | <integer> <numeric>
a Chrom1 [ 1, 10] - | 1 1.0000000
b Chrom2 [ 2, 10] + | 2 0.8888889
c Chrom2 [ 3, 10] + | 3 0.7777778
d Chrom2 [ 4, 10] * | 4 0.6666667
e Chrom1 [ 5, 10] * | 5 0.5555556
f Chrom1 [ 6, 10] + | 6 0.4444444
g Chrom3 [ 7, 10] + | 7 0.3333333
h Chrom3 [ 8, 10] + | 8 0.2222222
i Chrom3 [ 9, 10] - | 9 0.1111111
j Chrom3 [10, 10] - | 10 0.0000000
seqlengths
Chrom1 Chrom2 Chrom3
NA NA NA
now I find that Stuart's language does not quite work (at least for the
sessionInfo value given below)
> subset(gr, elementMetadata(gr)$score >6 & strand(gr)=="+")
Error in .local(x, ...) : 'subset' must be logical
> elementMetadata(gr)$score >6 & strand(gr)=="+"
'logical' Rle of length 10 with 3 runs
Lengths: 6 2 2
Values : FALSE TRUE FALSE
> as.logical(.Last.value)
[1] FALSE FALSE FALSE FALSE FALSE FALSE TRUE TRUE FALSE FALSE
so a coercion of the predicate above would be needed. On the other hand, he
mentions the
square bracket for subsetting:
> gr[ strand(gr)=="+" & elementMetadata(gr)$score>6,] # you can omit the
comma, or use it
GRanges with 2 ranges and 2 elementMetadata values
seqnames ranges strand | score GC
<Rle> <IRanges> <Rle> | <integer> <numeric>
g Chrom3 [7, 10] + | 7 0.3333333
h Chrom3 [8, 10] + | 8 0.2222222
which works fine and seems quite natural; the simplification that Steve
seems to be asking for would
allow implicit references to elementMetadata variables in the predicate. I
am not in favor of such
an extension of semantics of bracket.
In response to the question about documentation, note that showMethods("[")
works, and in particular,
method?"[,GRanges" returns a good man page
On Sat, Jun 5, 2010 at 7:32 AM, Stuart Andrews <[email protected]>wrote:
>
> Hi,
>
> To answer Steve's question first question ... yes. People (n = 1) are
> indeed subsetting GRanges objects. In my case, I used GRanges::subset()
> like this:
>
> R> subset(tags, elementMetadata(tags)$genome.hits < 5 & strand(tag) == '+')
>
> My question is, how does one discover that the singe-index square bracket
> is implemented like this for GRanges? I didn't see any mention of this in
> GenomicRanges.pdf of "?GRanges" .
>
> Are there other methods that do not appear in the documentation, and how
> can I learn about the existence of undocumented method in the future?
>
> Thx,
> - Stu
>
> Stuart Andrews, Ph.D.
> Postdoctoral Associate
> Institute for Computational Biomedicine
> Weill Cornell Medical College, New York, NY
>
>
>
>
> On Jun 4, 2010, at 9:18 PM, Michael Lawrence wrote:
>
> On Fri, Jun 4, 2010 at 2:27 PM, Steve Lianoglou <
>> [email protected]> wrote:
>>
>> Hi,
>>>
>>> Random question I thought I'd shoot out there ...
>>>
>>> I'm finding myself wanting to slice and dice IRanges-like objects (I'm
>>> playing with GRanges right now) based on some column of their
>>> elementMetadata.
>>>
>>>
>>> I guess if this makes sense then it would make sense to support this for
>> all
>> Sequence derivatives. It works already for RangedData, btw.
>>
>> Michael
>>
>>
>> Are other people finding that they want to do this, too?
>>> Would it make sense to add some subset-mojo to do that?
>>>
>>> Here's a motivating example:
>>>
>>> Say I have a GRanges object (`tags`), that looks something like:
>>>
>>> GRanges with 2217486 ranges and 8 elementMetadata values
>>> seqnames ranges strand | tag.id genome.hits gene.hits
>>> <Rle> <IRanges> <Rle> | <integer> <integer> <integer>
>>> [1] chr1 [ 4850, 4866] - | 405384 10 3
>>> [2] chr1 [ 7804, 7820] - | 405387 6 4
>>> [3] chr1 [13162, 13178] - | 405397 5 4
>>> [4] chr1 [16712, 16728] + | 35 12164 2475
>>> [5] chr1 [21381, 21397] + | 45 497 79
>>> [6] chr1 [21479, 21495] - | 1466 3823 957
>>>
>>> And say that I want all "tags" with < 5 genome.hits on the "+" strand.
>>> I'd like to:
>>>
>>> R> subset(tags, genome.hits < 5 & strand == '+')
>>>
>>> To do the same as:
>>>
>>> R> tags[elementMetadata(tags)$genome.hits < 5 & strand(tag) == '+']
>>>
>>> I realize that using `genome.hits` (from the elementMetadata) and
>>> `strand` (not in the metadata) is crossing some boundaries, but I just
>>> wanted to point out one of the more "complex" cases.
>>>
>>> Just curious,
>>> -steve
>>>
>>> --
>>> Steve Lianoglou
>>> Graduate Student: Computational Systems Biology
>>> | Memorial Sloan-Kettering Cancer Center
>>> | Weill Medical College of Cornell University
>>> Contact Info:
>>> http://cbio.mskcc.org/~lianos/contact<http://cbio.mskcc.org/%7Elianos/contact>
>>> <http://cbio.mskcc.org/%7Elianos/contact>
>>>
>>> _______________________________________________
>>> Bioc-sig-sequencing mailing list
>>> [email protected]
>>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>>>
>>>
>> [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioc-sig-sequencing mailing list
>> [email protected]
>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>>
>
> _______________________________________________
> Bioc-sig-sequencing mailing list
> [email protected]
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>
[[alternative HTML version deleted]]
_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing