But what is the scope of space? For example, the reduce operation has no
concept of space (see below). In GenomicRanges, we introduced the
concept of seqlengths to a number of classes including GRanges and
GRangesList. There are certain restrictions of what can be held in a
seqlengths slot, for example you can't mix NAs with non-NAs. Perhaps we
can formalize space for all List objects so that you either have names
of NULL or all the names must be distinct, non-empty strings. We would
also have to define what happens in a binary operation involving two
List objects when name sets are not identical.
> RangesList(a = IRanges(1,1), a = IRanges(1,2))
SimpleRangesList of length 2
$a
IRanges of length 1
start end width
[1] 1 1 1
$a
IRanges of length 1
start end width
[1] 1 2 2
> validObject(RangesList(a = IRanges(1,1), a = IRanges(1,2)))
[1] TRUE
> reduce(RangesList(a = IRanges(1,1), a = IRanges(1,2)))
SimpleRangesList of length 2
$a
IRanges of length 1
start end width
[1] 1 1 1
$a
IRanges of length 1
start end width
[1] 1 2 2
Patrick
On 6/12/10 5:47 AM, Michael Lawrence wrote:
>
>
> On Sat, Jun 12, 2010 at 12:17 AM, Patrick Aboyoun <[email protected]
> <mailto:[email protected]>> wrote:
>
> Janet,
> Most function in the IRanges package follows the R convention of
> considering the elements of names to be loosely linked attributes
> rather than rigid keys. For convenience, functions such as $, [,
> [[ treat a list as a hash if it has names, but in most
> circumstances the names are ignored or copied without use. Even
> when there are names on elements, there are some odd corner cases
> that can cause problems. For example, if I wanted to have multiple
> list elements with the same name, then some important operations
> give unexpected results:
>
> > list(a = 1, a = 2)["a"]
> $a
> [1] 1
>
> If the issue is limited to enhance the seqselect function to make
> it name aware, it probably makes sense to go ahead with the
> enhancement. But the scope of this issue can grow quite large. For
> example, should names be used when adding to RleList objects? What
> should the following produce
>
> RleList(a = Rle(1)) + RleList(a = Rle(2), a = Rle(3), b = Rle(4))
>
> Due to these types of ambiguities, I would rather focus on
> educating the user to be mindful that these are position-oriented
> rather than key-oriented objects and have them ensure that
> elements are in alignment.
>
> Thoughts?
>
>
>
> Sometimes in IRanges the names have a special semantic -- that of a
> "space". I guess this is limited to RangesList. Other data structures,
> like RleList, are often treated as being separated by space or
> chromosome, though their names have never explicitly been treated as
> the space. This inconsistency is probably OK, but it needs to be
> documented.
>
> Patrick
>
>
>
>
> On 6/11/10 4:06 PM, Janet Young wrote:
>
> Hi,
>
> I've been playing around with seqselect on scores stored in a
> SimpleRleList object to get subregions defined in a RangesList
> object.
>
> I found a couple of things: first an enhancement request -
> would it be possible to allow seqselect to deal with cases
> where not every space (name) in the SimpleRleList has a
> corresponding space/name in the RangesList object?
>
> The second is either bug or else I've misunderstood the way
> seqselect is supposed to work, in a dangerous way - it looks
> like seqselect doesn't use the names of the list items to
> select scores, it just assumes that in the two lists the
> elements have the same names in the same order.
>
> The code below should explain both issues problem much better
> than those descriptions.
>
> thanks,
>
> Janet
>
>
>
> > library(IRanges)
>
> Attaching package: 'IRanges'
>
> The following object(s) are masked from 'package:base':
>
> cbind, Map, mapply, order, paste, pmax, pmax.int
> <http://pmax.int>, pmin, pmin.int <http://pmin.int>, rbind,
> rep.int <http://rep.int>, table
>
> >
> > ### generate some arbitrary scores
> > track <- RangedData(RangesList(chrA = IRanges(start = c(1,
> 4, 6), width=c(3, 2, 4)),chrB = IRanges(start = c(1, 3, 6),
> width=c(3, 3, 4))) )
> > trackCoverage <- coverage(track,
> weight=list(chrA=c(2,7,3),chrB=c(1,1,1)) )
> >
> > ### define subregions
> > exons <- RangesList(chrA = IRanges(start = c(2, 4), width =
> c(2,2)),chrB = IRanges(start = 3, width = 5))
> >
> > ### seqselect works if all spaces in trackCoverage have an
> element in exons
> > seqselect(trackCoverage,exons )
> SimpleRleList of length 2
> $chrA
> 'integer' Rle of length 4 with 2 runs
> Lengths: 2 2
> Values : 2 7
>
> $chrB
> 'integer' Rle of length 5 with 2 runs
> Lengths: 1 4
> Values : 2 1
>
> >
> > ### define subregions only on one chr
> > exons_chrAonly <- RangesList(chrA = IRanges(start = c(2, 4),
> width = c(2, 2)))
> > ### now seqselect doesn't work if some spaces don't have any
> elements
> > seqselect(trackCoverage,exons_chrAonly )
> Error in seqselect(trackCoverage, exons_chrAonly) :
> 'length(start)' must equal 'length(x)' when 'end' and 'width'
> are NULL
> >
> >
> > ##### also, defining the regions with spaces in a different
> order seems to cause trouble as seqselect doesn't seem to be
> using the list's names - just going by order of elements
> > exons_reorderchrs <- RangesList(chrB = IRanges(start = 3,
> width = 5),chrA = IRanges(start = c(2, 4), width = c(2,2)))
> > seqselect(trackCoverage,exons_reorderchrs )
> SimpleRleList of length 2
> $chrA
> 'integer' Rle of length 5 with 3 runs
> Lengths: 1 2 2
> Values : 2 7 3
>
> $chrB
> 'integer' Rle of length 4 with 3 runs
> Lengths: 1 1 2
> Values : 1 2 1
>
> >
> > identical ( seqselect(trackCoverage,exons ) ,
> seqselect(trackCoverage,exons_reorderchrs ) )
> [1] FALSE
> >
> > sessionInfo()
> R version 2.11.1 (2010-05-31)
> i386-apple-darwin9.8.0
>
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods
> base
>
> other attached packages:
> [1] IRanges_1.6.6
>
> _______________________________________________
> Bioc-sig-sequencing mailing list
> [email protected]
> <mailto:[email protected]>
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>
>
> _______________________________________________
> Bioc-sig-sequencing mailing list
> [email protected]
> <mailto:[email protected]>
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>
>
[[alternative HTML version deleted]]
_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing