Simon,
Could you provide timings for Rle element extraction because I have been
trying to provide speedups for bottlenecks. If the need is to perform
multiple element extraction, as Wolfgang suggests, then "[" for Rle
should be performant since it only calculates the start values once:
## from the internals of "[" for Rle
output <- runValue(x)[findInterval(i, start(x))]
if (!drop) output <- Rle(output)
Patrick
Wolfgang Huber wrote:
Hi Simon
just to be sure - what is n? Number of segments, or length of the
(expanded) sequence?
And rather than looking at the time needed to access a single value at
a certain position, shouldn't you be looking at the time needed to
access the values on a complete equi-spaced grid from begin to end of
the sequence?
bw Wolfgang
Simon Anders ha scritto:
Hi Michael
Michael Lawrence wrote:
An Rle object, even if it only stores the widths, would be better
than RangedData. Just getting the starts out of a RangedData is an
O(n) operation, and there is in general a lot of overhead for
functionality that is not useful in your case.
Thanks.
But wait a second: Isn't there a slot "starts" in a RangedData object?
So why would it be O(n) if this information is already there?
My concern was that getting the starts (or even just getting a value at
a given position) from an Rle object would be O(n) because the Rle
object does not contain the starts, only the lengths of the intervals.
So, what information is now stored where?
Cheers
Simon
Best wishes
Wolfgang
------------------------------------------------
Wolfgang Huber, EMBL, http://www.ebi.ac.uk/huber
_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing