Hi Pete,

Thanks for suggesting this fast method. I've formalized this a little
bit by using a generic (identicalVals) + methods. I also tweaked it
in order to avoid false negatives that can occur when 'x' and 'y' have
different names or different seqlevels. So no more fallback to
'all(x == y)'.

Committed in SummarizedExperiment 1.3.82.

BTW please note that 'x == y' and 'identicalVals(x, y)' both ignore
circularity of the underlying sequences e.g. ranges [1, 10] and
[101, 110] represent the same position on a circular sequence of
length 100 so should be considered equal. However for 'x == y' and
'identicalVals(x, y)', they are not. Something we should address at
some point...

Cheers,
H.

On 08/30/2016 05:57 AM, Peter Hickey wrote:
The cbind,SummarizedExperiment-method checks that the rowRanges slots
are equal by calling `all(x == x1)`, where x and x1 are GenomicRanges
objects. This can be kind of slow and makes a large, temporary vector
when length(x) is large.

I wrote a fast method to check equality of two GenomicRanges objects,
see https://gist.github.com/PeteHaitch/13787125a165928e652dcfea2a8d166a.
It takes it from 13.7 seconds to 0.004 seconds for a GenomicRanges
object with 100M elements on my machine. It uses identical() on key
slots of the GenomicRanges objects, and I'm not sure if this could
return false negatives, so I fall back to all(x == x1) if the fast
method returns FALSE.

Could cbind,SummarizedExperiment-method be updated to use something like this?

Cheers,
Pete

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to