Michael Kao <mkao006rmail <at> gmail.com> writes: > Your solution is fast, but not completely correct, because you are also counting possible duplicates within the second matrix. The 'refitted' function could look as follows:
compMat2 <- function(A, B) { # rows of B present in A B0 <- B[!duplicated(B), ] na <- nrow(A); nb <- nrow(B0) AB <- rbind(A, B0) ab <- duplicated(AB)[(na+1):(na+nb)] return(sum(ab)) } and testing an example the size the OR was asking for: set.seed(8237) A <- matrix(sample(1:1000, 2*67420, replace=TRUE), 67420, 2) B <- matrix(sample(1:1000, 2*59199, replace=TRUE), 59199, 2) system.time(n <- compMat2(A, B)) # n = 3790 while compMat() will return 5522 rows, with 1732 duplicates within B ! A 3.06 GHz iMac needs about 2 -- 2.5 seconds. Hans Werner > On 2/12/2011 2:48 p.m., David Winsemius wrote: > > > > On Dec 2, 2011, at 4:20 AM, oluwole oyebamiji wrote: > > > >> Hi all, > >> I have matrix A of 67420 by 2 and another matrix B of 59199 by 2. > >> I would like to find the number of rows of matrix B that I can find > >> in matrix A (rows that are common to both matrices with or without > >> sorting). > >> > >> I have tried the "intersection" and "is.element" functions in R but > >> it only working for the vectors and not matrix > >> i.e, intersection(A,B) and is.element(A,B). > > > > Have you considered the 'duplicated' function? > > > > Here is an example based on the duplicated function > > test.mat1 <- matrix(1:20, nc = 5) > > test.mat2 <- rbind(test.mat1[sample(1:5, 2), ], matrix(101:120, nc = 5)) > > compMat <- function(mat1, mat2){ > nr1 <- nrow(mat1) > nr2 <- nrow(mat2) > mat2[duplicated(rbind(mat1, mat2))[(nr1 + 1):(nr1 + nr2)], ] > } > > compMat(test.mat1, test.mat2) > > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.