Hello, I have been interested in setdiff() for data frames that operates row-wise. I looked in the documentation, mailing lists, etc., and didn't find exactly the right thing. Given data frames A, B with the same columns, the goal is to extract the rows that are in A, but not in B. Of course, one can usually do setdiff(rownames(A), rownames(B)) but that is cheating. :-)
I played around a little bit and came up with setdiff.data.frame = function(A, B){ g <- function( y, B){ any( apply(B, 1, FUN = function(x) identical(all.equal(x, y), TRUE) ) ) } unique( A[ !apply(A, 1, FUN = function(t) g(t, B) ), ] ) } I am sure that somebody can do this a better/faster way... any ideas? Any chance we could get a data.frame method for set.diff in future R versions? (The notion of "set" is somewhat ambiguous with respect to rows, columns, and entries in the data frame case.) Jay P.S. You can see what I'm looking for with A <- expand.grid( 1:3, 1:3 ) B <- A[ 2:5, ] setdiff.data.frame(A,B) *************************************************** G. Jay Kerns, Ph.D. Assistant Professor / Statistics Coordinator Department of Mathematics & Statistics Youngstown State University Youngstown, OH 44555-0002 USA Office: 1035 Cushwa Hall Phone: (330) 941-3310 Office (voice mail) -3302 Department -3170 FAX E-mail: [EMAIL PROTECTED] http://www.cc.ysu.edu/~gjkerns/ ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel