The anti_join from the package dplyr might also be handy. install.package("dplyr") library(dplyr) anti_join (x1, x2)
You can get help on the different functions by ?function.name(), so ?anti_join() will bring you help - and examples - on the anti_join function. It might be worth testing your approach on a small subset of the data. That makes it easier for you to follow what happens and evaluate the outcome. HTH Ulrik Marsh Hardy ARA/RISK <mha...@ara.com> schrieb am So., 28. Jan. 2018, 04:14: > Cool, looks like that'd do it, almost as if converting an entire record to > a character string and comparing strings. > > -- M. B. Hardy, statistician > work: Applied Research Associates, S. E. Div. > 8537 Six Forks Rd., # 6000 / Raleigh, NC 27615 > <https://maps.google.com/?q=8537+Six+Forks+Rd.,+%23+6000+/+Raleigh,+NC+27615&entry=gmail&source=g> > -2963 > (919) 582-3329, fax: 582-3301 > home: 1020 W. South St. / Raleigh, NC 27603 > <https://maps.google.com/?q=1020+W.+South+St.+/+Raleigh,+NC+27603&entry=gmail&source=g> > -2162 > (919) 834-1245 > ________________________________________ > From: William Dunlap [wdun...@tibco.com] > Sent: Saturday, January 27, 2018 4:57 PM > To: Marsh Hardy ARA/RISK > Cc: Ulrik Stervbo; Eric Berger; r-help@r-project.org > Subject: Re: [R] Newbie wants to compare 2 huge RDSs row by row. > > If your two objects have class "data.frame" (look at class(objectName)) > and they > both have the same number of columns and the same order of columns and the > column types match closely enough (use all.equal(x1, x2) for that), then > you can try > which( rowSums( x1 != x2 ) > 0) > E.g., > > x1 <- data.frame(X=1:5, Y=rep(c("A","B"),c(3,2))) > > x2 <- data.frame(X=c(1,2,-3,-4,5), Y=rep(c("A","B"),c(2,3))) > > x1 > X Y > 1 1 A > 2 2 A > 3 3 A > 4 4 B > 5 5 B > > x2 > X Y > 1 1 A > 2 2 A > 3 -3 B > 4 -4 B > 5 5 B > > which( rowSums( x1 != x2 ) > 0) > [1] 3 4 > > If you want to allow small numeric differences but exactly character > matches > you will have to get a bit fancier. Splitting the data.frames into > character and > numeric parts and comparing each works well. > > Bill Dunlap > TIBCO Software > wdunlap tibco.com<http://tibco.com> > > On Sat, Jan 27, 2018 at 1:18 PM, Marsh Hardy ARA/RISK <mha...@ara.com > <mailto:mha...@ara.com>> wrote: > Hi Guys, I apologize for my rank & utter newness at R. > > I used summary() and found about 95 variables, both character and numeric, > all with "Length:368842" I assume is the # of records. > > I'd like to know the record number (row #?) of any record where the data > doesn't match in the 2 files of what should be the same output. > > Thanks in advance, M. > > // > ________________________________________ > From: Ulrik Stervbo [ulrik.ster...@gmail.com<mailto: > ulrik.ster...@gmail.com>] > Sent: Saturday, January 27, 2018 10:00 AM > To: Eric Berger > Cc: Marsh Hardy ARA/RISK; r-help@r-project.org<mailto:r-help@r-project.org > > > Subject: Re: [R] Newbie wants to compare 2 huge RDSs row by row. > > Also, it will be easier to provide helpful information if you'd describe > what in your data you want to compare and what you hope to get out of the > comparison. > > Best wishes, > Ulrik > > Eric Berger <ericjber...@gmail.com<mailto:ericjber...@gmail.com><mailto: > ericjber...@gmail.com<mailto:ericjber...@gmail.com>>> schrieb am Sa., 27. > Jan. 2018, 08:18: > Hi Marsh, > An RDS is not a data structure such as a data.frame. It can be anything. > For example if I want to save my objects a, b, c I could do: > > saveRDS( list(a,b,c,), file="tmp.RDS") > Then read them back later with > > myList <- readRDS( "tmp.RDS" ) > > Do you have additional information about your "RDSs" ? > > Eric > > > On Sat, Jan 27, 2018 at 6:54 AM, Marsh Hardy ARA/RISK <mha...@ara.com > <mailto:mha...@ara.com><mailto:mha...@ara.com<mailto:mha...@ara.com>>> > wrote: > > > Each RDS is 40 MBs. What's a slick code to compare them row by row, IDing > > row numbers with mismatches? > > > > Thanks in advance. > > > > // > > > > ______________________________________________ > > R-help@r-project.org<mailto:R-help@r-project.org><mailto: > R-help@r-project.org<mailto:R-help@r-project.org>> mailing list -- To > UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/ > > posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org<mailto:R-help@r-project.org><mailto: > R-help@r-project.org<mailto:R-help@r-project.org>> mailing list -- To > UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help@r-project.org<mailto:R-help@r-project.org> mailing list -- To > UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.