I think merge() can do what's wanted, but you do have to be careful that values
match exactly. Here's an example where two data frames print the same in a row
for columns 'a' and 'b', but are not exactly same. merge() returns zero rows.
This problem can be fixed in this case by rounding, but that's not a good
general solution because very close numbers can round to different numbers,
e.g., 1.499 and 1.501.
Here are examples:
x <- data.frame(a=c(1.0000001,2), b=c(3,4), c=LETTERS[1:2])
y <- data.frame(a=c(1,2), b=c(3,5), c=LETTERS[3:4])
x
a b c
1 1 3 A
2 2 4 B
y
a b c
1 1 3 C
2 2 5 D
# x[1,"a"] and y[1,"a"] look the same, but are very slightly different
merge(x, y, by=c("a", "b"))
[1] a b c.x c.y
<0 rows> (or 0-length row.names)
# make x1 a version of x where the values are rounded to whole numbers
x1 <- x
x1$a <- round(x1$a)
merge(x1, y, by=c("a", "b"))
a b c.x c.y
1 1 3 A C
# intersect() returns columns that are the same in each dataframe, not rows
intersect(x, y)
c
1 C
2 D
intersect(x1, y)
a c
1 1 C
2 2 D
-- Tony Plate
jim holtman wrote:
You are missing a comma:
common <- intersect(data_frame_x[,c("Latitude", "Longitude")],
data_frame_y[,c("Latitude","Longitude")])
On Tue, Apr 28, 2009 at 5:49 AM, Steve Murray <smurray...@hotmail.com> wrote:
Thanks for the reply, however, when I do the following command, I receive the
message: 'data frame with 0 columns and 0 rows'. I've checked again though, and
there should be several thousand rows where the Latitude and Longitude pairs
are the same.
common <- intersect(data_frame_x[c("Latitude", "Longitude")],
data_frame_y[c("Latitude","Longitude")])
common
data frame with 0 columns and 0 rows
Is there an obvious solution to this? Should I be using 'unique' instead, and
if so, how would I get the above to correspond to this command?
Thanks,
Steve
________________________________
Date: Tue, 28 Apr 2009 13:36:51 +0530
Subject: Re: [R] Finding rows common to two datasets
From: umesh.sriniva...@gmail.com
To: smurray...@hotmail.com
CC: r-help@r-project.org
Dear Steve,
Try
? intersect
and see if that might help.
Cheers,
Umesh
On Tue, Apr 28, 2009 at 1:29 PM, Steve Murray> wrote:
Dear all,
I have 2 data frames, both with 14 columns of data and differing numbers of
rows. The first two columns are 'Latitude' and 'Longitude'. I want to find the
pairs of Latitude and Longitude coordinates which are common to both datasets,
and output a new data frame which is composed of these coincident rows. I tried
using the 'unique' command, but had difficulties interpreting the help file.
Many thanks for any help offered,
Steve
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.