[R] Quickest way to match two vectors besides %in%?

2005-11-08 Thread Pete Cap
Hello list,

I have two data frames, X (48469,2) and Y (79771,5).

X[,1] contains distinct values of Y[,2].
I want to match values in X[,1] and Y[,2], then take
the corresponding value in [X,2] and place it in
Y[,4].

So far I have been doing it like so:
for(i in 1:48469) {
y[which(x[i,1]==y[,3]),4]-x[i,2]
}

But it chunks along so very slowly that I can't help
but wonder if there's a faster way, mainly because on
my box it takes R about 30 seconds to simply COUNT to
48,469 in the for loop.

I have already tried using %in%.  It tells me if the
values in X[,1] are IN Y[,2], which is useful in
removing unnecessary values from X[,1].  But it does
not tell me exactly where they match.  which(X[,1]
%in% Y[,2]) does but it only matches on the first
instance.

This is the slowest part of the script I'm working
on--if I could improve it I could shave off some
serious operating time.  Any pointers?

Regards,

Pete

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Quickest way to match two vectors besides %in%?

2005-11-08 Thread Duncan Murdoch
On 11/8/2005 2:28 PM, Pete Cap wrote:
 Hello list,
 
 I have two data frames, X (48469,2) and Y (79771,5).
 
 X[,1] contains distinct values of Y[,2].
 I want to match values in X[,1] and Y[,2], then take
 the corresponding value in [X,2] and place it in
 Y[,4].
 
 So far I have been doing it like so:
 for(i in 1:48469) {
 y[which(x[i,1]==y[,3]),4]-x[i,2]
 }
 
 But it chunks along so very slowly that I can't help
 but wonder if there's a faster way, mainly because on
 my box it takes R about 30 seconds to simply COUNT to
 48,469 in the for loop.
 
 I have already tried using %in%.  It tells me if the
 values in X[,1] are IN Y[,2], which is useful in
 removing unnecessary values from X[,1].  But it does
 not tell me exactly where they match.  which(X[,1]
 %in% Y[,2]) does but it only matches on the first
 instance.
 
 This is the slowest part of the script I'm working
 on--if I could improve it I could shave off some
 serious operating time.  Any pointers?

Look at the merge() function to add the X and Y columns to a new 
dataframe, then process that to merge the X[,2] and Y[,4] values.

It will be something like

Z - merge(X, Y, by.x=1, by.y=2, all.y=TRUE)

changes - !is.na(Z[,2])
Z[changes,5] - Z[changes,2]

but you are almost certainly better off (from a maintenance point of 
view) to use the names of the columns, rather than guessing at column 
numbers.

Duncan Murdoch

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Quickest way to match two vectors besides %in%?

2005-11-08 Thread Weiwei Shi
?match

 x
  X1 X2
1  1  5
2  2  6
3  3  7
4  4  8

 y
  Y1 Y4
1  1  8
2  2  9
3  3 10
4  4 11
5  1 12
6  2 13
7  3 14
8  4 15

 y.orig-y # backup

 y$Y4-x$X2[match(y$Y1, x$X1)]
 y
  Y1 Y4
1  1  5
2  2  6
3  3  7
4  4  8
5  1  5
6  2  6
7  3  7
8  4  8


HTH,

Weiwei

On 11/8/05, Pete Cap [EMAIL PROTECTED] wrote:
 Hello list,

 I have two data frames, X (48469,2) and Y (79771,5).

 X[,1] contains distinct values of Y[,2].
 I want to match values in X[,1] and Y[,2], then take
 the corresponding value in [X,2] and place it in
 Y[,4].

 So far I have been doing it like so:
 for(i in 1:48469) {
 y[which(x[i,1]==y[,3]),4]-x[i,2]
 }

 But it chunks along so very slowly that I can't help
 but wonder if there's a faster way, mainly because on
 my box it takes R about 30 seconds to simply COUNT to
 48,469 in the for loop.

 I have already tried using %in%.  It tells me if the
 values in X[,1] are IN Y[,2], which is useful in
 removing unnecessary values from X[,1].  But it does
 not tell me exactly where they match.  which(X[,1]
 %in% Y[,2]) does but it only matches on the first
 instance.

 This is the slowest part of the script I'm working
 on--if I could improve it I could shave off some
 serious operating time.  Any pointers?

 Regards,

 Pete

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html



--
Weiwei Shi, Ph.D

Did you always know?
No, I did not. But I believed...
---Matrix III

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Quickest way to match two vectors besides %in%?

2005-11-08 Thread paul sorenson
Pete Cap wrote:
 Hello list,
 
 I have two data frames, X (48469,2) and Y (79771,5).
 
 X[,1] contains distinct values of Y[,2].
 I want to match values in X[,1] and Y[,2], then take
 the corresponding value in [X,2] and place it in
 Y[,4].
 
 So far I have been doing it like so:
 for(i in 1:48469) {
 y[which(x[i,1]==y[,3]),4]-x[i,2]
 }

I'm not sure but isn't that a case where merge() can help?

cheers

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html