Re: [R] Need a vectorized way to avoid two nested FOR loops

2009-10-08 Thread jim holtman
Here is one way of doing it: n - 20 set.seed(2) # create test dataframe x - as.data.frame(matrix(sample(1:2,n*6, TRUE), nrow=n)) x V1 V2 V3 V4 V5 V6 1 1 2 2 2 1 1 2 2 1 1 2 2 1 3 2 2 1 2 1 2 4 1 1 1 1 1 2 5 2 1 2 2 1 1 6 2 1 2 1 2 2 7 1 1 2 1

Re: [R] Need a vectorized way to avoid two nested FOR loops

2009-10-08 Thread jim holtman
I answered the wrong question. Here is the code to find all the matches for each row: n - 20 set.seed(2) # create test dataframe x - as.data.frame(matrix(sample(1:2,n*6, TRUE), nrow=n)) x x.col - c(1,3,5) # match against all the other rows x.match1 - apply(x[, x.col], 1, function(a){ .mat -

Re: [R] Need a vectorized way to avoid two nested FOR loops

2009-10-08 Thread Dimitris Rizopoulos
Another approach is: n - 20 set.seed(2) x - as.data.frame(matrix(sample(1:2, n*6, TRUE), nrow = n)) x.col - c(1, 3, 5) values - do.call(paste, c(x[x.col], sep = \r)) out - lapply(seq_along(ind), function (i) { ind - which(values == values[i]) ind[!ind %in% i] }) out Best, Dimitris

Re: [R] Need a vectorized way to avoid two nested FOR loops

2009-10-08 Thread Bert Gunter
If I understand your intent, I believe you can get what you want much faster (no interpreted loops and linear times) by looking at this slightly differently. First of all, the choice of columns is unimportant, as indexing can be used to create a data frame containing only the columns of

Re: [R] Need a vectorized way to avoid two nested FOR loops

2009-10-08 Thread Rama Ramakrishnan
Bert, Jim, Dimitris and Joris, Thank you all very much for your prompt help and suggestions. After trying the ideas out, I have decided to go with Bert's approach since it is by far the fastest of the lot. Thanks again! Rama Ramakrishnan On Oct 8, 2009, at 12:49 PM, Bert Gunter wrote:

Re: [R] Need a vectorized way to avoid two nested FOR loops

2009-10-08 Thread Dimitris Rizopoulos
Bert Gunter wrote: If I understand your intent, I believe you can get what you want much faster (no interpreted loops and linear times) by looking at this slightly differently. First of all, the choice of columns is unimportant, as indexing can be used to create a data frame containing only the

[R] Need a vectorized way to avoid two nested FOR loops

2009-10-07 Thread Rama Ramakrishnan
Hi Friends, I have a data frame d. Let vars be the column indices for a subset of the columns in d (e.g., vars - c(1,3,4,8)) For each row r in d, I want to collect all the other rows in d that match the values in row r for just the columns in vars. The naive way to do this is to have a