[R] Find the 50 highest values in a matrix

2010-06-18 Thread uschlecht
Hi, I have a huge matrix (4000 * 2000 data points) and I would like to retrieve the coordinates (column and row) for the top 50 (or x) values. Some positions in the matrix have NA as a value. These should be discarded. My current method is to replace all NAs by 0, then rank all the values and

Re: [R] Find the 50 highest values in a matrix

2010-06-18 Thread Nikhil Kaza
Matrix is just a vector. So order should work haven't verified the following code. a - matrix(rnorm(4000*2000), 4000, 2000) b - order(a, na.last=TRUE, decreasing=TRUE)[1:50] use %% or %/% to get the row# and column #s Nikhil Kaza Asst. Professor, City and Regional Planning University of

Re: [R] Find the 50 highest values in a matrix

2010-06-18 Thread Dennis Murphy
Hi: Here's a faked up example: a - matrix(rnorm(4000*2000), 4000, 2000) # Generate some NAs in the matrix nr - sample(50, 1:4000) nc - sample(50, 1:2000) a[nr, nc] - NA # convert to data frame: b - data.frame(row = rep(1:4000, 2000), col = rep(1:2000, each = 4000), x =

Re: [R] Find the 50 highest values in a matrix

2010-06-18 Thread Peter Ehlers
m - matrix(round(rnorm(4000 * 2000), 4), nr = 4000) is.na(m) - sample(8e6, 1e6) system.time( idx - which( matrix(m %in% head(sort(m, TRUE), 50), nr = nrow(m)), arr.ind = TRUE)) # user system elapsed # 3.120.193.18 -Peter Ehlers On 2010-06-18 5:13, Dennis

Re: [R] Find the 50 highest values in a matrix

2010-06-18 Thread Henrik Bengtsson
You might also want to consider _partial sorting_ by using the 'partial' argument of sort(), especially when the number of data points is really large. Since argument 'decreasing=FALSE' is not supported when using 'partial', you have to flip it yourself by negating the values, e.g. x -

Re: [R] Find the 50 highest values in a matrix

2010-06-18 Thread Dennis Murphy
Hi: From what I can tell, Henrik efficiently finds the 50 largest values without the matrix indices and Peter efficiently finds the matrix indices without the corresponding values. Let's combine the two: x - rnorm(8e6) is.na(x) - sample(8e6, 1e6) n - 50 x1 - sort(x, decreasing=TRUE)[1:n] # Find