Re: [R] Correlation of huge matrix saved as binary file

2012-03-03 Thread Thomas Lumley
On Sat, Mar 3, 2012 at 2:36 PM, Peter Langfelder peter.langfel...@gmail.com wrote: 3. Instead of calculating the correlations one-by-one, calculate them in small blocks (if you have enough memory and you run a 64-bit R). With 900M rows, you will only be able to put a 900Mx2 into an R object,

[R] Correlation of huge matrix saved as binary file

2012-03-02 Thread Bryo
Hi, I have a 900,000,000*9,000 matrix where I need to calculate the correlation between all entries along the smaller dimension, thus creating a 9k*9k correlation matrix. This matrix is too big to be uploaded in R, and is saved as a binary file. To access the data in the file I use mmap and some

Re: [R] Correlation of huge matrix saved as binary file

2012-03-02 Thread Peter Langfelder
I don't think you can speed it up by a whole lot... but you can try a few things, especially if you don't have missing data in the matrix (which you probably don't). The main question is what takes most of the time- the api calls or the cor() call? If it's cor, here's what you can try: 1.