Re: [R] SVD on very large data matrix

Stefan Evert Mon, 08 Apr 2013 17:45:53 -0700

On 8 Apr 2013, at 23:21, Andy Cooper <[email protected]> wrote:


> So, no one has direct experience running irlba on a data matrix as large as 
> 500,000 x 1,000 or larger?

I haven't used irlba in production code, but ran a few benchmarks on much 
smaller matrices.  My impression was (also from the documentation, I think) was 
that irlba is designed for use cases where only a few singular values are 
needed, up to 10 or so.  With 50 singular values, I found randomized SVD to be 
faster than irlba.

If you're working with a dense 500,000 x 1000 matrix, you'll need a lot of RAM. 
 Have you tried the svd() function? Most good BLAS libraries include highly 
optimised SVD code; if your machine has enough CPU cores, even a 
high-dimensional SVD might be fast enough.

Best,
Stefan

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] SVD on very large data matrix

Reply via email to