you may try SSVD.
https://cwiki.apache.org/confluence/display/MAHOUT/Stochastic+Singular+Value+Decomposition

but 4k eigenvectors (or, rather, singular values) is kind of still a
lot though and may push the precision out of the error estimates. I
don't we had precision study for that many. Also need quite a bit of
memory to compute that (not to mention flops). More realistically you
probably may try 1k singular values . You may try more if you have
access to more powerful hardware than we did in the studies but
distributed computation time will grow at about k^1.5, i.e. faster
than linear, even if you have enough nodes for the tasks.

-d

On Thu, Jul 19, 2012 at 6:12 PM, Aniruddha Basak <[email protected]> wrote:
> Hi,
> I am working on a clustering problem which involves determining the
> largest "k" eigenvectors of a very large matrix. The matrices, I work on,
> are typically of the order of 10^6 by 10^6.
> Trying to do this using the Lanczos solver available in Mahout, I found it
> is very slow and takes around 1.5 minutes to compute each eigenvectors.
> Hence to get 4000 eigenvectors, it takes 100 hours or 4 days !!
>
> So I am looking for something faster to solve the "Eigen decomposition"
> problem for very large sparse matrix. Please suggest me what should I use ?
>
>
> Thanks,
> Aniruddha
>

Reply via email to