Both of those complaints are dealt with by Dmitriy's work on the stochastic
decomposition.

On Sat, Dec 17, 2011 at 6:00 AM, Sebastian Schelter <[email protected]
> wrote:

> Hi there,
>
> I played with Mahout to decompose the adjacency matrices of large graphs
> lately. I stumbled on a paper of Christos Faloutsos that describes a
> variation of the Lanczos algorithm they use for this on top of Hadoop.
> They even explicitly mention Mahout:
>
> "Very recently(March 2010), the Mahout project [2] provides
> SVD on top of HADOOP. Due to insufficient documentation, we were not
> able to find the input format and run a head-to-head comparison. But,
> reading the source code, we discovered that Mahout suffers from two
> major issues: (a) it assumes that the vector (b, with n=O(billion)
> entries) fits in the memory of a single machine, and (b) it implements
> the full re-orthogonalization which is inefficient."
>
> http://www.cs.cmu.edu/~ukang/papers/HeigenPAKDD2011.pdf
>
> --sebastian
>

Reply via email to