Hi there, I played with Mahout to decompose the adjacency matrices of large graphs lately. I stumbled on a paper of Christos Faloutsos that describes a variation of the Lanczos algorithm they use for this on top of Hadoop. They even explicitly mention Mahout:
"Very recently(March 2010), the Mahout project [2] provides SVD on top of HADOOP. Due to insufficient documentation, we were not able to find the input format and run a head-to-head comparison. But, reading the source code, we discovered that Mahout suffers from two major issues: (a) it assumes that the vector (b, with n=O(billion) entries) fits in the memory of a single machine, and (b) it implements the full re-orthogonalization which is inefficient." http://www.cs.cmu.edu/~ukang/papers/HeigenPAKDD2011.pdf --sebastian
