Really? I guess PageRank in mahout was removed due to inherited network
bottleneck of mapreduce. But I didn't know MLlib has the implementation.
Is mllib implementation based on Lanczos or SSVD? Just curious...
On 17/02/2014 11:11 PM, Dmitriy Lyubimov wrote:
I bet page rank in mllib in spark finds stationary distribution much faster.
On Feb 17, 2014 1:33 PM, "peng" <[email protected]> wrote:
Agreed, and this is the case where Lanczos algorithm is obsolete.
My point is: if SSVD is unable to find the eigenvector of asymmetric
matrix (this is a common formulation of PageRank, and some random walks,
and many other things), then we still have to rely on large-scale Lanczos
algorithm.
On Mon 17 Feb 2014 04:25:16 PM EST, Ted Dunning wrote:
For the symmetric case, SVD is eigen decomposition.
On Mon, Feb 17, 2014 at 1:12 PM, peng <[email protected]> wrote:
If SSVD is not designed for such eigenvector problem. Then I would vote
for retaining the Lanczos algorithm.
However, I would like to see the opposite case, I have tested both
algorithms on symmetric case and SSVD is much faster and more accurate
than
its competitor.
Yours Peng
On Wed 12 Feb 2014 03:25:47 PM EST, peng wrote:
In PageRank I'm afraid I have no other option than eigenvector
\lambda, but not singular vector u & v:) The PageRank in Mahout was
removed with other graph-based algorithm.
On Tue 11 Feb 2014 06:34:17 PM EST, Ted Dunning wrote:
SSVD is very probably better than Lanczos for any large decomposition.
That said, it does SVD, not eigen decomposition which means that the
question of symmetrical matrices or positive definiteness doesn't much
matter.
Do you really need eigen-decomposition?
On Tue, Feb 11, 2014 at 2:55 PM, peng <[email protected]> wrote:
Just asking for possible replacement of our Lanczos-based PageRank
implementation. - Peng