Yep. Essentially PageRank is just the first left singular vector of (L - B), where L is the L_1 normalized outbound link graph matrix, and B is the constant matrix of all entries of value "beta" (where beta is the probability to just randomly jump to some other page).
To compute this with minimal effort, using the distributed SVD already in Mahout is probably the way to go. The trick will be to have a modified DistributedRowMatrix.timesSquared() method, which takes into account the B matrix. I can spell that out in a little more detail if you'd like. -jake On Wed, Jun 30, 2010 at 10:46 PM, Ted Dunning <[email protected]> wrote: > Also note that there *is* a pretty large scale SVD solver in Mahout. That > can give you a short-cut to pageRank. > > On Wed, Jun 30, 2010 at 12:11 PM, Grant Ingersoll <[email protected] > >wrote: > > > > If not, I'd like to implement it. Any advice appreciated, > > > > Have a look at the matrix/vector libraries. See also the How To > Contribute > > page on the wiki. >
