We could attempt to patch it to terminate when most scores values converege 
less than a specified threshold. 

> Also to add to what Markus has said.  A true PageRank type calculation
> would run until it converges.  LinkRank being iterative, runs a given
> number of loops, by default 10.  This tends to converge for many, but
> not all, link sets.
> 
> Dennis
> 
> On 10/14/2011 08:26 AM, Markus Jelsma wrote:
> > On Friday 14 October 2011 15:03:16 Thomas Anderson wrote:
> >> I read wiki (http://wiki.apache.org/nutch/NewScoring#LinkRank) stating
> >> the process of LinkRank is iterative and scores tend to converge after
> >> iteration. However, from the the source I discover it seems that the
> >> job always reads from the same input path and produce to the same
> >> output path. For instance,
> >> 
> >> runCounter() reads intput from nodes and returns the number of nodes
> >> runInitializer() reads from nodes and initializes inLinkScore
> >> 
> >> then iteration (default is 10)
> >> runInverted() reads from nodes, where inLinkScore is initialized,
> >> outlinks, and loops; then produces output to
> >> linkrank-<random>/inverted
> >> runAnalysis() reads from nodes (inLinkScore is inited), and inverted
> >> path (in previous step); then produces output to
> >> linkrank-<random>/nodes
> > 
> > The score for X and Y after the first iteration are (1 - damping) +
> > (damping * sum(inlinkScore)). Suppose X also links to Y, then
> > sum(inlinkScore) for Y will change as X has a new value after the first
> > iteration.
> > 
> > This is convergence as the delta's between iterations will flatten out
> > after each iteration.
> > 
> >> This seems to me with the same process to calculate the scores, the
> >> result of LinkRank will always be the same at each iteration. So I
> >> can't understand very well how scores would converge. What place would
> >> be the key point to spot at? Or any doc that may explain this more
> >> detail?
> >> 
> >> Thanks.

Reply via email to