It's a modification of the Google PigeonRank: http://www.google.com/technology/pigeonrank.html
. . . . . . . . . . . . . . Okay, serious now: Here's an overview of nutch scoring: 1. At query time: Does a TF-IDF (Term Frequency - Inverse Document Frequency) at search time. 2. Link analysis boost. This is done by computing the Page rank as a separate analysis step. Now this steps takes too long and is not actively supported in newer versions...thus as an alternative you can use the "poor man's PageRank" and use the following link boost: sqrt(inlinks) (controlled via properties files, see nutch-default.xml). 3. set boost for specific fields, i.e. if query matches Title or anchor the final score is boosted by a factor. I would encourage you to look at the explain page. Here are a couple of links that might help: JavaDoc: http://lucene.apache.org/nutch/apidocs/net/nutch/indexer/NutchSimilarity.htm l Erik's presentation: http://www.cooug.org/java/presentations/september2004/NEJUG.pdf (look at scoring slide) -----Original Message----- From: Santi Gori [mailto:[EMAIL PROTECTED] Sent: Wednesday, June 08, 2005 3:05 PM To: [email protected] Subject: Re: ranking algorithm On 6/8/05, Santi Gori <[EMAIL PROTECTED]> wrote: > Hello. I would like to know how nutch`s ranking algorithm works. > Thanks >
