RE: ranking algorithm

Chirag Chaman Wed, 08 Jun 2005 13:52:45 -0700

It's a modification of the Google PigeonRank:
http://www.google.com/technology/pigeonrank.html

.
.
.
.
.
.
.
.
.
.
.
.
.
.
Okay, serious now:

Here's an overview of nutch scoring:

1. At query time: Does a TF-IDF (Term Frequency - Inverse Document
Frequency) at search time.

2. Link analysis boost. This is done by computing the Page rank as a
separate analysis step. Now this steps takes too long and is not actively
supported in newer versions...thus as an alternative you can use the "poor
man's PageRank" and use the following link boost:  sqrt(inlinks)
(controlled via properties files, see nutch-default.xml).

3. set boost for specific fields, i.e. if query matches Title or anchor the
final score is boosted by a factor.  I would encourage you to look at the
explain page. 

Here are a couple of links that might help: 

JavaDoc:
http://lucene.apache.org/nutch/apidocs/net/nutch/indexer/NutchSimilarity.htm
l

Erik's presentation:
http://www.cooug.org/java/presentations/september2004/NEJUG.pdf
(look at scoring slide)

-----Original Message-----
From: Santi Gori [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, June 08, 2005 3:05 PM
To: [email protected]
Subject: Re: ranking algorithm

On 6/8/05, Santi Gori <[EMAIL PROTECTED]> wrote:
> Hello. I would like to know how nutch`s ranking algorithm works.
> Thanks
>

RE: ranking algorithm

Reply via email to