Re: Nutch scoring algorithm

Andy Liu Mon, 11 Apr 2005 05:17:40 -0700

fieldNorm is lengthNorm * document boost.  The lengthNorm formula is
defined within Lucene's similarity class (which is a function of the
number of terms within the document) and the document boost is
calculated in IndexSegment.java .


Nutch assigns different boosts to each field so that you can tune your
search results.  For example, you can use explain to see if anchor
matches are too strong, and adjust accordingly.

Andy

On Apr 11, 2005 12:17 AM, Kannan Sundaramoorthy
<[EMAIL PROTECTED]> wrote:
> 
> Hi,
> I am trying to understand how Nutch computes score for each document. I
> could figure out how tf, idf and queryNorm are computed but I do not
> understand how fieldNorm (normalisation for each field) value is
> computed. It seems to be a magic number for me and this is where Nutch
> seems to differ from Lucene in computing score.
> Also Nutch assigns different boosts for different fields (e.g, 4.0 for
> url field) and uses this value while computing queryWeight. Can anyone
> explain these please?
> 
> Thanks,
> Kannan
> 
> This e-mail and any files transmitted with it are for the sole use of the 
> intended recipient(s) and may contain confidential and privileged information.
> If you are not the intended recipient, please contact the sender by reply 
> e-mail and destroy all copies of the original message.
> Any unauthorised review, use, disclosure, dissemination, forwarding, printing 
> or copying of this email or any action taken in reliance on this e-mail is 
> strictly
> prohibited and may be unlawful.
> 
>   Visit us at http://www.cognizant.com
>

Re: Nutch scoring algorithm

Reply via email to