RE: Result scoring question

Armbrust, Daniel C. Thu, 15 Apr 2004 09:42:14 -0700

Thanks for the advice.

I created a class to extend DefaultSimilarity, and made it return 10 for the idf 
value.  (I don't really have any data to back up picking 10, other than it seems to 
work)


This did indeed, cause my exact matches to float up to the top.  Your explanation 
makes sense, because for this particular query, there were only 2 documents in the 
index that contained the words "renal calculus" in the preferred_designation field 
while there were hundreds that  contained those words in the other_designation field.

I'll keep testing it to make sure that nothing odd happens in other searches now, but 
is seems good so far.

Thanks, 

Dan



************************************ 
-----Original Message-----
From: Ype Kingma [mailto:[EMAIL PROTECTED] 
Sent: Thursday, April 15, 2004 2:00 AM
To: Lucene Users List
Subject: Re: Result scoring question


It seems that the problem is in the idf weights.
Try using a scorer that returns a constant for the idf.
You can inherit all the default behaviour and only override the idf().

The idf weights are established for Lucene terms, which are a combination
of a field and a text term. If a text term occurs infrequently in one field, it
will score higher than in a field in which it occurs frequently.
(idf means inverse document frequency).
My guess is this is what's happening here.


Good luck,
Ype


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: Result scoring question

Reply via email to