Try using IndexSearcher.explain (and then a toString on the resulting Explanation object) to see the details of why things are scoring how they are. This can be most enlightening!

Erik


On Apr 14, 2004, at 12:16 PM, Armbrust, Daniel C. wrote:


I know that the lucene scoring algorithm is pretty complicated, I know I don't understand all the pieces. But given these documents:

A) - <preferred_designation> left renal calculus
B) - <other_designation> renal calculus

Should a query of

other_designation:("renal calculus") OR preferred_designation:("renal calculus")

Score document B higher than document A?

Those documents are a made up example. Here are the documents and scores I am getting back from the query on my real index:

Score 1.0 - Document<Text<first_word:left> Text<preferred_designation:left renal calculus in calyceal diverticulum> Unindexed<frequency:4> Text<codeTokenized:M00004001> Keyword<code:M00004001> Keyword<UNIQUE_DOCUMENT_IDENTIFIER_FIELD:48270>>

Score 0.85714287 - Document<Keyword<UNIQUE_DOCUMENT_IDENTIFIER_FIELD:514631> Keyword<code:M00035214> Text<codeTokenized:M00035214> Unindexed<frequency:4> Text<preferred_designation:left renal calculus in a solitary left kidney> Text<first_word:left>>

Score 0.7409672 - Document<Text<first_word:renal> Text<other_designation:renal calculus> Unindexed<frequency:3> Text<codeTokenized:M00032753> Keyword<code:M00032753> Keyword<UNIQUE_DOCUMENT_IDENTIFIER_FIELD:481129>>


Am I just making a dumb mistake somewhere?


Thanks,

Dan

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to