Alex,

if you have length normalization turned on then the length (the number of tokens and perhaps even the distance between the tokens) of the second document is much greater than the length of the first document. The length is the complete number of tokens in the field, i.e. if you add more than one field with the same name to a document these will be concatenated. This is why the first hit is a better match.

Try the Searcher#explain method for more details:

http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/Searcher.html#explain(org.apache.lucene.search.Query,%20int)


   karl

26 nov 2008 kl. 20.22 skrev AlexElba:


Hello ,
I have two document in my lucene index

Document<stored/uncompressed,indexed<tagId:5117>
stored/uncompressed<tagName:Wholesale Hot Dog Stand Equipment>
stored/uncompressed,indexed,tokenized<tagKey:wholesale hot dog stand
equipment> stored/uncompressed>

Document<stored/uncompressed,indexed<tagId:11274>
stored/uncompressed<tagName:Hot Dogs>
stored/uncompressed,indexed,tokenized<tagKey:hot dog meal>
stored/uncompressed,indexed,tokenized<tagKey:hot dog restaurant>
stored/uncompressed,indexed,tokenized<tagKey:hotdog>
stored/uncompressed,indexed,tokenized<tagKey:hot dog>
stored/uncompressed,indexed,tokenized<tagKey:hot dog dining>
stored/uncompressed,indexed,tokenized<tagKey:best hotdog>
stored/uncompressed,indexed,tokenized<tagKey:cuisine hot dog>
stored/uncompressed,indexed,tokenized<tagKey:hotdog stand>
stored/uncompressed,indexed,tokenized<tagKey:hotdog restaurant>
stored/uncompressed,indexed,tokenized<tagKey:hot dog grill>
stored/uncompressed,indexed,tokenized<tagKey:hot dog cuisine>
stored/uncompressed,indexed,tokenized<tagKey:hot dog stand>
stored/uncompressed,indexed,tokenized<tagKey:hot dog menu>
stored/uncompressed,indexed,tokenized<tagKey:hot dog shop>
stored/uncompressed,indexed,tokenized<tagKey:hotdog vendor>
stored/uncompressed,indexed,tokenized<tagKey:hotdog grill>>

and I am  searching for +tagKey:hot +tagKey:dog

which is exact match for 2nd document, but I am getting 1.0 score for first
document and 0.7 for second one.

I have custom similarity where lengthNorm is (1.0 / tokenCount) others are
some consents

why my first document is getting higher score?
--
View this message in context: 
http://www.nabble.com/Scoring-issue-tp20707410p20707410.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to