[
https://issues.apache.org/jira/browse/LUCENE-2936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12998136#comment-12998136
]
Robert Muir commented on LUCENE-2936:
-------------------------------------
Koji: the issue is the document boost of zero.
because of this, the explanation does not indicate a match by default (see
Explanation.java):
{noformat}
/**
* Indicates whether or not this Explanation models a good match.
*
* <p>
* By default, an Explanation represents a "match" if the value is positive.
* </p>
* @see #getValue
*/
public boolean isMatch() {
return (0.0f < getValue());
}
{noformat}
Separately, we should decide what to do about norm values of zero. In my
opinion, norm values of zero should not necessarily decode to a floating point
value of zero (we might want to adjust our norm decoder by default to not do
this).
Otherwise, in addition to your problem, the search degrades into a pure boolean
ranking model (as TF and IDF are completely zeroed out).
This is really unlikely with the default relevance ranking (unless you use a
boost of zero or similar), but is possible e.g. if you use a different
SmallFloat quantization. I raised this issue on LUCENE-1360 where if you were
to use this "short field" quantization on a large document, what should we do?
So in my opinion, we should consider adjusting the NORM_TABLE in Similarity so
that if the norm happens to be zero, it does not decode to a float of zero.
This will have no impact on performance as its a statically calculated table.
> score and explain don't match
> -----------------------------
>
> Key: LUCENE-2936
> URL: https://issues.apache.org/jira/browse/LUCENE-2936
> Project: Lucene - Java
> Issue Type: Bug
> Affects Versions: 2.9.4, 3.0.3, 3.1, 4.0
> Reporter: Koji Sekiguchi
> Priority: Minor
> Attachments: TestScore.java
>
>
> I've faced this problem recently. I'll attach a program to reproduce the
> problem soon. The program outputs the following:
> {noformat}
> ** score = 0.10003257
> ** explain
> 0.050016284 = (MATCH) product of:
> 0.15004885 = (MATCH) sum of:
> 0.15004885 = weight(f1:"note book" in 0), product of:
> 0.3911943 = queryWeight(f1:"note book"), product of:
> 0.61370564 = idf(f1: note=1 book=1)
> 0.6374299 = queryNorm
> 0.38356602 = fieldWeight(f1:"note book" in 0), product of:
> 1.0 = tf(phraseFreq=1.0)
> 0.61370564 = idf(f1: note=1 book=1)
> 0.625 = fieldNorm(field=f1, doc=0)
> 0.33333334 = coord(1/3)
> {noformat}
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]