[ 
https://issues.apache.org/jira/browse/LUCENE-2936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12998136#comment-12998136
 ] 

Robert Muir commented on LUCENE-2936:
-------------------------------------

Koji: the issue is the document boost of zero.

because of this, the explanation does not indicate a match by default (see 
Explanation.java):
{noformat}
  /**
   * Indicates whether or not this Explanation models a good match.
   *
   * <p>
   * By default, an Explanation represents a "match" if the value is positive.
   * </p>
   * @see #getValue
   */
  public boolean isMatch() {
    return (0.0f < getValue());
  }
{noformat}

Separately, we should decide what to do about norm values of zero. In my 
opinion, norm values of zero should not necessarily decode to a floating point 
value of zero (we might want to adjust our norm decoder by default to not do 
this). 

Otherwise, in addition to your problem, the search degrades into a pure boolean 
ranking model (as TF and IDF are completely zeroed out).

This is really unlikely with the default relevance ranking (unless you use a 
boost of zero or similar), but is possible e.g. if you use a different 
SmallFloat quantization. I raised this issue on LUCENE-1360 where if you were 
to use this "short field" quantization on a large document, what should we do?

So in my opinion, we should consider adjusting the NORM_TABLE in Similarity so 
that if the norm happens to be zero, it does not decode to a float of zero. 
This will have no impact on performance as its a statically calculated table.


> score and explain don't match
> -----------------------------
>
>                 Key: LUCENE-2936
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2936
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 2.9.4, 3.0.3, 3.1, 4.0
>            Reporter: Koji Sekiguchi
>            Priority: Minor
>         Attachments: TestScore.java
>
>
> I've faced this problem recently. I'll attach a program to reproduce the 
> problem soon. The program outputs the following:
> {noformat}
> ** score = 0.10003257
> ** explain
> 0.050016284 = (MATCH) product of:
>   0.15004885 = (MATCH) sum of:
>     0.15004885 = weight(f1:"note book" in 0), product of:
>       0.3911943 = queryWeight(f1:"note book"), product of:
>         0.61370564 = idf(f1: note=1 book=1)
>         0.6374299 = queryNorm
>       0.38356602 = fieldWeight(f1:"note book" in 0), product of:
>         1.0 = tf(phraseFreq=1.0)
>         0.61370564 = idf(f1: note=1 book=1)
>         0.625 = fieldNorm(field=f1, doc=0)
>   0.33333334 = coord(1/3)
> {noformat}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to