[jira] [Commented] (LUCENE-8072) Improve accuracy of similarity scores

Robert Muir (JIRA) Fri, 01 Dec 2017 07:02:12 -0800

    [ 
https://issues.apache.org/jira/browse/LUCENE-8072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16274490#comment-16274490
 ]


Robert Muir commented on LUCENE-8072:
-------------------------------------

Also i dont see the benefit to relevance. I am fine with taking any perf hit, 
if it really helps, but I think we need to do this carefully on a case-by-case 
basis, not blindly across the board.

For example in the BM25 case it does not help to do this in the IDF, such tiny 
idfs are the stopword case so additional precision does not matter. It also 
does not help "behavior" since Math.log is already required to be 
semi-monotonic.

> Improve accuracy of similarity scores
> -------------------------------------
>
>                 Key: LUCENE-8072
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8072
>             Project: Lucene - Core
>          Issue Type: Task
>            Reporter: Adrien Grand
>            Priority: Minor
>         Attachments: LUCENE-8072.patch
>
>
> I noticed two things we could do to improve the accuracy of our scores:
>  - use {{Math.log1p(x)}} instead of {{Math.log(1+x)}}, especially when x is 
> expected to be small
>  - use doubles for intermediate values that are used to compute norms in 
> BM25Similarity



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-8072) Improve accuracy of similarity scores

Reply via email to