[ 
https://issues.apache.org/jira/browse/LUCENE-7997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-7997:
--------------------------------
    Attachment: LUCENE-7997_wip.patch

Updated patch that also tests floating point tf values. We assume a 
computeSlopFactor has the range {{(0 .. 1]}} for testing. This found a leftover 
buggy float cast in DFR {{I(F)}} but also a new bug: Axiomatic model F1 will 
most likely return NaN values if you use SloppyPhraseQuery! frequency values < 
1 cause its first log to go negative, then the next log to go NaN: formula is 
{{1 + log(1 + log(freq))}}. Imagine freq=0.3, this is {{1 + log(1 + -1.2)}} = 
{{1 + log(-0.2)}} = NaN. If we alter the formula to use {{log(1 + freq)}} then 
tests pass but needs investigation/may not be an appropriate solution, so i 
marked AwaitsFix for now.

> More sanity testing of similarities
> -----------------------------------
>
>                 Key: LUCENE-7997
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7997
>             Project: Lucene - Core
>          Issue Type: Task
>            Reporter: Adrien Grand
>            Priority: Minor
>         Attachments: LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, 
> LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, 
> LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, 
> LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, 
> LUCENE-7997_wip.patch
>
>
> LUCENE-7993 is a potential optimization that we could only apply if the 
> similarity is an increasing functions of {{freq}} (all other things like DF 
> and length being equal). This sounds like a very reasonable requirement for a 
> similarity, so we should test it in the base similarity test case and maybe 
> move broken similarities to sandbox?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to