[
https://issues.apache.org/jira/browse/LUCENE-7997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Robert Muir updated LUCENE-7997:
--------------------------------
Attachment: LUCENE-7997_wip.patch
Updated patch with DFR passing/failing the new tests as expected:
* scoring models without warnings in the javadocs pass: models {{G}}, {{I(F)}},
{{I\(n)}}, {{I(ne)}}
* ones with warnings in javadocs all fail: models {{BE}}, {{D}}, and {{P}}
I think this is a good sign it works to do what we need. To make DFR pass at
all, I changed SimilarityBase to use {{double}} everywhere internally, then
cast to 32-bit float at the end. This fixed all the numerical errors. I think
this makes sense as this subclass is supposed to be simple and easy to use
(separately, we should take another look at the whole thing now that a lot of
ClassicSimilarity's complexity has been removed). It makes the formulas more
elegant in many cases too because constants like {{5.0}} are naturally doubles
and all java Math functions take doubles, so some casts etc get removed.
Will work thru the other models and look at potential improvements to explain
etc here too for consistency.
> More sanity testing of similarities
> -----------------------------------
>
> Key: LUCENE-7997
> URL: https://issues.apache.org/jira/browse/LUCENE-7997
> Project: Lucene - Core
> Issue Type: Task
> Reporter: Adrien Grand
> Priority: Minor
> Attachments: LUCENE-7997_wip.patch, LUCENE-7997_wip.patch,
> LUCENE-7997_wip.patch, LUCENE-7997_wip.patch, LUCENE-7997_wip.patch,
> LUCENE-7997_wip.patch
>
>
> LUCENE-7993 is a potential optimization that we could only apply if the
> similarity is an increasing functions of {{freq}} (all other things like DF
> and length being equal). This sounds like a very reasonable requirement for a
> similarity, so we should test it in the base similarity test case and maybe
> move broken similarities to sandbox?
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]