[jira] [Updated] (LUCENE-7730) Better encode length normalization in similarities

Adrien Grand (JIRA) Tue, 16 May 2017 09:15:27 -0700

     [ 
https://issues.apache.org/jira/browse/LUCENE-7730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Adrien Grand updated LUCENE-7730:
---------------------------------
    Attachment: LUCENE-7730.patch

New patch. It is not really possible to switch ClassicSimilarity to the new 
encoding given how it is built on the assumption that it encodes the 
normalization factor directly while the new encoding I have been working on 
encodes the length. So I ended up doing the following:
 - ClassicSimilarity will still encode norms the same way in 7.0 as it did 
before, it means it is no longer index-time compatible with, say, BM25Similarity
 - ClassicSimilarity docs have been updated to advise using BM25Similarity 
instead
 - ClassicSimilarity has been moved out of similarity randomization in the test 
framework

I'd like to get it in 7.0 as this change can only be done in a major release 
(it uses the index creation major to know which encoding to use) so please 
speak up if you have concerns.

> Better encode length normalization in similarities
> --------------------------------------------------
>
>                 Key: LUCENE-7730
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7730
>             Project: Lucene - Core
>          Issue Type: Task
>            Reporter: Adrien Grand
>         Attachments: LUCENE-7730.patch, LUCENE-7730.patch, LUCENE-7730.patch
>
>
> Now that index-time boosts are gone (LUCENE-6819) and that indices record the 
> version that was used to create them (for backward compatibility, 
> LUCENE-7703), we can look into storing the length normalization factor more 
> efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (LUCENE-7730) Better encode length normalization in similarities

Reply via email to