[ 
https://issues.apache.org/jira/browse/LUCENE-6819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand updated LUCENE-6819:
---------------------------------
    Attachment: LUCENE-6819-wip.patch

Here's a patch in case someone would like to run some relevancy tests. I goes 
even further and uses a completely different encoding that stores lengths in a 
byte. It is fully accurate up to 40 and then accuracy degrades linearly with 
the log of the length. It has a restriction that it does not support index 
boosts, but on the other hand, making assumptions that index boosts are not 
used allows it to make the 256 values useful, while with the current encoding, 
if index boosts are not used, only 63 values represent valid lengths: other 
values are either less than 1 or greater than MAX_VALUE.

The patch is just a proof of concept and does not try to tackle the removal of 
index-time boosts or backward compatibility, which are the hard problems here.

> Deprecate index-time boosts?
> ----------------------------
>
>                 Key: LUCENE-6819
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6819
>             Project: Lucene - Core
>          Issue Type: Task
>            Reporter: Adrien Grand
>            Priority: Minor
>         Attachments: LUCENE-6819-wip.patch
>
>
> Follow-up of this comment: 
> https://issues.apache.org/jira/browse/LUCENE-6818?focusedCommentId=14934801&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14934801
> Index-time boosts are a very expert feature whose behaviour is tight to the 
> Similarity impl. Additionally users have often be confused by the poor 
> precision due to the fact that we encode values on a single byte. But now we 
> have doc values that allow you to encode any values the way you want with as 
> much precision as you need so maybe we should deprecate index-time boosts and 
> recommend to encode index-time scoring factors into doc values fields instead.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to