Adrien Grand created LUCENE-10672:
-------------------------------------

             Summary: Re-evaluate different ways to encode postings
                 Key: LUCENE-10672
                 URL: https://issues.apache.org/jira/browse/LUCENE-10672
             Project: Lucene - Core
          Issue Type: Task
            Reporter: Adrien Grand


In Lucene 4, we moved to FOR to encode postings because it woud give better 
throughput compared to VInts that we had been using until then. This was a time 
when Lucene would often need to evaluate entire postings lists, and 
optimizations like BS1 were very important for good performance.

Nowadays, Lucene performs more dynamic pruning and it's less frequent that 
Lucene needs to evaluate all hits that match a query. So the performance of 
{{nextDoc()}} has become a bit less relevant while the performance of 
{{advance(target)}} has become more relevant.

I wonder if we should re-evaluate other ways to encode postings that are 
theoretically better at skipping, such as Elias-Fano coding, since they support 
skipping directly on the encoded representation instead of requiring decoding a 
full block of integers where only a couple of them would be relevant.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to