Adrien Grand created LUCENE-10672:
-------------------------------------
Summary: Re-evaluate different ways to encode postings
Key: LUCENE-10672
URL: https://issues.apache.org/jira/browse/LUCENE-10672
Project: Lucene - Core
Issue Type: Task
Reporter: Adrien Grand
In Lucene 4, we moved to FOR to encode postings because it woud give better
throughput compared to VInts that we had been using until then. This was a time
when Lucene would often need to evaluate entire postings lists, and
optimizations like BS1 were very important for good performance.
Nowadays, Lucene performs more dynamic pruning and it's less frequent that
Lucene needs to evaluate all hits that match a query. So the performance of
{{nextDoc()}} has become a bit less relevant while the performance of
{{advance(target)}} has become more relevant.
I wonder if we should re-evaluate other ways to encode postings that are
theoretically better at skipping, such as Elias-Fano coding, since they support
skipping directly on the encoded representation instead of requiring decoding a
full block of integers where only a couple of them would be relevant.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]