[PR] Use a more coarse-grained competitive iterator for skipper-based numeric sorts [lucene]

via GitHub Thu, 29 Jan 2026 06:09:20 -0800


romseygeek opened a new pull request, #15632:
URL: https://github.com/apache/lucene/pull/15632


   Numeric sorts against a field with DocValuesSkippers enabled currently use
   DocValuesRangeIterator to implement competitive iterators.  This has a number
   of disadvantages:
   - DVRI cannot efficiently implement docIDRunEnd() or intoBitSet(), meaning 
that 
      bulk conjunction filtering may end up falling into slower code paths
   - For field value distributions that are essentially random, DVRI falls back 
to 
      doc-by-doc value checking, meaning that no skipping happens at all, but 
adding
      overhead.
   
   This commit adds a new SkipBlockRangeIterator that only skips whole blocks
   where no document will be competitive, avoiding any individual doc-by-doc 
value
   checks.  The docIDRunEnd() and intoBitSet() implementations are very fast and
   mean that bulk conjunction filtering will be efficient.  The overheads as a 
whole
   are very low, so randomly distributed values are much less adversarial, while
   queries against indexes where the document order is roughly correlated with 
the 
   query sort get significant boosts.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[PR] Use a more coarse-grained competitive iterator for skipper-based numeric sorts [lucene]

Reply via email to