romseygeek opened a new pull request, #15632:
URL: https://github.com/apache/lucene/pull/15632
Numeric sorts against a field with DocValuesSkippers enabled currently use
DocValuesRangeIterator to implement competitive iterators. This has a number
of disadvantages:
- DVRI cannot efficiently implement docIDRunEnd() or intoBitSet(), meaning
that
bulk conjunction filtering may end up falling into slower code paths
- For field value distributions that are essentially random, DVRI falls back
to
doc-by-doc value checking, meaning that no skipping happens at all, but
adding
overhead.
This commit adds a new SkipBlockRangeIterator that only skips whole blocks
where no document will be competitive, avoiding any individual doc-by-doc
value
checks. The docIDRunEnd() and intoBitSet() implementations are very fast and
mean that bulk conjunction filtering will be efficient. The overheads as a
whole
are very low, so randomly distributed values are much less adversarial, while
queries against indexes where the document order is roughly correlated with
the
query sort get significant boosts.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]