I see, this is a 50kB allocation per segment, which is fine under normal
usage, but becomes noticeable with percolator queries which create a new
MaxScoreBulkScorer for every document?

In general, bulk scorers will want to allocate large arrays/bit sets to
help with bulk processing of documents, some other bulk scorers do this as
well: BatchScoreBulkScorer, BlockMaxConjunctionBulkScorer,
DenseConjunctionBulkScorer, DisjunctionMaxBulkScorer.
I wonder if a better fix would be to disable bulk scoring for
percolator/monitor-style usage and force doc-at-a-time evaluation by using
ScorerSupplier#get() (possibly wrapped in a DefaultBulkScorer if you'd like
to consume hits via the BulkScorer API while still doing doc-at-a-time
evaluation) instead of ScorerSupplier#bulkScorer().

On Wed, Jul 1, 2026 at 12:40 PM Alan Woodward <[email protected]> wrote:

> Hi all,
>
> We’ve found a regression in 10.5.0 due to eager allocation of large array
> buffers in MaxScoreBulkScorer - fix proposed here:
> https://github.com/apache/lucene/pull/16316
>
> This particularly hits boolean queries with an expensive two-phase
> subclause (in our case, some percolator queries got a lot slower).  I think
> it probably warrants a 10.5.1 bugfix.
>
> - Alan
>


-- 
Adrien

Reply via email to