jimczi opened a new pull request, #16259: URL: https://github.com/apache/lucene/pull/16259
#16177 routed every two-phase window through the unified bit-set path. That regressed sparse conjunctions whose surviving docs must be confirmed by an expensive two-phase clause (e.g. a phrase under a selective filter, such as the CountFilteredPhrase nightly task): the per-window bit-set materialization (intoBitSet, applyMask, cardinality, BitSetDocIdStream) is not amortized when only a few docs survive, so it loses to plain leap-frog. Dispatch such windows to a restored scoreWindowUsingLeapFrog. A window uses leap-frog when its lead is sparse (leadCost <= maxDoc/4) and some two-phase clause is skippable (approximation().cost() < maxDoc). A skip-indexed doc-values range reports cost == NO_MORE_DOCS (>= maxDoc): it cannot be skipped, so it stays on the bit-set path and keeps the SIMD intoBitSet win from #16177. Plain approximations are now also sorted before two-phase ones so matches() only runs on docs that already satisfy every approximation. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
