jimczi opened a new pull request, #16259:
URL: https://github.com/apache/lucene/pull/16259

   #16177 routed every two-phase window through the unified bit-set path. That 
regressed sparse conjunctions whose surviving docs must be confirmed by an 
expensive two-phase clause (e.g. a phrase under a selective filter, such as the 
CountFilteredPhrase nightly task): the per-window bit-set materialization 
(intoBitSet, applyMask, cardinality, BitSetDocIdStream) is not amortized when 
only a few docs survive, so it loses to plain leap-frog.
   
   Dispatch such windows to a restored scoreWindowUsingLeapFrog. A window uses 
leap-frog when its lead is sparse (leadCost <= maxDoc/4) and some two-phase 
clause is skippable (approximation().cost() < maxDoc). A skip-indexed 
doc-values range reports cost == NO_MORE_DOCS (>= maxDoc): it cannot be 
skipped, so it stays on the bit-set path and keeps the SIMD intoBitSet win from 
#16177. 
   Plain approximations are now also sorted before two-phase ones so matches() 
only runs on docs that already satisfy every approximation.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to