iprithv commented on PR #16069:
URL: https://github.com/apache/lucene/pull/16069#issuecomment-4506997765

   @romseygeek I used your patch, but changed one thing. the original was 
calling intoBitSet on "top", but that is the scorer, not the filter. so I 
switched it to use the filter iterator, same way as DenseConjunctionBulkScorer. 
rest is same as your idea:
   - bitset created eagerly when filter is dense
   - single path in scoreInnerWindowWithFilter
   - helper methods for leapfrog vs bitset
   - removed extra method and wrapper
   
   benchmark (wikimediumall, 33M docs)
   
   | task          | baseline qps | candidate qps | diff   |
   |---------------|--------------|---------------|--------|
   | AndHighMed    | 126.32       | 105.09        | -16.8% |
   | OrHighMed     | 238.84       | 217.16        | -9.1%  |
   | AndHighHigh   | 97.19        | 83.47         | -14.1% |
   | OrHighHigh    | 128.01       | 111.37        | -13.0% |
   | HighTerm      | 1147.51      | 1060.69       | -7.6%  |
   | LowTerm       | 2346.38      | 2169.87       | -7.5%  |
   
   so overall it regresses. I think in the bitset path here, we are iterating 
docs one by one using the priority queue (nextDoc + updateTop). but earlier we 
used nextDocsAndScores, which batches docs and scores together which made
   - fewer PQ updates
   - processes many docs in one go
   - uses arrays instead of repeated PQ ops
   
   this version is cleaner, but i think we lose that bulk scoring advantage? 
maybe we can keep this structure and still use nextDocsAndScores for essential 
scorers?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to