mikemccand opened a new issue, #12358: URL: https://github.com/apache/lucene/issues/12358
### Description Context: we (Amazon customer facing product search team, and also AWS) are attempting to understand the amazing performance Tantivy (Rust search engine) has over Lucene, iterating in [this GitHub repo](https://github.com/Tony-X/search-benchmark-game). That repo is sort of a merger of Lucene's benchmarking code ([luceneutil](https://github.com/mikemccand/luceneutil)), including its tasks and `enwiki` corpus, and the [open source Tantivy benchmark](https://github.com/quickwit-oss/search-benchmark-game). Tantivy is impressively fast :) This issue is a spinoff from [this fascinating comment](https://github.com/Tony-X/search-benchmark-game/issues/30#issuecomment-1579761787) by @fulmicoton, creator and maintainer of [Tantivy](https://github.com/quickwit-oss/tantivy). Tantivy optimizes `count()` for `BooleanQuery` disjunctions much like Lucene's `BooleanScorer`, by scoring in a windowed bitset of N docs at once, and then pop-counting the set bits in each window. This is not technically a sub-linear implementation: it is still linear, but I suspect with a smaller constant factor than the default `count()` fallback Lucene implements. Perhaps, for all cases where `BooleanQuery` uses the windowed `BooleanScorer`, we could also implement this `count()` optimization. From my read of Lucene's `BooleanWeight.count`, I don't think Lucene has this optimization? Maybe we should port over Tantivy's optimization? It should make disjunctive counting quite a bit faster? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org