Govind, It seems you meant to post this to the Lucene dev mailing list, but this is the Lucene.NET mailing list. So, I am reposting your question on your behalf to the correct mailing list.
Thanks, Shad Storhaug (NightOwl888) Project Chairperson - Apache Lucene.NET -----Original Message----- From: Govind Balaji <govind.bal...@glean.com.INVALID> Sent: Wednesday, July 23, 2025 6:15 PM To: d...@lucenenet.apache.org Subject: [I] Index level caching policy is thrashed by segment-specific query rewrites Copying the github issue I created since it doesn't look like every github issue is automatically copied to this mailing list. Apologies if this ends up as a duplicate. https://github.com/apache/lucene/issues/14986 Description 1. Each IndexSearcher has its own UsageTrackingQueryCachingPolicy that is shared across all segments. 2. This caching policy uses a 256-length ring buffer to keep track of recently used queries. 3. A TermInSetQuery with rewriteMethod = MultiTermQuery.CONSTANT_SCORE_BLENDED_REWRITE yields a RewritingWeight. 4. Getting a scorer from this RewritingWeight for a segment could involve rewriting to a BooleanQuery of multiple TermQuery with only the terms present in that particular segment - ref org.apache.lucene.search.AbstractMultiTermQueryConstantScoreWrapper.RewritingWeight#scorerSupplier 5. Thus a single TermInSetQuery will end up thrashing the ring buffer as multiple distinct BooleanQuerys from different segments. 6. This leads to a poor caching rate for indexes with a large number of segments. We could verify this behavior with a new caching policy that delegates to UsageTrackingQueryCachingPolicy after logging the onUse() and shouldCache() calls. Is there a good reason to not have this ring buffer tracking at a per segment level? That would fix this issue. Version and environment details Lucene 9.12.1 --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org