Govind,

It seems you meant to post this to the Lucene dev mailing list, but this is the 
Lucene.NET mailing list. So, I am reposting your question on your behalf to the 
correct mailing list.

Thanks,
Shad Storhaug (NightOwl888)
Project Chairperson - Apache Lucene.NET

-----Original Message-----
From: Govind Balaji <govind.bal...@glean.com.INVALID> 
Sent: Wednesday, July 23, 2025 6:15 PM
To: d...@lucenenet.apache.org
Subject: [I] Index level caching policy is thrashed by segment-specific query 
rewrites

Copying the github issue I created since it doesn't look like every github 
issue is automatically copied to this mailing list. Apologies if this ends up 
as a duplicate.
https://github.com/apache/lucene/issues/14986

Description

   1. Each IndexSearcher has its own UsageTrackingQueryCachingPolicy that
   is shared across all segments.
   2. This caching policy uses a 256-length ring buffer to keep track of
   recently used queries.
   3. A TermInSetQuery with rewriteMethod =
   MultiTermQuery.CONSTANT_SCORE_BLENDED_REWRITE yields a RewritingWeight.
   4. Getting a scorer from this RewritingWeight for a segment could
   involve rewriting to a BooleanQuery of multiple TermQuery with only the
   terms present in that particular segment - ref
   
org.apache.lucene.search.AbstractMultiTermQueryConstantScoreWrapper.RewritingWeight#scorerSupplier
   5. Thus a single TermInSetQuery will end up thrashing the ring buffer as
   multiple distinct BooleanQuerys from different segments.
   6. This leads to a poor caching rate for indexes with a large number of
   segments.

We could verify this behavior with a new caching policy that delegates to 
UsageTrackingQueryCachingPolicy after logging the onUse() and shouldCache()  
calls.

Is there a good reason to not have this ring buffer tracking at a per segment 
level? That would fix this issue.
Version and environment details

Lucene 9.12.1

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to