Hey Adrian, Apologies for the delay in my response. Thank you so much for the detailed reply and feedback on my write up. I will incorporate your suggestions and revert back with a new iteration soon.
Regards, Shradha > On 11 Jul 2022, at 10:37, Adrien Grand <jpou...@gmail.com> wrote: > > Hey Shradha, > > This correctly describes the what, but I think it could add more color > about why the cache behaves this way to be more useful, e.g. > - Why doesn't the cache cache all queries? Lucene is relatively good at > evaluating a subset of the matching documents, e.g. queries sorted by > numeric field can leverage point index structures to only look at a small > subset of the matching docs. Yet caching a query requires consuming all its > matches, so it could significantly hurt latencies. It's important to not > cache all queries to preserve the benefit of Lucene's filtering and dynamic > pruning capabilities. > - A corollary of the above is that the query cache is essentially a way to > trade latency for throughput. Use-cases that care more about latency than > throughput may want to disable the cache entirely. > - LRUQueryCache takes a `skipCacheFactor` which aims at limiting the > impact of query caching on latency by not caching clauses whose cost is > much higher than the overall query. It only helps for filters within > conjunctions though, not in the dynamic pruning case when we don't know how > many matches are going to be consumed. > - Why are small segments never cached? Small segments are likely going to > be merged soon, so it would be wasteful to build cache entries that would > get evicted shortly. > - The queries that never get cached don't get cached because a cached > entry wouldn't perform faster than their uncached counterpart. An inverted > index is already a cache of the matches for every unique term of the index. > > On Fri, Jul 8, 2022 at 3:20 PM Shradha Shankar <shrdsha2...@gmail.com> > wrote: > >> Hello! >> >> I work at Amazon Product Search and I’ve recently been looking into >> understanding how Lucene’s LRU Query Cache works. I’ve written up a summary >> of my understanding here. (Also attached as a markdown file with this email) >> >> Will really appreciate feedback/improvements/corrections for my >> understanding and if this is worthy of contributing to the documentation >> for LRU QueryCache. :) >> >> ================================= >> A brief overview of Lucene's Query Cache >> >> We first get acquainted with Lucene’s caching at IndexSearcher’s >> createWeight method which is called for every query (and consequently >> sub-queries within that query, eg see BooleanWeight) before we can actually >> find matching documents in our index and score them. Weight is really >> another representation of a query that is specific to the statistics of the >> IndexSearcher being used. This definition makes it easier to see why >> caching logic would start while creating weight for the query - we want to >> make a weight that will be responsible for caching matching docs per >> segment. Since segments are specific to the IndexReader being used by the >> IndexSearcher, they are transitively, specific to the IndexSearcher. >> >> QueryCache in Lucene is an interface that has the signature for just one >> method - doCache. doCache takes in a Weight (weight of the query in >> question eg: TermWeight for a TermQuery) and operates on it based on the >> rules defined by QueryCachingPolicy (yet another interface that defines two >> methods - onUse and shouldCache) to return a Weight. This “new” returned >> Weight possibly wraps the original weight and bestows upon it some caching >> abilities. >> >> As of now, Lucene has one core implementation of the QueryCache and the >> QueryCachingPolicy. All IndexSearcher instances have a default query cache >> - LRUQueryCache and use the default policy - >> UsageTrackingQueryCachingPolicy.In the IndexSearcher’s createWeight method, >> we first create a weight for the incoming query and then subject it to the >> LRUQueryCache’s doCache method. An important thing to note here is that we >> only call doCache when the score mode passed to the search method does not >> need scores. Calling doCache does nothing complex; it just returns the >> input weight wrapped as a CachingWrapperWeight that encapsulates the >> caching policy information. No real caching has happened, yet! >> >> After getting a weight from the createWeight method, the IndexSearcher >> iterates over each leaf and uses the weight to create a BulkScorer. A >> BulkScorer, as the name suggests, is used to score a range of the documents >> - generally the range being all the matches found in that leaf. Given >> context information for a leaf, every weight should know how to create a >> bulk scorer for that leaf. In our case, the CachingWrapperWeight’s >> BulkScorer method does a little bit extra and this is where the actual >> caching happens! >> >> A brief dive into the query caching policy: While LRU says what we want to >> evict from a full cache, using query caching policies we can define other >> rules to use in conjunction with the cache’s design policy. The default >> UsageTrackingQueryCachingPolicy dictates what queries will be put into the >> cache. This policy uses a ring buffer data structure optimised to track and >> retrieve frequencies for a given query. The policy also defines some >> queries that will not be cached (TermQuery, MatchAllDocsQuery, >> MatchNoDocsQuery to name a few). The OnUse method on this policy checks if >> the incoming query is something it would like to cache and if so, it >> registers the frequency for this query. The UsageTrackingQueryCachingPolicy >> defines minimum frequencies for queries - 2 for costly queries and 5 for >> others. When a query that can be cached and has been seen more than minimum >> number of times, the shouldCache method of the policy gives it a green >> light for caching. >> >> A brief dive into LRUQueryCache: There are many attributes in this class >> that track ram usage and allowed max memory size for the cache that play >> important roles but, for easy understanding, we want to highlight two >> attributes - Map<IndexReader.CacheKey, LeafCache> cache and Map<Query, >> Query> uniqueQueries. uniqueQueries: A LinkedHashMap with capacity and >> accessOrder set to true behaves like an LRU cache and uniqueQueries is just >> that. While the keys in uniqueQueries are Query instances, the values are >> also Query instances and not DocIdSet corresponding to the matches as we >> might expect. This LRU cache is common across all segments and the purpose >> of this cache (map) is to store a singleton corresponding to the most >> recently seen queries so that we don’t need to store multiple copies of the >> same query. cache: The cache maps a LeafCache instance for each leaf in the >> index represented by a CacheKey. The LeafCache isn’t a cache but is a >> Map<Query, CacheAndCount> where Query corresponds to the a cached Query >> instance from uniqueQueries and CacheAndCount is the actual data structure >> that stores the DocIdSet for the matches found for the query in that leaf. >> >> Returning to the creation of the BulkScorer from the CachingWrapperWeight, >> we first register the query (and hence its frequency) on the policy via the >> onUse method. At present, Lucene only wants to cache queries on leaves >> which have more than 10k documents and have more than 3% of the total >> number of documents in the index. Once we determine that the segment is >> eligible for caching, we use a CacheHelper to get the CacheKey for the >> segment so that we can access the LeafCache corresponding to it. We check >> if the query has a CacheAndCount entry in the LeafCache and return it (a >> cache hit!) or return null (a cache miss). In the case of a cache miss, we >> use the policy’s shouldCache to determine if this query is cache-eligible >> and if it is, we create a CacheAndCount instance for the query-leaf pair. >> We add the Query to uniqueQueries and query and CacheAndCount instance to >> the corresponding LeafCache for that leaf. We’ve finally cached a query and >> its match set! >> >> When creating a CacheAndCount instance for a query that is being newly >> cached, we want to find the matches for that query. We use the actual >> weight for that query (the one wrapped inside the CachingWrapperWeight) to >> create a BulkScorer and then call score on that scorer to populate a >> LeafCollector with the matches for the query in this leaf. Based on the >> density of the result set, we create a DocIdSet over a FixedBitSet or a >> Roaring bit set. >> >> Finally, we create a DocIdSetIterator from the retrieved or created >> CacheAndCount instance and return the most commonly used BulkScorer - the >> DefaultBulkScorer with a ConstantScoreScorer (score set to 0) over the >> DocIdSetIterator. When there is a cache miss and the query is ineligible >> for caching, we return the BulkScorer from the actual weight wrapped inside >> the CachingWrapperWeight, as we would if no caching logic were present! >> >> Thanks, >> Shradha >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org > > > > -- > Adrien --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org