Re: Lucene's LRU Query Cache - Deep Dive

Mohammad Sadiq Thu, 01 Sep 2022 05:20:53 -0700

Thank you for the explanation Adrien!
Looks like you’re right. When I tried making the code changes, it’s not as 
straight-forward as I imagined.


Regards,
Mohammad Sadiq


> On 19 Jul 2022, at 23:19, Adrien Grand <jpou...@gmail.com> wrote:
> 
> 1. I believe that this would require pulling a ScorerSupplier twice for the
> same segment, which is a costly operation.
> 
> 2. The cost is computed in order to know whether the top-level query is
> likely to consume a large-enough portion of the matches of the query that
> we are considering caching so that caching this query wouldn't hurt latency
> too much. Making a bad decision here because the cost is unknown would lead
> to a worse situation than computing the cost on every query that we are
> considering caching.
> 
> On both of these questions, I feel like I may be missing the point about
> the suggestion you are making so feel free to show a simple code change
> that could help me understand the change that you are suggesting.
> 
> On Thu, Jul 14, 2022 at 12:26 PM Mohammad Sadiq <sadiqli...@gmail.com 
> <mailto:sadiqli...@gmail.com>>
> wrote:
> 
>> Thanks for the deep-dive Shradha. Thank you Adrien for the additional
>> questions and answers.
>> 
>> I had a couple of questions, when looking around the cache code.
>> 
>> 1. The `QueryCachingPolicy` [1] makes decisions based on `Query`. Why not
>> use `Weight`?
>> The `scorerSupplier` [2] in the `LRUQueryCache` decides whether something
>> should be cached by determining the cost [3] using the `Weight`. IIUC, this
>> was introduced because “Query caching leads to absurdly slow queries” [4].
>> What if the `QueryCachingPolicy` was called with the `Weight` instead?
>> Since the `Query` can be obtained from the `Weight`, we can have all such
>> caching decisions in the policy, and de-couple that decision from the
>> `LRUQueryCache` class. What do you think?
>> 
>> 2. Why do we invoke a possibly costly `cost()` method for every cache
>> addition?
>> During the above cost computation, we call the `supplier.cost()` method;
>> but its documentation [5] states that it “may be a costly operation, so it
>> should only be called if necessary".
>> This means that we’re including a (possibly) costly operation for every
>> cache addition. If we de-couple these, then, for cases where the cache
>> addition is expensive, we can use the call to `cost`, but for other cases,
>> we can avoid this expensive call.
>> 
>> If you, or the community thinks that this is a good idea, then I can open
>> a JIRA, and submit a PR.
>> 
>> References:
>> [1]
>> https://github.com/apache/lucene/blob/d5d6dc079395c47cd6d12dcce3bcfdd2c7d9dc63/lucene/core/src/java/org/apache/lucene/search/QueryCachingPolicy.java
>> <
>> https://github.com/apache/lucene/blob/d5d6dc079395c47cd6d12dcce3bcfdd2c7d9dc63/lucene/core/src/java/org/apache/lucene/search/QueryCachingPolicy.java
>>  
>> <https://github.com/apache/lucene/blob/d5d6dc079395c47cd6d12dcce3bcfdd2c7d9dc63/lucene/core/src/java/org/apache/lucene/search/QueryCachingPolicy.java>
>>> 
>> [2]
>> https://github.com/apache/lucene/blob/941df98c3f718371af4702c92bf6537739120064/lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java#L725
>>  
>> <https://github.com/apache/lucene/blob/941df98c3f718371af4702c92bf6537739120064/lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java#L725>
>> <
>> https://github.com/apache/lucene/blob/941df98c3f718371af4702c92bf6537739120064/lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java#L725
>>  
>> <https://github.com/apache/lucene/blob/941df98c3f718371af4702c92bf6537739120064/lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java#L725>
>>> 
>> [3]
>> https://github.com/apache/lucene/blob/941df98c3f718371af4702c92bf6537739120064/lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java#L767
>>  
>> <https://github.com/apache/lucene/blob/941df98c3f718371af4702c92bf6537739120064/lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java#L767>
>> <
>> https://github.com/apache/lucene/blob/941df98c3f718371af4702c92bf6537739120064/lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java#L767
>>  
>> <https://github.com/apache/lucene/blob/941df98c3f718371af4702c92bf6537739120064/lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java#L767>
>>> 
>> [4] https://github.com/apache/lucene-solr/pull/940/files 
>> <https://github.com/apache/lucene-solr/pull/940/files> <
>> https://github.com/apache/lucene-solr/pull/940/files 
>> <https://github.com/apache/lucene-solr/pull/940/files>>
>> [5]
>> https://github.com/apache/lucene/blob/d5d6dc079395c47cd6d12dcce3bcfdd2c7d9dc63/lucene/core/src/java/org/apache/lucene/search/ScorerSupplier.java#L39-L40
>>  
>> <https://github.com/apache/lucene/blob/d5d6dc079395c47cd6d12dcce3bcfdd2c7d9dc63/lucene/core/src/java/org/apache/lucene/search/ScorerSupplier.java#L39-L40>
>> <
>> https://github.com/apache/lucene/blob/d5d6dc079395c47cd6d12dcce3bcfdd2c7d9dc63/lucene/core/src/java/org/apache/lucene/search/ScorerSupplier.java#L39-L40
>>  
>> <https://github.com/apache/lucene/blob/d5d6dc079395c47cd6d12dcce3bcfdd2c7d9dc63/lucene/core/src/java/org/apache/lucene/search/ScorerSupplier.java#L39-L40>
>>> 
>> 
>> 
>> Regards,
>> Mohammad Sadiq
>> 
>> 
>>> On 11 Jul 2022, at 10:37, Adrien Grand <jpou...@gmail.com> wrote:
>>> 
>>> Hey Shradha,
>>> 
>>> This correctly describes the what, but I think it could add more color
>>> about why the cache behaves this way to be more useful, e.g.
>>> - Why doesn't the cache cache all queries? Lucene is relatively good at
>>> evaluating a subset of the matching documents, e.g. queries sorted by
>>> numeric field can leverage point index structures to only look at a small
>>> subset of the matching docs. Yet caching a query requires consuming all
>> its
>>> matches, so it could significantly hurt latencies. It's important to not
>>> cache all queries to preserve the benefit of Lucene's filtering and
>> dynamic
>>> pruning capabilities.
>>> - A corollary of the above is that the query cache is essentially a way
>> to
>>> trade latency for throughput. Use-cases that care more about latency than
>>> throughput may want to disable the cache entirely.
>>> - LRUQueryCache takes a `skipCacheFactor` which aims at limiting the
>>> impact of query caching on latency by not caching clauses whose cost is
>>> much higher than the overall query. It only helps for filters within
>>> conjunctions though, not in the dynamic pruning case when we don't know
>> how
>>> many matches are going to be consumed.
>>> - Why are small segments never cached? Small segments are likely going to
>>> be merged soon, so it would be wasteful to build cache entries that would
>>> get evicted shortly.
>>> - The queries that never get cached don't get cached because a cached
>>> entry wouldn't perform faster than their uncached counterpart. An
>> inverted
>>> index is already a cache of the matches for every unique term of the
>> index.
>>> 
>>> On Fri, Jul 8, 2022 at 3:20 PM Shradha Shankar <shrdsha2...@gmail.com>
>>> wrote:
>>> 
>>>> Hello!
>>>> 
>>>> I work at Amazon Product Search and I’ve recently been looking into
>>>> understanding how Lucene’s LRU Query Cache works. I’ve written up a
>> summary
>>>> of my understanding here. (Also attached as a markdown file with this
>> email)
>>>> 
>>>> Will really appreciate feedback/improvements/corrections for my
>>>> understanding and if this is worthy of contributing to the documentation
>>>> for LRU QueryCache. :)
>>>> 
>>>> =================================
>>>> A brief overview of Lucene's Query Cache
>>>> 
>>>> We first get acquainted with Lucene’s caching at IndexSearcher’s
>>>> createWeight method which is called for every query (and consequently
>>>> sub-queries within that query, eg see BooleanWeight) before we can
>> actually
>>>> find matching documents in our index and score them. Weight is really
>>>> another representation of a query that is specific to the statistics of
>> the
>>>> IndexSearcher being used. This definition makes it easier to see why
>>>> caching logic would start while creating weight for the query - we want
>> to
>>>> make a weight that will be responsible for caching matching docs per
>>>> segment. Since segments are specific to the IndexReader being used by
>> the
>>>> IndexSearcher, they are transitively, specific to the IndexSearcher.
>>>> 
>>>> QueryCache in Lucene is an interface that has the signature for just one
>>>> method - doCache. doCache takes in a Weight (weight of the query in
>>>> question eg: TermWeight for a TermQuery) and operates on it based on the
>>>> rules defined by QueryCachingPolicy (yet another interface that defines
>> two
>>>> methods - onUse and shouldCache) to return a Weight. This “new” returned
>>>> Weight possibly wraps the original weight and bestows upon it some
>> caching
>>>> abilities.
>>>> 
>>>> As of now, Lucene has one core implementation of the QueryCache and the
>>>> QueryCachingPolicy. All IndexSearcher instances have a default query
>> cache
>>>> - LRUQueryCache and use the default policy -
>>>> UsageTrackingQueryCachingPolicy.In the IndexSearcher’s createWeight
>> method,
>>>> we first create a weight for the incoming query and then subject it to
>> the
>>>> LRUQueryCache’s doCache method. An important thing to note here is that
>> we
>>>> only call doCache when the score mode passed to the search method does
>> not
>>>> need scores. Calling doCache does nothing complex; it just returns the
>>>> input weight wrapped as a CachingWrapperWeight that encapsulates the
>>>> caching policy information. No real caching has happened, yet!
>>>> 
>>>> After getting a weight from the createWeight method, the IndexSearcher
>>>> iterates over each leaf and uses the weight to create a BulkScorer. A
>>>> BulkScorer, as the name suggests, is used to score a range of the
>> documents
>>>> - generally the range being all the matches found in that leaf. Given
>>>> context information for a leaf, every weight should know how to create a
>>>> bulk scorer for that leaf. In our case, the CachingWrapperWeight’s
>>>> BulkScorer method does a little bit extra and this is where the actual
>>>> caching happens!
>>>> 
>>>> A brief dive into the query caching policy: While LRU says what we want
>> to
>>>> evict from a full cache, using query caching policies we can define
>> other
>>>> rules to use in conjunction with the cache’s design policy. The default
>>>> UsageTrackingQueryCachingPolicy dictates what queries will be put into
>> the
>>>> cache. This policy uses a ring buffer data structure optimised to track
>> and
>>>> retrieve frequencies for a given query. The policy also defines some
>>>> queries that will not be cached (TermQuery, MatchAllDocsQuery,
>>>> MatchNoDocsQuery to name a few). The OnUse method on this policy checks
>> if
>>>> the incoming query is something it would like to cache and if so, it
>>>> registers the frequency for this query. The
>> UsageTrackingQueryCachingPolicy
>>>> defines minimum frequencies for queries - 2 for costly queries and 5 for
>>>> others. When a query that can be cached and has been seen more than
>> minimum
>>>> number of times, the shouldCache method of the policy gives it a green
>>>> light for caching.
>>>> 
>>>> A brief dive into LRUQueryCache: There are many attributes in this class
>>>> that track ram usage and allowed max memory size for the cache that play
>>>> important roles but, for easy understanding, we want to highlight two
>>>> attributes - Map<IndexReader.CacheKey, LeafCache> cache and Map<Query,
>>>> Query> uniqueQueries. uniqueQueries: A LinkedHashMap with capacity and
>>>> accessOrder set to true behaves like an LRU cache and uniqueQueries is
>> just
>>>> that. While the keys in uniqueQueries are Query instances, the values
>> are
>>>> also Query instances and not DocIdSet corresponding to the matches as we
>>>> might expect. This LRU cache is common across all segments and the
>> purpose
>>>> of this cache (map) is to store a singleton corresponding to the most
>>>> recently seen queries so that we don’t need to store multiple copies of
>> the
>>>> same query. cache: The cache maps a LeafCache instance for each leaf in
>> the
>>>> index represented by a CacheKey. The LeafCache isn’t a cache but is a
>>>> Map<Query, CacheAndCount> where Query corresponds to the a cached Query
>>>> instance from uniqueQueries and CacheAndCount is the actual data
>> structure
>>>> that stores the DocIdSet for the matches found for the query in that
>> leaf.
>>>> 
>>>> Returning to the creation of the BulkScorer from the
>> CachingWrapperWeight,
>>>> we first register the query (and hence its frequency) on the policy via
>> the
>>>> onUse method. At present, Lucene only wants to cache queries on leaves
>>>> which have more than 10k documents and have more than 3% of the total
>>>> number of documents in the index. Once we determine that the segment is
>>>> eligible for caching, we use a CacheHelper to get the CacheKey for the
>>>> segment so that we can access the LeafCache corresponding to it. We
>> check
>>>> if the query has a CacheAndCount entry in the LeafCache and return it (a
>>>> cache hit!) or return null (a cache miss). In the case of a cache miss,
>> we
>>>> use the policy’s shouldCache to determine if this query is
>> cache-eligible
>>>> and if it is, we create a CacheAndCount instance for the query-leaf
>> pair.
>>>> We add the Query to uniqueQueries and query and CacheAndCount instance
>> to
>>>> the corresponding LeafCache for that leaf. We’ve finally cached a query
>> and
>>>> its match set!
>>>> 
>>>> When creating a CacheAndCount instance for a query that is being newly
>>>> cached, we want to find the matches for that query. We use the actual
>>>> weight for that query (the one wrapped inside the CachingWrapperWeight)
>> to
>>>> create a BulkScorer and then call score on that scorer to populate a
>>>> LeafCollector with the matches for the query in this leaf. Based on the
>>>> density of the result set, we create a DocIdSet over a FixedBitSet or a
>>>> Roaring bit set.
>>>> 
>>>> Finally, we create a DocIdSetIterator from the retrieved or created
>>>> CacheAndCount instance and return the most commonly used BulkScorer -
>> the
>>>> DefaultBulkScorer with a ConstantScoreScorer (score set to 0) over the
>>>> DocIdSetIterator. When there is a cache miss and the query is ineligible
>>>> for caching, we return the BulkScorer from the actual weight wrapped
>> inside
>>>> the CachingWrapperWeight, as we would if no caching logic were present!
>>>> 
>>>> Thanks,
>>>> Shradha
>>>> 
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>>>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>> 
>>> 
>>> 
>>> --
>>> Adrien
>> 
>> 
> 
> -- 
> Adrien

Re: Lucene's LRU Query Cache - Deep Dive

Reply via email to