[
https://issues.apache.org/jira/browse/LUCENE-8213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16404750#comment-16404750
]
Amir Hadadi commented on LUCENE-8213:
-------------------------------------
The issue is that the execution path for (q1 AND q2) depends on whether q2 gets
cached or not.
When q2 does not get cached, doc values are used to execute q2 and only the
single document matching q1 is evaluated against the range.
When q2 gets cached, it gets cached as a query that stands by itself, i.e. not
in the context of (q1 AND q2).
So the entire 10M documents that q2 matches are scanned in the BKD tree and get
cached to a bit set.
To protect against the caching of q2 causing the latency of (q1 AND q2) to be
too high, Adrian added maxCostFactor.
This factor checks whether the cost of caching q2 is higher by more than
maxCostFactor than the cost of evaluating (q1 AND q2).
This is the relevant code from LRUQueryCache:
{code:java}
double costFactor = (double) inSupplier.cost() / leadCost;
if (costFactor >= maxCostFactor) {
// too costly, caching might make the query much slower
return inSupplier.get(leadCost);
}{code}
My suggestion is to always evaluate (q1 AND q2) using the optimal path, and
cache q2 asynchrounously.
A refinement is to cache q2 synchronously if the cost of caching it is not too
high.
> offload caching to a dedicated threadpool
> -----------------------------------------
>
> Key: LUCENE-8213
> URL: https://issues.apache.org/jira/browse/LUCENE-8213
> Project: Lucene - Core
> Issue Type: Improvement
> Components: core/query/scoring
> Affects Versions: 7.2.1
> Reporter: Amir Hadadi
> Priority: Minor
> Labels: performance
>
> IndexOrDocValuesQuery allows to combine non selective range queries with a
> selective lead iterator in an optimized way. However, the range query at some
> point gets cached by a querying thread in LRUQueryCache, which negates the
> optimization of IndexOrDocValuesQuery for that specific query.
> It would be nice to see a caching implementation that offloads to a different
> thread pool, so that queries involving IndexOrDocValuesQuery would have
> consistent performance characteristics.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]