[
https://issues.apache.org/jira/browse/LUCENE-7680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15856493#comment-15856493
]
Adrien Grand commented on LUCENE-7680:
--------------------------------------
I see this class as a default set of heuristics that should work well for most
use-cases. If someone wants something more specific, I think the way to go
should be to write a new impl, the API should be pretty simple to implement? As
it stands, the class is indeed not designed for inheritance: in addition to
those pkg-private methods, it is final.
bq. Granted I could implement minFrequencyToCache and return Integer.MAX_VALUE.
Requiring that a filter has been seen Integer.MAX_VALUE times would indeed make
it never cached. However this change goes a bit further in the case of term
filters: it also does not add them to the history, which makes other filters
more likely of being cached than they are today. To take an extreme example,
say you have a query with 100 term filters and 1 other filter (which is not a
term). Even if that other filter was the same in every query, it would never
get cached because term queries "pollute" the history (we only keep track of
the last 256 used filters) and that other filter would only occur at most twice
in the history. By not putting term filters in the history of recently used
filters, then Lucene would be more likely to notice that that other filter gets
reused all the time.
bq. Curious; did you consider marking TermFilter as "cheap"?
What do you mean? Maybe it is the cause of the confusion, but when I say term
filter, I mean a TermQuery that is consumed with needsScores=false.
> Never cache term filters
> ------------------------
>
> Key: LUCENE-7680
> URL: https://issues.apache.org/jira/browse/LUCENE-7680
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Adrien Grand
> Priority: Minor
> Attachments: LUCENE-7680.patch
>
>
> Currently we just require term filters to be used a lot in order to cache
> them. Maybe instead we should look into never caching them. This should not
> hurt performance since term filters are plenty fast, and would make other
> filters more likely to be cached since we would not "pollute" the history
> with filters that are not worth caching.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]