[ 
https://issues.apache.org/jira/browse/LUCENE-8058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand updated LUCENE-8058:
---------------------------------
    Attachment: LUCENE-8058.patch

[~jim.ferenczi] I slightly changed the approach:
 - I increased the memory usage that we assume for queries to 1024. I think 
this makes sense since this was initially computed as the memory usage of a 
term query but we do not cache term queries anymore so cached queries are more 
likely to be boolean queries with a couple clauses.
 - I disabled caching on dismax and boolean queries that have more than 16 
clauses in order not to encourage users to switch to those queries to work 
around the fact that we no longer cache large term-in-set queries.

What do you think?

> Never cache large TermInSetQuery instances
> ------------------------------------------
>
>                 Key: LUCENE-8058
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8058
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>             Fix For: master (8.0), 7.2
>
>         Attachments: LUCENE-8058.patch, LUCENE-8058.patch
>
>
> I have seen several cases in which the query cache was highly underestimating 
> its memory usage due to the fact that it had references to large queries that 
> ended up using more memory than the associated doc id sets.
> We had a workaround for term-in-set queries by making TermInSetQuery 
> implement Accountable, but this information is lost when it is wrapped in 
> another query such as a BooleanQuery. So I would like to apply a safer fix 
> that just disables caching on large TermInSetQuery instances.
> I know it's a pity given that large queries are probably more expensive and 
> thus more cache-worthy, but I see such large queries as the result of a bad 
> design or a workaround to the fact that Lucene is not the right tool for the 
> job, so I think that disabling caching on large term-in-set queries is the 
> right trade-off by making the query cache safer for the majority of our users.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to