[
https://issues.apache.org/jira/browse/SOLR-17150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17815673#comment-17815673
]
Andrzej Bialecki commented on SOLR-17150:
-----------------------------------------
Here's the proposed approach to implement two thresholds:
* an absolute max limit to terminate any query that exceeds this allocation
* a relative dynamic limit to terminate queries that exceed "typical"
allocation
For the absolute limit: as with other implementations, {{memAllowed}} would set
the absolute limit per query (float value in megabytes?). In order to
accommodate initial queries this should be set to a relatively high value,
which isn't optimal later for typical queries - this higher limit will
eventually catch runaway queries but not before they consume significant memory.
For the dynamic limit: a histogram would be added to the metrics to track the
recent memory usage per query (using exponentially decaying reservoir). The
life-cycle of the histogram could be tied either to SolrCore or to
SolrIndexSearcher (the latter seems more appropriate because of the warmup
queries that would skew the longer-term stats in SolrCore's life-cycle).
After collecting sufficient number of data points (eg. {{{}N = 100{}}}) the
component could start enforcing a dynamic limit based on a formula that takes
into account the "typical" recent queries. For example: {{{}dynamicThreshold =
X * p99{}}}, where {{X = 2.0}} by default.
Open issues:
* does the dynamic threshold make sense? does the formula make sense?
* I think that both the static and dynamic limits should be optional, ie. some
combination of query params should allow user to skip the enforcement of either
/ both.
* since the dynamic limit involves parameters (at least N and X above) that
determine long-term tracking it can no longer be expressed just as short-lived
query params, it needs a configuration with a life-cycle of SolrCore or longer.
Where should we put this configuration?
> Create MemQueryLimit implementation
> -----------------------------------
>
> Key: SOLR-17150
> URL: https://issues.apache.org/jira/browse/SOLR-17150
> Project: Solr
> Issue Type: Sub-task
> Security Level: Public(Default Security Level. Issues are Public)
> Components: Query Limits
> Reporter: Andrzej Bialecki
> Assignee: Andrzej Bialecki
> Priority: Major
> Time Spent: 10m
> Remaining Estimate: 0h
>
> An implementation of {{QueryTimeout}} that terminates misbehaving queries
> that allocate too much memory for their execution.
> This is a bit more complicated than {{CpuQueryLimits}} because the first time
> a query is submitted it may legitimately allocate many sizeable objects
> (caches, field values, etc). So we want to catch and terminate queries that
> either exceed any reasonable threshold (eg. 2GB), or significantly exceed a
> time-weighted percentile of the recent queries.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]