[
https://issues.apache.org/jira/browse/SOLR-15714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17434475#comment-17434475
]
Cassandra Targett commented on SOLR-15714:
------------------------------------------
This is a very old section of the Ref Guide. This (also old) blog post[1]
phrases it similarly but slightly differently, and both the blog post and this
section probably copied from what seems to be a now-defunct wiki page:
bq. The Solr Wiki recommends that you set the size of this cache to at least
<max_results> * <max_concurrent_queries>, to ensure that Solr does not need to
re-fetch a document during a request.
What this means is that the size should be _at least_ equal to *your* maximum
results times *your* maximum number of concurrent queries. So if you peak at 10
concurrent queries and each query returns 20 results, you'd want the size of
this cache to be _at least_ 200. You wouldn't want averages here, because then
you will have performance degradation on higher loads.
It is phrased like these are parameters or values you should have readily
available somehow and formatted that way too, when neither are anything like
metrics. So, there is improvement to be done here, thanks for pointing it out.
[1] https://lucidworks.com/post/scaling-lucene-and-solr/
> guide bug in documentCache recommendations
> ------------------------------------------
>
> Key: SOLR-15714
> URL: https://issues.apache.org/jira/browse/SOLR-15714
> Project: Solr
> Issue Type: Improvement
> Security Level: Public(Default Security Level. Issues are Public)
> Components: documentation
> Reporter: Matthew Sporleder
> Priority: Minor
>
> [https://solr.apache.org/guide/8_8/query-settings-in-solrconfig.html#documentcache]
>
> Advice given for sizing the documentCache is "The size for the documentCache
> should always be greater than max_results times the max_concurrent_queries"
> Neither max_results nor max_concurrent_queries are solr keywords or settings.
> They are not defined but they are highlighted as if they are keywords.
> Is max_results rows= parameter or total matched documents?
> Is max_concurrent_queries http threads who might do a /select or something
> else?
> Furthermore the advice to use the *max* of things for sizing a cache seems
> pretty aggressive when *average* might be better, although it's hard to know
> because the current *max* don't mean anything :)
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]