[
https://issues.apache.org/jira/browse/SOLR-17947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18035976#comment-18035976
]
Mark Robert Miller commented on SOLR-17947:
-------------------------------------------
Requests that must contact ZooKeeper before proceeding represent a serious
SolrCloud anti-pattern. The deeper problem, however, is that all collections
share only three global locks. Even with a single collection, when the client
operates under high load and needs to refresh cluster state—often unnecessarily
for the request to succeed—those locks can become a severe bottleneck. While
the state is being fetched, a large number of concurrent requests block on the
same lock, and once the state is retrieved, they all resume simultaneously.
This burst behavior can amplify load dramatically: a steady
80-requests-per-second pattern at the client can suddenly spike to more than
300 requests per second on the server.
> CloudSolrClient async state refresh
> -----------------------------------
>
> Key: SOLR-17947
> URL: https://issues.apache.org/jira/browse/SOLR-17947
> Project: Solr
> Issue Type: Bug
> Affects Versions: 9.9
> Reporter: Mark Robert Miller
> Priority: Major
>
> * replace striped locks with single-flight cache refresh futures to stop
> thundering herd
> * keep stale entries usable while background refresh runs, update retry
> semantics accordingly
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]