[
https://issues.apache.org/jira/browse/SOLR-8707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15155241#comment-15155241
]
Hoss Man commented on SOLR-8707:
--------------------------------
bq. For example, in case there are 6 cores and auto commit time is 60 second,
the first core commit without delay, the second core do first commit after 10
seconds and commit in 60 seconds interval afterwards, and so on.
interesting ... a naive effort for individual cores to "space themselves out"
in time could probably be done fairly trivially when initializing the auto
commit timers on core load w/o a lot of continual coordination even if replicas
are added/removed over time:
if ZK mode:
* determine what shard we are
* request a list of all (known) replicas for our shard (even if they aren't
currently active)
* sort list of replicas by name, and locate our position N in the list and the
list size S
* assign "delayUnit = autoCommitTime / S"
* set an initial delay on the auto commit timer thread to "(delayUnit * N) +
rand(0, delayUnit)"
(The small amount of randomness seeming like a good idea to me in case some
replica is replaced by a new replica with a diff name, causing a different
existing replica (that doesn't pay know about the change to the list of ll
replicas) to shift up/down one in the list and think it has the same N as the
new replica)
> Distribute (auto)commit requests evenly over time in multi shard/replica
> collections
> ------------------------------------------------------------------------------------
>
> Key: SOLR-8707
> URL: https://issues.apache.org/jira/browse/SOLR-8707
> Project: Solr
> Issue Type: Improvement
> Components: update
> Reporter: Michael Sun
>
> In current implementation, all Solr nodes start commit for all cores in a
> collection almost at the same time. As result, it creates a load spike in
> cluster at regular interval, particular when collection is on HDFS. The main
> reason is that all cores are created almost at the same time for a collection
> and do commit in a fixed interval afterwards.
> It's good to distribute the the commit load evenly to avoid load spike. It
> helps to improve performance and reliability in general.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]