[
https://issues.apache.org/jira/browse/SOLR-12730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16736131#comment-16736131
]
ASF subversion and git services commented on SOLR-12730:
--------------------------------------------------------
Commit 6e745bd25007511266741b516cffdba757fa22a3 in lucene-solr's branch
refs/heads/master from Andrzej Bialecki
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=6e745bd ]
SOLR-12730: Implement staggered SPLITSHARD requests in IndexSizeTrigger.
> Implement staggered SPLITSHARD requests in IndexSizeTrigger
> -----------------------------------------------------------
>
> Key: SOLR-12730
> URL: https://issues.apache.org/jira/browse/SOLR-12730
> Project: Solr
> Issue Type: Improvement
> Security Level: Public(Default Security Level. Issues are Public)
> Components: AutoScaling
> Reporter: Andrzej Bialecki
> Assignee: Andrzej Bialecki
> Priority: Major
> Fix For: master (9.0)
>
>
> Simulated large scale tests uncovered an interesting scenario that occurs
> also in real clusters where {{IndexSizeTrigger}} is used for controlling the
> maximum shard size.
> As index size grows and the number of shards grows, if document assignment is
> more or less even then at equal intervals (on a {{log2}} scale) there will be
> an avalanche of SPLITSHARD operations, because all shards will reach the
> critical size at approximately the same time.
> A hundred or more split shard operations running in parallel may severely
> affect the cluster performance.
> One possible approach to reduce the likelihood of this situation is to split
> shards not exactly in half but rather fudge the proportions around 60/40% in
> a random sequence, so that the resulting sub-sub-sub…shards would reach the
> thresholds at different times. This would require modifications to the
> SPLITSHARD command to allow this randomization.
> Another approach would be to simply limit the maximum number of parallel
> split shard operations. However, this would slow down the process of reaching
> the balance (increase lag) and possibly violate other operational constraints
> due to some shards waiting too long for the split and significantly exceeding
> their max size.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]