[
https://issues.apache.org/jira/browse/SOLR-13884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16966805#comment-16966805
]
Yonik Seeley edited comment on SOLR-13884 at 11/4/19 4:34 PM:
--------------------------------------------------------------
OK, I updated the test to reproduce another serious bug with replica placement
and concurrent collection creation.
When collection-level policies are used, and the cluster is currently
unbalanced, it's relatively easy to get into a situation where multiple
replicas are assigned to the exact same node. In the wild, I've actually seen
all 5 replicas of a single shard be assigned to the same node, and I've been
able to reproduce that with my test case.
The test case is currently set up to reproduce the simplest case I could
manage. We start off with just 2 nodes, create a single replica on one node,
then do 2 collection create commands concurrently (each with 1 shard and
replicationFactor=2). Pretty much 100% of the time, 1 shard will end up with
both replicas on the same node. This does not happen if the creations are done
serially. It also doesn't happen if there is an identical cluster-level policy
specified.
was (Author: [email protected]):
OK, I updated the test to reproduce another serious bug with replica placement
and concurrent collection creation.
When collection-level policies are used, and the cluster is currently
unbalanced, it's relatively easy to get into a situation where multiple
replicas are assigned to the exact same node. In the wild, I've actually seen
all 5 replicas of a single shard be assigned to the same node, and I've been
able to reproduce that with my test case.
The test case is currently set up to reproduce the simplest case I could
manage. We start off with just 2 nodes, create a single replica on one node,
then do 2 collection create commands concurrently (each with 1 shard and
replicationFactor=2). Pretty much 100% of the time, 1 shard will end up with
both replicas on the same node. This does not happen if the creations are done
serially.
> Concurrent collection creation leads to unbalanced cluster
> ----------------------------------------------------------
>
> Key: SOLR-13884
> URL: https://issues.apache.org/jira/browse/SOLR-13884
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Yonik Seeley
> Priority: Major
> Time Spent: 40m
> Remaining Estimate: 0h
>
> When multiple collection creations are done concurrently, the cluster can end
> up very unbalanced, with many (or most) replicas going to a small set of
> nodes.
> This was observed on both 8.2 and master.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]