[
https://issues.apache.org/jira/browse/SOLR-12495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16519668#comment-16519668
]
Jerry Bao edited comment on SOLR-12495 at 6/21/18 6:40 PM:
-----------------------------------------------------------
{quote}well
{code:java}
{"replica": "#MINIMUM", "node": "#ANY"}
{code}
means it is applied on a per collection basis
{quote}
That seems confusing to me; the way I read it is: keep a minimum number of
replicas on every node. Just to clarify, when you say per-collection basis,
you're meaning each collection is balanced? If that is so will there be a way
to keep the entire cluster balanced irrespective of collection? Is that covered
by the core preference? My concern here is that without a way to keep the
entire cluster balanced irrespective of collection, you'll end up with nodes
with one replica of every collection and other nodes with 0 replicas. For
example, if you had three collections with 30 replicas each, and 45 nodes, you
could end up with 30 nodes, each with one of each collections replica, and 15
nodes with 0 replicas, which is unbalanced.
{quote}In reality, it works slightly different. The value "<3" is not a
constant . it keeps varying when every replica is created. for instance , when
replica # 40 is being created , the value is (40/40 = 1) that is like saying
{{replica:"<2"}} . whereas , when replica #41 is created, it suddenly becomes
{{"replica" : "<3"}}. So actually allocations happen evenly
{quote}
I understand that it's not constant, but what I'm saying is the rule itself can
not be violated but the cluster not balanced. If I have 42 replicas and 40
nodes, I would want 1 replica on every node before getting 2 on other nodes.
ceil(42/40) -> <3 rule, which has the potential of having 2 replicas on 21
nodes, which satisfies the rule but is not balanced.
was (Author: jerry.bao):
{quote}well
{code:java}
{"replica": "#MINIMUM", "node": "#ANY"}
{code}
means it is applied on a per collection basis
{quote}
That seems confusing to me; the way I read it is: keep a minimum number of
replicas on every node. Just to clarify, when you say per-collection basis,
you're meaning each collection is balanced? If that is so will there be a way
to keep the entire cluster balanced irrespective of collection? Is that covered
by the core preference? My concern here is that without a way to keep the
entire cluster balanced irrespective of collection, you'll end up with nodes
with one replica of every collection and other nodes with 0 replicas. For
example, if you had three collections with 30 replicas each, and 45 nodes, you
could end up with 30 nodes, each with one collections replica, and 15 nodes
with 0 replicas, which is unbalanced.
{quote}In reality, it works slightly different. The value "<3" is not a
constant . it keeps varying when every replica is created. for instance , when
replica # 40 is being created , the value is (40/40 = 1) that is like saying
{{replica:"<2"}} . whereas , when replica #41 is created, it suddenly becomes
{{"replica" : "<3"}}. So actually allocations happen evenly
{quote}
I understand that it's not constant, but what I'm saying is the rule itself can
not be violated but the cluster not balanced. If I have 42 replicas and 40
nodes, I would want 1 replica on every node before getting 2 on other nodes.
ceil(42/40) -> <3 rule, which has the potential of having 2 replicas on 21
nodes, which satisfies the rule but is not balanced.
> Make it possible to evenly distribute replicas
> ----------------------------------------------
>
> Key: SOLR-12495
> URL: https://issues.apache.org/jira/browse/SOLR-12495
> Project: Solr
> Issue Type: Sub-task
> Security Level: Public(Default Security Level. Issues are Public)
> Components: AutoScaling
> Reporter: Noble Paul
> Priority: Major
>
> Support a new function value for {{replica= "#MINIMUM"}}
> {{#MINIMUM}} means the minimum computed value for the given configuration
> the value of replica will be calculated as {{<=
> Math.ceil(number_of_replicas/number_of_valid_nodes) }}
> *example 1:*
> {code:java}
> {"replica" : "#MINIMUM" , "shard" : "#EACH" , "node" : "#ANY"}
> {code}
> *case 1* : nodes=3, replicationFactor=4
> the value of replica will be calculated as {{Math.ceil(4/3) = 2}}
> current state : nodes=3, replicationFactor=2
> this is equivalent to the hard coded rule
> {code:java}
> {"replica" : "<3" , "shard" : "#EACH" , "node" : "#ANY"}
> {code}
> *case 2* :
> current state : nodes=3, replicationFactor=2
> this is equivalent to the hard coded rule
> {code:java}
> {"replica" : "<3" , "shard" : "#EACH" , "node" : "#ANY"}
> {code}
> *example:2*
> {code}
> {"replica" : "#MINIMUM" , "node" : "#ANY"}{code}
> case 1: numShards = 2, replicationFactor=3, nodes = 5
> this is equivalent to the hard coded rule
> {code:java}
> {"replica" : "<3" , "node" : "#ANY"}
> {code}
> *example:3*
> {code}
> {"replica" : "<2" , "shard" : "#EACH" , "port" : "8983"}{code}
> case 1: {{replicationFactor=3, nodes with port 8983 = 2}}
> this is equivalent to the hard coded rule
> {code}
> {"replica" : "<3" , "shard" : "#EACH" , "port" : "8983"}{code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]