[ 
https://issues.apache.org/jira/browse/SOLR-6220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-6220:
-----------------------------

    Summary: Replica placement startegy for solrcloud  (was: Replica placement 
startegy dor solrcloud)

> Replica placement startegy for solrcloud
> ----------------------------------------
>
>                 Key: SOLR-6220
>                 URL: https://issues.apache.org/jira/browse/SOLR-6220
>             Project: Solr
>          Issue Type: Bug
>          Components: SolrCloud
>            Reporter: Noble Paul
>            Assignee: Noble Paul
>
> h1.Objective
> Most cloud based systems allow to specify rules on how the replicas/nodes of 
> a cluster are allocated . Solr should have a flexible mechanism through which 
> we should be able to control allocation of replicas or later change it to 
> suit the needs of the system
> All configurations are per collection basis. The rules are applied whenever a 
> replica is created in any of the shards in a given collection during
>  * collection creation
>  * shard splitting
>  * add replica
>  * createsshard
> There are two aspects to how replicas are placed: snitch and placement. 
> h2.snitch 
> How to identify the tags of nodes. Snitches are configured through collection 
> create command with the snitch prefix  . eg: snitch.type=EC2Snitch.
> The system provides the following implicit tag names which cannot be used by 
> other snitches
>  * node : The solr nodename
>  * host : The hostname
>  * ip : The ip address of the host
>  * cores : This is a dynamic varibale which gives the core count at any given 
> point 
>  * disk : This is a dynamic variable  which gives the available disk space at 
> any given point
> There will a few snitches provided by the system such as 
> h3.EC2Snitch
> Provides two tags called dc, rack from the region and zone values in EC2
> h3.IPSnitch 
> Use the IP to infer the “dc” and “rack” values
> h3.NodePropertySnitch 
> This lets users provide system properties to each node with tagname and value 
> .
> example : -Dsolrcloud.snitch.vals=tag-x:val-a,tag-y:val-b. This means this 
> particular node will have two tags “tag-x” and “tag-y” .
>  
> h3.RestSnitch 
>  Which lets the user configure a url which the server can invoke and get all 
> the tags for a given node. 
> This takes extra parameters in create command
> example:  
> {{snitch.type=RestSnitch&snitch.url=http://snitchserverhost:port?nodename={}}}
> The response of the  rest call   
> {{http://snitchserverhost:port/?nodename=192.168.1:8080_solr}}
> must be in either json format or properties format. 
> eg: 
> {code:JavaScript}
> {
> “tag-x”:”x-val”,
> “tag-y”:”y-val”
> }
> {code}
> or
> {noformat}
> tag-x=x-val
> tag-y=y-val
> {noformat}
> h3.ManagedSnitch
> This snitch keeps a list of nodes and their tag value pairs in Zookeeper. The 
> user should be able to manage the tags and values of each node through a 
> collection API 
> h2.Placement 
> This tells how many replicas for a given shard needs to be assigned to nodes 
> with the given key value pairs. These parameters will be passed on to the 
> collection CREATE api as a parameter  "placement" . The values will be saved 
> in the state of the collection as follows
> {code:Javascript}
> {
>  “mycollection”:{
>   “snitch”: {
>       type:“EC2Snitch”
>     }
>   “placement”:{
>    “key1”: “value1”,
>    “key2”: “value2”,
>    }
> }
> {code}
> A rule consists of 2 parts
>  * LHS or the qualifier : The format is \{shardname}.\{replicacount} .    Use 
> the wild card “*” for qualifying all. Use the \(!) operand for exclusion
>  * RHS or  conditions :  The format is \{tagname}\{operand}\{value} . The tag 
> name and values are provided by the snitch. The supported operands are
>  ** -> :  equals
>  ** >    : greater than . Only applicable for numeric tags
>  **<     : less than , Only applicable to numeric tags
> Each collection can have any number of rules. As long as the rules do not 
> conflict with each other it should be OK. Or else an error is thrown
> Example rules:
>  * “shard1:1”:“dc->dc1&rack->168” : This would assign exactly 1 replica for 
> shard1 with nodes having tags   “dc=dc1,rack=168”.
>  *  “shard1:1+”:“dc->dc1&rack->168”  : Same as above but assigns atleast one 
> replica to the tag val combination
>  * “*.1”:“dc->dc1” :  For all shards keep exactly one replica in dc:dc1
>  * “*.1+”:”dc->dc2”  :     At least one  replica needs to be in dc:dc2
>  * “*.2-”:”dc->dc3” : Keep a maximum of 2 replicas in dc:dc3 for all shards
>  * “shard1.*”:”rack->730”  :  All replicas of shard1 will go to rack 730
>  * “shard1.1”:“node->192.167.1.2:8983_solr”  : 1 replica of shard1 must go to 
> the node 192.167.1.28983_solr
>  * “!shard1.* : “rack->738”  : No replica of shard1 should go to rack 738 
>  * “!shard1.* : “host->192.168.89.91”  : No replica of shard1 should go to 
> host 192.168.89.91
> * “*.*”: “cores<5”: All replicas should be created in nodes with  less than 5 
> cores  
>  * “*.*”:”disk>20gb” :  All replicas must be created in nodes with disk space 
> greater than 20gb
> In the collection create API all the placement rules are provided as a 
> parameter called placement and multiple rules are separated with "|" 
> example:
> {noformat}
> snitch.type=EC2Snitch&placement=*.1:dc->dc1|*.2-:dc->dc3|!shard1. :rack->738 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to