[jira] [Commented] (SOLR-12993) Split the state.json into 2. a small frequently modified data + a large unmodified data
[ https://issues.apache.org/jira/browse/SOLR-12993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16819976#comment-16819976 ] mosh commented on SOLR-12993: - {quote}or alternately we can just add this data (status, leader) to the LIR term files . That way , we don't need to create any new files {quote} ZkShardTerms(class that generates LIR files) resides in solr-core, while ZkStateReader is in solrJ. Since this proposal is to split state.json, there would be no way to find out which replica is the leader, since this information will reside inside the LIR term files. I propose two possible forms of action: # Move ZkShardTerms to solrJ, and combine LIR terms # Create new files as proposed by [~noble.paul], which will contain a small subset of the split information. [~noble.paul], [~gus_heck], WDYT? > Split the state.json into 2. a small frequently modified data + a large > unmodified data > --- > > Key: SOLR-12993 > URL: https://issues.apache.org/jira/browse/SOLR-12993 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul >Priority: Major > > This a just a proposal to minimize the ZK load and improve scalability of > very large clusters. > Every time a small state change occurs for a collection/replica the following > file needs to be updated + read * n times (where n = no of replicas for this > collection ). The proposal is to split the main file into 2. > {code} > {"gettingstarted":{ > "pullReplicas":"0", > "replicationFactor":"2", > "router":{"name":"compositeId"}, > "maxShardsPerNode":"-1", > "autoAddReplicas":"false", > "nrtReplicas":"2", > "tlogReplicas":"0", > "shards":{ > "shard1":{ > "range":"8000-", > > "replicas":{ > "core_node3":{ > "core":"gettingstarted_shard1_replica_n1", > "base_url":"http://10.0.0.80:8983/solr;, > "node_name":"10.0.0.80:8983_solr", > "state":"active", > "type":"NRT", > "force_set_state":"false", > "leader":"true"}, > "core_node5":{ > "core":"gettingstarted_shard1_replica_n2", > "base_url":"http://10.0.0.80:7574/solr;, > "node_name":"10.0.0.80:7574_solr", > > "type":"NRT", > "force_set_state":"false"}}}, > "shard2":{ > "range":"0-7fff", > "state":"active", > "replicas":{ > "core_node7":{ > "core":"gettingstarted_shard2_replica_n4", > "base_url":"http://10.0.0.80:7574/solr;, > "node_name":"10.0.0.80:7574_solr", > > "type":"NRT", > "force_set_state":"false"}, > "core_node8":{ > "core":"gettingstarted_shard2_replica_n6", > "base_url":"http://10.0.0.80:8983/solr;, > "node_name":"10.0.0.80:8983_solr", > > "type":"NRT", > "force_set_state":"false", > "leader":"true"}} > {code} > another file {{status.json}} which is frequently updated and small. > {code} > { > "shard1": { > "state": "ACTIVE", > "core_node3": {"state": "active", "leader" : true}, > "core_node5": {"state": "active"} > }, > "shard2": { > "state": "active", > "core_node7": {"state": "active"}, > "core_node8": {"state": "active", "leader" : true}} > } > {code} > Here the size of the file is roughly one tenth of the other file. This leads > to a dramatic reduction in the amount of data written/read to/from ZK. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12993) Split the state.json into 2. a small frequently modified data + a large unmodified data
[ https://issues.apache.org/jira/browse/SOLR-12993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16785978#comment-16785978 ] Noble Paul commented on SOLR-12993: --- bq.Error prone how? Apparently the cost of watching got better recently, though not entirely clear how much better: Error prone at Solr side. If Solr is watching 1's of nodes in ZK if some data is not is sync , it's hard to debug . https://issues.apache.org/jira/browse/ZOOKEEPER-1416 can be a huge improvement however > Split the state.json into 2. a small frequently modified data + a large > unmodified data > --- > > Key: SOLR-12993 > URL: https://issues.apache.org/jira/browse/SOLR-12993 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul >Priority: Major > > This a just a proposal to minimize the ZK load and improve scalability of > very large clusters. > Every time a small state change occurs for a collection/replica the following > file needs to be updated + read * n times (where n = no of replicas for this > collection ). The proposal is to split the main file into 2. > {code} > {"gettingstarted":{ > "pullReplicas":"0", > "replicationFactor":"2", > "router":{"name":"compositeId"}, > "maxShardsPerNode":"-1", > "autoAddReplicas":"false", > "nrtReplicas":"2", > "tlogReplicas":"0", > "shards":{ > "shard1":{ > "range":"8000-", > > "replicas":{ > "core_node3":{ > "core":"gettingstarted_shard1_replica_n1", > "base_url":"http://10.0.0.80:8983/solr;, > "node_name":"10.0.0.80:8983_solr", > "state":"active", > "type":"NRT", > "force_set_state":"false", > "leader":"true"}, > "core_node5":{ > "core":"gettingstarted_shard1_replica_n2", > "base_url":"http://10.0.0.80:7574/solr;, > "node_name":"10.0.0.80:7574_solr", > > "type":"NRT", > "force_set_state":"false"}}}, > "shard2":{ > "range":"0-7fff", > "state":"active", > "replicas":{ > "core_node7":{ > "core":"gettingstarted_shard2_replica_n4", > "base_url":"http://10.0.0.80:7574/solr;, > "node_name":"10.0.0.80:7574_solr", > > "type":"NRT", > "force_set_state":"false"}, > "core_node8":{ > "core":"gettingstarted_shard2_replica_n6", > "base_url":"http://10.0.0.80:8983/solr;, > "node_name":"10.0.0.80:8983_solr", > > "type":"NRT", > "force_set_state":"false", > "leader":"true"}} > {code} > another file {{status.json}} which is frequently updated and small. > {code} > { > "shard1": { > "state": "ACTIVE", > "core_node3": {"state": "active", "leader" : true}, > "core_node5": {"state": "active"} > }, > "shard2": { > "state": "active", > "core_node7": {"state": "active"}, > "core_node8": {"state": "active", "leader" : true}} > } > {code} > Here the size of the file is roughly one tenth of the other file. This leads > to a dramatic reduction in the amount of data written/read to/from ZK. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12993) Split the state.json into 2. a small frequently modified data + a large unmodified data
[ https://issues.apache.org/jira/browse/SOLR-12993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16785641#comment-16785641 ] Gus Heck commented on SOLR-12993: - Error prone how? Apparently the cost of watching got better recently, though not entirely clear how much better: https://github.com/apache/zookeeper/pull/590 comments on that PR for zk mention that accumulo uses a lot of watches and that pr will be a big win for them, but I know nothing about accumulo or their experiences with that. > Split the state.json into 2. a small frequently modified data + a large > unmodified data > --- > > Key: SOLR-12993 > URL: https://issues.apache.org/jira/browse/SOLR-12993 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul >Priority: Major > > This a just a proposal to minimize the ZK load and improve scalability of > very large clusters. > Every time a small state change occurs for a collection/replica the following > file needs to be updated + read * n times (where n = no of replicas for this > collection ). The proposal is to split the main file into 2. > {code} > {"gettingstarted":{ > "pullReplicas":"0", > "replicationFactor":"2", > "router":{"name":"compositeId"}, > "maxShardsPerNode":"-1", > "autoAddReplicas":"false", > "nrtReplicas":"2", > "tlogReplicas":"0", > "shards":{ > "shard1":{ > "range":"8000-", > > "replicas":{ > "core_node3":{ > "core":"gettingstarted_shard1_replica_n1", > "base_url":"http://10.0.0.80:8983/solr;, > "node_name":"10.0.0.80:8983_solr", > "state":"active", > "type":"NRT", > "force_set_state":"false", > "leader":"true"}, > "core_node5":{ > "core":"gettingstarted_shard1_replica_n2", > "base_url":"http://10.0.0.80:7574/solr;, > "node_name":"10.0.0.80:7574_solr", > > "type":"NRT", > "force_set_state":"false"}}}, > "shard2":{ > "range":"0-7fff", > "state":"active", > "replicas":{ > "core_node7":{ > "core":"gettingstarted_shard2_replica_n4", > "base_url":"http://10.0.0.80:7574/solr;, > "node_name":"10.0.0.80:7574_solr", > > "type":"NRT", > "force_set_state":"false"}, > "core_node8":{ > "core":"gettingstarted_shard2_replica_n6", > "base_url":"http://10.0.0.80:8983/solr;, > "node_name":"10.0.0.80:8983_solr", > > "type":"NRT", > "force_set_state":"false", > "leader":"true"}} > {code} > another file {{status.json}} which is frequently updated and small. > {code} > { > "shard1": { > "state": "ACTIVE", > "core_node3": {"state": "active", "leader" : true}, > "core_node5": {"state": "active"} > }, > "shard2": { > "state": "active", > "core_node7": {"state": "active"}, > "core_node8": {"state": "active", "leader" : true}} > } > {code} > Here the size of the file is roughly one tenth of the other file. This leads > to a dramatic reduction in the amount of data written/read to/from ZK. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12993) Split the state.json into 2. a small frequently modified data + a large unmodified data
[ https://issues.apache.org/jira/browse/SOLR-12993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16785648#comment-16785648 ] Gus Heck commented on SOLR-12993: - Also interesting, but as yet unresolved: https://issues.apache.org/jira/browse/ZOOKEEPER-1416 > Split the state.json into 2. a small frequently modified data + a large > unmodified data > --- > > Key: SOLR-12993 > URL: https://issues.apache.org/jira/browse/SOLR-12993 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul >Priority: Major > > This a just a proposal to minimize the ZK load and improve scalability of > very large clusters. > Every time a small state change occurs for a collection/replica the following > file needs to be updated + read * n times (where n = no of replicas for this > collection ). The proposal is to split the main file into 2. > {code} > {"gettingstarted":{ > "pullReplicas":"0", > "replicationFactor":"2", > "router":{"name":"compositeId"}, > "maxShardsPerNode":"-1", > "autoAddReplicas":"false", > "nrtReplicas":"2", > "tlogReplicas":"0", > "shards":{ > "shard1":{ > "range":"8000-", > > "replicas":{ > "core_node3":{ > "core":"gettingstarted_shard1_replica_n1", > "base_url":"http://10.0.0.80:8983/solr;, > "node_name":"10.0.0.80:8983_solr", > "state":"active", > "type":"NRT", > "force_set_state":"false", > "leader":"true"}, > "core_node5":{ > "core":"gettingstarted_shard1_replica_n2", > "base_url":"http://10.0.0.80:7574/solr;, > "node_name":"10.0.0.80:7574_solr", > > "type":"NRT", > "force_set_state":"false"}}}, > "shard2":{ > "range":"0-7fff", > "state":"active", > "replicas":{ > "core_node7":{ > "core":"gettingstarted_shard2_replica_n4", > "base_url":"http://10.0.0.80:7574/solr;, > "node_name":"10.0.0.80:7574_solr", > > "type":"NRT", > "force_set_state":"false"}, > "core_node8":{ > "core":"gettingstarted_shard2_replica_n6", > "base_url":"http://10.0.0.80:8983/solr;, > "node_name":"10.0.0.80:8983_solr", > > "type":"NRT", > "force_set_state":"false", > "leader":"true"}} > {code} > another file {{status.json}} which is frequently updated and small. > {code} > { > "shard1": { > "state": "ACTIVE", > "core_node3": {"state": "active", "leader" : true}, > "core_node5": {"state": "active"} > }, > "shard2": { > "state": "active", > "core_node7": {"state": "active"}, > "core_node8": {"state": "active", "leader" : true}} > } > {code} > Here the size of the file is roughly one tenth of the other file. This leads > to a dramatic reduction in the amount of data written/read to/from ZK. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12993) Split the state.json into 2. a small frequently modified data + a large unmodified data
[ https://issues.apache.org/jira/browse/SOLR-12993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16784973#comment-16784973 ] Noble Paul commented on SOLR-12993: --- We do that because of the explosion of the no:of watchers . large no:of watchers are error prone. We don't know if a particular node is not in sync or the cost of watching so many nodes > Split the state.json into 2. a small frequently modified data + a large > unmodified data > --- > > Key: SOLR-12993 > URL: https://issues.apache.org/jira/browse/SOLR-12993 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul >Priority: Major > > This a just a proposal to minimize the ZK load and improve scalability of > very large clusters. > Every time a small state change occurs for a collection/replica the following > file needs to be updated + read * n times (where n = no of replicas for this > collection ). The proposal is to split the main file into 2. > {code} > {"gettingstarted":{ > "pullReplicas":"0", > "replicationFactor":"2", > "router":{"name":"compositeId"}, > "maxShardsPerNode":"-1", > "autoAddReplicas":"false", > "nrtReplicas":"2", > "tlogReplicas":"0", > "shards":{ > "shard1":{ > "range":"8000-", > > "replicas":{ > "core_node3":{ > "core":"gettingstarted_shard1_replica_n1", > "base_url":"http://10.0.0.80:8983/solr;, > "node_name":"10.0.0.80:8983_solr", > "state":"active", > "type":"NRT", > "force_set_state":"false", > "leader":"true"}, > "core_node5":{ > "core":"gettingstarted_shard1_replica_n2", > "base_url":"http://10.0.0.80:7574/solr;, > "node_name":"10.0.0.80:7574_solr", > > "type":"NRT", > "force_set_state":"false"}}}, > "shard2":{ > "range":"0-7fff", > "state":"active", > "replicas":{ > "core_node7":{ > "core":"gettingstarted_shard2_replica_n4", > "base_url":"http://10.0.0.80:7574/solr;, > "node_name":"10.0.0.80:7574_solr", > > "type":"NRT", > "force_set_state":"false"}, > "core_node8":{ > "core":"gettingstarted_shard2_replica_n6", > "base_url":"http://10.0.0.80:8983/solr;, > "node_name":"10.0.0.80:8983_solr", > > "type":"NRT", > "force_set_state":"false", > "leader":"true"}} > {code} > another file {{status.json}} which is frequently updated and small. > {code} > { > "shard1": { > "state": "ACTIVE", > "core_node3": {"state": "active", "leader" : true}, > "core_node5": {"state": "active"} > }, > "shard2": { > "state": "active", > "core_node7": {"state": "active"}, > "core_node8": {"state": "active", "leader" : true}} > } > {code} > Here the size of the file is roughly one tenth of the other file. This leads > to a dramatic reduction in the amount of data written/read to/from ZK. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12993) Split the state.json into 2. a small frequently modified data + a large unmodified data
[ https://issues.apache.org/jira/browse/SOLR-12993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16784877#comment-16784877 ] Gus Heck commented on SOLR-12993: - In general I wonder about out tendency to store configuration as blobs of JSON in Zookeeper which is inherently tree based itself (and more flexible, and more powerful than JSON). We have lots of code that is packing and unpacking these JSON files and then effectively accessing a path within the JSON... whereas we could just access a path in Zookeeper directly and not bother with any JSON parsing/encoding. And if we wanted to watch something within the structure it wouldn't have to be conflated with other code that wanted to watch something else... > Split the state.json into 2. a small frequently modified data + a large > unmodified data > --- > > Key: SOLR-12993 > URL: https://issues.apache.org/jira/browse/SOLR-12993 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul >Priority: Major > > This a just a proposal to minimize the ZK load and improve scalability of > very large clusters. > Every time a small state change occurs for a collection/replica the following > file needs to be updated + read * n times (where n = no of replicas for this > collection ). The proposal is to split the main file into 2. > {code} > {"gettingstarted":{ > "pullReplicas":"0", > "replicationFactor":"2", > "router":{"name":"compositeId"}, > "maxShardsPerNode":"-1", > "autoAddReplicas":"false", > "nrtReplicas":"2", > "tlogReplicas":"0", > "shards":{ > "shard1":{ > "range":"8000-", > > "replicas":{ > "core_node3":{ > "core":"gettingstarted_shard1_replica_n1", > "base_url":"http://10.0.0.80:8983/solr;, > "node_name":"10.0.0.80:8983_solr", > "state":"active", > "type":"NRT", > "force_set_state":"false", > "leader":"true"}, > "core_node5":{ > "core":"gettingstarted_shard1_replica_n2", > "base_url":"http://10.0.0.80:7574/solr;, > "node_name":"10.0.0.80:7574_solr", > > "type":"NRT", > "force_set_state":"false"}}}, > "shard2":{ > "range":"0-7fff", > "state":"active", > "replicas":{ > "core_node7":{ > "core":"gettingstarted_shard2_replica_n4", > "base_url":"http://10.0.0.80:7574/solr;, > "node_name":"10.0.0.80:7574_solr", > > "type":"NRT", > "force_set_state":"false"}, > "core_node8":{ > "core":"gettingstarted_shard2_replica_n6", > "base_url":"http://10.0.0.80:8983/solr;, > "node_name":"10.0.0.80:8983_solr", > > "type":"NRT", > "force_set_state":"false", > "leader":"true"}} > {code} > another file {{status.json}} which is frequently updated and small. > {code} > { > "shard1": { > "state": "ACTIVE", > "core_node3": {"state": "active", "leader" : true}, > "core_node5": {"state": "active"} > }, > "shard2": { > "state": "active", > "core_node7": {"state": "active"}, > "core_node8": {"state": "active", "leader" : true}} > } > {code} > Here the size of the file is roughly one tenth of the other file. This leads > to a dramatic reduction in the amount of data written/read to/from ZK. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12993) Split the state.json into 2. a small frequently modified data + a large unmodified data
[ https://issues.apache.org/jira/browse/SOLR-12993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16784790#comment-16784790 ] Noble Paul commented on SOLR-12993: --- [~moshebla] we will need to change existing code . But as [~ab]mentioned , the changes are internal and not visible to APIs > Split the state.json into 2. a small frequently modified data + a large > unmodified data > --- > > Key: SOLR-12993 > URL: https://issues.apache.org/jira/browse/SOLR-12993 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul >Priority: Major > > This a just a proposal to minimize the ZK load and improve scalability of > very large clusters. > Every time a small state change occurs for a collection/replica the following > file needs to be updated + read * n times (where n = no of replicas for this > collection ). The proposal is to split the main file into 2. > {code} > {"gettingstarted":{ > "pullReplicas":"0", > "replicationFactor":"2", > "router":{"name":"compositeId"}, > "maxShardsPerNode":"-1", > "autoAddReplicas":"false", > "nrtReplicas":"2", > "tlogReplicas":"0", > "shards":{ > "shard1":{ > "range":"8000-", > > "replicas":{ > "core_node3":{ > "core":"gettingstarted_shard1_replica_n1", > "base_url":"http://10.0.0.80:8983/solr;, > "node_name":"10.0.0.80:8983_solr", > "state":"active", > "type":"NRT", > "force_set_state":"false", > "leader":"true"}, > "core_node5":{ > "core":"gettingstarted_shard1_replica_n2", > "base_url":"http://10.0.0.80:7574/solr;, > "node_name":"10.0.0.80:7574_solr", > > "type":"NRT", > "force_set_state":"false"}}}, > "shard2":{ > "range":"0-7fff", > "state":"active", > "replicas":{ > "core_node7":{ > "core":"gettingstarted_shard2_replica_n4", > "base_url":"http://10.0.0.80:7574/solr;, > "node_name":"10.0.0.80:7574_solr", > > "type":"NRT", > "force_set_state":"false"}, > "core_node8":{ > "core":"gettingstarted_shard2_replica_n6", > "base_url":"http://10.0.0.80:8983/solr;, > "node_name":"10.0.0.80:8983_solr", > > "type":"NRT", > "force_set_state":"false", > "leader":"true"}} > {code} > another file {{status.json}} which is frequently updated and small. > {code} > { > "shard1": { > "state": "ACTIVE", > "core_node3": {"state": "active", "leader" : true}, > "core_node5": {"state": "active"} > }, > "shard2": { > "state": "active", > "core_node7": {"state": "active"}, > "core_node8": {"state": "active", "leader" : true}} > } > {code} > Here the size of the file is roughly one tenth of the other file. This leads > to a dramatic reduction in the amount of data written/read to/from ZK. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12993) Split the state.json into 2. a small frequently modified data + a large unmodified data
[ https://issues.apache.org/jira/browse/SOLR-12993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16783664#comment-16783664 ] Andrzej Bialecki commented on SOLR-12993: -- bq. Would this be worth the overhead of changing existing code? Very little code interacts directly with these files, in most places this status is accessed via ClusterState. > Split the state.json into 2. a small frequently modified data + a large > unmodified data > --- > > Key: SOLR-12993 > URL: https://issues.apache.org/jira/browse/SOLR-12993 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul >Priority: Major > > This a just a proposal to minimize the ZK load and improve scalability of > very large clusters. > Every time a small state change occurs for a collection/replica the following > file needs to be updated + read * n times (where n = no of replicas for this > collection ). The proposal is to split the main file into 2. > {code} > {"gettingstarted":{ > "pullReplicas":"0", > "replicationFactor":"2", > "router":{"name":"compositeId"}, > "maxShardsPerNode":"-1", > "autoAddReplicas":"false", > "nrtReplicas":"2", > "tlogReplicas":"0", > "shards":{ > "shard1":{ > "range":"8000-", > > "replicas":{ > "core_node3":{ > "core":"gettingstarted_shard1_replica_n1", > "base_url":"http://10.0.0.80:8983/solr;, > "node_name":"10.0.0.80:8983_solr", > "state":"active", > "type":"NRT", > "force_set_state":"false", > "leader":"true"}, > "core_node5":{ > "core":"gettingstarted_shard1_replica_n2", > "base_url":"http://10.0.0.80:7574/solr;, > "node_name":"10.0.0.80:7574_solr", > > "type":"NRT", > "force_set_state":"false"}}}, > "shard2":{ > "range":"0-7fff", > "state":"active", > "replicas":{ > "core_node7":{ > "core":"gettingstarted_shard2_replica_n4", > "base_url":"http://10.0.0.80:7574/solr;, > "node_name":"10.0.0.80:7574_solr", > > "type":"NRT", > "force_set_state":"false"}, > "core_node8":{ > "core":"gettingstarted_shard2_replica_n6", > "base_url":"http://10.0.0.80:8983/solr;, > "node_name":"10.0.0.80:8983_solr", > > "type":"NRT", > "force_set_state":"false", > "leader":"true"}} > {code} > another file {{status.json}} which is frequently updated and small. > {code} > { > "shard1": { > "state": "ACTIVE", > "core_node3": {"state": "active", "leader" : true}, > "core_node5": {"state": "active"} > }, > "shard2": { > "state": "active", > "core_node7": {"state": "active"}, > "core_node8": {"state": "active", "leader" : true}} > } > {code} > Here the size of the file is roughly one tenth of the other file. This leads > to a dramatic reduction in the amount of data written/read to/from ZK. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12993) Split the state.json into 2. a small frequently modified data + a large unmodified data
[ https://issues.apache.org/jira/browse/SOLR-12993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16783361#comment-16783361 ] mosh commented on SOLR-12993: - {quote}or alternately we can just add this data (status, leader) to the LIR term files . That way , we don't need to create any new files{quote} Would this be worth the overhead of changing existing code? > Split the state.json into 2. a small frequently modified data + a large > unmodified data > --- > > Key: SOLR-12993 > URL: https://issues.apache.org/jira/browse/SOLR-12993 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul >Priority: Major > > This a just a proposal to minimize the ZK load and improve scalability of > very large clusters. > Every time a small state change occurs for a collection/replica the following > file needs to be updated + read * n times (where n = no of replicas for this > collection ). The proposal is to split the main file into 2. > {code} > {"gettingstarted":{ > "pullReplicas":"0", > "replicationFactor":"2", > "router":{"name":"compositeId"}, > "maxShardsPerNode":"-1", > "autoAddReplicas":"false", > "nrtReplicas":"2", > "tlogReplicas":"0", > "shards":{ > "shard1":{ > "range":"8000-", > > "replicas":{ > "core_node3":{ > "core":"gettingstarted_shard1_replica_n1", > "base_url":"http://10.0.0.80:8983/solr;, > "node_name":"10.0.0.80:8983_solr", > "state":"active", > "type":"NRT", > "force_set_state":"false", > "leader":"true"}, > "core_node5":{ > "core":"gettingstarted_shard1_replica_n2", > "base_url":"http://10.0.0.80:7574/solr;, > "node_name":"10.0.0.80:7574_solr", > > "type":"NRT", > "force_set_state":"false"}}}, > "shard2":{ > "range":"0-7fff", > "state":"active", > "replicas":{ > "core_node7":{ > "core":"gettingstarted_shard2_replica_n4", > "base_url":"http://10.0.0.80:7574/solr;, > "node_name":"10.0.0.80:7574_solr", > > "type":"NRT", > "force_set_state":"false"}, > "core_node8":{ > "core":"gettingstarted_shard2_replica_n6", > "base_url":"http://10.0.0.80:8983/solr;, > "node_name":"10.0.0.80:8983_solr", > > "type":"NRT", > "force_set_state":"false", > "leader":"true"}} > {code} > another file {{status.json}} which is frequently updated and small. > {code} > { > "shard1": { > "state": "ACTIVE", > "core_node3": {"state": "active", "leader" : true}, > "core_node5": {"state": "active"} > }, > "shard2": { > "state": "active", > "core_node7": {"state": "active"}, > "core_node8": {"state": "active", "leader" : true}} > } > {code} > Here the size of the file is roughly one tenth of the other file. This leads > to a dramatic reduction in the amount of data written/read to/from ZK. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12993) Split the state.json into 2. a small frequently modified data + a large unmodified data
[ https://issues.apache.org/jira/browse/SOLR-12993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16783039#comment-16783039 ] mosh commented on SOLR-12993: - {quote}LIR term files{quote} Hey, I am not aware of what LIR is. Would this refer to the /terms/shardX files under Zookeeper? > Split the state.json into 2. a small frequently modified data + a large > unmodified data > --- > > Key: SOLR-12993 > URL: https://issues.apache.org/jira/browse/SOLR-12993 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul >Priority: Major > > This a just a proposal to minimize the ZK load and improve scalability of > very large clusters. > Every time a small state change occurs for a collection/replica the following > file needs to be updated + read * n times (where n = no of replicas for this > collection ). The proposal is to split the main file into 2. > {code} > {"gettingstarted":{ > "pullReplicas":"0", > "replicationFactor":"2", > "router":{"name":"compositeId"}, > "maxShardsPerNode":"-1", > "autoAddReplicas":"false", > "nrtReplicas":"2", > "tlogReplicas":"0", > "shards":{ > "shard1":{ > "range":"8000-", > > "replicas":{ > "core_node3":{ > "core":"gettingstarted_shard1_replica_n1", > "base_url":"http://10.0.0.80:8983/solr;, > "node_name":"10.0.0.80:8983_solr", > "state":"active", > "type":"NRT", > "force_set_state":"false", > "leader":"true"}, > "core_node5":{ > "core":"gettingstarted_shard1_replica_n2", > "base_url":"http://10.0.0.80:7574/solr;, > "node_name":"10.0.0.80:7574_solr", > > "type":"NRT", > "force_set_state":"false"}}}, > "shard2":{ > "range":"0-7fff", > "state":"active", > "replicas":{ > "core_node7":{ > "core":"gettingstarted_shard2_replica_n4", > "base_url":"http://10.0.0.80:7574/solr;, > "node_name":"10.0.0.80:7574_solr", > > "type":"NRT", > "force_set_state":"false"}, > "core_node8":{ > "core":"gettingstarted_shard2_replica_n6", > "base_url":"http://10.0.0.80:8983/solr;, > "node_name":"10.0.0.80:8983_solr", > > "type":"NRT", > "force_set_state":"false", > "leader":"true"}} > {code} > another file {{status.json}} which is frequently updated and small. > {code} > { > "shard1": { > "state": "ACTIVE", > "core_node3": {"state": "active", "leader" : true}, > "core_node5": {"state": "active"} > }, > "shard2": { > "state": "active", > "core_node7": {"state": "active"}, > "core_node8": {"state": "active", "leader" : true}} > } > {code} > Here the size of the file is roughly one tenth of the other file. This leads > to a dramatic reduction in the amount of data written/read to/from ZK. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12993) Split the state.json into 2. a small frequently modified data + a large unmodified data
[ https://issues.apache.org/jira/browse/SOLR-12993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16692886#comment-16692886 ] Noble Paul commented on SOLR-12993: --- Yeah, the example had it wrong. I just corrected bq.I would think the replica / shard "state", LIR term and "leader" flags would be good candidates Yes, both are suitable candidates > Split the state.json into 2. a small frequently modified data + a large > unmodified data > --- > > Key: SOLR-12993 > URL: https://issues.apache.org/jira/browse/SOLR-12993 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul >Priority: Major > > This a just a proposal to minimize the ZK load and improve scalability of > very large clusters. > Every time a small state change occurs for a collection/replica the following > file needs to be updated + read * n times (where n = no of replicas for this > collection ). The proposal is to split the main file into 2. > {code} > {"gettingstarted":{ > "pullReplicas":"0", > "replicationFactor":"2", > "router":{"name":"compositeId"}, > "maxShardsPerNode":"-1", > "autoAddReplicas":"false", > "nrtReplicas":"2", > "tlogReplicas":"0", > "shards":{ > "shard1":{ > "range":"8000-", > > "replicas":{ > "core_node3":{ > "core":"gettingstarted_shard1_replica_n1", > "base_url":"http://10.0.0.80:8983/solr;, > "node_name":"10.0.0.80:8983_solr", > "state":"active", > "type":"NRT", > "force_set_state":"false", > "leader":"true"}, > "core_node5":{ > "core":"gettingstarted_shard1_replica_n2", > "base_url":"http://10.0.0.80:7574/solr;, > "node_name":"10.0.0.80:7574_solr", > > "type":"NRT", > "force_set_state":"false"}}}, > "shard2":{ > "range":"0-7fff", > "state":"active", > "replicas":{ > "core_node7":{ > "core":"gettingstarted_shard2_replica_n4", > "base_url":"http://10.0.0.80:7574/solr;, > "node_name":"10.0.0.80:7574_solr", > > "type":"NRT", > "force_set_state":"false"}, > "core_node8":{ > "core":"gettingstarted_shard2_replica_n6", > "base_url":"http://10.0.0.80:8983/solr;, > "node_name":"10.0.0.80:8983_solr", > > "type":"NRT", > "force_set_state":"false", > "leader":"true"}} > {code} > another file {{status.json}} which is frequently updated and small. > {code} > { > "shard1": { > "state": "ACTIVE", > "core_node3": {"state": "active"}, > "core_node5": {"state": "active"} > }, > "shard2": { > "state": "active", > "core_node7": {"state": "active"}, > "core_node8": {"state": "active"}} > } > {code} > Here the size of the file is roughly one tenth of the other file. This leads > to a dramatic reduction in the amount of data written/read to/from ZK. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12993) Split the state.json into 2. a small frequently modified data + a large unmodified data
[ https://issues.apache.org/jira/browse/SOLR-12993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16692830#comment-16692830 ] Andrzej Bialecki commented on SOLR-12993: -- What parts of state would be kept in the new file? I would think the replica / shard "state", LIR term and "leader" flags would be good candidates - but the example above is not clear, "state" and "status" occur in both and seem to indicate the same thing. > Split the state.json into 2. a small frequently modified data + a large > unmodified data > --- > > Key: SOLR-12993 > URL: https://issues.apache.org/jira/browse/SOLR-12993 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul >Priority: Major > > This a just a proposal to minimize the ZK load and improve scalability of > very large clusters. > Every time a small state change occurs for a collection/replica the following > file needs to be updated + read * n times (where n = no of replicas for this > collection ). The proposal is to split the main file into 2. > {code} > {"gettingstarted":{ > "pullReplicas":"0", > "replicationFactor":"2", > "router":{"name":"compositeId"}, > "maxShardsPerNode":"-1", > "autoAddReplicas":"false", > "nrtReplicas":"2", > "tlogReplicas":"0", > "shards":{ > "shard1":{ > "range":"8000-", > > "replicas":{ > "core_node3":{ > "core":"gettingstarted_shard1_replica_n1", > "base_url":"http://10.0.0.80:8983/solr;, > "node_name":"10.0.0.80:8983_solr", > "state":"active", > "type":"NRT", > "force_set_state":"false", > "leader":"true"}, > "core_node5":{ > "core":"gettingstarted_shard1_replica_n2", > "base_url":"http://10.0.0.80:7574/solr;, > "node_name":"10.0.0.80:7574_solr", > > "type":"NRT", > "force_set_state":"false"}}}, > "shard2":{ > "range":"0-7fff", > "state":"active", > "replicas":{ > "core_node7":{ > "core":"gettingstarted_shard2_replica_n4", > "base_url":"http://10.0.0.80:7574/solr;, > "node_name":"10.0.0.80:7574_solr", > > "type":"NRT", > "force_set_state":"false"}, > "core_node8":{ > "core":"gettingstarted_shard2_replica_n6", > "base_url":"http://10.0.0.80:8983/solr;, > "node_name":"10.0.0.80:8983_solr", > > "type":"NRT", > "force_set_state":"false", > "leader":"true"}} > {code} > another file {{status.json}} which is frequently updated and small. > {code} > { > "shard1": { > "status": "ACTIVE", > "core_node3": {"status": "ACTIVE"}, > "core_node5": {"status": "ACTIVE"} > }, > "shard2": { > "status": "ACTIVE", > "core_node7": {"status": "ACTIVE"}, > "core_node8": {"status": "ACTIVE"}} > } > {code} > Here the size of the file is roughly one tenth of the other file. This leads > to a dramatic reduction in the amount of data written/read to/from ZK. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org