[jira] [Commented] (SOLR-5872) Eliminate overseer queue
[ https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16106028#comment-16106028 ] Noble Paul commented on SOLR-5872: -- bq.Is it really true that every Solr node subscribes/watches to ZK state changes to all collections all the time A node only watches those states if it has a replica of that collection. > Eliminate overseer queue > - > > Key: SOLR-5872 > URL: https://issues.apache.org/jira/browse/SOLR-5872 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Reporter: Noble Paul >Assignee: Noble Paul > > The overseer queue is one of the busiest points in the entire system. The > raison d'être of the queue is > * Provide batching of operations for the main clusterstate,json so that > state updates are minimized > * Avoid race conditions and ensure order > Now , as we move the individual collection states out of the main > clusterstate.json, the batching is not useful anymore. > Race conditions can easily be solved by using a compare and set in Zookeeper. > The proposed solution is , whenever an operation is required to be performed > on the clusterstate, the same thread (and of course the same JVM) > # read the fresh state and version of zk node > # construct the new state > # perform a compare and set > # if compare and set fails go to step 1 > This should be limited to all operations performed on external collections > because batching would be required for others -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5872) Eliminate overseer queue
[ https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16106025#comment-16106025 ] David Smiley commented on SOLR-5872: Is it really true that every Solr node subscribes/watches to ZK state changes to all collections all the time, _even to Collections that have no replicas on the current node_? Albert's comment https://issues.apache.org/jira/browse/SOLR-5872?focusedCommentId=15890021=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15890021 indicate this is so. I could understand doing this to query collections on different nodes but I think such watches should expire if not continuously utilized. Is there another JIRA issue about this? > Eliminate overseer queue > - > > Key: SOLR-5872 > URL: https://issues.apache.org/jira/browse/SOLR-5872 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Reporter: Noble Paul >Assignee: Noble Paul > > The overseer queue is one of the busiest points in the entire system. The > raison d'être of the queue is > * Provide batching of operations for the main clusterstate,json so that > state updates are minimized > * Avoid race conditions and ensure order > Now , as we move the individual collection states out of the main > clusterstate.json, the batching is not useful anymore. > Race conditions can easily be solved by using a compare and set in Zookeeper. > The proposed solution is , whenever an operation is required to be performed > on the clusterstate, the same thread (and of course the same JVM) > # read the fresh state and version of zk node > # construct the new state > # perform a compare and set > # if compare and set fails go to step 1 > This should be limited to all operations performed on external collections > because batching would be required for others -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5872) Eliminate overseer queue
[ https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15927335#comment-15927335 ] Noble Paul commented on SOLR-5872: -- bq. When you suggest partitioning the queue, do you mean multiple ZK queues? It helps in keeping the size of any given queue to be minimal if you have very large no:of collections. I'm open to the idea of doing an in memory partitioning. read large no:of items say 1. put them into in memory buckets and feed them into overseer would be just fine as well and it could be an easy win > Eliminate overseer queue > - > > Key: SOLR-5872 > URL: https://issues.apache.org/jira/browse/SOLR-5872 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Reporter: Noble Paul >Assignee: Noble Paul > > The overseer queue is one of the busiest points in the entire system. The > raison d'être of the queue is > * Provide batching of operations for the main clusterstate,json so that > state updates are minimized > * Avoid race conditions and ensure order > Now , as we move the individual collection states out of the main > clusterstate.json, the batching is not useful anymore. > Race conditions can easily be solved by using a compare and set in Zookeeper. > The proposed solution is , whenever an operation is required to be performed > on the clusterstate, the same thread (and of course the same JVM) > # read the fresh state and version of zk node > # construct the new state > # perform a compare and set > # if compare and set fails go to step 1 > This should be limited to all operations performed on external collections > because batching would be required for others -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5872) Eliminate overseer queue
[ https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15927301#comment-15927301 ] Joshua Humphries commented on SOLR-5872: Right, but when a node comes up and changes replica states to active, it is highly likely that the number of events for a single collection will be ~1. So breaking batches at collection boundaries results in effectively no batching. With the current code, there's no benefit to combining writes for multiple collections into the same batch. But if the code pipelined all of the writes for a batch (instead of issuing each one synchronously, blocking for each result) then combining writes across collections would reduce latency. When you suggest partitioning the queue, do you mean multiple ZK queues? Seems simpler to just partition in memory: ingest the whole queue (or up to some limit) and push into in-memory queues (one per partition; could even explode a 'downnode' message into the multiple updates it implies and scatter those updates across partitions). After one of the in-memory partitions completes an item, it can delete the corresponding entry from ZK. So, from ZK's point of view, the operations can completing out-of-order instead of always polling the head of the queue. When partitions quiesce (or when some other policy allows more items to be polled -- so we don't necessarily have to wait on all partitions to complete before grabbing more items), ingest another batch of items from ZK. > Eliminate overseer queue > - > > Key: SOLR-5872 > URL: https://issues.apache.org/jira/browse/SOLR-5872 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Reporter: Noble Paul >Assignee: Noble Paul > > The overseer queue is one of the busiest points in the entire system. The > raison d'être of the queue is > * Provide batching of operations for the main clusterstate,json so that > state updates are minimized > * Avoid race conditions and ensure order > Now , as we move the individual collection states out of the main > clusterstate.json, the batching is not useful anymore. > Race conditions can easily be solved by using a compare and set in Zookeeper. > The proposed solution is , whenever an operation is required to be performed > on the clusterstate, the same thread (and of course the same JVM) > # read the fresh state and version of zk node > # construct the new state > # perform a compare and set > # if compare and set fails go to step 1 > This should be limited to all operations performed on external collections > because batching would be required for others -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5872) Eliminate overseer queue
[ https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15927274#comment-15927274 ] Noble Paul commented on SOLR-5872: -- We actually try to batch the writes at overseer today if multiple subsequent update operations come for the same collection. because all the collections share the same queue, the benefits are not realized. The solution is to have a larger no:of of queues , say 1000 buckets (and as many threads). Each collection must be hashed to one of these buckets.This will help us improve the batching because there is a much higher probability of 2 subsequent events are for the same collection > Eliminate overseer queue > - > > Key: SOLR-5872 > URL: https://issues.apache.org/jira/browse/SOLR-5872 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Reporter: Noble Paul >Assignee: Noble Paul > > The overseer queue is one of the busiest points in the entire system. The > raison d'être of the queue is > * Provide batching of operations for the main clusterstate,json so that > state updates are minimized > * Avoid race conditions and ensure order > Now , as we move the individual collection states out of the main > clusterstate.json, the batching is not useful anymore. > Race conditions can easily be solved by using a compare and set in Zookeeper. > The proposed solution is , whenever an operation is required to be performed > on the clusterstate, the same thread (and of course the same JVM) > # read the fresh state and version of zk node > # construct the new state > # perform a compare and set > # if compare and set fails go to step 1 > This should be limited to all operations performed on external collections > because batching would be required for others -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5872) Eliminate overseer queue
[ https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15926351#comment-15926351 ] Joshua Humphries commented on SOLR-5872: I identified one issue with slowness in processing the overseer queue: a 'downnode' message can result in far more updates to ZK than necessary -- mainly for clusters with many collections where any given collection only has shard-replicas on a small minority of the nodes. Our cluster has many thousands of collections, most of which have only one shard and one replica. So 'downnode' was updating about 40x more collections in ZK than actually necessary. Furthermore, all of the writes are done synchronously/sequentially which means we pay the RTT to ZK 40x more than necessary. (Also, writes across collections, even when state_format > 1, could be batched and pipelined, to further reduce latency here.) See SOLR-10277 for more details. > Eliminate overseer queue > - > > Key: SOLR-5872 > URL: https://issues.apache.org/jira/browse/SOLR-5872 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Reporter: Noble Paul >Assignee: Noble Paul > > The overseer queue is one of the busiest points in the entire system. The > raison d'être of the queue is > * Provide batching of operations for the main clusterstate,json so that > state updates are minimized > * Avoid race conditions and ensure order > Now , as we move the individual collection states out of the main > clusterstate.json, the batching is not useful anymore. > Race conditions can easily be solved by using a compare and set in Zookeeper. > The proposed solution is , whenever an operation is required to be performed > on the clusterstate, the same thread (and of course the same JVM) > # read the fresh state and version of zk node > # construct the new state > # perform a compare and set > # if compare and set fails go to step 1 > This should be limited to all operations performed on external collections > because batching would be required for others -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5872) Eliminate overseer queue
[ https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905254#comment-15905254 ] albert vico oton commented on SOLR-5872: Yea exactly, sorry if my comment was confusing but the problem is with the use that SolrCloud is doing of ZK not with ZK itself. {quote} When the number of collections gets large enough, Solr has a tendency to run into ZOOKEEPER-1162, because entries can be added to the overseer queue at a much faster rate than the overseer can process them. During my testing on SOLR-7191 with version 5, Solr generated an overseer queue with 850,000 entries in it, resulting in a ZK packet size of 14 megabytes. {quote} I believe that this is exactly what we were experiencing. > Eliminate overseer queue > - > > Key: SOLR-5872 > URL: https://issues.apache.org/jira/browse/SOLR-5872 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Reporter: Noble Paul >Assignee: Noble Paul > > The overseer queue is one of the busiest points in the entire system. The > raison d'être of the queue is > * Provide batching of operations for the main clusterstate,json so that > state updates are minimized > * Avoid race conditions and ensure order > Now , as we move the individual collection states out of the main > clusterstate.json, the batching is not useful anymore. > Race conditions can easily be solved by using a compare and set in Zookeeper. > The proposed solution is , whenever an operation is required to be performed > on the clusterstate, the same thread (and of course the same JVM) > # read the fresh state and version of zk node > # construct the new state > # perform a compare and set > # if compare and set fails go to step 1 > This should be limited to all operations performed on external collections > because batching would be required for others -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5872) Eliminate overseer queue
[ https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905224#comment-15905224 ] Shawn Heisey commented on SOLR-5872: I don't know how this issue escaped my attention, especially since it's been around a few years. [~mewmewball] mentioned early on in this issue that each state change results in four ZK writes. When I opened SOLR-7191, I found that when any collection changed state, something was sent to the overseer queue for *every* collection. If I remember right, this happens even when adding a new collection, which seems completely insane to me. When the number of collections gets large enough, Solr has a tendency to run into ZOOKEEPER-1162, because entries can be added to the overseer queue at a much faster rate than the overseer can process them. During my testing on SOLR-7191 with version 5, Solr generated an overseer queue with 850,000 entries in it, resulting in a ZK packet size of 14 megabytes. I am not at all familiar with how SolrCloud's zookeeper code works. Exploring that rabbit hole will take a pretty major time investment. I've been reluctant to spend that time. Other people *do* understand it, so I mostly just bounce ideas off of those people and ask questions. bq. I'll take a look but the problem we were seeing was in Zookeeper cluster not in solr I don't see anything in your comment on 2017/03/01 that describes a problem with ZK. It sounds like problems with Solr using ZK. The overseer is a Solr component, it's not in ZK. If SOLR-10130 is occurring on your system, then an upgrade to 6.4.2 will help. > Eliminate overseer queue > - > > Key: SOLR-5872 > URL: https://issues.apache.org/jira/browse/SOLR-5872 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Reporter: Noble Paul >Assignee: Noble Paul > > The overseer queue is one of the busiest points in the entire system. The > raison d'être of the queue is > * Provide batching of operations for the main clusterstate,json so that > state updates are minimized > * Avoid race conditions and ensure order > Now , as we move the individual collection states out of the main > clusterstate.json, the batching is not useful anymore. > Race conditions can easily be solved by using a compare and set in Zookeeper. > The proposed solution is , whenever an operation is required to be performed > on the clusterstate, the same thread (and of course the same JVM) > # read the fresh state and version of zk node > # construct the new state > # perform a compare and set > # if compare and set fails go to step 1 > This should be limited to all operations performed on external collections > because batching would be required for others -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5872) Eliminate overseer queue
[ https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15903961#comment-15903961 ] albert vico oton commented on SOLR-5872: thanks for the advice, I'll take a look but the problem we were seeing was in Zookeeper cluster not in solr > Eliminate overseer queue > - > > Key: SOLR-5872 > URL: https://issues.apache.org/jira/browse/SOLR-5872 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Reporter: Noble Paul >Assignee: Noble Paul > > The overseer queue is one of the busiest points in the entire system. The > raison d'être of the queue is > * Provide batching of operations for the main clusterstate,json so that > state updates are minimized > * Avoid race conditions and ensure order > Now , as we move the individual collection states out of the main > clusterstate.json, the batching is not useful anymore. > Race conditions can easily be solved by using a compare and set in Zookeeper. > The proposed solution is , whenever an operation is required to be performed > on the clusterstate, the same thread (and of course the same JVM) > # read the fresh state and version of zk node > # construct the new state > # perform a compare and set > # if compare and set fails go to step 1 > This should be limited to all operations performed on external collections > because batching would be required for others -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5872) Eliminate overseer queue
[ https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15903557#comment-15903557 ] Shawn Heisey commented on SOLR-5872: bq. We were doing our tests with solr 6.4.1 That version has other problems that may be clouding the issue (pun intended). See SOLR-10130. I advise an immediate upgrade to 6.4.2, to be sure that any problems you're encountering are actually caused by SolrCloud, not high CPU usage. > Eliminate overseer queue > - > > Key: SOLR-5872 > URL: https://issues.apache.org/jira/browse/SOLR-5872 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Reporter: Noble Paul >Assignee: Noble Paul > > The overseer queue is one of the busiest points in the entire system. The > raison d'être of the queue is > * Provide batching of operations for the main clusterstate,json so that > state updates are minimized > * Avoid race conditions and ensure order > Now , as we move the individual collection states out of the main > clusterstate.json, the batching is not useful anymore. > Race conditions can easily be solved by using a compare and set in Zookeeper. > The proposed solution is , whenever an operation is required to be performed > on the clusterstate, the same thread (and of course the same JVM) > # read the fresh state and version of zk node > # construct the new state > # perform a compare and set > # if compare and set fails go to step 1 > This should be limited to all operations performed on external collections > because batching would be required for others -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5872) Eliminate overseer queue
[ https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15903394#comment-15903394 ] albert vico oton commented on SOLR-5872: Already did that, but nodes still notify its state change, apparently collections need to know about other collections status in order to reroute queries to them, this amount of state change msg was killing our io in the ZK cluster, causing a lot of cpu wait time and effectively rendering the system unusable. But honestly that's as far as we went, we moved away from solrcloud and now we are using standalone solr inside each container for each of our collections and doing the balancing between replicas through nginx. > Eliminate overseer queue > - > > Key: SOLR-5872 > URL: https://issues.apache.org/jira/browse/SOLR-5872 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Reporter: Noble Paul >Assignee: Noble Paul > > The overseer queue is one of the busiest points in the entire system. The > raison d'être of the queue is > * Provide batching of operations for the main clusterstate,json so that > state updates are minimized > * Avoid race conditions and ensure order > Now , as we move the individual collection states out of the main > clusterstate.json, the batching is not useful anymore. > Race conditions can easily be solved by using a compare and set in Zookeeper. > The proposed solution is , whenever an operation is required to be performed > on the clusterstate, the same thread (and of course the same JVM) > # read the fresh state and version of zk node > # construct the new state > # perform a compare and set > # if compare and set fails go to step 1 > This should be limited to all operations performed on external collections > because batching would be required for others -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5872) Eliminate overseer queue
[ https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15903303#comment-15903303 ] Erick Erickson commented on SOLR-5872: -- bq: "Also, I do not see why collection A should be aware of collection B state." What is your evidence of this? Because this was changed quite a while ago. Originally there was a single clusterstate.json that held all of the state information for all the collections, but that changed to having a state.json in each collections' Znode. So either you're misinterpreting something or somehow using an older style Zookeeper state. Or something really strange is happening. See the collections API MIGRATESTATEFORMAT. > Eliminate overseer queue > - > > Key: SOLR-5872 > URL: https://issues.apache.org/jira/browse/SOLR-5872 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Reporter: Noble Paul >Assignee: Noble Paul > > The overseer queue is one of the busiest points in the entire system. The > raison d'être of the queue is > * Provide batching of operations for the main clusterstate,json so that > state updates are minimized > * Avoid race conditions and ensure order > Now , as we move the individual collection states out of the main > clusterstate.json, the batching is not useful anymore. > Race conditions can easily be solved by using a compare and set in Zookeeper. > The proposed solution is , whenever an operation is required to be performed > on the clusterstate, the same thread (and of course the same JVM) > # read the fresh state and version of zk node > # construct the new state > # perform a compare and set > # if compare and set fails go to step 1 > This should be limited to all operations performed on external collections > because batching would be required for others -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5872) Eliminate overseer queue
[ https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15902928#comment-15902928 ] albert vico oton commented on SOLR-5872: We were doing our tests with solr 6.4.1 > Eliminate overseer queue > - > > Key: SOLR-5872 > URL: https://issues.apache.org/jira/browse/SOLR-5872 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Reporter: Noble Paul >Assignee: Noble Paul > > The overseer queue is one of the busiest points in the entire system. The > raison d'être of the queue is > * Provide batching of operations for the main clusterstate,json so that > state updates are minimized > * Avoid race conditions and ensure order > Now , as we move the individual collection states out of the main > clusterstate.json, the batching is not useful anymore. > Race conditions can easily be solved by using a compare and set in Zookeeper. > The proposed solution is , whenever an operation is required to be performed > on the clusterstate, the same thread (and of course the same JVM) > # read the fresh state and version of zk node > # construct the new state > # perform a compare and set > # if compare and set fails go to step 1 > This should be limited to all operations performed on external collections > because batching would be required for others -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5872) Eliminate overseer queue
[ https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900838#comment-15900838 ] Mark Miller commented on SOLR-5872: --- bq. Hello, we are currently trying to do a deploy of around 200 collections With which version? > Eliminate overseer queue > - > > Key: SOLR-5872 > URL: https://issues.apache.org/jira/browse/SOLR-5872 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Reporter: Noble Paul >Assignee: Noble Paul > > The overseer queue is one of the busiest points in the entire system. The > raison d'être of the queue is > * Provide batching of operations for the main clusterstate,json so that > state updates are minimized > * Avoid race conditions and ensure order > Now , as we move the individual collection states out of the main > clusterstate.json, the batching is not useful anymore. > Race conditions can easily be solved by using a compare and set in Zookeeper. > The proposed solution is , whenever an operation is required to be performed > on the clusterstate, the same thread (and of course the same JVM) > # read the fresh state and version of zk node > # construct the new state > # perform a compare and set > # if compare and set fails go to step 1 > This should be limited to all operations performed on external collections > because batching would be required for others -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5872) Eliminate overseer queue
[ https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900731#comment-15900731 ] Noble Paul commented on SOLR-5872: -- it's time to split the "state" from "state.json" into a separate file. replica-status.json {code} { // 0:DOWN // 1: ACTIVE // 2: RECOVERING "replica1": 1 "replica2": 1 } {code} So every core watches 2 files instead of one and 99% of changes happen to the replica-status.json This can help us scale to a very large no:of of shards > Eliminate overseer queue > - > > Key: SOLR-5872 > URL: https://issues.apache.org/jira/browse/SOLR-5872 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Reporter: Noble Paul >Assignee: Noble Paul > > The overseer queue is one of the busiest points in the entire system. The > raison d'être of the queue is > * Provide batching of operations for the main clusterstate,json so that > state updates are minimized > * Avoid race conditions and ensure order > Now , as we move the individual collection states out of the main > clusterstate.json, the batching is not useful anymore. > Race conditions can easily be solved by using a compare and set in Zookeeper. > The proposed solution is , whenever an operation is required to be performed > on the clusterstate, the same thread (and of course the same JVM) > # read the fresh state and version of zk node > # construct the new state > # perform a compare and set > # if compare and set fails go to step 1 > This should be limited to all operations performed on external collections > because batching would be required for others -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5872) Eliminate overseer queue
[ https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15890021#comment-15890021 ] albert vico oton commented on SOLR-5872: Hello, we are currently trying to do a deploy of around 200 collections and solrcloud can't handle it, it just dies due update_status messages propagation everytime we try to add a new collection, each collection has 3 replicas, and sizes are not very large. Also, I do not see why collection A should be aware of collection B state. But moving to the topic, overseer node dies since he can not handle all the stress due the flooding of messages. IMHO we have here a single point of failure in a distributed system, which is not very recommended. since it would be useful for big fat shards, my suggestion would be to make this optional behavior, so people like use who need to have a more distributed approach can make use of solrcloud. Since right now it is impossible to. and I'm not talking about "thousands" of collections actually with as few as 100 we are seeing very bad performance. > Eliminate overseer queue > - > > Key: SOLR-5872 > URL: https://issues.apache.org/jira/browse/SOLR-5872 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Reporter: Noble Paul >Assignee: Noble Paul > > The overseer queue is one of the busiest points in the entire system. The > raison d'être of the queue is > * Provide batching of operations for the main clusterstate,json so that > state updates are minimized > * Avoid race conditions and ensure order > Now , as we move the individual collection states out of the main > clusterstate.json, the batching is not useful anymore. > Race conditions can easily be solved by using a compare and set in Zookeeper. > The proposed solution is , whenever an operation is required to be performed > on the clusterstate, the same thread (and of course the same JVM) > # read the fresh state and version of zk node > # construct the new state > # perform a compare and set > # if compare and set fails go to step 1 > This should be limited to all operations performed on external collections > because batching would be required for others -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5872) Eliminate overseer queue
[ https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699051#comment-14699051 ] Shalin Shekhar Mangar commented on SOLR-5872: - bq. I see the lack of batching in stateFormat=2 is a potential blocker to it's adoption. We need some benchmarks on a single collection with lots of cores (at least 1000), and see how it works with stateFormat=1, stateFormat=2, and this new approach A single collection with lots of cores will perform similarly with both stateFormat=1 and stateFormat=2 because updates in stateFormat=2 are also batched as long as consecutive updates are for the same collection. Eliminate overseer queue - Key: SOLR-5872 URL: https://issues.apache.org/jira/browse/SOLR-5872 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Noble Paul Assignee: Noble Paul The overseer queue is one of the busiest points in the entire system. The raison d'être of the queue is * Provide batching of operations for the main clusterstate,json so that state updates are minimized * Avoid race conditions and ensure order Now , as we move the individual collection states out of the main clusterstate.json, the batching is not useful anymore. Race conditions can easily be solved by using a compare and set in Zookeeper. The proposed solution is , whenever an operation is required to be performed on the clusterstate, the same thread (and of course the same JVM) # read the fresh state and version of zk node # construct the new state # perform a compare and set # if compare and set fails go to step 1 This should be limited to all operations performed on external collections because batching would be required for others -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5872) Eliminate overseer queue
[ https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700144#comment-14700144 ] Scott Blum commented on SOLR-5872: -- At the risk of creating two code paths, here's an idea. 1) We could improve batching *significantly* at the Overseer level, to be able to batch even when the same collection isn't updated twice in a row. We just need something like a dirty list instead of only tracking the last one and the shared clusterStateModified. This could be an independent improvement. 2) When performing updates on format=2, we could use a size heuristic to decide whether or not to go through the queue. For collections with less than N shards, we could just do a local CAS loop for state update ops. For collections with more than N shares we'd just always go through the queue. Eliminate overseer queue - Key: SOLR-5872 URL: https://issues.apache.org/jira/browse/SOLR-5872 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Noble Paul Assignee: Noble Paul The overseer queue is one of the busiest points in the entire system. The raison d'être of the queue is * Provide batching of operations for the main clusterstate,json so that state updates are minimized * Avoid race conditions and ensure order Now , as we move the individual collection states out of the main clusterstate.json, the batching is not useful anymore. Race conditions can easily be solved by using a compare and set in Zookeeper. The proposed solution is , whenever an operation is required to be performed on the clusterstate, the same thread (and of course the same JVM) # read the fresh state and version of zk node # construct the new state # perform a compare and set # if compare and set fails go to step 1 This should be limited to all operations performed on external collections because batching would be required for others -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5872) Eliminate overseer queue
[ https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700181#comment-14700181 ] Ramkumar Aiyengar commented on SOLR-5872: - (1) sounds like a good idea.. On (2), to either have people decide on the approach, or have Solr do it, we would need to know the perf characteristics of both approaches. So maintaining two implementations or not really comes down to what we are trading off against. Again, only some serious benchmarking can answer that.. Eliminate overseer queue - Key: SOLR-5872 URL: https://issues.apache.org/jira/browse/SOLR-5872 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Noble Paul Assignee: Noble Paul The overseer queue is one of the busiest points in the entire system. The raison d'être of the queue is * Provide batching of operations for the main clusterstate,json so that state updates are minimized * Avoid race conditions and ensure order Now , as we move the individual collection states out of the main clusterstate.json, the batching is not useful anymore. Race conditions can easily be solved by using a compare and set in Zookeeper. The proposed solution is , whenever an operation is required to be performed on the clusterstate, the same thread (and of course the same JVM) # read the fresh state and version of zk node # construct the new state # perform a compare and set # if compare and set fails go to step 1 This should be limited to all operations performed on external collections because batching would be required for others -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5872) Eliminate overseer queue
[ https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700191#comment-14700191 ] Scott Blum commented on SOLR-5872: -- Agreed. The idea behind #2 is to serve two very different kind of configurations. a) Avoiding the overseer queue for small shard count is an optimization for deployments that have a huge number of collections, but each collection has very few shards. The process of starting up and shutting down is more efficient because each collection can be updated in a distributed manner. b) Using the overseer queue for big shard count is an optimization for deployments that have few collections, but each collection has a huge number of shards. Eliminate overseer queue - Key: SOLR-5872 URL: https://issues.apache.org/jira/browse/SOLR-5872 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Noble Paul Assignee: Noble Paul The overseer queue is one of the busiest points in the entire system. The raison d'être of the queue is * Provide batching of operations for the main clusterstate,json so that state updates are minimized * Avoid race conditions and ensure order Now , as we move the individual collection states out of the main clusterstate.json, the batching is not useful anymore. Race conditions can easily be solved by using a compare and set in Zookeeper. The proposed solution is , whenever an operation is required to be performed on the clusterstate, the same thread (and of course the same JVM) # read the fresh state and version of zk node # construct the new state # perform a compare and set # if compare and set fails go to step 1 This should be limited to all operations performed on external collections because batching would be required for others -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5872) Eliminate overseer queue
[ https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14698494#comment-14698494 ] Noble Paul commented on SOLR-5872: -- as [~andyetitmoves] said, batching offers serious benefits. stateformat=2 really does not matter . A collection with a lot of shards is more likely than a large no:of collections. Without batching , it will have the same bottleneck. The batching is not for writing to ZK, it is for reading from ZK. If there are 1000s of cores reading every single update to the {{state.json}} we are back to square one. We will need to do some serious benchmarking to prove the performance of that it is worth it Eliminate overseer queue - Key: SOLR-5872 URL: https://issues.apache.org/jira/browse/SOLR-5872 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Noble Paul Assignee: Noble Paul The overseer queue is one of the busiest points in the entire system. The raison d'être of the queue is * Provide batching of operations for the main clusterstate,json so that state updates are minimized * Avoid race conditions and ensure order Now , as we move the individual collection states out of the main clusterstate.json, the batching is not useful anymore. Race conditions can easily be solved by using a compare and set in Zookeeper. The proposed solution is , whenever an operation is required to be performed on the clusterstate, the same thread (and of course the same JVM) # read the fresh state and version of zk node # construct the new state # perform a compare and set # if compare and set fails go to step 1 This should be limited to all operations performed on external collections because batching would be required for others -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5872) Eliminate overseer queue
[ https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14698435#comment-14698435 ] Ramkumar Aiyengar commented on SOLR-5872: - Though I haven't done serious experiments on this as yet, I see the lack of batching in stateFormat=2 is a potential blocker to it's adoption. We need some benchmarks on a single collection with lots of cores (at least 1000), and see how it works with stateFormat=1, stateFormat=2, and this new approach. Keep in mind that hundreds of cores might change state at the same time, that's the real benefit to batching. I fear that without a batching approach, the system might choke due to the contention at that point. Eliminate overseer queue - Key: SOLR-5872 URL: https://issues.apache.org/jira/browse/SOLR-5872 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Noble Paul Assignee: Noble Paul The overseer queue is one of the busiest points in the entire system. The raison d'être of the queue is * Provide batching of operations for the main clusterstate,json so that state updates are minimized * Avoid race conditions and ensure order Now , as we move the individual collection states out of the main clusterstate.json, the batching is not useful anymore. Race conditions can easily be solved by using a compare and set in Zookeeper. The proposed solution is , whenever an operation is required to be performed on the clusterstate, the same thread (and of course the same JVM) # read the fresh state and version of zk node # construct the new state # perform a compare and set # if compare and set fails go to step 1 This should be limited to all operations performed on external collections because batching would be required for others -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5872) Eliminate overseer queue
[ https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14658807#comment-14658807 ] Scott Blum commented on SOLR-5872: -- Now that SOLR-5756 is close to landed, I want to take a serious stab at making updates to format2 collections not go through overseer. IE, anything that modifies clusterstate.json goes through overseer, but anything that modifies a /collection/foo/state.json would be handled by the local node with a CAS loop. I realize that for a collection with a huge number of shards+replicas, there could be contention on that single node. Worth nothing that the current implementation doesn't batch format2 updates anyway, it ends up doing a (non-contended) write for every individual mutation. Eliminate overseer queue - Key: SOLR-5872 URL: https://issues.apache.org/jira/browse/SOLR-5872 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Noble Paul Assignee: Noble Paul The overseer queue is one of the busiest points in the entire system. The raison d'être of the queue is * Provide batching of operations for the main clusterstate,json so that state updates are minimized * Avoid race conditions and ensure order Now , as we move the individual collection states out of the main clusterstate.json, the batching is not useful anymore. Race conditions can easily be solved by using a compare and set in Zookeeper. The proposed solution is , whenever an operation is required to be performed on the clusterstate, the same thread (and of course the same JVM) # read the fresh state and version of zk node # construct the new state # perform a compare and set # if compare and set fails go to step 1 This should be limited to all operations performed on external collections because batching would be required for others -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5872) Eliminate overseer queue
[ https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940787#comment-13940787 ] Mark Miller commented on SOLR-5872: --- bq. Is that dead in the water now? No. It's got it's own issue, and it seems likely to happen to me. Even this issue is not dead in the water. Things are generally determined via discussion and consensus. I'm arguing that we should look at simple performance bottleneck and improvements to the current system - there seems to be a lot of low hanging fruit. {noformat} Can you throw some light on how was the ZK schema for your initial impl? If all nodes of a given slice is under one zk directory , one watch on the parent should be fine, right? {noformat} It's been a long time and we had a few variations, so I'd have to go back in the code to refresh my memory. For now, from my memory: Initially I had it to that we simply watched the parent - Loggly ran into performance issues with this - even when only one entry changed, they had so many entries that updating the state with so many nodes reading so many entries, the performance was a big problem for them. They hacked around it initially, and then we moved to watching each entry eventually - this made small updating state for small changes very efficient. But then another big early user was still hitting performance issues simply from having to read so many entries on startup and such. This is what prompted the move to a single clusterstate.json. It's hard to remember it all perfectly - the info is spread across and around a lot of old JIRAs. Non of the changes were taken lightly, and a variety of developers and contributors were generally involved in the discussion or motivating changes via their needs. There are tradeoffs with all of these approaches. Eliminate overseer queue - Key: SOLR-5872 URL: https://issues.apache.org/jira/browse/SOLR-5872 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Noble Paul Assignee: Noble Paul The overseer queue is one of the busiest points in the entire system. The raison d'être of the queue is * Provide batching of operations for the main clusterstate,json so that state updates are minimized * Avoid race conditions and ensure order Now , as we move the individual collection states out of the main clusterstate.json, the batching is not useful anymore. Race conditions can easily be solved by using a compare and set in Zookeeper. The proposed solution is , whenever an operation is required to be performed on the clusterstate, the same thread (and of course the same JVM) # read the fresh state and version of zk node # construct the new state # perform a compare and set # if compare and set fails go to step 1 This should be limited to all operations performed on external collections because batching would be required for others -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5872) Eliminate overseer queue
[ https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938943#comment-13938943 ] Shalin Shekhar Mangar commented on SOLR-5872: - {quote} bq. as we move the individual collection states out of the main clusterstate.json [...] This will make a difference on clusters with many smaller collections, but not on the single big collection. It seems like we still want scalability in both directions (wrt number of collections, and the size a single collection can be). {quote} The best solution that I see here is to move the replica states out into their own ZK nodes. This way the individual nodes can update them directly without the overseer via compare and set operations. The rest of the operations can continue to be processed in the overseer. If we do this, even the external collection changes may not be required. The leader information can also be read directly from the leader election nodes. Eliminate overseer queue - Key: SOLR-5872 URL: https://issues.apache.org/jira/browse/SOLR-5872 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Noble Paul Assignee: Noble Paul The overseer queue is one of the busiest points in the entire system. The raison d'être of the queue is * Provide batching of operations for the main clusterstate,json so that state updates are minimized * Avoid race conditions and ensure order Now , as we move the individual collection states out of the main clusterstate.json, the batching is not useful anymore. Race conditions can easily be solved by using a compare and set in Zookeeper. The proposed solution is , whenever an operation is required to be performed on the clusterstate, the same thread (and of course the same JVM) # read the fresh state and version of zk node # construct the new state # perform a compare and set # if compare and set fails go to step 1 This should be limited to all operations performed on external collections because batching would be required for others -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5872) Eliminate overseer queue
[ https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939413#comment-13939413 ] Mark Miller commented on SOLR-5872: --- bq. is to move the replica states out into their own ZK nodes. That is also how I first implemented the clusterstate - it was super slow to read the state and required a ridiculous number of watchers. Now that they have some options to read multiple nodes in one call, it may be that you can work around some of the issues we had, but it was really only good for the case where you had small changes in state to read - users had real issues with performance otherwise and that is why we moved to clusterstate.json. It's a similar issue - we have been there before, we moved because of tough issues, it's should be a high bar to go back. Eliminate overseer queue - Key: SOLR-5872 URL: https://issues.apache.org/jira/browse/SOLR-5872 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Noble Paul Assignee: Noble Paul The overseer queue is one of the busiest points in the entire system. The raison d'être of the queue is * Provide batching of operations for the main clusterstate,json so that state updates are minimized * Avoid race conditions and ensure order Now , as we move the individual collection states out of the main clusterstate.json, the batching is not useful anymore. Race conditions can easily be solved by using a compare and set in Zookeeper. The proposed solution is , whenever an operation is required to be performed on the clusterstate, the same thread (and of course the same JVM) # read the fresh state and version of zk node # construct the new state # perform a compare and set # if compare and set fails go to step 1 This should be limited to all operations performed on external collections because batching would be required for others -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5872) Eliminate overseer queue
[ https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939422#comment-13939422 ] Noble Paul commented on SOLR-5872: -- bq.That is also how I first implemented the clusterstate Can you throw some light on how was the ZK schema for your initial impl? If all nodes of a given slice is in one watch on the parent should be fine, right? Eliminate overseer queue - Key: SOLR-5872 URL: https://issues.apache.org/jira/browse/SOLR-5872 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Noble Paul Assignee: Noble Paul The overseer queue is one of the busiest points in the entire system. The raison d'être of the queue is * Provide batching of operations for the main clusterstate,json so that state updates are minimized * Avoid race conditions and ensure order Now , as we move the individual collection states out of the main clusterstate.json, the batching is not useful anymore. Race conditions can easily be solved by using a compare and set in Zookeeper. The proposed solution is , whenever an operation is required to be performed on the clusterstate, the same thread (and of course the same JVM) # read the fresh state and version of zk node # construct the new state # perform a compare and set # if compare and set fails go to step 1 This should be limited to all operations performed on external collections because batching would be required for others -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5872) Eliminate overseer queue
[ https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939824#comment-13939824 ] Ramkumar Aiyengar commented on SOLR-5872: - Wasn't one of the ideas considered in one of the other tickets to 'shard' the cluster state into N pieces so that we can hit a sweet spot between number of watchers and contention? Is that dead in the water now? Eliminate overseer queue - Key: SOLR-5872 URL: https://issues.apache.org/jira/browse/SOLR-5872 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Noble Paul Assignee: Noble Paul The overseer queue is one of the busiest points in the entire system. The raison d'être of the queue is * Provide batching of operations for the main clusterstate,json so that state updates are minimized * Avoid race conditions and ensure order Now , as we move the individual collection states out of the main clusterstate.json, the batching is not useful anymore. Race conditions can easily be solved by using a compare and set in Zookeeper. The proposed solution is , whenever an operation is required to be performed on the clusterstate, the same thread (and of course the same JVM) # read the fresh state and version of zk node # construct the new state # perform a compare and set # if compare and set fails go to step 1 This should be limited to all operations performed on external collections because batching would be required for others -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5872) Eliminate overseer queue
[ https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13937842#comment-13937842 ] Yonik Seeley commented on SOLR-5872: bq. as we move the individual collection states out of the main clusterstate.json [...] This will make a difference on clusters with many smaller collections, but not on the single big collection. It seems like we still want scalability in both directions (wrt number of collections, and the size a single collection can be). Eliminate overseer queue - Key: SOLR-5872 URL: https://issues.apache.org/jira/browse/SOLR-5872 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Noble Paul Assignee: Noble Paul The overseer queue is one of the busiest points in the entire system. The raison d'être of the queue is * Provide batching of operations for the main clusterstate,json so that state updates are minimized * Avoid race conditions and ensure order Now , as we move the individual collection states out of the main clusterstate.json, the batching is not useful anymore. Race conditions can easily be solved by using a compare and set in Zookeeper. The proposed solution is , whenever an operation is required to be performed on the clusterstate, the same thread (and of course the same JVM) # read the fresh state and version of zk node # construct the new state # perform a compare and set # if compare and set fails go to step 1 This should be limited to all operations performed on external collections because batching would be required for others -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5872) Eliminate overseer queue
[ https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13937901#comment-13937901 ] Mark Miller commented on SOLR-5872: --- I'm not fully sold on this yet. Compare and set is how this was first implemented and it has it's own issues - hence the work Sami did to move to the queue. Potter has noticed the overseer is fairly slow at working through state updates. I think that should be investigated first. Eliminate overseer queue - Key: SOLR-5872 URL: https://issues.apache.org/jira/browse/SOLR-5872 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Noble Paul Assignee: Noble Paul The overseer queue is one of the busiest points in the entire system. The raison d'être of the queue is * Provide batching of operations for the main clusterstate,json so that state updates are minimized * Avoid race conditions and ensure order Now , as we move the individual collection states out of the main clusterstate.json, the batching is not useful anymore. Race conditions can easily be solved by using a compare and set in Zookeeper. The proposed solution is , whenever an operation is required to be performed on the clusterstate, the same thread (and of course the same JVM) # read the fresh state and version of zk node # construct the new state # perform a compare and set # if compare and set fails go to step 1 This should be limited to all operations performed on external collections because batching would be required for others -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5872) Eliminate overseer queue
[ https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938075#comment-13938075 ] Jessica Cheng commented on SOLR-5872: - Seems like everyone is worried about batching. I think it'd be interesting to add logging/ stats tracking and experiment on a large cluster to see how much batching is actually achieved. There are a few things I worry about with the current implementation: - With the overseer queues, each state update is 4+ zookeeper writes: 1 enqueue to stateUpdateQueue, 1 enqueue to workqueue, 1 state update write (potentially batched), 1 dequeue from stateUpdateQueue, and 1 dequeue from workqueue--not to mention that each core going through a restart could generate quite a few state updates (down, potentially isLeader switch, recovering, up) and each node can contain multiple cores. - Empirically, we have definitely seen the workqueue back up with lots of items during a node bounce--but of course this can be due to some bug that's causing Potter to notice the slowness. - If batching really is so important, there's no batching for external collection state updates. - In a normal rolling bounce where instances are restarted one-by-one, in the same order each time, the Overseer is killed at each instance restart, thus hindering the recovery process by gating state transition. (Here there are workarounds by playing with bounce orders, etc., but I would argue that in any organization that would have a cluster large enough to worry about this, there is most likely a system that governs the machines and normally does instance 1 to N bounces, and a general-purpose ops team that eschews service-/app-specific bounce instructions.) With all that said, I would really appreciated it to have more background details about what problems Mark and Sami has seen in the old implementation, and exactly what that old implementation was. Eliminate overseer queue - Key: SOLR-5872 URL: https://issues.apache.org/jira/browse/SOLR-5872 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Noble Paul Assignee: Noble Paul The overseer queue is one of the busiest points in the entire system. The raison d'être of the queue is * Provide batching of operations for the main clusterstate,json so that state updates are minimized * Avoid race conditions and ensure order Now , as we move the individual collection states out of the main clusterstate.json, the batching is not useful anymore. Race conditions can easily be solved by using a compare and set in Zookeeper. The proposed solution is , whenever an operation is required to be performed on the clusterstate, the same thread (and of course the same JVM) # read the fresh state and version of zk node # construct the new state # perform a compare and set # if compare and set fails go to step 1 This should be limited to all operations performed on external collections because batching would be required for others -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5872) Eliminate overseer queue
[ https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938116#comment-13938116 ] Mark Miller commented on SOLR-5872: --- Thanks Jessica - your comment is more along the lines of what you need to argue to make a large change like this. Specifics. I don't have time to write a detailed answer at the moment, but a lot of my reservations are around the large refactoring that is being attempted to support tons of collections. So far, I have not been super happy with a lot of the work that has been done. Much of it seems hurried, existing tests have not been beefed up in critical areas, new tests have been fairly minimal, and so I'm likely to push back on many of these issues. We have too many stability issues to tackle as it is. Abandoning code that has been getting hardened for over a year now for a approach that was already abandoned should not be done lightly. If someone makes a thoughtful and clear argument with specifics and then makes a thoughtful, well tested implementation, I'm much more likely to get on board. I'll respond to the technical points when I get some time. Eliminate overseer queue - Key: SOLR-5872 URL: https://issues.apache.org/jira/browse/SOLR-5872 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Noble Paul Assignee: Noble Paul The overseer queue is one of the busiest points in the entire system. The raison d'être of the queue is * Provide batching of operations for the main clusterstate,json so that state updates are minimized * Avoid race conditions and ensure order Now , as we move the individual collection states out of the main clusterstate.json, the batching is not useful anymore. Race conditions can easily be solved by using a compare and set in Zookeeper. The proposed solution is , whenever an operation is required to be performed on the clusterstate, the same thread (and of course the same JVM) # read the fresh state and version of zk node # construct the new state # perform a compare and set # if compare and set fails go to step 1 This should be limited to all operations performed on external collections because batching would be required for others -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5872) Eliminate overseer queue
[ https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938185#comment-13938185 ] Noble Paul commented on SOLR-5872: -- People suggest new changes to the system when /where they think it is required. It is important that we counter suggestions on their own merits/demerits. I'm sure you [~markrmil...@gmail.com] /Sami would have abandoned the idea because of some real issues. I would love to hear them out (when you have time) .The issues may not me insurmountable . But , the point is , looking at the code the Overseer queue is seen as quite a bottleneck and this is the solution that immediately comes to ones mind. Anyone who can build up a patch will be a good demonstration of the possibility of such a solution. People who are testing out their systems in real test environment will be able to provide invaluable feedback on the viability/issues with the solution. As developers, we need to guide/handhold the users who are pushing the envelope . At some point when we develop enough confidence we can integrate it into the product itself . bq.It seems like we still want scalability in both directions (wrt number of collections, and the size a single collection can be). Yes, in the current system scaling with multiple collections is much simpler and a first baby step towards breaking the monolithic clusterstate.json . Eventually we would like to go to a state per slice so that we can support very large collections. But these new experiments need to be tried out first before we venture into larger ones Eliminate overseer queue - Key: SOLR-5872 URL: https://issues.apache.org/jira/browse/SOLR-5872 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Noble Paul Assignee: Noble Paul The overseer queue is one of the busiest points in the entire system. The raison d'être of the queue is * Provide batching of operations for the main clusterstate,json so that state updates are minimized * Avoid race conditions and ensure order Now , as we move the individual collection states out of the main clusterstate.json, the batching is not useful anymore. Race conditions can easily be solved by using a compare and set in Zookeeper. The proposed solution is , whenever an operation is required to be performed on the clusterstate, the same thread (and of course the same JVM) # read the fresh state and version of zk node # construct the new state # perform a compare and set # if compare and set fails go to step 1 This should be limited to all operations performed on external collections because batching would be required for others -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5872) Eliminate overseer queue
[ https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938498#comment-13938498 ] Mark Miller commented on SOLR-5872: --- {noformat} People suggest new changes to the system when /where they think it is required. It is important that we counter suggestions on their own merits/demerits. {noformat} Of course - and given this issue as presented, as I said I'm not fully sold on this yet. The other background I gave also applies to all of these issues. We won't just rip out tons of code and replace it just because someone has identified an issue and proposed a solution. The bar for this type of change should be high. Given the history of these changes, I'm going to have to be sold more than if the history was better. Each contributor is also judged on their merit - what have they contributed so far, what's the quality of their contributions, how much are they helping with test fails, etc. The more merit you have on SolrCloud, the more likely these large scale refactorings will receive support. {noformat} As developers, we need to guide/handhold the users who are pushing the envelope . At some point when we develop enough confidence we can integrate it into the product itself . {noformat} We don't necessarily need to hold hands - we need to take that on a case by case basis. We need to walk before we can run. We should probably jog before we run as well. It's an issue by issue thing though, and Jessica has already begun providing a case for looking at this. bq. At some point when we develop enough confidence we can integrate it into the product itself . I've heard that before and I'm not totally sold that's how things will play out. Certainly they would not play out that way without some push back on these issues IMO. Eliminate overseer queue - Key: SOLR-5872 URL: https://issues.apache.org/jira/browse/SOLR-5872 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Noble Paul Assignee: Noble Paul The overseer queue is one of the busiest points in the entire system. The raison d'être of the queue is * Provide batching of operations for the main clusterstate,json so that state updates are minimized * Avoid race conditions and ensure order Now , as we move the individual collection states out of the main clusterstate.json, the batching is not useful anymore. Race conditions can easily be solved by using a compare and set in Zookeeper. The proposed solution is , whenever an operation is required to be performed on the clusterstate, the same thread (and of course the same JVM) # read the fresh state and version of zk node # construct the new state # perform a compare and set # if compare and set fails go to step 1 This should be limited to all operations performed on external collections because batching would be required for others -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5872) Eliminate overseer queue
[ https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938539#comment-13938539 ] Mark Miller commented on SOLR-5872: --- bq. With the overseer queues, each state update is 4+ zookeeper writes Given the numbers I've seen published for ZK performance, it seems like that should not be a big deal in typical cases? bq. Empirically, we have definitely seen the workqueue back up with lots of items during a node bounce I'm not surprised - most of this code has not been optimized or investigated thoroughly. The original author of a lot of the Overseer code has moved on and it likely has not seen as much attention as would be nice over the past year. Until someone looks into the current issues closely though, it seems hard to recommend rewriting this whole very important piece. bq. If batching really is so important, there's no batching for external collection state updates. I'm not really fully up on external collections but AFAIK it's part of some other work to support tons of collections that I'm not fully sold on yet either :) bq. In a normal rolling bounce where instances are restarted one-by-one, in the same order each time, the Overseer is killed at each instance restart, thus hindering the recovery process by gating state transition. This points out another issue that we might be able to address. Without having looked closely at the issues brought up (and I don't see evidence anyone else has either), it's hard to draw the conclusion the whole thing just has to be replaced yet. A couple issues around the old implementation: * With every node updating the whole cluster state on state change, the clusterstate.json file is read far too much. The workaround you guys are proposing for that appears to be only having clients update the clusterstate when they run into an error - but I'm not sold that that is the best architecture for the future either. That's a complicated change to make, with many ramifications for future development. * Some things that are in the clusterstate now and that could be in the future are not so easily handled with the non overseer strategy - like marking who is the leader. You have to have the Overseer running its own special thread to inject and remove information. * As things are, on something like cluster startup, there will be tons of reads and writes of the clusterstate.json - a flood of attempts and retries to update it in ZooKeeper. For further discussion around the change, there should be background if you search the archives. There is a strong argument to be made that we should first investigate the performance issues with the current strategy. ZooKeeper is pretty fast - these state updates are tiny and batched. It seems like we should be able to do a lot better without throwing out code that has been getting hardened for a long time now. Eliminate overseer queue - Key: SOLR-5872 URL: https://issues.apache.org/jira/browse/SOLR-5872 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Noble Paul Assignee: Noble Paul The overseer queue is one of the busiest points in the entire system. The raison d'être of the queue is * Provide batching of operations for the main clusterstate,json so that state updates are minimized * Avoid race conditions and ensure order Now , as we move the individual collection states out of the main clusterstate.json, the batching is not useful anymore. Race conditions can easily be solved by using a compare and set in Zookeeper. The proposed solution is , whenever an operation is required to be performed on the clusterstate, the same thread (and of course the same JVM) # read the fresh state and version of zk node # construct the new state # perform a compare and set # if compare and set fails go to step 1 This should be limited to all operations performed on external collections because batching would be required for others -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5872) Eliminate overseer queue
[ https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938670#comment-13938670 ] Jessica Cheng commented on SOLR-5872: - quoteFor further discussion around the change, there should be background if you search the archives./quote If you wouldn't mind terribly, will you please paste the link of a few relevant threads in the archive? (Sorry, I'm not familiar with all the keywords and archives, etc., yet.) quoteThere is a strong argument to be made that we should first investigate the performance issues with the current strategy. ZooKeeper is pretty fast - these state updates are tiny and batched. It seems like we should be able to do a lot better without throwing out code that has been getting hardened for a long time now./quote I see where your hesitation is now, and I can definitely agree. Sounds like there are a few points to be investigated for the current system before we attempt to change anything: - Why is the Overseer's so slow at updating cluster state/ What's causing the build-up of queue messages during a restart? - What can we do to generally solve the problem of the Overseer being killed on every instance restart in a rolling bounce? - How much is actually batched? My gut is that for external collections, batching won't be of that much benefit (except for that super-large collection case that Yoink mentioned), but I agree that if the current system can be hardened to work even for those, then the simplicity of one code path should be preferred over ultra-optimizing for a non-issue (assuming the first two points above can be fixed). Eliminate overseer queue - Key: SOLR-5872 URL: https://issues.apache.org/jira/browse/SOLR-5872 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Noble Paul Assignee: Noble Paul The overseer queue is one of the busiest points in the entire system. The raison d'être of the queue is * Provide batching of operations for the main clusterstate,json so that state updates are minimized * Avoid race conditions and ensure order Now , as we move the individual collection states out of the main clusterstate.json, the batching is not useful anymore. Race conditions can easily be solved by using a compare and set in Zookeeper. The proposed solution is , whenever an operation is required to be performed on the clusterstate, the same thread (and of course the same JVM) # read the fresh state and version of zk node # construct the new state # perform a compare and set # if compare and set fails go to step 1 This should be limited to all operations performed on external collections because batching would be required for others -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5872) Eliminate overseer queue
[ https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938687#comment-13938687 ] Mark Miller commented on SOLR-5872: --- {noformat} Some things that are in the clusterstate now and that could be in the future are not so easily handled with the non overseer strategy - like marking who is the leader. You have to have the Overseer running its own special thread to inject and remove information. {noformat} To expand on this one a bit - you can obviously have each node essentially do what the overseer does now - to know the true shard leader that means things like going to ZooKeeper though - so for a large cluster, as each node takes on all the duties of the overseer and every node is now hitting zookeper for this and that, and then each node is trying update the clusterstate.json at the same time and retrying, and you have this contentious herd pilling onto this one zookeeper node. The Overseer was seen as a fairly elegant way to avoid this herd effect and provide a less chatty solution. Rather than all the retries and reading the state on every state change, everyone writes to a non contentious zk node, the Overseer batches up the info and writes out the state. Now if we cannot make it fast enough because of fundamental limitations, that is one thing. But gosh, on the surface, these state updates are so small and ZK is fairly performant... We should identify the bottlenecks. For startup, one random idea is to look at using zk's multi call support to read the whole queue in one request and then batch it all. I've got some other common sense ideas as well, but will have to find out the choke points before it makes a lot of sense brainstorming solutions. Eliminate overseer queue - Key: SOLR-5872 URL: https://issues.apache.org/jira/browse/SOLR-5872 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Noble Paul Assignee: Noble Paul The overseer queue is one of the busiest points in the entire system. The raison d'être of the queue is * Provide batching of operations for the main clusterstate,json so that state updates are minimized * Avoid race conditions and ensure order Now , as we move the individual collection states out of the main clusterstate.json, the batching is not useful anymore. Race conditions can easily be solved by using a compare and set in Zookeeper. The proposed solution is , whenever an operation is required to be performed on the clusterstate, the same thread (and of course the same JVM) # read the fresh state and version of zk node # construct the new state # perform a compare and set # if compare and set fails go to step 1 This should be limited to all operations performed on external collections because batching would be required for others -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5872) Eliminate overseer queue
[ https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938710#comment-13938710 ] Noble Paul commented on SOLR-5872: -- bq.There is a strong argument to be made that we should first investigate the performance issues with the current strategy. The current implementation of DistributedQueue.peek() is extremely expensive. It reads all the children and sort them and return one item from the head and discard all others. There can be a new method DistributedQueue.peek(n) where n is the number of items and Overseer can process them all in one batch . Eliminate overseer queue - Key: SOLR-5872 URL: https://issues.apache.org/jira/browse/SOLR-5872 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Noble Paul Assignee: Noble Paul The overseer queue is one of the busiest points in the entire system. The raison d'être of the queue is * Provide batching of operations for the main clusterstate,json so that state updates are minimized * Avoid race conditions and ensure order Now , as we move the individual collection states out of the main clusterstate.json, the batching is not useful anymore. Race conditions can easily be solved by using a compare and set in Zookeeper. The proposed solution is , whenever an operation is required to be performed on the clusterstate, the same thread (and of course the same JVM) # read the fresh state and version of zk node # construct the new state # perform a compare and set # if compare and set fails go to step 1 This should be limited to all operations performed on external collections because batching would be required for others -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5872) Eliminate overseer queue
[ https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938717#comment-13938717 ] Noble Paul commented on SOLR-5872: -- bq.I'm not really fully up on external collections but AFAIK it's part of some other work to support tons of collections There is more and more information getting added to the cluster state . I'm sure no one would object to the point that splitting the clusterstate.json would be a more scalable solution and the right direction to take. Of course, this is not to be done in haste , but eventually that should be the way. The eventual goal should be to support very large no:of collections (say 1000's) and support extremely large collections (with 1000's of slices) . Solr itself will not have any problem scaling like that but the Overseer/clusterstate strategy will go through a revamp before we reach there. Eliminate overseer queue - Key: SOLR-5872 URL: https://issues.apache.org/jira/browse/SOLR-5872 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Noble Paul Assignee: Noble Paul The overseer queue is one of the busiest points in the entire system. The raison d'être of the queue is * Provide batching of operations for the main clusterstate,json so that state updates are minimized * Avoid race conditions and ensure order Now , as we move the individual collection states out of the main clusterstate.json, the batching is not useful anymore. Race conditions can easily be solved by using a compare and set in Zookeeper. The proposed solution is , whenever an operation is required to be performed on the clusterstate, the same thread (and of course the same JVM) # read the fresh state and version of zk node # construct the new state # perform a compare and set # if compare and set fails go to step 1 This should be limited to all operations performed on external collections because batching would be required for others -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5872) Eliminate overseer queue
[ https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938751#comment-13938751 ] Mark Miller commented on SOLR-5872: --- If it's the issue about breaking up clusterstate.json per collection, I don't necessarily think that's a bad idea. I didn't realize that would make it something called an external collection though. Eliminate overseer queue - Key: SOLR-5872 URL: https://issues.apache.org/jira/browse/SOLR-5872 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Noble Paul Assignee: Noble Paul The overseer queue is one of the busiest points in the entire system. The raison d'être of the queue is * Provide batching of operations for the main clusterstate,json so that state updates are minimized * Avoid race conditions and ensure order Now , as we move the individual collection states out of the main clusterstate.json, the batching is not useful anymore. Race conditions can easily be solved by using a compare and set in Zookeeper. The proposed solution is , whenever an operation is required to be performed on the clusterstate, the same thread (and of course the same JVM) # read the fresh state and version of zk node # construct the new state # perform a compare and set # if compare and set fails go to step 1 This should be limited to all operations performed on external collections because batching would be required for others -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5872) Eliminate overseer queue
[ https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938755#comment-13938755 ] Noble Paul commented on SOLR-5872: -- The external collection is just a name. It really does not matter what we call them. The idea is to split the same data out to smaller state nodes . Eliminate overseer queue - Key: SOLR-5872 URL: https://issues.apache.org/jira/browse/SOLR-5872 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Noble Paul Assignee: Noble Paul The overseer queue is one of the busiest points in the entire system. The raison d'être of the queue is * Provide batching of operations for the main clusterstate,json so that state updates are minimized * Avoid race conditions and ensure order Now , as we move the individual collection states out of the main clusterstate.json, the batching is not useful anymore. Race conditions can easily be solved by using a compare and set in Zookeeper. The proposed solution is , whenever an operation is required to be performed on the clusterstate, the same thread (and of course the same JVM) # read the fresh state and version of zk node # construct the new state # perform a compare and set # if compare and set fails go to step 1 This should be limited to all operations performed on external collections because batching would be required for others -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5872) Eliminate overseer queue
[ https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938775#comment-13938775 ] Mark Miller commented on SOLR-5872: --- It's obviously just a name :) I didn't know that it existed - that's all I was saying - I figured it meant something else. To me it doesn't make much sense. I think if we decide to split out the clusterstate.json per collection, that is the direction we should take, we should only support one clusterstate.json for back compat at most, and no such special name should exist. Solr 5.0 would no longer support the single clusterstate.json. Or, we might even decide to have the Overseer upgrade the format for you or something before 5.0. Other thoughts on Overseer performance: * Because only one process should be reading and removing items from the distributed queue at a time, seems like there are many cases we could read multiple nodes in one call. * Perhaps 1500ms is not a great batch time - would be interesting if we made it configurable as well. * Seems there might be a lot of room for parallelism - we probably only need to order within a collection if not simply per SolrCore. Eliminate overseer queue - Key: SOLR-5872 URL: https://issues.apache.org/jira/browse/SOLR-5872 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Noble Paul Assignee: Noble Paul The overseer queue is one of the busiest points in the entire system. The raison d'être of the queue is * Provide batching of operations for the main clusterstate,json so that state updates are minimized * Avoid race conditions and ensure order Now , as we move the individual collection states out of the main clusterstate.json, the batching is not useful anymore. Race conditions can easily be solved by using a compare and set in Zookeeper. The proposed solution is , whenever an operation is required to be performed on the clusterstate, the same thread (and of course the same JVM) # read the fresh state and version of zk node # construct the new state # perform a compare and set # if compare and set fails go to step 1 This should be limited to all operations performed on external collections because batching would be required for others -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5872) Eliminate overseer queue
[ https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938796#comment-13938796 ] Noble Paul commented on SOLR-5872: -- bq. I think if we decide to split out the clusterstate.json per collection, that is the direction we should take Yes, that is the plan we would probably switch to that from 5.0 or something. But the challenge is to offer a smother migration path. * initially , users would be able to switch to that mode when creating a collection (an opt In) SOLR-5473 does that * offer an API to migrate to the new format SOLR-5756 * Make it the default format (from say 5.0) * deprecate the old format Eliminate overseer queue - Key: SOLR-5872 URL: https://issues.apache.org/jira/browse/SOLR-5872 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Noble Paul Assignee: Noble Paul The overseer queue is one of the busiest points in the entire system. The raison d'être of the queue is * Provide batching of operations for the main clusterstate,json so that state updates are minimized * Avoid race conditions and ensure order Now , as we move the individual collection states out of the main clusterstate.json, the batching is not useful anymore. Race conditions can easily be solved by using a compare and set in Zookeeper. The proposed solution is , whenever an operation is required to be performed on the clusterstate, the same thread (and of course the same JVM) # read the fresh state and version of zk node # construct the new state # perform a compare and set # if compare and set fails go to step 1 This should be limited to all operations performed on external collections because batching would be required for others -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org