[jira] [Commented] (SOLR-5872) Eliminate overseer queue

2017-07-28 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16106028#comment-16106028
 ] 

Noble Paul commented on SOLR-5872:
--

bq.Is it really true that every Solr node subscribes/watches to ZK state 
changes to all collections all the time

A node only watches those states if it has a replica of that collection.

> Eliminate overseer queue 
> -
>
> Key: SOLR-5872
> URL: https://issues.apache.org/jira/browse/SOLR-5872
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Reporter: Noble Paul
>Assignee: Noble Paul
>
> The overseer queue is one of the busiest points in the entire system. The 
> raison d'être of the queue is
>  * Provide batching of operations for the main clusterstate,json so that 
> state updates are minimized 
> * Avoid race conditions and ensure order
> Now , as we move the individual collection states out of the main 
> clusterstate.json, the batching is not useful anymore.
> Race conditions can easily be solved by using a compare and set in Zookeeper. 
> The proposed solution  is , whenever an operation is required to be performed 
> on the clusterstate, the same thread (and of course the same JVM)
>  # read the fresh state and version of zk node  
>  # construct the new state 
>  # perform a compare and set
>  # if compare and set fails go to step 1
> This should be limited to all operations performed on external collections 
> because batching would be required for others 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5872) Eliminate overseer queue

2017-07-28 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16106025#comment-16106025
 ] 

David Smiley commented on SOLR-5872:


Is it really true that every Solr node subscribes/watches to ZK state changes 
to all collections all the time, _even to Collections that have no replicas on 
the current node_?  Albert's comment 
https://issues.apache.org/jira/browse/SOLR-5872?focusedCommentId=15890021=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15890021
 indicate this is so.  I could understand doing this to query collections on 
different nodes but I think such watches should expire if not continuously 
utilized.  Is there another JIRA issue about this?

> Eliminate overseer queue 
> -
>
> Key: SOLR-5872
> URL: https://issues.apache.org/jira/browse/SOLR-5872
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Reporter: Noble Paul
>Assignee: Noble Paul
>
> The overseer queue is one of the busiest points in the entire system. The 
> raison d'être of the queue is
>  * Provide batching of operations for the main clusterstate,json so that 
> state updates are minimized 
> * Avoid race conditions and ensure order
> Now , as we move the individual collection states out of the main 
> clusterstate.json, the batching is not useful anymore.
> Race conditions can easily be solved by using a compare and set in Zookeeper. 
> The proposed solution  is , whenever an operation is required to be performed 
> on the clusterstate, the same thread (and of course the same JVM)
>  # read the fresh state and version of zk node  
>  # construct the new state 
>  # perform a compare and set
>  # if compare and set fails go to step 1
> This should be limited to all operations performed on external collections 
> because batching would be required for others 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5872) Eliminate overseer queue

2017-03-15 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15927335#comment-15927335
 ] 

Noble Paul commented on SOLR-5872:
--

bq. When you suggest partitioning the queue, do you mean multiple ZK queues?

It helps in keeping the size of any given queue to be minimal if you have very 
large no:of collections. 

I'm open to the idea of doing an in memory partitioning. 

read large no:of items say 1. put them into in memory buckets and feed them 
into overseer would be just fine as well and it could be an easy win 


> Eliminate overseer queue 
> -
>
> Key: SOLR-5872
> URL: https://issues.apache.org/jira/browse/SOLR-5872
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Reporter: Noble Paul
>Assignee: Noble Paul
>
> The overseer queue is one of the busiest points in the entire system. The 
> raison d'être of the queue is
>  * Provide batching of operations for the main clusterstate,json so that 
> state updates are minimized 
> * Avoid race conditions and ensure order
> Now , as we move the individual collection states out of the main 
> clusterstate.json, the batching is not useful anymore.
> Race conditions can easily be solved by using a compare and set in Zookeeper. 
> The proposed solution  is , whenever an operation is required to be performed 
> on the clusterstate, the same thread (and of course the same JVM)
>  # read the fresh state and version of zk node  
>  # construct the new state 
>  # perform a compare and set
>  # if compare and set fails go to step 1
> This should be limited to all operations performed on external collections 
> because batching would be required for others 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5872) Eliminate overseer queue

2017-03-15 Thread Joshua Humphries (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15927301#comment-15927301
 ] 

Joshua Humphries commented on SOLR-5872:


Right, but when a node comes up and changes replica states to active, it is 
highly likely that the number of events for a single collection will be ~1. So 
breaking batches at collection boundaries results in effectively no batching.

With the current code, there's no benefit to combining writes for multiple 
collections into the same batch. But if the code pipelined all of the writes 
for a batch (instead of issuing each one synchronously, blocking for each 
result) then combining writes across collections would reduce latency.

When you suggest partitioning the queue, do you mean multiple ZK queues? Seems 
simpler to just partition in memory: ingest the whole queue (or up to some 
limit) and push into in-memory queues (one per partition; could even explode a 
'downnode' message into the multiple updates it implies and scatter those 
updates across partitions). After one of the in-memory partitions completes an 
item, it can delete the corresponding entry from ZK. So, from ZK's point of 
view, the operations can completing out-of-order instead of always polling the 
head of the queue. When partitions quiesce (or when some other policy allows 
more items to be polled -- so we don't necessarily have to wait on all 
partitions to complete before grabbing more items), ingest another batch of 
items from ZK.

> Eliminate overseer queue 
> -
>
> Key: SOLR-5872
> URL: https://issues.apache.org/jira/browse/SOLR-5872
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Reporter: Noble Paul
>Assignee: Noble Paul
>
> The overseer queue is one of the busiest points in the entire system. The 
> raison d'être of the queue is
>  * Provide batching of operations for the main clusterstate,json so that 
> state updates are minimized 
> * Avoid race conditions and ensure order
> Now , as we move the individual collection states out of the main 
> clusterstate.json, the batching is not useful anymore.
> Race conditions can easily be solved by using a compare and set in Zookeeper. 
> The proposed solution  is , whenever an operation is required to be performed 
> on the clusterstate, the same thread (and of course the same JVM)
>  # read the fresh state and version of zk node  
>  # construct the new state 
>  # perform a compare and set
>  # if compare and set fails go to step 1
> This should be limited to all operations performed on external collections 
> because batching would be required for others 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5872) Eliminate overseer queue

2017-03-15 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15927274#comment-15927274
 ] 

Noble Paul commented on SOLR-5872:
--

We actually try to batch the writes at overseer today if multiple subsequent 
update operations come for the same collection. because all the collections 
share the same queue, the benefits are not realized. The solution is to have a 
larger no:of of queues , say 1000 buckets (and as many threads). Each 
collection must be hashed to one of these buckets.This will help us improve the 
batching because there is a much higher probability of 2 subsequent events are 
for the same collection


> Eliminate overseer queue 
> -
>
> Key: SOLR-5872
> URL: https://issues.apache.org/jira/browse/SOLR-5872
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Reporter: Noble Paul
>Assignee: Noble Paul
>
> The overseer queue is one of the busiest points in the entire system. The 
> raison d'être of the queue is
>  * Provide batching of operations for the main clusterstate,json so that 
> state updates are minimized 
> * Avoid race conditions and ensure order
> Now , as we move the individual collection states out of the main 
> clusterstate.json, the batching is not useful anymore.
> Race conditions can easily be solved by using a compare and set in Zookeeper. 
> The proposed solution  is , whenever an operation is required to be performed 
> on the clusterstate, the same thread (and of course the same JVM)
>  # read the fresh state and version of zk node  
>  # construct the new state 
>  # perform a compare and set
>  # if compare and set fails go to step 1
> This should be limited to all operations performed on external collections 
> because batching would be required for others 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5872) Eliminate overseer queue

2017-03-15 Thread Joshua Humphries (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15926351#comment-15926351
 ] 

Joshua Humphries commented on SOLR-5872:


I identified one issue with slowness in processing the overseer queue: a 
'downnode' message can result in far more updates to ZK than necessary -- 
mainly for clusters with many collections where any given collection only has 
shard-replicas on a small minority of the nodes. Our cluster has many thousands 
of collections, most of which have only one shard and one replica. So 
'downnode' was updating about 40x more collections in ZK than actually 
necessary. Furthermore, all of the writes are done synchronously/sequentially 
which means we pay the RTT to ZK 40x more than necessary. (Also, writes across 
collections, even when state_format > 1, could be batched and pipelined, to 
further reduce latency here.)

See SOLR-10277 for more details.

> Eliminate overseer queue 
> -
>
> Key: SOLR-5872
> URL: https://issues.apache.org/jira/browse/SOLR-5872
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Reporter: Noble Paul
>Assignee: Noble Paul
>
> The overseer queue is one of the busiest points in the entire system. The 
> raison d'être of the queue is
>  * Provide batching of operations for the main clusterstate,json so that 
> state updates are minimized 
> * Avoid race conditions and ensure order
> Now , as we move the individual collection states out of the main 
> clusterstate.json, the batching is not useful anymore.
> Race conditions can easily be solved by using a compare and set in Zookeeper. 
> The proposed solution  is , whenever an operation is required to be performed 
> on the clusterstate, the same thread (and of course the same JVM)
>  # read the fresh state and version of zk node  
>  # construct the new state 
>  # perform a compare and set
>  # if compare and set fails go to step 1
> This should be limited to all operations performed on external collections 
> because batching would be required for others 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5872) Eliminate overseer queue

2017-03-10 Thread albert vico oton (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905254#comment-15905254
 ] 

albert vico oton commented on SOLR-5872:


Yea exactly, sorry if my comment was confusing but the problem is with the use 
that SolrCloud is doing of ZK not with ZK itself.

{quote} When the number of collections gets large enough, Solr has a tendency 
to run into ZOOKEEPER-1162, because entries can be added to the overseer queue 
at a much faster rate than the overseer can process them. During my testing on 
SOLR-7191 with version 5, Solr generated an overseer queue with 850,000 entries 
in it, resulting in a ZK packet size of 14 megabytes. {quote}

I believe that this is exactly what we were experiencing. 

> Eliminate overseer queue 
> -
>
> Key: SOLR-5872
> URL: https://issues.apache.org/jira/browse/SOLR-5872
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Reporter: Noble Paul
>Assignee: Noble Paul
>
> The overseer queue is one of the busiest points in the entire system. The 
> raison d'être of the queue is
>  * Provide batching of operations for the main clusterstate,json so that 
> state updates are minimized 
> * Avoid race conditions and ensure order
> Now , as we move the individual collection states out of the main 
> clusterstate.json, the batching is not useful anymore.
> Race conditions can easily be solved by using a compare and set in Zookeeper. 
> The proposed solution  is , whenever an operation is required to be performed 
> on the clusterstate, the same thread (and of course the same JVM)
>  # read the fresh state and version of zk node  
>  # construct the new state 
>  # perform a compare and set
>  # if compare and set fails go to step 1
> This should be limited to all operations performed on external collections 
> because batching would be required for others 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5872) Eliminate overseer queue

2017-03-10 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905224#comment-15905224
 ] 

Shawn Heisey commented on SOLR-5872:


I don't know how this issue escaped my attention, especially since it's been 
around a few years.

[~mewmewball] mentioned early on in this issue that each state change results 
in four ZK writes.  When I opened SOLR-7191, I found that when any collection 
changed state, something was sent to the overseer queue for *every* collection. 
 If I remember right, this happens even when adding a new collection, which 
seems completely insane to me.

When the number of collections gets large enough, Solr has a tendency to run 
into ZOOKEEPER-1162, because entries can be added to the overseer queue at a 
much faster rate than the overseer can process them.  During my testing on 
SOLR-7191 with version 5, Solr generated an overseer queue with 850,000 entries 
in it, resulting in a ZK packet size of 14 megabytes.

I am not at all familiar with how SolrCloud's zookeeper code works.  Exploring 
that rabbit hole will take a pretty major time investment.  I've been reluctant 
to spend that time.  Other people *do* understand it, so I mostly just bounce 
ideas off of those people and ask questions.

bq. I'll take a look but the problem we were seeing was in Zookeeper cluster 
not in solr

I don't see anything in your comment on 2017/03/01 that describes a problem 
with ZK.  It sounds like problems with Solr using ZK.  The overseer is a Solr 
component, it's not in ZK.  If SOLR-10130 is occurring on your system, then an 
upgrade to 6.4.2 will help.


> Eliminate overseer queue 
> -
>
> Key: SOLR-5872
> URL: https://issues.apache.org/jira/browse/SOLR-5872
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Reporter: Noble Paul
>Assignee: Noble Paul
>
> The overseer queue is one of the busiest points in the entire system. The 
> raison d'être of the queue is
>  * Provide batching of operations for the main clusterstate,json so that 
> state updates are minimized 
> * Avoid race conditions and ensure order
> Now , as we move the individual collection states out of the main 
> clusterstate.json, the batching is not useful anymore.
> Race conditions can easily be solved by using a compare and set in Zookeeper. 
> The proposed solution  is , whenever an operation is required to be performed 
> on the clusterstate, the same thread (and of course the same JVM)
>  # read the fresh state and version of zk node  
>  # construct the new state 
>  # perform a compare and set
>  # if compare and set fails go to step 1
> This should be limited to all operations performed on external collections 
> because batching would be required for others 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5872) Eliminate overseer queue

2017-03-09 Thread albert vico oton (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15903961#comment-15903961
 ] 

albert vico oton commented on SOLR-5872:


thanks for the advice, I'll take a look but the problem we were seeing was in 
Zookeeper cluster not in solr 

> Eliminate overseer queue 
> -
>
> Key: SOLR-5872
> URL: https://issues.apache.org/jira/browse/SOLR-5872
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Reporter: Noble Paul
>Assignee: Noble Paul
>
> The overseer queue is one of the busiest points in the entire system. The 
> raison d'être of the queue is
>  * Provide batching of operations for the main clusterstate,json so that 
> state updates are minimized 
> * Avoid race conditions and ensure order
> Now , as we move the individual collection states out of the main 
> clusterstate.json, the batching is not useful anymore.
> Race conditions can easily be solved by using a compare and set in Zookeeper. 
> The proposed solution  is , whenever an operation is required to be performed 
> on the clusterstate, the same thread (and of course the same JVM)
>  # read the fresh state and version of zk node  
>  # construct the new state 
>  # perform a compare and set
>  # if compare and set fails go to step 1
> This should be limited to all operations performed on external collections 
> because batching would be required for others 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5872) Eliminate overseer queue

2017-03-09 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15903557#comment-15903557
 ] 

Shawn Heisey commented on SOLR-5872:


bq. We were doing our tests with solr 6.4.1

That version has other problems that may be clouding the issue (pun intended).  
See SOLR-10130.  I advise an immediate upgrade to 6.4.2, to be sure that any 
problems you're encountering are actually caused by SolrCloud, not high CPU 
usage.


> Eliminate overseer queue 
> -
>
> Key: SOLR-5872
> URL: https://issues.apache.org/jira/browse/SOLR-5872
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Reporter: Noble Paul
>Assignee: Noble Paul
>
> The overseer queue is one of the busiest points in the entire system. The 
> raison d'être of the queue is
>  * Provide batching of operations for the main clusterstate,json so that 
> state updates are minimized 
> * Avoid race conditions and ensure order
> Now , as we move the individual collection states out of the main 
> clusterstate.json, the batching is not useful anymore.
> Race conditions can easily be solved by using a compare and set in Zookeeper. 
> The proposed solution  is , whenever an operation is required to be performed 
> on the clusterstate, the same thread (and of course the same JVM)
>  # read the fresh state and version of zk node  
>  # construct the new state 
>  # perform a compare and set
>  # if compare and set fails go to step 1
> This should be limited to all operations performed on external collections 
> because batching would be required for others 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5872) Eliminate overseer queue

2017-03-09 Thread albert vico oton (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15903394#comment-15903394
 ] 

albert vico oton commented on SOLR-5872:


Already did that, but nodes still notify its state change, apparently 
collections need to know about other collections status in order to reroute 
queries to them, this amount of state change msg was killing our io in the ZK 
cluster, causing a lot of cpu wait time and effectively rendering the system 
unusable. But honestly that's as far as we went, we moved away from solrcloud 
and now we are using standalone solr inside each container for each of our 
collections and doing the balancing between replicas through nginx. 

> Eliminate overseer queue 
> -
>
> Key: SOLR-5872
> URL: https://issues.apache.org/jira/browse/SOLR-5872
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Reporter: Noble Paul
>Assignee: Noble Paul
>
> The overseer queue is one of the busiest points in the entire system. The 
> raison d'être of the queue is
>  * Provide batching of operations for the main clusterstate,json so that 
> state updates are minimized 
> * Avoid race conditions and ensure order
> Now , as we move the individual collection states out of the main 
> clusterstate.json, the batching is not useful anymore.
> Race conditions can easily be solved by using a compare and set in Zookeeper. 
> The proposed solution  is , whenever an operation is required to be performed 
> on the clusterstate, the same thread (and of course the same JVM)
>  # read the fresh state and version of zk node  
>  # construct the new state 
>  # perform a compare and set
>  # if compare and set fails go to step 1
> This should be limited to all operations performed on external collections 
> because batching would be required for others 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5872) Eliminate overseer queue

2017-03-09 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15903303#comment-15903303
 ] 

Erick Erickson commented on SOLR-5872:
--

bq: "Also, I do not see why collection A should be aware of collection B state."

What is your evidence of this? Because this was changed quite a while ago. 
Originally there was a single clusterstate.json that held all of the state 
information for all the collections, but that changed to having a state.json in 
each collections' Znode. So either you're misinterpreting something or somehow 
using an older style Zookeeper state. Or something really strange is happening.

See the collections API MIGRATESTATEFORMAT.



> Eliminate overseer queue 
> -
>
> Key: SOLR-5872
> URL: https://issues.apache.org/jira/browse/SOLR-5872
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Reporter: Noble Paul
>Assignee: Noble Paul
>
> The overseer queue is one of the busiest points in the entire system. The 
> raison d'être of the queue is
>  * Provide batching of operations for the main clusterstate,json so that 
> state updates are minimized 
> * Avoid race conditions and ensure order
> Now , as we move the individual collection states out of the main 
> clusterstate.json, the batching is not useful anymore.
> Race conditions can easily be solved by using a compare and set in Zookeeper. 
> The proposed solution  is , whenever an operation is required to be performed 
> on the clusterstate, the same thread (and of course the same JVM)
>  # read the fresh state and version of zk node  
>  # construct the new state 
>  # perform a compare and set
>  # if compare and set fails go to step 1
> This should be limited to all operations performed on external collections 
> because batching would be required for others 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5872) Eliminate overseer queue

2017-03-09 Thread albert vico oton (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15902928#comment-15902928
 ] 

albert vico oton commented on SOLR-5872:


We were doing our tests with solr 6.4.1

> Eliminate overseer queue 
> -
>
> Key: SOLR-5872
> URL: https://issues.apache.org/jira/browse/SOLR-5872
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Reporter: Noble Paul
>Assignee: Noble Paul
>
> The overseer queue is one of the busiest points in the entire system. The 
> raison d'être of the queue is
>  * Provide batching of operations for the main clusterstate,json so that 
> state updates are minimized 
> * Avoid race conditions and ensure order
> Now , as we move the individual collection states out of the main 
> clusterstate.json, the batching is not useful anymore.
> Race conditions can easily be solved by using a compare and set in Zookeeper. 
> The proposed solution  is , whenever an operation is required to be performed 
> on the clusterstate, the same thread (and of course the same JVM)
>  # read the fresh state and version of zk node  
>  # construct the new state 
>  # perform a compare and set
>  # if compare and set fails go to step 1
> This should be limited to all operations performed on external collections 
> because batching would be required for others 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5872) Eliminate overseer queue

2017-03-07 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900838#comment-15900838
 ] 

Mark Miller commented on SOLR-5872:
---

bq. Hello, we are currently trying to do a deploy of around 200 collections

With which version?

> Eliminate overseer queue 
> -
>
> Key: SOLR-5872
> URL: https://issues.apache.org/jira/browse/SOLR-5872
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Reporter: Noble Paul
>Assignee: Noble Paul
>
> The overseer queue is one of the busiest points in the entire system. The 
> raison d'être of the queue is
>  * Provide batching of operations for the main clusterstate,json so that 
> state updates are minimized 
> * Avoid race conditions and ensure order
> Now , as we move the individual collection states out of the main 
> clusterstate.json, the batching is not useful anymore.
> Race conditions can easily be solved by using a compare and set in Zookeeper. 
> The proposed solution  is , whenever an operation is required to be performed 
> on the clusterstate, the same thread (and of course the same JVM)
>  # read the fresh state and version of zk node  
>  # construct the new state 
>  # perform a compare and set
>  # if compare and set fails go to step 1
> This should be limited to all operations performed on external collections 
> because batching would be required for others 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5872) Eliminate overseer queue

2017-03-07 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900731#comment-15900731
 ] 

Noble Paul commented on SOLR-5872:
--

it's time to split the "state" from "state.json" into a separate file. 
replica-status.json

{code}
{
// 0:DOWN
// 1: ACTIVE
// 2: RECOVERING

"replica1": 1
"replica2": 1
}
{code}

So every core watches 2 files instead of one and 99% of changes happen to the 
replica-status.json

This can help us scale to a very large no:of of shards

> Eliminate overseer queue 
> -
>
> Key: SOLR-5872
> URL: https://issues.apache.org/jira/browse/SOLR-5872
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Reporter: Noble Paul
>Assignee: Noble Paul
>
> The overseer queue is one of the busiest points in the entire system. The 
> raison d'être of the queue is
>  * Provide batching of operations for the main clusterstate,json so that 
> state updates are minimized 
> * Avoid race conditions and ensure order
> Now , as we move the individual collection states out of the main 
> clusterstate.json, the batching is not useful anymore.
> Race conditions can easily be solved by using a compare and set in Zookeeper. 
> The proposed solution  is , whenever an operation is required to be performed 
> on the clusterstate, the same thread (and of course the same JVM)
>  # read the fresh state and version of zk node  
>  # construct the new state 
>  # perform a compare and set
>  # if compare and set fails go to step 1
> This should be limited to all operations performed on external collections 
> because batching would be required for others 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5872) Eliminate overseer queue

2017-03-01 Thread albert vico oton (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15890021#comment-15890021
 ] 

albert vico oton commented on SOLR-5872:


Hello, we are currently trying to do a deploy of around 200 collections and 
solrcloud can't handle it, it just  dies due update_status messages propagation 
everytime we try to add a new collection, each collection has 3 replicas, and 
sizes are not very large. Also, I do not see why collection A should be aware 
of collection B state.  

But moving to the topic, overseer node dies since he can not handle all the 
stress due the flooding of messages. IMHO we have here a single point of 
failure in a distributed system, which is not very recommended. 

since it would be useful for big fat shards, my suggestion would be to make 
this optional behavior, so people like use who need to have a more distributed 
approach can make use of solrcloud. Since right now it is impossible to. and 
I'm not talking about "thousands" of collections actually with as few as 100 we 
are seeing very bad performance.



> Eliminate overseer queue 
> -
>
> Key: SOLR-5872
> URL: https://issues.apache.org/jira/browse/SOLR-5872
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Reporter: Noble Paul
>Assignee: Noble Paul
>
> The overseer queue is one of the busiest points in the entire system. The 
> raison d'être of the queue is
>  * Provide batching of operations for the main clusterstate,json so that 
> state updates are minimized 
> * Avoid race conditions and ensure order
> Now , as we move the individual collection states out of the main 
> clusterstate.json, the batching is not useful anymore.
> Race conditions can easily be solved by using a compare and set in Zookeeper. 
> The proposed solution  is , whenever an operation is required to be performed 
> on the clusterstate, the same thread (and of course the same JVM)
>  # read the fresh state and version of zk node  
>  # construct the new state 
>  # perform a compare and set
>  # if compare and set fails go to step 1
> This should be limited to all operations performed on external collections 
> because batching would be required for others 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5872) Eliminate overseer queue

2015-08-17 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699051#comment-14699051
 ] 

Shalin Shekhar Mangar commented on SOLR-5872:
-

bq. I see the lack of batching in stateFormat=2 is a potential blocker to it's 
adoption. We need some benchmarks on a single collection with lots of cores (at 
least 1000), and see how it works with stateFormat=1, stateFormat=2, and this 
new approach

A single collection with lots of cores will perform similarly with both 
stateFormat=1 and stateFormat=2 because updates in stateFormat=2 are also 
batched as long as consecutive updates are for the same collection.

 Eliminate overseer queue 
 -

 Key: SOLR-5872
 URL: https://issues.apache.org/jira/browse/SOLR-5872
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul

 The overseer queue is one of the busiest points in the entire system. The 
 raison d'être of the queue is
  * Provide batching of operations for the main clusterstate,json so that 
 state updates are minimized 
 * Avoid race conditions and ensure order
 Now , as we move the individual collection states out of the main 
 clusterstate.json, the batching is not useful anymore.
 Race conditions can easily be solved by using a compare and set in Zookeeper. 
 The proposed solution  is , whenever an operation is required to be performed 
 on the clusterstate, the same thread (and of course the same JVM)
  # read the fresh state and version of zk node  
  # construct the new state 
  # perform a compare and set
  # if compare and set fails go to step 1
 This should be limited to all operations performed on external collections 
 because batching would be required for others 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5872) Eliminate overseer queue

2015-08-17 Thread Scott Blum (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700144#comment-14700144
 ] 

Scott Blum commented on SOLR-5872:
--

At the risk of creating two code paths, here's an idea.

1) We could improve batching *significantly* at the Overseer level, to be able 
to batch even when the same collection isn't updated twice in a row.  We just 
need something like a dirty list instead of only tracking the last one and the 
shared clusterStateModified.  This could be an independent improvement.

2) When performing updates on format=2, we could use a size heuristic to decide 
whether or not to go through the queue.  For collections with less than N 
shards, we could just do a local CAS loop for state update ops.  For 
collections with more than N shares we'd just always go through the queue.

 Eliminate overseer queue 
 -

 Key: SOLR-5872
 URL: https://issues.apache.org/jira/browse/SOLR-5872
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul

 The overseer queue is one of the busiest points in the entire system. The 
 raison d'être of the queue is
  * Provide batching of operations for the main clusterstate,json so that 
 state updates are minimized 
 * Avoid race conditions and ensure order
 Now , as we move the individual collection states out of the main 
 clusterstate.json, the batching is not useful anymore.
 Race conditions can easily be solved by using a compare and set in Zookeeper. 
 The proposed solution  is , whenever an operation is required to be performed 
 on the clusterstate, the same thread (and of course the same JVM)
  # read the fresh state and version of zk node  
  # construct the new state 
  # perform a compare and set
  # if compare and set fails go to step 1
 This should be limited to all operations performed on external collections 
 because batching would be required for others 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5872) Eliminate overseer queue

2015-08-17 Thread Ramkumar Aiyengar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700181#comment-14700181
 ] 

Ramkumar Aiyengar commented on SOLR-5872:
-

(1) sounds like a good idea..

On (2), to either have people decide on the approach, or have Solr do it, we 
would need to know the perf characteristics of both approaches. So maintaining 
two implementations or not really comes down to what we are trading off 
against. Again, only some serious benchmarking can answer that..

 Eliminate overseer queue 
 -

 Key: SOLR-5872
 URL: https://issues.apache.org/jira/browse/SOLR-5872
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul

 The overseer queue is one of the busiest points in the entire system. The 
 raison d'être of the queue is
  * Provide batching of operations for the main clusterstate,json so that 
 state updates are minimized 
 * Avoid race conditions and ensure order
 Now , as we move the individual collection states out of the main 
 clusterstate.json, the batching is not useful anymore.
 Race conditions can easily be solved by using a compare and set in Zookeeper. 
 The proposed solution  is , whenever an operation is required to be performed 
 on the clusterstate, the same thread (and of course the same JVM)
  # read the fresh state and version of zk node  
  # construct the new state 
  # perform a compare and set
  # if compare and set fails go to step 1
 This should be limited to all operations performed on external collections 
 because batching would be required for others 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5872) Eliminate overseer queue

2015-08-17 Thread Scott Blum (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700191#comment-14700191
 ] 

Scott Blum commented on SOLR-5872:
--

Agreed.  The idea behind #2 is to serve two very different kind of 
configurations.

a) Avoiding the overseer queue for small shard count is an optimization for 
deployments that have a huge number of collections, but each collection has 
very few shards.  The process of starting up and shutting down is more 
efficient because each collection can be updated in a distributed manner.

b) Using the overseer queue for big shard count is an optimization for 
deployments that have few collections, but each collection has a huge number of 
shards.

 Eliminate overseer queue 
 -

 Key: SOLR-5872
 URL: https://issues.apache.org/jira/browse/SOLR-5872
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul

 The overseer queue is one of the busiest points in the entire system. The 
 raison d'être of the queue is
  * Provide batching of operations for the main clusterstate,json so that 
 state updates are minimized 
 * Avoid race conditions and ensure order
 Now , as we move the individual collection states out of the main 
 clusterstate.json, the batching is not useful anymore.
 Race conditions can easily be solved by using a compare and set in Zookeeper. 
 The proposed solution  is , whenever an operation is required to be performed 
 on the clusterstate, the same thread (and of course the same JVM)
  # read the fresh state and version of zk node  
  # construct the new state 
  # perform a compare and set
  # if compare and set fails go to step 1
 This should be limited to all operations performed on external collections 
 because batching would be required for others 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5872) Eliminate overseer queue

2015-08-15 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14698494#comment-14698494
 ] 

Noble Paul commented on SOLR-5872:
--

as [~andyetitmoves] said, batching offers serious benefits. stateformat=2 
really does not matter . A collection with a lot of shards is more likely than 
a large no:of collections. Without batching , it will have the same bottleneck. 
The batching is not for writing to ZK, it is for reading from ZK. If there are 
1000s of cores reading every single update to the {{state.json}} we are back to 
square one. 

We will need to do some serious benchmarking to prove the performance of that 
it is worth it

 Eliminate overseer queue 
 -

 Key: SOLR-5872
 URL: https://issues.apache.org/jira/browse/SOLR-5872
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul

 The overseer queue is one of the busiest points in the entire system. The 
 raison d'être of the queue is
  * Provide batching of operations for the main clusterstate,json so that 
 state updates are minimized 
 * Avoid race conditions and ensure order
 Now , as we move the individual collection states out of the main 
 clusterstate.json, the batching is not useful anymore.
 Race conditions can easily be solved by using a compare and set in Zookeeper. 
 The proposed solution  is , whenever an operation is required to be performed 
 on the clusterstate, the same thread (and of course the same JVM)
  # read the fresh state and version of zk node  
  # construct the new state 
  # perform a compare and set
  # if compare and set fails go to step 1
 This should be limited to all operations performed on external collections 
 because batching would be required for others 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5872) Eliminate overseer queue

2015-08-15 Thread Ramkumar Aiyengar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14698435#comment-14698435
 ] 

Ramkumar Aiyengar commented on SOLR-5872:
-

Though I haven't done serious experiments on this as yet, I see the lack of 
batching in stateFormat=2 is a potential blocker to it's adoption. We need some 
benchmarks on a single collection with lots of cores (at least 1000), and see 
how it works with stateFormat=1, stateFormat=2, and this new approach. Keep in 
mind that hundreds of cores might change state at the same time, that's the 
real benefit to batching. I fear that without a batching approach, the system 
might choke due to the contention at that point.

 Eliminate overseer queue 
 -

 Key: SOLR-5872
 URL: https://issues.apache.org/jira/browse/SOLR-5872
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul

 The overseer queue is one of the busiest points in the entire system. The 
 raison d'être of the queue is
  * Provide batching of operations for the main clusterstate,json so that 
 state updates are minimized 
 * Avoid race conditions and ensure order
 Now , as we move the individual collection states out of the main 
 clusterstate.json, the batching is not useful anymore.
 Race conditions can easily be solved by using a compare and set in Zookeeper. 
 The proposed solution  is , whenever an operation is required to be performed 
 on the clusterstate, the same thread (and of course the same JVM)
  # read the fresh state and version of zk node  
  # construct the new state 
  # perform a compare and set
  # if compare and set fails go to step 1
 This should be limited to all operations performed on external collections 
 because batching would be required for others 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5872) Eliminate overseer queue

2015-08-05 Thread Scott Blum (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14658807#comment-14658807
 ] 

Scott Blum commented on SOLR-5872:
--

Now that SOLR-5756 is close to landed, I want to take a serious stab at making 
updates to format2 collections not go through overseer.  IE, anything that 
modifies clusterstate.json goes through overseer, but anything that modifies a 
/collection/foo/state.json would be handled by the local node with a CAS loop.  
I realize that for a collection with a huge number of shards+replicas, there 
could be contention on that single node.  Worth nothing that the current 
implementation doesn't batch format2 updates anyway, it ends up doing a 
(non-contended) write for every individual mutation.

 Eliminate overseer queue 
 -

 Key: SOLR-5872
 URL: https://issues.apache.org/jira/browse/SOLR-5872
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul

 The overseer queue is one of the busiest points in the entire system. The 
 raison d'être of the queue is
  * Provide batching of operations for the main clusterstate,json so that 
 state updates are minimized 
 * Avoid race conditions and ensure order
 Now , as we move the individual collection states out of the main 
 clusterstate.json, the batching is not useful anymore.
 Race conditions can easily be solved by using a compare and set in Zookeeper. 
 The proposed solution  is , whenever an operation is required to be performed 
 on the clusterstate, the same thread (and of course the same JVM)
  # read the fresh state and version of zk node  
  # construct the new state 
  # perform a compare and set
  # if compare and set fails go to step 1
 This should be limited to all operations performed on external collections 
 because batching would be required for others 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5872) Eliminate overseer queue

2014-03-19 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940787#comment-13940787
 ] 

Mark Miller commented on SOLR-5872:
---

bq. Is that dead in the water now?

No. It's got it's own issue, and it seems likely to happen to me.

Even this issue is not dead in the water. Things are generally determined via 
discussion and consensus. I'm arguing that we should look at simple performance 
bottleneck and improvements to the current system - there seems to be a lot of 
low hanging fruit.

{noformat}
Can you throw some light on how was the ZK schema for your initial impl? If all 
nodes of a given slice is under one zk directory , one watch on the parent 
should be fine, right?
{noformat}

It's been a long time and we had a few variations, so I'd have to go back in 
the code to refresh my memory. For now, from my memory:

Initially I had it to that we simply watched the parent - Loggly ran into 
performance issues with this - even when only one entry changed, they had so 
many entries that updating the state with so many nodes reading so many 
entries, the performance was a big problem for them. They hacked around it 
initially, and then we moved to watching each entry eventually - this made 
small updating state for small changes very efficient. But then another big 
early user was still hitting performance issues simply from having to read so 
many entries on startup and such. This is what prompted the move to a single 
clusterstate.json.

It's hard to remember it all perfectly - the info is spread across and around a 
lot of old JIRAs. Non of the changes were taken lightly, and a variety of 
developers and contributors were generally involved in the discussion or 
motivating changes via their needs.

There are tradeoffs with all of these approaches.

 Eliminate overseer queue 
 -

 Key: SOLR-5872
 URL: https://issues.apache.org/jira/browse/SOLR-5872
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul

 The overseer queue is one of the busiest points in the entire system. The 
 raison d'être of the queue is
  * Provide batching of operations for the main clusterstate,json so that 
 state updates are minimized 
 * Avoid race conditions and ensure order
 Now , as we move the individual collection states out of the main 
 clusterstate.json, the batching is not useful anymore.
 Race conditions can easily be solved by using a compare and set in Zookeeper. 
 The proposed solution  is , whenever an operation is required to be performed 
 on the clusterstate, the same thread (and of course the same JVM)
  # read the fresh state and version of zk node  
  # construct the new state 
  # perform a compare and set
  # if compare and set fails go to step 1
 This should be limited to all operations performed on external collections 
 because batching would be required for others 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5872) Eliminate overseer queue

2014-03-18 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938943#comment-13938943
 ] 

Shalin Shekhar Mangar commented on SOLR-5872:
-

{quote}
bq. as we move the individual collection states out of the main 
clusterstate.json [...]

This will make a difference on clusters with many smaller collections, but not 
on the single big collection.
It seems like we still want scalability in both directions (wrt number of 
collections, and the size a single collection can be).
{quote}

The best solution that I see here is to move the replica states out into their 
own ZK nodes. This way the individual nodes can update them directly without 
the overseer via compare and set operations. The rest of the operations can 
continue to be processed in the overseer. If we do this, even the external 
collection changes may not be required. The leader information can also be read 
directly from the leader election nodes.

 Eliminate overseer queue 
 -

 Key: SOLR-5872
 URL: https://issues.apache.org/jira/browse/SOLR-5872
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul

 The overseer queue is one of the busiest points in the entire system. The 
 raison d'être of the queue is
  * Provide batching of operations for the main clusterstate,json so that 
 state updates are minimized 
 * Avoid race conditions and ensure order
 Now , as we move the individual collection states out of the main 
 clusterstate.json, the batching is not useful anymore.
 Race conditions can easily be solved by using a compare and set in Zookeeper. 
 The proposed solution  is , whenever an operation is required to be performed 
 on the clusterstate, the same thread (and of course the same JVM)
  # read the fresh state and version of zk node  
  # construct the new state 
  # perform a compare and set
  # if compare and set fails go to step 1
 This should be limited to all operations performed on external collections 
 because batching would be required for others 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5872) Eliminate overseer queue

2014-03-18 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939413#comment-13939413
 ] 

Mark Miller commented on SOLR-5872:
---

bq.  is to move the replica states out into their own ZK nodes.

That is also how I first implemented the clusterstate - it was super slow to 
read the state and required a ridiculous number of watchers. Now that they have 
some options to read multiple nodes in one call, it may be that you can work 
around some of the issues we had, but it was really only good for the case 
where you had small changes in state to read - users had real issues with 
performance otherwise and that is why we moved to clusterstate.json.

It's a similar issue - we have been there before, we moved because of tough 
issues, it's should be a high bar to go back.

 Eliminate overseer queue 
 -

 Key: SOLR-5872
 URL: https://issues.apache.org/jira/browse/SOLR-5872
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul

 The overseer queue is one of the busiest points in the entire system. The 
 raison d'être of the queue is
  * Provide batching of operations for the main clusterstate,json so that 
 state updates are minimized 
 * Avoid race conditions and ensure order
 Now , as we move the individual collection states out of the main 
 clusterstate.json, the batching is not useful anymore.
 Race conditions can easily be solved by using a compare and set in Zookeeper. 
 The proposed solution  is , whenever an operation is required to be performed 
 on the clusterstate, the same thread (and of course the same JVM)
  # read the fresh state and version of zk node  
  # construct the new state 
  # perform a compare and set
  # if compare and set fails go to step 1
 This should be limited to all operations performed on external collections 
 because batching would be required for others 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5872) Eliminate overseer queue

2014-03-18 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939422#comment-13939422
 ] 

Noble Paul commented on SOLR-5872:
--

bq.That is also how I first implemented the clusterstate

Can you throw some light on how was the ZK schema for your initial impl? If all 
nodes of a given slice is in one watch on the parent should be fine, right?

 Eliminate overseer queue 
 -

 Key: SOLR-5872
 URL: https://issues.apache.org/jira/browse/SOLR-5872
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul

 The overseer queue is one of the busiest points in the entire system. The 
 raison d'être of the queue is
  * Provide batching of operations for the main clusterstate,json so that 
 state updates are minimized 
 * Avoid race conditions and ensure order
 Now , as we move the individual collection states out of the main 
 clusterstate.json, the batching is not useful anymore.
 Race conditions can easily be solved by using a compare and set in Zookeeper. 
 The proposed solution  is , whenever an operation is required to be performed 
 on the clusterstate, the same thread (and of course the same JVM)
  # read the fresh state and version of zk node  
  # construct the new state 
  # perform a compare and set
  # if compare and set fails go to step 1
 This should be limited to all operations performed on external collections 
 because batching would be required for others 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5872) Eliminate overseer queue

2014-03-18 Thread Ramkumar Aiyengar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939824#comment-13939824
 ] 

Ramkumar Aiyengar commented on SOLR-5872:
-

Wasn't one of the ideas considered in one of the other tickets to 'shard' the 
cluster state into N pieces so that we can hit a sweet spot between number of 
watchers and contention? Is that dead in the water now?

 Eliminate overseer queue 
 -

 Key: SOLR-5872
 URL: https://issues.apache.org/jira/browse/SOLR-5872
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul

 The overseer queue is one of the busiest points in the entire system. The 
 raison d'être of the queue is
  * Provide batching of operations for the main clusterstate,json so that 
 state updates are minimized 
 * Avoid race conditions and ensure order
 Now , as we move the individual collection states out of the main 
 clusterstate.json, the batching is not useful anymore.
 Race conditions can easily be solved by using a compare and set in Zookeeper. 
 The proposed solution  is , whenever an operation is required to be performed 
 on the clusterstate, the same thread (and of course the same JVM)
  # read the fresh state and version of zk node  
  # construct the new state 
  # perform a compare and set
  # if compare and set fails go to step 1
 This should be limited to all operations performed on external collections 
 because batching would be required for others 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5872) Eliminate overseer queue

2014-03-17 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13937842#comment-13937842
 ] 

Yonik Seeley commented on SOLR-5872:


bq. as we move the individual collection states out of the main 
clusterstate.json [...]

This will make a difference on clusters with many smaller collections, but not 
on the single big collection.
It seems like we still want scalability in both directions (wrt number of 
collections, and the size a single collection can be).

 Eliminate overseer queue 
 -

 Key: SOLR-5872
 URL: https://issues.apache.org/jira/browse/SOLR-5872
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul

 The overseer queue is one of the busiest points in the entire system. The 
 raison d'être of the queue is
  * Provide batching of operations for the main clusterstate,json so that 
 state updates are minimized 
 * Avoid race conditions and ensure order
 Now , as we move the individual collection states out of the main 
 clusterstate.json, the batching is not useful anymore.
 Race conditions can easily be solved by using a compare and set in Zookeeper. 
 The proposed solution  is , whenever an operation is required to be performed 
 on the clusterstate, the same thread (and of course the same JVM)
  # read the fresh state and version of zk node  
  # construct the new state 
  # perform a compare and set
  # if compare and set fails go to step 1
 This should be limited to all operations performed on external collections 
 because batching would be required for others 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5872) Eliminate overseer queue

2014-03-17 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13937901#comment-13937901
 ] 

Mark Miller commented on SOLR-5872:
---

I'm not fully sold on this yet. Compare and set is how this was first 
implemented and it has it's own issues - hence the work Sami did to move to the 
queue. 

Potter has noticed the overseer is fairly slow at working through state 
updates. I think that should be investigated first. 

 Eliminate overseer queue 
 -

 Key: SOLR-5872
 URL: https://issues.apache.org/jira/browse/SOLR-5872
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul

 The overseer queue is one of the busiest points in the entire system. The 
 raison d'être of the queue is
  * Provide batching of operations for the main clusterstate,json so that 
 state updates are minimized 
 * Avoid race conditions and ensure order
 Now , as we move the individual collection states out of the main 
 clusterstate.json, the batching is not useful anymore.
 Race conditions can easily be solved by using a compare and set in Zookeeper. 
 The proposed solution  is , whenever an operation is required to be performed 
 on the clusterstate, the same thread (and of course the same JVM)
  # read the fresh state and version of zk node  
  # construct the new state 
  # perform a compare and set
  # if compare and set fails go to step 1
 This should be limited to all operations performed on external collections 
 because batching would be required for others 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5872) Eliminate overseer queue

2014-03-17 Thread Jessica Cheng (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938075#comment-13938075
 ] 

Jessica Cheng commented on SOLR-5872:
-

Seems like everyone is worried about batching. I think it'd be interesting to 
add logging/ stats tracking and experiment on a large cluster to see how much 
batching is actually achieved.

There are a few things I worry about with the current implementation:
- With the overseer queues, each state update is 4+ zookeeper writes: 1 enqueue 
to stateUpdateQueue, 1 enqueue to workqueue, 1 state update write (potentially 
batched), 1 dequeue from stateUpdateQueue, and 1 dequeue from workqueue--not to 
mention that each core going through a restart could generate quite a few state 
updates (down, potentially isLeader switch, recovering, up) and each node can 
contain multiple cores.
- Empirically, we have definitely seen the workqueue back up with lots of items 
during a node bounce--but of course this can be due to some bug that's causing 
Potter to notice the slowness.
- If batching really is so important, there's no batching for external 
collection state updates.
- In a normal rolling bounce where instances are restarted one-by-one, in the 
same order each time, the Overseer is killed at each instance restart, thus 
hindering the recovery process by gating state transition. (Here there are 
workarounds by playing with bounce orders, etc., but I would argue that in any 
organization that would have a cluster large enough to worry about this, there 
is most likely a system that governs the machines and normally does instance 1 
to N bounces, and a general-purpose ops team that eschews service-/app-specific 
bounce instructions.)

With all that said, I would really appreciated it to have more background 
details about what problems Mark and Sami has seen in the old implementation, 
and exactly what that old implementation was.

 Eliminate overseer queue 
 -

 Key: SOLR-5872
 URL: https://issues.apache.org/jira/browse/SOLR-5872
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul

 The overseer queue is one of the busiest points in the entire system. The 
 raison d'être of the queue is
  * Provide batching of operations for the main clusterstate,json so that 
 state updates are minimized 
 * Avoid race conditions and ensure order
 Now , as we move the individual collection states out of the main 
 clusterstate.json, the batching is not useful anymore.
 Race conditions can easily be solved by using a compare and set in Zookeeper. 
 The proposed solution  is , whenever an operation is required to be performed 
 on the clusterstate, the same thread (and of course the same JVM)
  # read the fresh state and version of zk node  
  # construct the new state 
  # perform a compare and set
  # if compare and set fails go to step 1
 This should be limited to all operations performed on external collections 
 because batching would be required for others 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5872) Eliminate overseer queue

2014-03-17 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938116#comment-13938116
 ] 

Mark Miller commented on SOLR-5872:
---

Thanks Jessica - your comment is more along the lines of what you need to argue 
to make a large change like this. Specifics.

I don't have time to write a detailed answer at the moment, but a lot of my 
reservations are around the large refactoring that is being attempted to 
support tons of collections. So far, I have not been super happy with a lot of 
the work that has been done. Much of it seems hurried, existing tests have not 
been beefed up in critical areas, new tests have been fairly minimal, and so 
I'm likely to push back on many of these issues. We have too many stability 
issues to tackle as it is. Abandoning code that has been getting hardened for 
over a year now for a approach that was already abandoned should not be done 
lightly. 

If someone makes a thoughtful and clear argument with specifics and then makes 
a thoughtful, well tested implementation, I'm much more likely to get on board.

I'll respond to the technical points when I get some time.

 Eliminate overseer queue 
 -

 Key: SOLR-5872
 URL: https://issues.apache.org/jira/browse/SOLR-5872
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul

 The overseer queue is one of the busiest points in the entire system. The 
 raison d'être of the queue is
  * Provide batching of operations for the main clusterstate,json so that 
 state updates are minimized 
 * Avoid race conditions and ensure order
 Now , as we move the individual collection states out of the main 
 clusterstate.json, the batching is not useful anymore.
 Race conditions can easily be solved by using a compare and set in Zookeeper. 
 The proposed solution  is , whenever an operation is required to be performed 
 on the clusterstate, the same thread (and of course the same JVM)
  # read the fresh state and version of zk node  
  # construct the new state 
  # perform a compare and set
  # if compare and set fails go to step 1
 This should be limited to all operations performed on external collections 
 because batching would be required for others 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5872) Eliminate overseer queue

2014-03-17 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938185#comment-13938185
 ] 

Noble Paul commented on SOLR-5872:
--

People suggest new changes to the system when /where they think it is required. 
It is important that we counter suggestions on their own merits/demerits. 

I'm sure you [~markrmil...@gmail.com] /Sami would have abandoned the idea 
because of some real issues. I would love to hear them out (when you have time) 
.The issues may not me insurmountable  . But , the point is , looking at the 
code the Overseer queue is seen as quite a bottleneck and this is the solution 
that immediately comes to ones mind. 

Anyone who can build up a patch will be a good demonstration of the possibility 
of such a solution. People who are testing out their systems in real test 
environment will be able to provide invaluable feedback on the viability/issues 
with the solution. As developers,  we need to guide/handhold the users who are 
pushing the envelope . At some point when we develop enough confidence we can 
integrate it into the product itself . 

bq.It seems like we still want scalability in both directions (wrt number of 
collections, and the size a single collection can be).

Yes, in the current system scaling with multiple collections is much simpler 
and a first baby step towards breaking the monolithic clusterstate.json . 
Eventually we would like to go to a state per slice so that we can support very 
large collections. But these new experiments need to be tried out first before 
we venture into larger ones

 Eliminate overseer queue 
 -

 Key: SOLR-5872
 URL: https://issues.apache.org/jira/browse/SOLR-5872
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul

 The overseer queue is one of the busiest points in the entire system. The 
 raison d'être of the queue is
  * Provide batching of operations for the main clusterstate,json so that 
 state updates are minimized 
 * Avoid race conditions and ensure order
 Now , as we move the individual collection states out of the main 
 clusterstate.json, the batching is not useful anymore.
 Race conditions can easily be solved by using a compare and set in Zookeeper. 
 The proposed solution  is , whenever an operation is required to be performed 
 on the clusterstate, the same thread (and of course the same JVM)
  # read the fresh state and version of zk node  
  # construct the new state 
  # perform a compare and set
  # if compare and set fails go to step 1
 This should be limited to all operations performed on external collections 
 because batching would be required for others 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5872) Eliminate overseer queue

2014-03-17 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938498#comment-13938498
 ] 

Mark Miller commented on SOLR-5872:
---

{noformat}
People suggest new changes to the system when /where they think it is required. 
It is important that we counter suggestions on their own merits/demerits.
{noformat}

Of course - and given this issue as presented, as I said I'm not fully sold on 
this yet. The other background I gave also applies to all of these issues. We 
won't just rip out tons of code and replace it just because someone has 
identified an issue and proposed a solution. The bar for this type of change 
should be high. Given the history of these changes, I'm going to have to be 
sold more than if the history was better. Each contributor is also judged on 
their merit - what have they contributed so far, what's the quality of their 
contributions, how much are they helping with test fails, etc. The more merit 
you have on SolrCloud, the more likely these large scale refactorings will 
receive support.

{noformat}
As developers, we need to guide/handhold the users who are pushing the envelope 
. At some point when we develop enough confidence we can integrate it into the 
product itself .
{noformat}

We don't necessarily need to hold hands - we need to take that on a case by 
case basis. We need to walk before we can run. We should probably jog before we 
run as well.

It's an issue by issue thing though, and Jessica has already begun providing a 
case for looking at this.

bq. At some point when we develop enough confidence we can integrate it into 
the product itself .

I've heard that before and I'm not totally sold that's how things will play 
out. Certainly they would not play out that way without some push back on these 
issues IMO.

 Eliminate overseer queue 
 -

 Key: SOLR-5872
 URL: https://issues.apache.org/jira/browse/SOLR-5872
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul

 The overseer queue is one of the busiest points in the entire system. The 
 raison d'être of the queue is
  * Provide batching of operations for the main clusterstate,json so that 
 state updates are minimized 
 * Avoid race conditions and ensure order
 Now , as we move the individual collection states out of the main 
 clusterstate.json, the batching is not useful anymore.
 Race conditions can easily be solved by using a compare and set in Zookeeper. 
 The proposed solution  is , whenever an operation is required to be performed 
 on the clusterstate, the same thread (and of course the same JVM)
  # read the fresh state and version of zk node  
  # construct the new state 
  # perform a compare and set
  # if compare and set fails go to step 1
 This should be limited to all operations performed on external collections 
 because batching would be required for others 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5872) Eliminate overseer queue

2014-03-17 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938539#comment-13938539
 ] 

Mark Miller commented on SOLR-5872:
---

bq. With the overseer queues, each state update is 4+ zookeeper writes

Given the numbers I've seen published for ZK performance, it seems like that 
should not be a big deal in typical cases?

bq. Empirically, we have definitely seen the workqueue back up with lots of 
items during a node bounce

I'm not surprised - most of this code has not been optimized or investigated 
thoroughly. The original author of a lot of the Overseer code has moved on and 
it likely has not seen as much attention as would be nice over the past year. 
Until someone looks into the current issues closely though, it seems hard to 
recommend rewriting this whole very important piece.

bq. If batching really is so important, there's no batching for external 
collection state updates.

I'm not really fully up on external collections but AFAIK it's part of some 
other work to support tons of collections that I'm not fully sold on yet either 
:)

bq. In a normal rolling bounce where instances are restarted one-by-one, in 
the same order each time, the Overseer is killed at each instance restart, thus 
hindering the recovery process by gating state transition.

This points out another issue that we might be able to address.

Without having looked closely at the issues brought up (and I don't see 
evidence anyone else has either), it's hard to draw the conclusion the whole 
thing just has to be replaced yet.

A couple issues around the old implementation:

* With every node updating the whole cluster state on state change, the 
clusterstate.json file is read far too much. The workaround you guys are 
proposing for that appears to be only having clients update the clusterstate 
when they run into an error - but I'm not sold that that is the best 
architecture for the future either. That's a complicated change to make, with 
many ramifications for future development.

* Some things that are in the clusterstate now and that could be in the future 
are not so easily handled with the non overseer strategy - like marking who is 
the leader. You have to have the Overseer running its own special thread to 
inject and remove information.

* As things are, on something like cluster startup, there will be tons of reads 
and writes of the clusterstate.json - a flood of attempts and retries to update 
it in ZooKeeper.

For further discussion around the change, there should be background if you 
search the archives.

There is a strong argument to be made that we should first investigate the 
performance issues with the current strategy. ZooKeeper is pretty fast - these 
state updates are tiny and batched. It seems like we should be able to do a lot 
better without throwing out code that has been getting hardened for a long time 
now.



 Eliminate overseer queue 
 -

 Key: SOLR-5872
 URL: https://issues.apache.org/jira/browse/SOLR-5872
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul

 The overseer queue is one of the busiest points in the entire system. The 
 raison d'être of the queue is
  * Provide batching of operations for the main clusterstate,json so that 
 state updates are minimized 
 * Avoid race conditions and ensure order
 Now , as we move the individual collection states out of the main 
 clusterstate.json, the batching is not useful anymore.
 Race conditions can easily be solved by using a compare and set in Zookeeper. 
 The proposed solution  is , whenever an operation is required to be performed 
 on the clusterstate, the same thread (and of course the same JVM)
  # read the fresh state and version of zk node  
  # construct the new state 
  # perform a compare and set
  # if compare and set fails go to step 1
 This should be limited to all operations performed on external collections 
 because batching would be required for others 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5872) Eliminate overseer queue

2014-03-17 Thread Jessica Cheng (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938670#comment-13938670
 ] 

Jessica Cheng commented on SOLR-5872:
-

quoteFor further discussion around the change, there should be background if 
you search the archives./quote
If you wouldn't mind terribly, will you please paste the link of a few relevant 
threads in the archive? (Sorry, I'm not familiar with all the keywords and 
archives, etc., yet.)

quoteThere is a strong argument to be made that we should first investigate 
the performance issues with the current strategy. ZooKeeper is pretty fast - 
these state updates are tiny and batched. It seems like we should be able to do 
a lot better without throwing out code that has been getting hardened for a 
long time now./quote
I see where your hesitation is now, and I can definitely agree. Sounds like 
there are a few points to be investigated for the current system before we 
attempt to change anything:

- Why is the Overseer's so slow at updating cluster state/ What's causing the 
build-up of queue messages during a restart?
- What can we do to generally solve the problem of the Overseer being killed on 
every instance restart in a rolling bounce?
- How much is actually batched?

My gut is that for external collections, batching won't be of that much benefit 
(except for that super-large collection case that Yoink mentioned), but I agree 
that if the current system can be hardened to work even for those, then the 
simplicity of one code path should be preferred over ultra-optimizing for a 
non-issue (assuming the first two points above can be fixed).

 Eliminate overseer queue 
 -

 Key: SOLR-5872
 URL: https://issues.apache.org/jira/browse/SOLR-5872
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul

 The overseer queue is one of the busiest points in the entire system. The 
 raison d'être of the queue is
  * Provide batching of operations for the main clusterstate,json so that 
 state updates are minimized 
 * Avoid race conditions and ensure order
 Now , as we move the individual collection states out of the main 
 clusterstate.json, the batching is not useful anymore.
 Race conditions can easily be solved by using a compare and set in Zookeeper. 
 The proposed solution  is , whenever an operation is required to be performed 
 on the clusterstate, the same thread (and of course the same JVM)
  # read the fresh state and version of zk node  
  # construct the new state 
  # perform a compare and set
  # if compare and set fails go to step 1
 This should be limited to all operations performed on external collections 
 because batching would be required for others 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5872) Eliminate overseer queue

2014-03-17 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938687#comment-13938687
 ] 

Mark Miller commented on SOLR-5872:
---

{noformat}
Some things that are in the clusterstate now and that could be in the future 
are not so easily handled with the non overseer strategy - like marking who is 
the leader. You have to have the Overseer running its own special thread to 
inject and remove information.
{noformat}

To expand on this one a bit - you can obviously have each node essentially do 
what the overseer does now - to know the true shard leader that means things 
like going to ZooKeeper though - so for a large cluster, as each node takes on 
all the duties of the overseer and every node is now hitting zookeper for this 
and that, and then each node is trying update the clusterstate.json at the same 
time and retrying, and you have this contentious herd pilling onto this one 
zookeeper node.

The Overseer was seen as a fairly elegant way to avoid this herd effect and 
provide a less chatty solution. Rather than all the retries and reading the 
state on every state change, everyone writes to a non contentious zk node, the 
Overseer batches up the info and writes out the state.

Now if we cannot make it fast enough because of fundamental limitations, that 
is one thing. But gosh, on the surface, these state updates are so small and ZK 
is fairly performant...

We should identify the bottlenecks.

For startup, one random idea is to look at using zk's multi call support to 
read the whole queue in one request and then batch it all.

I've got some other common sense ideas as well, but will have to find out the 
choke points before it makes a lot of sense brainstorming solutions.

 Eliminate overseer queue 
 -

 Key: SOLR-5872
 URL: https://issues.apache.org/jira/browse/SOLR-5872
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul

 The overseer queue is one of the busiest points in the entire system. The 
 raison d'être of the queue is
  * Provide batching of operations for the main clusterstate,json so that 
 state updates are minimized 
 * Avoid race conditions and ensure order
 Now , as we move the individual collection states out of the main 
 clusterstate.json, the batching is not useful anymore.
 Race conditions can easily be solved by using a compare and set in Zookeeper. 
 The proposed solution  is , whenever an operation is required to be performed 
 on the clusterstate, the same thread (and of course the same JVM)
  # read the fresh state and version of zk node  
  # construct the new state 
  # perform a compare and set
  # if compare and set fails go to step 1
 This should be limited to all operations performed on external collections 
 because batching would be required for others 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5872) Eliminate overseer queue

2014-03-17 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938710#comment-13938710
 ] 

Noble Paul commented on SOLR-5872:
--

bq.There is a strong argument to be made that we should first investigate the 
performance issues with the current strategy.

The current implementation of DistributedQueue.peek() is extremely expensive. 
It reads all the children and sort them and return one item from the head and 
discard all others.  There can be  a new method DistributedQueue.peek(n) where 
n is the number of items and Overseer can process them all in one batch .



 Eliminate overseer queue 
 -

 Key: SOLR-5872
 URL: https://issues.apache.org/jira/browse/SOLR-5872
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul

 The overseer queue is one of the busiest points in the entire system. The 
 raison d'être of the queue is
  * Provide batching of operations for the main clusterstate,json so that 
 state updates are minimized 
 * Avoid race conditions and ensure order
 Now , as we move the individual collection states out of the main 
 clusterstate.json, the batching is not useful anymore.
 Race conditions can easily be solved by using a compare and set in Zookeeper. 
 The proposed solution  is , whenever an operation is required to be performed 
 on the clusterstate, the same thread (and of course the same JVM)
  # read the fresh state and version of zk node  
  # construct the new state 
  # perform a compare and set
  # if compare and set fails go to step 1
 This should be limited to all operations performed on external collections 
 because batching would be required for others 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5872) Eliminate overseer queue

2014-03-17 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938717#comment-13938717
 ] 

Noble Paul commented on SOLR-5872:
--

bq.I'm not really fully up on external collections but AFAIK it's part of 
some other work to support tons of collections

There is more and more information getting added to the cluster state . I'm 
sure no one would object to the point that splitting the clusterstate.json 
would be a more scalable solution and the right direction to take. Of course, 
this is not to be done in haste , but eventually that should be the way.  The 
eventual goal should be to support very large no:of collections (say 1000's) 
and support extremely large collections (with 1000's of slices) . Solr itself 
will not have any problem scaling like that but the Overseer/clusterstate 
strategy will go through a revamp before we reach there.  

 Eliminate overseer queue 
 -

 Key: SOLR-5872
 URL: https://issues.apache.org/jira/browse/SOLR-5872
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul

 The overseer queue is one of the busiest points in the entire system. The 
 raison d'être of the queue is
  * Provide batching of operations for the main clusterstate,json so that 
 state updates are minimized 
 * Avoid race conditions and ensure order
 Now , as we move the individual collection states out of the main 
 clusterstate.json, the batching is not useful anymore.
 Race conditions can easily be solved by using a compare and set in Zookeeper. 
 The proposed solution  is , whenever an operation is required to be performed 
 on the clusterstate, the same thread (and of course the same JVM)
  # read the fresh state and version of zk node  
  # construct the new state 
  # perform a compare and set
  # if compare and set fails go to step 1
 This should be limited to all operations performed on external collections 
 because batching would be required for others 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5872) Eliminate overseer queue

2014-03-17 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938751#comment-13938751
 ] 

Mark Miller commented on SOLR-5872:
---

If it's the issue about breaking up clusterstate.json per collection, I don't 
necessarily think that's a bad idea. I didn't realize that would make it 
something called an external collection though.

 Eliminate overseer queue 
 -

 Key: SOLR-5872
 URL: https://issues.apache.org/jira/browse/SOLR-5872
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul

 The overseer queue is one of the busiest points in the entire system. The 
 raison d'être of the queue is
  * Provide batching of operations for the main clusterstate,json so that 
 state updates are minimized 
 * Avoid race conditions and ensure order
 Now , as we move the individual collection states out of the main 
 clusterstate.json, the batching is not useful anymore.
 Race conditions can easily be solved by using a compare and set in Zookeeper. 
 The proposed solution  is , whenever an operation is required to be performed 
 on the clusterstate, the same thread (and of course the same JVM)
  # read the fresh state and version of zk node  
  # construct the new state 
  # perform a compare and set
  # if compare and set fails go to step 1
 This should be limited to all operations performed on external collections 
 because batching would be required for others 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5872) Eliminate overseer queue

2014-03-17 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938755#comment-13938755
 ] 

Noble Paul commented on SOLR-5872:
--

The external collection is just a name. It really does not matter what we 
call them. The idea is to split the same data out to smaller state nodes . 

 Eliminate overseer queue 
 -

 Key: SOLR-5872
 URL: https://issues.apache.org/jira/browse/SOLR-5872
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul

 The overseer queue is one of the busiest points in the entire system. The 
 raison d'être of the queue is
  * Provide batching of operations for the main clusterstate,json so that 
 state updates are minimized 
 * Avoid race conditions and ensure order
 Now , as we move the individual collection states out of the main 
 clusterstate.json, the batching is not useful anymore.
 Race conditions can easily be solved by using a compare and set in Zookeeper. 
 The proposed solution  is , whenever an operation is required to be performed 
 on the clusterstate, the same thread (and of course the same JVM)
  # read the fresh state and version of zk node  
  # construct the new state 
  # perform a compare and set
  # if compare and set fails go to step 1
 This should be limited to all operations performed on external collections 
 because batching would be required for others 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5872) Eliminate overseer queue

2014-03-17 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938775#comment-13938775
 ] 

Mark Miller commented on SOLR-5872:
---

It's obviously just a name :) I didn't know that it existed - that's all I was 
saying - I figured it meant something else. To me it doesn't make much sense. I 
think if we decide to split out the clusterstate.json per collection, that is 
the direction we should take, we should only support one clusterstate.json for 
back compat at most, and no such special name should exist. Solr 5.0 would no 
longer support the single clusterstate.json. Or, we might even decide to have 
the Overseer upgrade the format for you or something before 5.0.

Other thoughts on Overseer performance:

* Because only one process should be reading and removing items from the 
distributed queue at a time, seems like there are many cases we could read 
multiple nodes in one call.

* Perhaps 1500ms is not a great batch time - would be interesting if we made it 
configurable as well.

* Seems there might be a lot of room for parallelism - we probably only need to 
order within a collection if not simply per SolrCore. 

 Eliminate overseer queue 
 -

 Key: SOLR-5872
 URL: https://issues.apache.org/jira/browse/SOLR-5872
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul

 The overseer queue is one of the busiest points in the entire system. The 
 raison d'être of the queue is
  * Provide batching of operations for the main clusterstate,json so that 
 state updates are minimized 
 * Avoid race conditions and ensure order
 Now , as we move the individual collection states out of the main 
 clusterstate.json, the batching is not useful anymore.
 Race conditions can easily be solved by using a compare and set in Zookeeper. 
 The proposed solution  is , whenever an operation is required to be performed 
 on the clusterstate, the same thread (and of course the same JVM)
  # read the fresh state and version of zk node  
  # construct the new state 
  # perform a compare and set
  # if compare and set fails go to step 1
 This should be limited to all operations performed on external collections 
 because batching would be required for others 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5872) Eliminate overseer queue

2014-03-17 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938796#comment-13938796
 ] 

Noble Paul commented on SOLR-5872:
--

bq. I think if we decide to split out the clusterstate.json per collection, 
that is the direction we should take

Yes, that is the plan

we would probably switch to that from 5.0 or something. But the challenge is to 
offer a smother migration path. 
 * initially , users would be able to switch to that mode when creating a 
collection (an opt In) SOLR-5473 does that
*  offer an API to migrate to the new format  SOLR-5756
*  Make it the default format (from say 5.0)
*  deprecate the old format



 Eliminate overseer queue 
 -

 Key: SOLR-5872
 URL: https://issues.apache.org/jira/browse/SOLR-5872
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul

 The overseer queue is one of the busiest points in the entire system. The 
 raison d'être of the queue is
  * Provide batching of operations for the main clusterstate,json so that 
 state updates are minimized 
 * Avoid race conditions and ensure order
 Now , as we move the individual collection states out of the main 
 clusterstate.json, the batching is not useful anymore.
 Race conditions can easily be solved by using a compare and set in Zookeeper. 
 The proposed solution  is , whenever an operation is required to be performed 
 on the clusterstate, the same thread (and of course the same JVM)
  # read the fresh state and version of zk node  
  # construct the new state 
  # perform a compare and set
  # if compare and set fails go to step 1
 This should be limited to all operations performed on external collections 
 because batching would be required for others 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org