[ 
https://issues.apache.org/jira/browse/SOLR-14927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17229252#comment-17229252
 ] 

Mark Robert Miller edited comment on SOLR-14927 at 11/10/20, 2:31 PM:
----------------------------------------------------------------------

It’s the bad impl that limits the overseer (due to tech debt and variety of 
reasons), not the design. You are right that the zookeeper already owns the 
state, and that is why our overseer is so silly. 

The solution is not to embrace zk more, that’s actually the non scalable 
solution and the root of a lot of our base instability. 

The Overseer actually has the advantage for state updates, the cas approach 
with zk as the state owner is actually the non scalable approach. If you 
compare our impl, anything beats it, if you compare the design, CAS updates of 
state.json is a  few steps back. 

The disadvantage to this approach is actually what’s stated as the advantage. 
Zookeeper owning the state, state updates and cluster information distribution 
are not scalable

This approach will not compete well with the overseer approach in multiple 
areas, including cluster scalability. I have a branch you can try to compare 
with when some code is ready. This approach will have a hard time making it. 


was (Author: markrmiller):
It’s the bad impl that limits the overseer (due to tech debt and variety of 
reasons), not the design. You are right that the zookeeper already owns the 
state, and that is why our overseer is so silly. The solution is not to embrace 
zk more, that’s actually the non scalable solution. The Overseer actually has 
the advantage for state updates, the cas approach with zk as the state owner is 
actually the non scalable approach. The disadvantage is actually what’s stated 
as the advantage. Zookeeper owning the state, state updates are not scalable.

The approach will not compete well with ab overseer approach in multiple areas, 
including cluster scalability. 

> Remove Overseer
> ---------------
>
>                 Key: SOLR-14927
>                 URL: https://issues.apache.org/jira/browse/SOLR-14927
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: SolrCloud
>            Reporter: Ilan Ginzburg
>            Assignee: Ilan Ginzburg
>            Priority: Major
>              Labels: cluster, collection-api, overseer, solrcloud, zookeeper
>
> This Jira is intended to capture sub jiras on the path to remove the Overseer 
> component from SolrCloud and move to all nodes being able to do the work 
> currently done by Overseer.
> See detailed description in [this 
> doc|https://docs.google.com/document/d/1u4QHsIHuIxlglIW6hekYlXGNOP0HjLGVX5N6inkj6Ok/].
> Copying (edited) from the above doc:
> The motivation for removing Overseer include:
>  * Mono threaded state change is slow and doesn’t scale,
>  * Communication between cluster nodes and the Overseer use Zookeeper as a 
> queueing mechanism, this is not a good idea,
>  * Nodes talking to Overseer (then Overseer talking to itself) via Zookeeper 
> is inefficient and adds latency,
>  * Collection API scalability is poor, because not only a single node 
> processes commands for all Collections, but it also depends on the mono 
> threaded state change queue consumption,
>  * The code supporting Overseer in SolrCloud is complex (election, queue 
> management, recovery etc).
> The general idea is that there’s already a central point in the SolrCloud 
> cluster and it’s Zookeeper. It might not be necessary to have a second 
> central point (Overseer) because nodes can interact directly with Zookeeper 
> and synchronize more efficiently by optimistic locking using “conditional 
> updates” (a.k.a compare and swap or CAS).
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to