[jira] [Commented] (SOLR-14927) Remove Overseer

David Smiley (Jira) Wed, 09 Nov 2022 18:49:04 -0800


    [ 
https://issues.apache.org/jira/browse/SOLR-14927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17631404#comment-17631404
 ]


David Smiley commented on SOLR-14927:
-------------------------------------

An observation -- these new settings are remarkably under-documented; users 
won't know about them.  I had to read the code to figure out how to set them 
and their inter-relationship.  On the bright side, we can change the settings 
with less care for backwards compatibility :-)

I think a next step here is to separate off ConfigSet aspects so sunset the 
Overseer processing of them -- SOLR-15157.  I wish I had reviewed more closely 
at the time to have done this for 9.0.  I see no reason why ConfigSets should 
be centrally processed only on the Overseer.  And for that matter, no need for 
the facilities in {{DistributedCollectionConfigSetCommandRunner}} either, so I 
created SOLR-16543 as follow-on for that.

> Remove Overseer
> ---------------
>
>                 Key: SOLR-14927
>                 URL: https://issues.apache.org/jira/browse/SOLR-14927
>             Project: Solr
>          Issue Type: Improvement
>          Components: SolrCloud
>            Reporter: Ilan Ginzburg
>            Assignee: Ilan Ginzburg
>            Priority: Major
>              Labels: cluster, collection-api, overseer, solrcloud, zookeeper
>
> This Jira is intended to capture sub jiras on the path to remove the Overseer 
> component from SolrCloud and move to all nodes being able to do the work 
> currently done by Overseer.
> See detailed description in [this 
> doc|https://docs.google.com/document/d/1u4QHsIHuIxlglIW6hekYlXGNOP0HjLGVX5N6inkj6Ok/].
> Copying (edited) from the above doc:
> The motivation for removing Overseer include:
>  * Mono threaded state change is slow and doesn’t scale,
>  * Communication between cluster nodes and the Overseer use Zookeeper as a 
> queueing mechanism, this is not a good idea,
>  * Nodes talking to Overseer (then Overseer talking to itself) via Zookeeper 
> is inefficient and adds latency,
>  * Collection API scalability is poor, because not only a single node 
> processes commands for all Collections, but it also depends on the mono 
> threaded state change queue consumption,
>  * The code supporting Overseer in SolrCloud is complex (election, queue 
> management, recovery etc).
> The general idea is that there’s already a central point in the SolrCloud 
> cluster and it’s Zookeeper. It might not be necessary to have a second 
> central point (Overseer) because nodes can interact directly with Zookeeper 
> and synchronize more efficiently by optimistic locking using “conditional 
> updates” (a.k.a compare and swap or CAS).
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SOLR-14927) Remove Overseer

Reply via email to