[
https://issues.apache.org/jira/browse/SOLR-14927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17631404#comment-17631404
]
David Smiley commented on SOLR-14927:
-------------------------------------
An observation -- these new settings are remarkably under-documented; users
won't know about them. I had to read the code to figure out how to set them
and their inter-relationship. On the bright side, we can change the settings
with less care for backwards compatibility :-)
I think a next step here is to separate off ConfigSet aspects so sunset the
Overseer processing of them -- SOLR-15157. I wish I had reviewed more closely
at the time to have done this for 9.0. I see no reason why ConfigSets should
be centrally processed only on the Overseer. And for that matter, no need for
the facilities in {{DistributedCollectionConfigSetCommandRunner}} either, so I
created SOLR-16543 as follow-on for that.
> Remove Overseer
> ---------------
>
> Key: SOLR-14927
> URL: https://issues.apache.org/jira/browse/SOLR-14927
> Project: Solr
> Issue Type: Improvement
> Components: SolrCloud
> Reporter: Ilan Ginzburg
> Assignee: Ilan Ginzburg
> Priority: Major
> Labels: cluster, collection-api, overseer, solrcloud, zookeeper
>
> This Jira is intended to capture sub jiras on the path to remove the Overseer
> component from SolrCloud and move to all nodes being able to do the work
> currently done by Overseer.
> See detailed description in [this
> doc|https://docs.google.com/document/d/1u4QHsIHuIxlglIW6hekYlXGNOP0HjLGVX5N6inkj6Ok/].
> Copying (edited) from the above doc:
> The motivation for removing Overseer include:
> * Mono threaded state change is slow and doesn’t scale,
> * Communication between cluster nodes and the Overseer use Zookeeper as a
> queueing mechanism, this is not a good idea,
> * Nodes talking to Overseer (then Overseer talking to itself) via Zookeeper
> is inefficient and adds latency,
> * Collection API scalability is poor, because not only a single node
> processes commands for all Collections, but it also depends on the mono
> threaded state change queue consumption,
> * The code supporting Overseer in SolrCloud is complex (election, queue
> management, recovery etc).
> The general idea is that there’s already a central point in the SolrCloud
> cluster and it’s Zookeeper. It might not be necessary to have a second
> central point (Overseer) because nodes can interact directly with Zookeeper
> and synchronize more efficiently by optimistic locking using “conditional
> updates” (a.k.a compare and swap or CAS).
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]