Master/slave is not just two roles, but a kind of cluster. I really don’t think
“Standalone” captures the non-Cloud cluster. Nobody in Chegg would 
have any idea that “standalone” meant “no Zookeeper”.

I’ve never thought that master/slave accurately described the traditional
replication model, but I can’t remember what terms I preferred because 
that was ten years ago. A master gives commands. That isn’t how Solr
masters work. It is closer to how an NRT or TLOG leader works, actually.

A Solr master just sits there and waits for other nodes to copy the index.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Jun 17, 2020, at 3:03 PM, Trey Grainger <solrt...@gmail.com> wrote:
> 
> Hi Walter,
> 
>> In Solr Cloud, the leader knows about each follower and updates them.
> Respectfully, I think you're mixing the "TYPE" of replica with the role of
> the "leader" and "follower"
> 
> In SolrCloud, only if the TYPE of a follower is NRT or TLOG does the leader
> push updates those followers.
> 
> When the TYPE of a follower is PULL, then it does not.  In Standalone mode,
> the type of a (currently) master would be NRT, and the type of the
> (currently) slaves is always PULL.
> 
> As such, this behavior is consistent across both SolrCloud and Standalone
> mode. It is true that Standalone mode does not currently have support for
> two of the replica TYPES that SolrCloud mode does, but I maintain that
> leader vs. follower behavior is inconsistent here.
> 
> Trey Grainger
> Founder, Searchkernel
> https://searchkernel.com
> 
> 
> 
> On Wed, Jun 17, 2020 at 5:41 PM Walter Underwood <wun...@wunderwood.org>
> wrote:
> 
>> But they are not the same. In Solr Cloud, the leader knows about each
>> follower and updates them. In standalone, the master has no idea that
>> slaves exist until a replication request arrives.
>> 
>> In Solr Cloud, the leader is elected. In standalone, that role is fixed at
>> config load time.
>> 
>> Looking ahead in my email inbox, publisher/subscriber is an excellent
>> choice.
>> 
>> wunder
>> Walter Underwood
>> wun...@wunderwood.org
>> http://observer.wunderwood.org/  (my blog)
>> 
>>> On Jun 17, 2020, at 2:21 PM, Trey Grainger <solrt...@gmail.com> wrote:
>>> 
>>> I guess I don't see it as polysemous, but instead simplifying.
>>> 
>>> In my proposal, the terms "leader" and "follower" would have the exact
>> same
>>> meaning in both SolrCloud and standalone mode. The only difference would
>> be
>>> that SolrCloud automatically manages the leaders and followers, whereas
>> in
>>> standalone mode you have to manage them manually (as is the case with
>> most
>>> things in SolrCloud vs. Standalone).
>>> 
>>> My view is that having an entirely different set of terminology
>> describing
>>> the same thing is way more cognitive overhead than having consistent
>>> terminology.
>>> 
>>> Trey Grainger
>>> Founder, Searchkernel
>>> https://searchkernel.com
>>> 
>>> On Wed, Jun 17, 2020 at 4:50 PM Walter Underwood <wun...@wunderwood.org>
>>> wrote:
>>> 
>>>> I strongly disagree with using the Solr Cloud leader/follower
>> terminology
>>>> for non-Cloud clusters. People in my company are confused enough without
>>>> using polysemous terminology.
>>>> 
>>>> “This node is the leader, but it means something different than the
>> leader
>>>> in this other cluster.” I’m dreading that conversation.
>>>> 
>>>> I like “principal”. How about “clone” for the slave role? That suggests
>>>> that
>>>> it does not accept updates and that it is loosely-coupled, only
>> depending
>>>> on the state of the no-longer-called-master.
>>>> 
>>>> Chegg has five production Solr Cloud clusters and one production
>>>> master/slave
>>>> cluster, so this is not a hypothetical for us. We have 100+ Solr hosts
>> in
>>>> production.
>>>> 
>>>> wunder
>>>> Walter Underwood
>>>> wun...@wunderwood.org
>>>> http://observer.wunderwood.org/  (my blog)
>>>> 
>>>>> On Jun 17, 2020, at 1:36 PM, Trey Grainger <solrt...@gmail.com> wrote:
>>>>> 
>>>>> Proposal:
>>>>> "A Solr COLLECTION is composed of one or more SHARDS, which each have
>> one
>>>>> or more REPLICAS. Each replica can have a ROLE of either:
>>>>> 1) A LEADER, which can process external updates for the shard
>>>>> 2) A FOLLOWER, which receives updates from another replica"
>>>>> 
>>>>> (Note: I prefer "role" but if others think it's too overloaded due to
>> the
>>>>> overseer role, we could replace it with "mode" or something similar)
>>>>> -------------------------------------------
>>>>> 
>>>>> To be explicit with the above definitions:
>>>>> 1) In SolrCloud, the roles of leaders and followers can dynamically
>>>> change
>>>>> based upon the status of the cluster. In standalone mode, they can be
>>>>> changed by manual intervention.
>>>>> 2) A leader does not have to have any followers (i.e. only one active
>>>>> replica)
>>>>> 3) Each shard always has one leader.
>>>>> 4) A follower can also pull updates from another follower instead of a
>>>>> leader (traditionally known as a REPEATER). A repeater is still a
>>>> follower,
>>>>> but would not be considered a leader because it can't process external
>>>>> updates.
>>>>> 5) A replica cannot be both a leader and a follower.
>>>>> 
>>>>> In addition to the above roles, each replica can have a TYPE of one of:
>>>>> 1) NRT - which can serve in the role of leader or follower
>>>>> 2) TLOG - which can only serve in the role of follower
>>>>> 3) PULL - which can only serve in the role of follower
>>>>> 
>>>>> A replica's type may be changed automatically in the event that its
>> role
>>>>> changes.
>>>>> 
>>>>> I think this terminology is consistent with the current Leader/Follower
>>>>> usage while also being able to easily accomodate a rename of the
>>>> historical
>>>>> master/slave terminology without mental gymnastics or the introduction
>> or
>>>>> more cognitive load through new terminology. I think adopting the
>>>>> Primary/Replica terminology will be incredibly confusing given the
>>>> already
>>>>> specific and well established meaning of "replica" within Solr.
>>>>> 
>>>>> All the Best,
>>>>> 
>>>>> Trey Grainger
>>>>> Founder, Searchkernel
>>>>> https://searchkernel.com
>>>>> 
>>>>> 
>>>>> 
>>>>> On Wed, Jun 17, 2020 at 3:38 PM Anshum Gupta <ans...@anshumgupta.net>
>>>> wrote:
>>>>> 
>>>>>> Hi everyone,
>>>>>> 
>>>>>> Moving a conversation that was happening on the PMC list to the public
>>>>>> forum. Most of the following is just me recapping the conversation
>> that
>>>> has
>>>>>> happened so far.
>>>>>> 
>>>>>> Some members of the community have been discussing getting rid of the
>>>>>> master/slave nomenclature from Solr.
>>>>>> 
>>>>>> While this may require a non-trivial effort, a general consensus so
>> far
>>>>>> seems to be to start this process and switch over incrementally, if a
>>>>>> single change ends up being too big.
>>>>>> 
>>>>>> There have been a lot of suggestions around what the new nomenclature
>>>> might
>>>>>> look like, a few people don’t want to overlap the naming here with
>> what
>>>>>> already exists in SolrCloud i.e. leader/follower.
>>>>>> 
>>>>>> Primary/Replica was an option that was suggested based on what other
>>>>>> vendors are moving towards based on Wikipedia:
>>>>>> https://en.wikipedia.org/wiki/Master/slave_(technology)
>>>>>> , however there were concerns around the use of “replica” as that
>>>> denotes a
>>>>>> very specific concept in SolrCloud. Current terminology clearly
>>>>>> differentiates the use of the traditional replication model from
>>>> SolrCloud
>>>>>> and reusing the names would make it difficult for that to happen.
>>>>>> 
>>>>>> There were similar concerns around using Leader/follower.
>>>>>> 
>>>>>> Let’s continue this conversation here while making sure that we
>> converge
>>>>>> without much bike-shedding.
>>>>>> 
>>>>>> -Anshum
>>>>>> 
>>>> 
>>>> 
>> 
>> 

Reply via email to