In my setup once DC1 comes back up make sure you start only two nodes.

 Now bring down the original observer and make it observer again.

Bring back the third node



It seems like lot of work compared to Jan's setup but you get 5 voting
members instead of 3 in normal situation.

On May 26, 2017 10:56 AM, "Susheel Kumar" <susheel2...@gmail.com> wrote:

> Thanks, Pushkar, Make sense.  Trying to understand the difference between
> your setup vs Jan's proposed setup.
>
> - Seems like when DC1 goes down, in your setup we have to bounce *one* from
> observer to non-observer while in Jan's setup *two* observers to
> non-observers.  Anything else I am missing
>
> - When DC1 comes back -  with your setup we need to bounce the one
> non-observer to observer to have 5 nodes quorum otherwise there are 3 + 3
> observers while with Jan's setup if we don't take any action when DC1 comes
> back, we are still operational with 5 nodes quorum.  Isn't it?  Or I am
> missing something.
>
>
>
> On Fri, May 26, 2017 at 10:07 AM, Pushkar Raste <pushkar.ra...@gmail.com>
> wrote:
>
> > Damn,
> > Math is hard
> >
> > DC1 : 3 non observers
> > DC2 : 2 non observers
> >
> > 3 + 2 = 5 non observers
> >
> > Observers don't participate in voting = non observers participate in
> voting
> >
> > 5 non observers = 5 votes
> >
> > In addition to the 2 non observer, DC2 also has an observer, which as you
> > pointed out does not participate in the voting.
> >
> > We still have 5 voting nodes.
> >
> >
> > Think of the observer as a standby name node in Hadoop 1.x, where some
> > intervention needed if the primary name node went down.
> >
> >
> > I hope my math makes sense
> >
> > On May 26, 2017 9:04 AM, "Susheel Kumar" <susheel2...@gmail.com> wrote:
> >
> > From ZK documentation, observers do not participate in vote,  so Pushkar,
> > when you said 5 nodes participate in voting, what exactly you mean?
> >
> > -- Observers are non-voting members of an ensemble which only hear the
> > results of votes, not the agreement protocol that leads up to them.
> >
> > Per ZK documentation, 3.4 includes observers,  does that mean Jan thought
> > experiment is practically possible, correct?
> >
> >
> > On Fri, May 26, 2017 at 3:53 AM, Rick Leir <rl...@leirtech.com> wrote:
> >
> > > Jan, Shawn, Susheel
> > >
> > > First steps first. First, let's do a fault-tolerant cluster, then
> maybe a
> > > _geographically_ fault-tolerant cluster.
> > >
> > > Add another server in either DC1 or DC2, in a separate rack, with
> > > independent power etc. As Shawn says below, install the third ZK there.
> > You
> > > would satisfy most of your requirements that way.
> > >
> > > cheers -- Rick
> > >
> > >
> > > On 2017-05-23 12:56 PM, Shawn Heisey wrote:
> > >
> > >> On 5/23/2017 10:12 AM, Susheel Kumar wrote:
> > >>
> > >>> Hi Jan, FYI - Since last year, I have been running a Solr 6.0 cluster
> > in
> > >>> one of lower env with 6 shards/replica in dc1 & 6 shard/replica in
> dc2
> > >>> (each shard replicated cross data center) with 3 ZK in dc1 and 2 ZK
> in
> > dc2.
> > >>> (I didn't have the availability of 3rd data center for ZK so went
> with
> > only
> > >>> 2 data center with above configuration) and so far no issues. Its
> been
> > >>> running fine, indexing, replicating data, serving queries etc. So in
> my
> > >>> test, setting up single cluster across two zones/data center works
> > without
> > >>> any issue when there is no or very minimal latency (in my case around
> > 30ms
> > >>> one way
> > >>>
> > >>
> > >> With that setup, if dc2 goes down, you're all good, but if dc1 goes
> > down,
> > >> you're not.
> > >>
> > >> There aren't enough ZK servers in dc2 to maintain quorum when dc1 is
> > >> unreachable, and SolrCloud is going to go read-only.  Queries would
> most
> > >> likely work, but you would not be able to change the indexes at all.
> > >>
> > >> ZooKeeper with N total servers requires int((N/2)+1) servers to be
> > >> operational to maintain quorum.  This means that with five total
> > servers,
> > >> three must be operational and able to talk to each other, or ZK cannot
> > >> guarantee that there is no split-brain, so quorum is lost.
> > >>
> > >> ZK in two data centers will never be fully fault-tolerant. There is no
> > >> combination of servers that will work properly.  You must have three
> > data
> > >> centers for a geographically fault-tolerant cluster.  Solr would be
> > >> optional in the third data center.  ZK must be installed in all three.
> > >>
> > >> Thanks,
> > >> Shawn
> > >>
> > >>
> > >
> >
>

Reply via email to