> In my setup once DC1 comes back up make sure you start only two nodes.
And if you start all three in DC1, you have 3+3 voting, what would then happen? Any chance of state corruption? I believe that my solution isolates manual change to two ZK nodes in DC2, while your requires config change to 1 in DC2 and manual start/stop of 1 in DC1. > Add another server in either DC1 or DC2, in a separate rack, with independent > power etc. As Shawn says below, install the third ZK there. You would satisfy > most of your requirements that way. Well, that’s not up to me to decide, it’s the customer environment that sets the constraints, they currently have 2 independent geo locations. And Solr is just a dependency of some other app they need to install, so doubt that they are very happy to start adding racks or independent power/network for this alone. Of course, if they already have such redundancy within one of the DCs, placing a 3rd ZK there is an ideal solution with probably good enough HA. If not, I’m looking for the 2nd best low-friction approach with software-only. Thanks for the input all! -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com > 26. mai 2017 kl. 21.27 skrev Pushkar Raste <pushkar.ra...@gmail.com>: > > In my setup once DC1 comes back up make sure you start only two nodes. > > Now bring down the original observer and make it observer again. > > Bring back the third node > > > > It seems like lot of work compared to Jan's setup but you get 5 voting > members instead of 3 in normal situation. > > On May 26, 2017 10:56 AM, "Susheel Kumar" <susheel2...@gmail.com> wrote: > >> Thanks, Pushkar, Make sense. Trying to understand the difference between >> your setup vs Jan's proposed setup. >> >> - Seems like when DC1 goes down, in your setup we have to bounce *one* from >> observer to non-observer while in Jan's setup *two* observers to >> non-observers. Anything else I am missing >> >> - When DC1 comes back - with your setup we need to bounce the one >> non-observer to observer to have 5 nodes quorum otherwise there are 3 + 3 >> observers while with Jan's setup if we don't take any action when DC1 comes >> back, we are still operational with 5 nodes quorum. Isn't it? Or I am >> missing something. >> >> >> >> On Fri, May 26, 2017 at 10:07 AM, Pushkar Raste <pushkar.ra...@gmail.com> >> wrote: >> >>> Damn, >>> Math is hard >>> >>> DC1 : 3 non observers >>> DC2 : 2 non observers >>> >>> 3 + 2 = 5 non observers >>> >>> Observers don't participate in voting = non observers participate in >> voting >>> >>> 5 non observers = 5 votes >>> >>> In addition to the 2 non observer, DC2 also has an observer, which as you >>> pointed out does not participate in the voting. >>> >>> We still have 5 voting nodes. >>> >>> >>> Think of the observer as a standby name node in Hadoop 1.x, where some >>> intervention needed if the primary name node went down. >>> >>> >>> I hope my math makes sense >>> >>> On May 26, 2017 9:04 AM, "Susheel Kumar" <susheel2...@gmail.com> wrote: >>> >>> From ZK documentation, observers do not participate in vote, so Pushkar, >>> when you said 5 nodes participate in voting, what exactly you mean? >>> >>> -- Observers are non-voting members of an ensemble which only hear the >>> results of votes, not the agreement protocol that leads up to them. >>> >>> Per ZK documentation, 3.4 includes observers, does that mean Jan thought >>> experiment is practically possible, correct? >>> >>> >>> On Fri, May 26, 2017 at 3:53 AM, Rick Leir <rl...@leirtech.com> wrote: >>> >>>> Jan, Shawn, Susheel >>>> >>>> First steps first. First, let's do a fault-tolerant cluster, then >> maybe a >>>> _geographically_ fault-tolerant cluster. >>>> >>>> Add another server in either DC1 or DC2, in a separate rack, with >>>> independent power etc. As Shawn says below, install the third ZK there. >>> You >>>> would satisfy most of your requirements that way. >>>> >>>> cheers -- Rick >>>> >>>> >>>> On 2017-05-23 12:56 PM, Shawn Heisey wrote: >>>> >>>>> On 5/23/2017 10:12 AM, Susheel Kumar wrote: >>>>> >>>>>> Hi Jan, FYI - Since last year, I have been running a Solr 6.0 cluster >>> in >>>>>> one of lower env with 6 shards/replica in dc1 & 6 shard/replica in >> dc2 >>>>>> (each shard replicated cross data center) with 3 ZK in dc1 and 2 ZK >> in >>> dc2. >>>>>> (I didn't have the availability of 3rd data center for ZK so went >> with >>> only >>>>>> 2 data center with above configuration) and so far no issues. Its >> been >>>>>> running fine, indexing, replicating data, serving queries etc. So in >> my >>>>>> test, setting up single cluster across two zones/data center works >>> without >>>>>> any issue when there is no or very minimal latency (in my case around >>> 30ms >>>>>> one way >>>>>> >>>>> >>>>> With that setup, if dc2 goes down, you're all good, but if dc1 goes >>> down, >>>>> you're not. >>>>> >>>>> There aren't enough ZK servers in dc2 to maintain quorum when dc1 is >>>>> unreachable, and SolrCloud is going to go read-only. Queries would >> most >>>>> likely work, but you would not be able to change the indexes at all. >>>>> >>>>> ZooKeeper with N total servers requires int((N/2)+1) servers to be >>>>> operational to maintain quorum. This means that with five total >>> servers, >>>>> three must be operational and able to talk to each other, or ZK cannot >>>>> guarantee that there is no split-brain, so quorum is lost. >>>>> >>>>> ZK in two data centers will never be fully fault-tolerant. There is no >>>>> combination of servers that will work properly. You must have three >>> data >>>>> centers for a geographically fault-tolerant cluster. Solr would be >>>>> optional in the third data center. ZK must be installed in all three. >>>>> >>>>> Thanks, >>>>> Shawn >>>>> >>>>> >>>> >>> >>