Thanks Camille for the reply and for the article. On Wed, Mar 9, 2016 at 5:16 AM, Camille Fournier <[email protected]> wrote:
> If you're referring to my setup I explicitly don't keep the data in sync > across separate zk deployments. The logic for handling lookup to different > zk is in the client. Trying to keep data in sync across multiple > deployments of zk is probably not a great plan, but I'm sure you can think > of workarounds that will possibly address your issue if it is really > important. > On Mar 8, 2016 8:19 PM, "s influxdb" <[email protected]> wrote: > > > I am referring to a set up that has different clusters > > for example 3 zk cluster > > cluster ABC DC1 { node 1, node 2 } DC 2 { node 3 , node 4 } DC 3 { node > 5} > > cluster DEF DC2 { node 6, node 7 } DC 1 { node 8 , node 9 } DC 3 { node > > 10} > > cluster GHI DC3 { node 11, node 12 } DC 2 { node 13 , node 14 } DC 1 { > > node 15} > > > > This survives any single DC being unavailable. > > > > My question was how is the data kept in sync among the 3 different zk > > clusters. for example between cluster ABC and DEF. > > and how is the client failing over to DEF when ABC is unavailable > > > > > > > > > > On Tue, Mar 8, 2016 at 4:10 PM, Shawn Heisey <[email protected]> > wrote: > > > > > On 3/8/2016 3:40 PM, s influxdb wrote: > > > > How does the client failover to the DC2 if DC1 is down ? Does the > > > services > > > > registered on DC1 for example with ephemeral nodes have to > re-register > > > with > > > > DC2 ? > > > > > > Even though Flavio and Camille have both said this, I'm not sure > whether > > > the posters on this thread are hearing it: > > > > > > If you only have two datacenters, you cannot set up a reliable > zookeeper > > > ensemble. It's simply not possible. There are NO combinations of > > > servers that will achieve fault tolerance with only two datacenters. > > > > > > The reason this won't work is the same reason that you cannot set up a > > > reliable ensemble with only two servers. If either data center goes > > > down, half of your ZK nodes will be gone, and neither data center will > > > have enough nodes to achieve quorum. > > > > > > When you have three datacenters that are all capable of directly > > > reaching each other, you only need one ZK node in each location. If > any > > > single DC goes down, the other two will be able to keep the ensemble > > > running. > > > > > > Data is replicated among the DCs in exactly the same way that it is if > > > all the servers are in one place. I don't know enough about internal > ZK > > > operation to comment further. > > > > > > ============= > > > > > > Some TL;DR information to follow: > > > > > > If you want to be able to take a node down for maintenance in a > multi-DC > > > situation and *still* survive an entire DC going down, you need three > > > nodes in each of three data centers -- nine total. This ensemble is > > > able to survive any four servers going down, so you can take down a > node > > > in one DC for maintenance, and if one of the other DCs fails entirely, > > > there will be five functioning servers that can maintain quorum. > > > > > > Detailed information for the specific situation outlined by Kaushal: > > > > > > DC-1 1 Leader 2 Followers > > > DC-2 1 Follower 2 Observers. > > > > > > A six-node ensemble requires at least operational four nodes to > maintain > > > quorum. If either of those data centers fails, there are only three > > > nodes left, which is not enough. > > > > > > Thanks, > > > Shawn > > > > > > > > >
