Optimized WAN ZooKeeper Config : Multi-Ensemble configuration

Todd Greenwood Wed, 05 Aug 2009 12:08:42 -0700

Flavio/Patrick/Mahadev -

Thanks for your support to date. As I understand it, the sticky points
w/ respect to WAN deployments are:


1. Leader Election: 

Leader elections in the WAN config (pod zk server weight = 0) is a bit
troublesome (ZOOKEEPER-498)

2. Network Connectivity Required: 

ZooKeeper clients cannot read/write to ZK Servers if the Server does not
have network connectivity to the quorum. In short, there is a hard
requirement to have network connectivity in order for the clients to
access the shared memory graph in ZK.

Alternative
-----------

I have seen some discussion about in the past re: multi-ensemble
solutions. Essentially, put one ensemble in each physical location
(POD), and another in your DC, and have a fairly simple process
coordinate synchronizing the various ensembles. If the POD writes can be
confined to a sub-tree in the master graph, then this should be fairly
simple. I'm imagining the following:

DC (master) graph:
/root/pods/1/data/item1
/root/pods/1/data/item2
/root/pods/1/data/item3
/root/pods/2
/root/pods/3
...etc
/root/shared/allpods/readonly/data/item1
/root/shared/allpods/readonly/data/item2
...etc

This has the advantage of minimizing cross pod traffic, which could be a
real perf killer in an WAN. It also provides transacted writes in the
PODs, even in the disconnected state. Clearly, another portion of the
business logic has to reconcile the DC (master) graph such that each of
the pods data items are processed, etc.

Does anyone have any experience with this (pitfalls, suggestions, etc.?)

-Todd

Optimized WAN ZooKeeper Config : Multi-Ensemble configuration

Reply via email to