One quick comment. We do not require majority quorums in ZooKeeper,
and one reason we implemented this feature was exactly to enable more
flexibility in deployments with multiple data centers. Flexible
quorums are not supposed to give you the ability of always having all
voting replicas in a single data center, but depending on the number
of data centers you're using, it could give you fewer cross-dc
messages per transaction.
I was actually wondering if with the new reconfiguration feature
coming up we will be able to change weights of servers in an online
fashion.
-Flavio
On Sep 22, 2011, at 10:53 PM, Mahadev Konar wrote:
Better still put it up on a wiki on
https://cwiki.apache.org/confluence/display/ZOOKEEPER/Index
thanks
mahadev
On Sep 22, 2011, at 1:45 PM, Vishal Kher wrote:
Hi Camille,
This is very interesting.
Can you give more info on your setup?
- Network connectivity (bandwidth and latency) that you have
between the
data centers? How much of the bandwidth is available for ZK?
- What are the timeout (server and client session timeout) values
that you
use? How much latency are the applications willing to tolerate?
We are thinking of running ZK across data centers as well and it
will be
great to see how others are resolving some of these problems.
Thanks.
-Vishal
On Thu, Sep 22, 2011 at 11:03 AM, Fournier, Camille F. <
[email protected]> wrote:
We spread our ZKs across 3 data centers and in fact, these data
centers are
split across global regions (2 or 4 in one region, one in a remote
region).
To keep throughput up (and note that the throughput you have to
worry about
is only write throughput), we always ensure that the master is in
one of the
"local" data centers.
If you have a very write-heavy and write time sensitive load, this
might
affect your performance. It won't affect reads at all because
reads are
serviced from the memory of the zk you connect to. For a mostly
read-intensive load, splitting across data centers is unlikely to
cause you
problems.
There is one exception: Monitoring. Even across data centers in
the same
region, we sometimes see zk dashboard unable to properly monitor
the leader
of a heavily-utilized cluster. This is due to the way the 4lw
connections
are managed, and something I'm trying to fix.
If you have the machines to test, I would recommend running zk-
smoketest (
https://github.com/phunt/zk-smoketest) on the proposed config.
C
-----Original Message-----
From: Damu R [mailto:[email protected]]
Sent: Thursday, September 22, 2011 10:50 AM
To: [email protected]
Subject: zookeeper cluster spanning datacenters
Hi,
I would like to know the downsides of having a zookeeper cluster
that spans
multiple datacenters. The requirement is a datacenter failure
should not
bring down the zookeeper cluster. From my understanding it is not
possible
to have a hot/cold cluster kind of setup possible. So we are
thinking of
putting zk servers in 3 colos(1+1+1 or 2+2+3). One of the major
drawback I
could think of is the throughput of the system affected by
latency. The
system does not require high throughput and can accept some
latency. How
much effect will the latency have on the throughput of the system?
What are
the other downsides of spreading the cluster across datacenters?
Regards
Damu
flavio
junqueira
research scientist
[email protected]
direct +34 93-183-8828
avinguda diagonal 177, 8th floor, barcelona, 08018, es
phone (408) 349 3300 fax (408) 349 3301