it is a bit confusing but initLimit is the timer that is used when a
follower connects to a leader. there may be some state transfers
involved to bring the follower up to speed so we need to be able to
allow a little extra time for the initial connection.
after that we use syncLimit to figure
Michael Bauland wrote:
- When I connect with a client to the Zookeeper ensemble I provide the
three IP addresses of my three Zookeeper servers. Does the client then
choose one of them arbitrarily or will it always try to connect to the
first one first? I'm asking since I would like to have my
On top of Ben's description, you probably need to set initLimit to
several minutes to transfer 700MB (worst case). The value of
syncLimit, however, does not need to be that large.
On Mar 15, 2010, at 7:24 PM, Benjamin Reed wrote:
it is a bit confusing but initLimit is the timer
If the links do not work for us for zk, then they are unlikely to work with
any other solution - such as trying to stretch Pacemaker or Red Hat Cluster
with their multicast protocols across the links.
If the links are not good enough, we might have to spend some more money to
IMO latency is the primary issue you will face, but also keep in mind
reliability w/in a colo.
Say you have 3 colos (obv can't be 2), if you only have 3 servers, one
in each colo, you will be reliable but clients w/in each colo will have
to connect to a remote colo if the local fails. You
Thanks for you input.
I am planning on having 3 zk servers per data centre, with perhaps only 2 in
the tie-breaker site.
The traffic between zk and the applications will be lots of local reads -
who is the primary database ?. Changes to the config will be rare (server
The results would be really nice information to have on ZooKeeper wiki.
Would be very helpful for others considering the same kind of deployment.
So, do send out your results on the list.
On 3/8/10 11:18 AM, Martin Waite email@example.com wrote:
That's controlled by the tickTime/synclimit/initlimit/etc.. see more
about this in the admin guide: http://bit.ly/c726DC
You'll want to increase from the defaults since those are typically for
high performance interconnect (ie within colo). You are correct though,
much will depend on your
As Ted rightly mentions that ZooKeeper usually is run within a colo because
of the low latency requirements of applications that it supports.
Its definitely reasnoble to use it in a multi data center environments but
you should realize the implications of it. The high latency/low
The inter-site links are a nuisance. We have two data-centres with 100Mb
links which I hope would be good enough for most uses, but we need a 3rd
site - and currently that only has 2Mb links to the other sites. This might
be a problem.
The ensemble would have a lot of read traffic
2Mb link might certainly be a problem. We can refer to these nodes as
ZooKeeper servers. Znodes is used to data elements in the ZooKeeper data
The Zookeeper ensemble has minimal traffic which is basically health checks
between the members of the ensemble. We call one of the members
If you can stand the latency for updates then zk should work well for
you. It is unlikely that you will be able to better than zk does and
still maintain correctness.
Do note that you can, probalbly bias client to use a local server.
That should make things more efficient.
Sent from my
What you describe is relatively reasonable, even though Zookeeper is not
normally distributed across multiple data centers with all members getting
full votes. If you account for the limited throughput that this will impose
on your applications that use ZK, then I think that this can work well.
I take your point about reliability, but I have no option other than finding
a multi-site solution.
Unfortunately, in my experience sites are much less reliable than individual
machines, and so in a way coping with site failure is more important than
individual machine failure. I imagine that
Mail list logo