Hey Ted,

The client library already handles all of this for you. Its not very clear, but 
the `host` parameter to:
> ZooKeeper(String host, int sessionTimeout, Watcher watcher)
... takes a comma separated list of server:port pairs, which should be the full 
list of servers in your quorum.

Assuming that the crashed server doesn't bring you below the minimum number of 
servers required for quorum, the client library will connect to another server 
in the list, and all of your Watchers will receive a 'connected' event, 
indicating that they changed servers. See: 
http://hadoop.apache.org/zookeeper/docs/current/zookeeperProgrammers.html#ch_gotchas

Also, the new Apache ZooKeeper mailing list is here: 
http://hadoop.apache.org/zookeeper/mailing_lists.html

Thanks,
Stu


-----Original Message-----
From: "Ted Dunning" <[EMAIL PROTECTED]>
Sent: Wednesday, October 15, 2008 5:01am
To: [EMAIL PROTECTED]
Subject: [Zookeeper-user] how to handle zookeeper server crash from client

-------------------------------------------------------------------------

What is the received wisdom about how to handle the crash of a ZK server
from the client.

Clearly, reconnecting is necessary.  Should that be done by the client by
just using a watcher that probes a list of servers until it finds a live
one?

For that matter, what is best practice relative to initial connection?  Are
people using a load balancer to abstract away how many servers are in the
zookeeper cluster?  Or are they writing application code to probe the
cluster until a live server is found?

-- 
ted


Reply via email to