Hi Folks

I am debugging an issue where the c client library takes a while to reconnect 
to Zookeeper server (version 3.5.1-alpha), any help is understanding what the 
problem is highly appreciated.

I tried 2 cases:

1) The server is restarted with no change in server configuration.

The client enters connecting state when the socket to zookeeper server closes 
and within a few seconds enter connected state (reconnects within 5 seconds)

2) The server is restarted with some config changes (we go from 2 zookeeper 
server to 1)

The client enters connecting state when the socket to zookeeper server closes 
but enters connected state after a while ( I have seen up to 
2/3*session_timeout). The session timeout I use is 180 seconds.

I am not able to understand why the client takes a long time to enter connected 
state in case 2. Is there a way to nudge zookeeper client library to reconnect 
faster?

Regards,
Pramod

https://zookeeper.apache.org/doc/r3.5.1-alpha/zookeeperProgrammers.html

When a client (session) becomes partitioned from the ZK serving cluster it will 
begin searching the list of servers that were specified during session 
creation. Eventually, when connectivity between the client and at least one of 
the servers is re-established, the session will either again transition to the 
"connected" state (if reconnected within the session timeout value) or it will 
transition to the "expired" state (if reconnected after the session timeout). 
It is not advisable to create a new session object (a new ZooKeeper.class or 
zookeeper handle in the c binding) for disconnection. The ZK client library 
will handle reconnect for you. In particular we have heuristics built into the 
client library to handle things like "herd effect", etc... Only create a new 
session when you are notified of session expiration (mandatory).

Reply via email to