Hi All,

I am using the Zookeeper C-client libraries to connect to ZK servers.
I am using 3.6.2 library.

The problem I am facing is that the library gets stuck in pthread_join() call 
and never returns.

The scenario is as follows:

  *   Zookeeper C-client connects to zookeeper over a m-TLS connection.
  *   The client loses network connectivity to zookeeper servers.
  *   During this time the zookeeper client code calls function 
zookeeper_close().
  *   Zookeeper_close() never returns.
  *   The state of the ZH handle during this time is ZOO_SSL_CONNECTING_STATE

I made the program dump core while it was stuck in this state. The back trace 
shows that zookeeper_close() calls adaptor_finsih() which gets stuck in the 
phthread_join() call for the IO thread.
This indicates that IO thread was stuck doing something.

The backtrace for the IO thread shows this trace.
Do_io() -> zookeeper_process() -> check_events() -> init_ssl_for_handler() -> 
init_ssl_for_socket().

While looking at the code there is a while(1) loop in init_ssl_for_socket() to 
which I added the following highlighted code and it seemed to have fixed the 
problem for me. Can anybody suggest if this is correct? Or if this problem has 
already been fixed in other releases?
while(1) {
        int rc;
        int sock = fd->sock;
        struct timeval tv;
        fd_set s_rfds, s_wfds;
        tv.tv_sec = 1;
        tv.tv_usec = 0;
        FD_ZERO(&s_rfds);
        FD_ZERO(&s_wfds);
        if(zh->close_requested)
        {
            return ZSSLCONNECTIONERROR;
        }

Many Thanks,
-Parag

Reply via email to