Hi Parag,

I had seen a crash from init_socket_for_ssl() on 3.6.2 c-client when using 
mTLS. The bt looked like this in my case.
The root casue was `SSL_library_init` call is not thread safe. It’s been called 
from init_ssl_for_socket(). I am not sure if you’re hitting this issue or not.. 
If so, you could protect the init call with a lock.


(gdb-9.1-490) bt
#0  0x00007f15324bb7bb in raise () from ...lib/x86_64-linux-gnu/libc.so.6
#1  0x00007f1532456535 in abort () from .../lib/x86_64-linux-gnu/libc.so.6
#2  0x00007f1533073abf in OpenSSLDie () from 
...lib/x86_64-linux-gnu/libcrypto.so.1.0.0
#3  0x00007f1532a97a68 in ?? () from .../lib/x86_64-linux-gnu/libssl.so.1.0.0
#4  0x00007f3432d9af2b in SSL_library_init () from 
.../x86_64-linux-gnu/libssl.so.1.0.0
…

-Thanks

From: Mulay, Parag Bhausaheb (Parag) <para...@avaya.com>
Date: Monday, April 4, 2022 at 7:40 AM
To: dev@zookeeper.apache.org <dev@zookeeper.apache.org>
Subject: RE: [External]Zookeeper C-client library getting stuck on 
pthread_join() call.
NetApp Security WARNING: This is an external email. Do not click links or open 
attachments unless you recognize the sender and know the content is safe.




Hi All,

I am sorry for flooding your mailboxes. I am not sure if I am sending this to a 
incorrect group or something?
But any help or suggestions regarding this will be really helpful.

Thanks
-Parag

-----Original Message-----
From: Mulay, Parag Bhausaheb (Parag) <para...@avaya.com>
Sent: Wednesday, March 30, 2022 8:04 AM
To: dev@zookeeper.apache.org
Subject: Re: [External]Zookeeper C-client library getting stuck on 
pthread_join() call.

Hi All,

Any suggestions about this? It will be of great help.

Thanks, in advance
-Parag
________________________________
From: Mulay, Parag Bhausaheb (Parag)
Sent: Monday, March 28, 2022 11:18:33 AM
To: dev@zookeeper.apache.org <dev@zookeeper.apache.org>
Subject: RE: [External]Zookeeper C-client library getting stuck on 
pthread_join() call.

It seems the mail looses its formatting. The code I added was the check for 
"close_requested", rest is existing code.

Thanks
-Parag

-----Original Message-----
From: Mulay, Parag Bhausaheb (Parag) <para...@avaya.com>
Sent: Monday, March 28, 2022 11:12 AM
To: dev@zookeeper.apache.org
Subject: [External]Zookeeper C-client library getting stuck on pthread_join() 
call.

[External Sender]

Hi All,

I am using the Zookeeper C-client libraries to connect to ZK servers.
I am using 3.6.2 library.

The problem I am facing is that the library gets stuck in pthread_join() call 
and never returns.

The scenario is as follows:

  *   Zookeeper C-client connects to zookeeper over a m-TLS connection.
  *   The client loses network connectivity to zookeeper servers.
  *   During this time the zookeeper client code calls function 
zookeeper_close().
  *   Zookeeper_close() never returns.
  *   The state of the ZH handle during this time is ZOO_SSL_CONNECTING_STATE

I made the program dump core while it was stuck in this state. The back trace 
shows that zookeeper_close() calls adaptor_finsih() which gets stuck in the 
phthread_join() call for the IO thread.
This indicates that IO thread was stuck doing something.

The backtrace for the IO thread shows this trace.
Do_io() -> zookeeper_process() -> check_events() -> init_ssl_for_handler() -> 
init_ssl_for_socket().

While looking at the code there is a while(1) loop in init_ssl_for_socket() to 
which I added the following highlighted code and it seemed to have fixed the 
problem for me. Can anybody suggest if this is correct? Or if this problem has 
already been fixed in other releases?
while(1) {
        int rc;
        int sock = fd->sock;
        struct timeval tv;
        fd_set s_rfds, s_wfds;
        tv.tv_sec = 1;
        tv.tv_usec = 0;
        FD_ZERO(&s_rfds);
        FD_ZERO(&s_wfds);
        if(zh->close_requested)
        {
            return ZSSLCONNECTIONERROR;
        }

Many Thanks,
-Parag

Reply via email to