[jira] [Commented] (ZOOKEEPER-981) Hang in zookeeper_close() in the multi-threaded C client

helei (Commented) (JIRA) Fri, 18 Nov 2011 05:33:22 -0800

    [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13152858#comment-13152858
 ]


helei commented on ZOOKEEPER-981:
---------------------------------

Sorry for not response in time. I saw another problem with this patch applied. 
Hang in zookeeper_close() again. here is the stack:
(gdb) bt
#0  0x000000302b80adfb in __lll_mutex_lock_wait () from 
/lib64/tls/libpthread.so.0
#1  0x000000302b1307a8 in main_arena () from /lib64/tls/libc.so.6
#2  0x000000302b910230 in stack_used () from /lib64/tls/libpthread.so.0
#3  0x000000302b808dde in pthread_cond_broadcast@@GLIBC_2.3.2 () from 
/lib64/tls/libpthread.so.0
#4  0x00000000006b4ce8 in adaptor_finish (zh=0x6902060) at src/mt_adaptor.c:217
#5  0x00000000006b0fd0 in zookeeper_close (zh=0x6902060) at src/zookeeper.c:2297
(gdb) p zh->ref_counter 
$5 = 1
(gdb) p zh->close_requested 
$6 = 1
(gdb) p *zh
$7 = {fd = 110112576, hostname = 0x6903620 "", addrs = 0x0, addrs_count = 1, 
  watcher = 0x62e5dc 
<doris::meta_register_mgr_t::register_mgr_watcher(_zhandle*, int, int, char 
const*, void*)>, last_recv = {tv_sec = 1321510694, 
    tv_usec = 552835}, last_send = {tv_sec = 1321510694, tv_usec = 552886}, 
last_ping = {tv_sec = 1321510685, tv_usec = 774869}, next_deadline = {
    tv_sec = 1321510704, tv_usec = 547831}, recv_timeout = 30000, input_buffer 
= 0x0, to_process = {head = 0x0, last = 0x0, lock = {__m_reserved = 0, 
      __m_count = 0, __m_owner = 0x0, __m_kind = 0, __m_lock = {__status = 0, 
__spinlock = 0}}}, to_send = {head = 0x0, last = 0x0, lock = {
      __m_reserved = 0, __m_count = 0, __m_owner = 0x0, __m_kind = 1, __m_lock 
= {__status = 0, __spinlock = 0}}}, sent_requests = {head = 0x0, last = 0x0, 
    cond = {__c_lock = {__status = 1, __spinlock = -1}, __c_waiting = 0x0, 
__padding = '\0' <repeats 15 times>, __align = 0}, lock = {__m_reserved = 0, 
      __m_count = 0, __m_owner = 0x0, __m_kind = 0, __m_lock = {__status = 0, 
__spinlock = 0}}}, completions_to_process = {head = 0x2aefbff800, 
    last = 0x2af0e05f40, cond = {__c_lock = {__status = 592705486850, 
__spinlock = -1}, __c_waiting = 0x45, 
      __padding = "E\000\000\000\000\000\000\000\220\006\000\000\000", __align 
= 296352743424}, lock = {__m_reserved = 1, __m_count = 0, 
      __m_owner = 0x1000026ca, __m_kind = 0, __m_lock = {__status = 0, 
__spinlock = 0}}}, connect_index = 0, client_id = {client_id = 
86551148676999146, 
    passwd = "G懵擀\233\213\f闬202筴\002錪\034"}, last_zxid = 82057372, 
outstanding_sync = 0, primer_buffer = {buffer = 0x6902290 "", len = 40, 
    curr_offset = 44, next = 0x0}, primer_storage = {len = 36, protocolVersion 
= 0, timeOut = 30000, sessionId = 86551148676999146, passwd_len = 16, 
    passwd = "G懵擀\233\213\f闬202筴\002錪\034"}, 
  primer_storage_buffer = 
"\000\000\000$\000\000\000\000\000\000u0\0013}惜薵闬000\000\000\020G懵擀\233\213\f闬202筴\002錪\034",
 state = 0, context = 0x0, 
  auth_h = {auth = 0x0, lock = {__m_reserved = 0, __m_count = 0, __m_owner = 
0x0, __m_kind = 0, __m_lock = {__status = 0, __spinlock = 0}}}, 
  ref_counter = 1, close_requested = 1, adaptor_priv = 0x0, socket_readable = 
{tv_sec = 0, tv_usec = 0}, active_node_watchers = 0x6901520, 
  active_exist_watchers = 0x69015d0, active_child_watchers = 0x6902ef0, chroot 
= 0x0}
I think the ref_counter is suposed to be 2 or 3 here. 1 seems not correct. 
thanks again
                
> Hang in zookeeper_close() in the multi-threaded C client
> --------------------------------------------------------
>
>                 Key: ZOOKEEPER-981
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-981
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: c client
>    Affects Versions: 3.3.2
>         Environment: Debian Squeeze, Linux 2.6.32-5, x86_64
>            Reporter: Jeremy Stribling
>            Assignee: Jeremy Stribling
>            Priority: Critical
>             Fix For: 3.4.0
>
>         Attachments: ZOOKEEPER-981-v1.patch, ZOOKEEPER-981.tar.gz, 
> zookeeper-981.patch
>
>
> I saw a hang once when my C++ application called the zookeeper_close() method 
> of the multi-threaded Zookeeper client library.  The stack trace of the hung 
> thread was the following:
> {quote}
> Thread 8 (Thread 5644):
> #0  0x00007f5d7bb5bbe4 in __lll_lock_wait () from /lib/libpthread.so.0
> #1  0x00007f5d7bb59ad0 in pthread_cond_broadcast@@GLIBC_2.3.2 () from 
> /lib/libpthread.so.0
> #2  0x00007f5d793628f6 in unlock_completion_list (l=0x32b4d68) at 
> .../zookeeper/src/c/src/mt_adaptor.c:66
> #3  0x00007f5d79354d4b in free_completions (zh=0x32b4c80, callCompletion=1, 
> reason=-116) at .../zookeeper/src/c/src/zookeeper.c:1069
> #4  0x00007f5d79355008 in cleanup_bufs (zh=0x32b4c80, callCompletion=1, 
> rc=-116) at .../thirdparty/zookeeper/src/c/src/zookeeper.c:1125
> #5  0x00007f5d79353200 in destroy (zh=0x32b4c80) at 
> .../thirdparty/zookeeper/src/c/src/zookeeper.c:366
> #6  0x00007f5d79358e0e in zookeeper_close (zh=0x32b4c80) at 
> .../zookeeper/src/c/src/zookeeper.c:2326
> #7  0x00007f5d79356d18 in api_epilog (zh=0x32b4c80, rc=0) at 
> .../zookeeper/src/c/src/zookeeper.c:1661
> #8  0x00007f5d79362f2f in adaptor_finish (zh=0x32b4c80) at 
> .../zookeeper/src/c/src/mt_adaptor.c:205
> #9  0x00007f5d79358c8c in zookeeper_close (zh=0x32b4c80) at 
> .../zookeeper/src/c/src/zookeeper.c:2297 
> ...
> {quote}
> The omitted part of the stack trace is entirely within my application, and 
> contains no other calls to/from the Zookeeper client.  In particular, I am 
> not calling zookeeper_close() from within a completion handler or any of the 
> library's threads.
> I haven't been able to reproduce this, and when I encountered this I wasn't 
> capturing logging from the client library, so unfortunately I don't have any 
> more information at this time.  But I will update this JIRA if I see it again.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-981) Hang in zookeeper_close() in the multi-threaded C client

Reply via email to