[
https://issues.apache.org/jira/browse/ZOOKEEPER-1720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16207555#comment-16207555
]
lawrence114 commented on ZOOKEEPER-1720:
----------------------------------------
hi Kevin,
I have met the same issue recently, I want know is there any progress about
this problem in your side?
> Race in zookeeper_close() leads to hang
> ---------------------------------------
>
> Key: ZOOKEEPER-1720
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1720
> Project: ZooKeeper
> Issue Type: Bug
> Components: c client
> Affects Versions: 3.5.0
> Environment: Ubuntu 12.04.1
> Reporter: Kevin Jamieson
>
> Using ZK 3.5.4, zookeeper_close() occasionally hangs with a backtrace of the
> form:
> {noformat}
> #0 0x00002b255fab489c in __lll_lock_wait () from
> /lib/x86_64-linux-gnu/libpthread.so.0
> #1 0x00002b255fab26b0 in pthread_cond_broadcast@@GLIBC_2.3.2 () from
> /lib/x86_64-linux-gnu/libpthread.so.0
> #2 0x00002b2560568ced in unlock_completion_list (l=0x13f5430) at
> src/mt_adaptor.c:69
> #3 0x00002b256055b9ec in free_completions (zh=0x13f5270, callCompletion=1,
> reason=-116) at src/zookeeper.c:1521
> #4 0x00002b256055d3bc in zookeeper_close (zh=0x13f5270) at
> src/zookeeper.c:2954
> {noformat}
> At which point the zhandle_t struct appears to have already been freed, as it
> contains garbage:
> {noformat}
> (gdb) p zh->sent_requests.cond
> $19 = {
> __data = {
> __lock = 2,
> __futex = 0,
> __total_seq = 18446744073709551615,
> __wakeup_seq = 0,
> __woken_seq = 0,
> __mutex = 0x0,
> __nwaiters = 0,
> __broadcast_seq = 0
> },
> __size =
> "\002\000\000\000\000\000\000\000\377\377\377\377\377\377\377\377", '\000'
> <repeats 31 times>,
> __align = 2
> }
> {noformat}
> There appears to be a race condition in the following code:
> {noformat}
> int api_epilog(zhandle_t *zh,int rc)
> {
> if(inc_ref_counter(zh,-1)==0 && zh->close_requested!=0)
> zookeeper_close(zh);
> return rc;
> }
> int zookeeper_close(zhandle_t *zh)
> {
> int rc=ZOK;
> if (zh==0)
> return ZBADARGUMENTS;
> zh->close_requested=1;
> if (inc_ref_counter(zh,1)>1) {
> {noformat}
> As api_epilog() may free zh in between zookeeper_close() setting
> zh->close_requested=1 and incrementing the reference count.
> The following patch should fix the problem:
> {noformat}
> diff --git a/src/c/src/zookeeper.c b/src/c/src/zookeeper.c
> index 6943243..61a263a 100644
> --- a/src/c/src/zookeeper.c
> +++ b/src/c/src/zookeeper.c
> @@ -1051,6 +1051,7 @@ zhandle_t *zookeeper_init(const char *host, watcher_fn
> watcher,
> goto abort;
> }
>
> + api_prolog(zh);
> return zh;
> abort:
> errnosave=errno;
> @@ -2889,7 +2890,7 @@ int zookeeper_close(zhandle_t *zh)
> return ZBADARGUMENTS;
>
> zh->close_requested=1;
> - if (inc_ref_counter(zh,1)>1) {
> + if (inc_ref_counter(zh,0)>1) {
> /* We have incremented the ref counter to prevent the
> * completions from calling zookeeper_close before we have
> * completed the adaptor_finish call below. */
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)