yihao yang created ZOOKEEPER-2802:
-------------------------------------
Summary: Zookeeper C client hang @wait_sync_completion
Key: ZOOKEEPER-2802
URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2802
Project: ZooKeeper
Issue Type: Bug
Components: c client
Affects Versions: 3.4.6
Environment: DISTRIB_DESCRIPTION="Ubuntu 14.04.2 LTS"
Reporter: yihao yang
Priority: Critical
I was using zookeeper 3.4.6 c client to access one zookeeper server in a VM.
The VM environment is not stable and I get a lot of EXPIRED_SESSION_STATE
events. I will create another session to ZK when I get an expired event. I also
have a read/write lock to protect session read (get/list/... on zk) and
write(connect, close, reconnect zhandle).
The problem is the session got an EXPIRED_SESSION_STATE event and when it tried
to hold the write lock and reconnect the session, it found there is a thread
was holding the read lock (which was operating sync list on zk). See the stack
below:
GDBStack:
Thread 7 (Thread 0x7f838a43a700 (LWP 62845)):
#0 pthread_cond_wait@@GLIBC_2.3.2 () at
../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1 0x0000000000636033 in wait_sync_completion (sc=sc@entry=0x7f8344000af0) at
src/mt_adaptor.c:85
#2 0x0000000000633248 in zoo_wget_children2_ (zh=<optimized out>,
path=0x7f83440677a8 "/graphsql/dict/objects/__services/RLS-GSE/_static_nodes",
watcher=0x0, watcherCtx=0x13e6310, strings=0x7f838a4397b0, stat=0x7f838a4398d0)
at src/zookeeper.c:3630
#3 0x000000000045e6ff in ZooKeeperContext::getChildren (this=0x13e6310,
path=..., children=children@entry=0x7f838a439890,
stat=stat@entry=0x7f838a4398d0) at zookeeper_context.cpp:xxx
This sync list didn't return a ZINVALIDSTAT but hung. Anyone know the problem?
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)