[
https://issues.apache.org/jira/browse/HBASE-22079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16799237#comment-16799237
]
Yu Li commented on HBASE-22079:
-------------------------------
bq. I see at least one leak with client ZK quorum stuff, but we don't use that.
The {{clientZKWatcher}} will not be created unless
{{hbase.client.zookeeper.quorum}} is set, so it's strange to me that by default
this watcher is observed there, maybe worth a double check on the configuration.
bq. And what is the client zk watcher used for?
After HBASE-20159, if {{hbase.client.zookeeper.quorum}} is set but
{{hbase.client.zookeeper.observer.mode}} not, {{HMaster}} will take care of
watching and synchronizing master/meta address to client zookeeper.
bq. We do not close the MetaLocationSyncer?
Correct, in {{ClientZKSyncer$ClientZkUpdater#run}} the {{while}} loop will exit
if server is stopped, so we didn't add explicit {{stop}} method for it.
However, after a second look, it's true that the {{clientZKWatcher}} is leaked,
and I think the fix here is necessary. The only strange thing is why this
client zk watcher is started w/o setting {{hbase.client.zookeeper.quorum}} as
per Sergey mentioned...
> master leaks ZK on shutdown and gets stuck because of netty threads if netty
> socket is used
> -------------------------------------------------------------------------------------------
>
> Key: HBASE-22079
> URL: https://issues.apache.org/jira/browse/HBASE-22079
> Project: HBase
> Issue Type: Bug
> Reporter: Sergey Shelukhin
> Assignee: Sergey Shelukhin
> Priority: Major
> Attachments: HBASE-22079.patch
>
>
> {noformat}
> "master/...:17000:becomeActiveMaster-SendThread(...1)" #311 daemon prio=5
> os_prio=0 tid=0x0000000058c61800 nid=0x2dd0 waiting on condition
> [0x0000000c477fe000]
> java.lang.Thread.State: TIMED_WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x00000000c4a5b3c0> (a
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
> at
> java.util.concurrent.LinkedBlockingDeque.pollFirst(LinkedBlockingDeque.java:522)
> at
> java.util.concurrent.LinkedBlockingDeque.poll(LinkedBlockingDeque.java:684)
> at
> org.apache.zookeeper.ClientCnxnSocketNetty.doTransport(ClientCnxnSocketNetty.java:232)
> at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1146)
> {noformat}
> This causes a bunch of netty threads to also leak it looks like, and these
> are not daemon (by design, apparently)
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)