[
https://issues.apache.org/jira/browse/ZOOKEEPER-4235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andor Molnar reassigned ZOOKEEPER-4235:
---------------------------------------
Assignee: Andor Molnar (was: Ravi Kishore Valeti)
> Java Client SendThread does not clean up created objects during constructor
> of SaslClient and Login
> ---------------------------------------------------------------------------------------------------
>
> Key: ZOOKEEPER-4235
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4235
> Project: ZooKeeper
> Issue Type: Bug
> Components: java client
> Reporter: Daniel Wong
> Assignee: Andor Molnar
> Priority: Major
> Labels: pull-request-available
> Time Spent: 5h 20m
> Remaining Estimate: 0h
>
> Hi I am an Apache Phoenix committer and I help manage many many zookeeper
> clusters at my employment primarily using ZK for HBase use cases. We
> recently had a production incident where some of our ACLs were not setup
> preventing connectivity from the client to the ZK nodes and the failure path
> exposed 2 issues to fix. This Jira and
> https://issues.apache.org/jira/browse/ZOOKEEPER-4236 . This Jira is the more
> important of the 2 and handles the failure observed in that we had a
> FD/thread leak from the ZK java client send thread. We had hundreds of
> threads per JVM with the following stack trace.
> {code:java}
> java.lang.Thread.State: RUNNABLE at
> java.net.PlainSocketImpl.socketConnect([email protected]/Native Method)
> at
> java.net.AbstractPlainSocketImpl.doConnect([email protected]/AbstractPlainSocketImpl.java:399)
> - locked <0x00000015004fde20> (a java.net.SocksSocketImpl) at
> java.net.AbstractPlainSocketImpl.connectToAddress([email protected]/AbstractPlainSocketImpl.java:242)
> at
> java.net.AbstractPlainSocketImpl.connect([email protected]/AbstractPlainSocketImpl.java:224)
> at
> java.net.SocksSocketImpl.connect([email protected]/SocksSocketImpl.java:403)
> at java.net.Socket.connect([email protected]/Socket.java:609) at
> sun.security.krb5.internal.TCPClient.<init>([email protected]/NetClient.java:62)
> at
> sun.security.krb5.internal.NetClient.getInstance([email protected]/NetClient.java:42)
> at
> sun.security.krb5.KdcComm$KdcCommunication.run([email protected]/KdcComm.java:401)
> at
> sun.security.krb5.KdcComm$KdcCommunication.run([email protected]/KdcComm.java:364)
> at java.security.AccessController.doPrivileged([email protected]/Native
> Method) at
> sun.security.krb5.KdcComm.send([email protected]/KdcComm.java:348)
> at
> sun.security.krb5.KdcComm.sendIfPossible([email protected]/KdcComm.java:253)
> at
> sun.security.krb5.KdcComm.send([email protected]/KdcComm.java:234)
> at
> sun.security.krb5.KdcComm.send([email protected]/KdcComm.java:200)
> at
> sun.security.krb5.KrbAsReqBuilder.send([email protected]/KrbAsReqBuilder.java:326)
> at
> sun.security.krb5.KrbAsReqBuilder.action([email protected]/KrbAsReqBuilder.java:371)
> at
> com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication([email protected]/Krb5LoginModule.java:754)
> at
> com.sun.security.auth.module.Krb5LoginModule.login([email protected]/Krb5LoginModule.java:592)
> at
> javax.security.auth.login.LoginContext.invoke([email protected]/LoginContext.java:726)
> at
> javax.security.auth.login.LoginContext$4.run([email protected]/LoginContext.java:665)
> at
> javax.security.auth.login.LoginContext$4.run([email protected]/LoginContext.java:663)
> at java.security.AccessController.doPrivileged([email protected]/Native
> Method) at
> javax.security.auth.login.LoginContext.invokePriv([email protected]/LoginContext.java:663)
> at
> javax.security.auth.login.LoginContext.login([email protected]/LoginContext.java:574)
> at org.apache.zookeeper.Login.login(Login.java:304) - locked
> <0x000000151c477148> (a org.apache.zookeeper.Login) at
> org.apache.zookeeper.Login.<init>(Login.java:106) at
> org.apache.zookeeper.client.ZooKeeperSaslClient.createSaslClient(ZooKeeperSaslClient.java:249)
> - locked <0x000000151c476f68> (a
> org.apache.zookeeper.client.ZooKeeperSaslClient) at
> org.apache.zookeeper.client.ZooKeeperSaslClient.<init>(ZooKeeperSaslClient.java:141)
> at
> org.apache.zookeeper.ClientCnxn$SendThread.startConnect(ClientCnxn.java:972)
> at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1031)
> {code}
> Note that today ZooKeeperSaslClient as well as Login both allocate resources
> in their constructors and thus cannot be cleaned up or interrupted via
> close/shutdown/disconnect of their parents due to still being a null object
> during initialization. This leaves the thread/sockets at the mercy of the
> configured kdc retry/timeout configuration.
> This Jira is intended to break the constructor and the initialization path
> into separate methods and properly clean up the resulting objects.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)