[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Norbert Kalmár reassigned ZOOKEEPER-4367:
-----------------------------------------

    Assignee: Norbert Kalmár

> Zookeeper#Login thread leak in case of Sasl AuthFailed.
> -------------------------------------------------------
>
>                 Key: ZOOKEEPER-4367
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4367
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: java client, kerberos
>    Affects Versions: 3.4.13
>            Reporter: Rushabh Shah
>            Assignee: Norbert Kalmár
>            Priority: Critical
>              Labels: pull-request-available
>             Fix For: 3.5.10, 3.8.0, 3.7.1
>
>          Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> We are seeing 1000's of Zookeeper#Login threads leak in our production 
> clusters.
> [ZooKeeperSaslClient#createSaslClient|https://github.com/apache/zookeeper/blob/branch-3.4.13/src/java/main/org/apache/zookeeper/client/ZooKeeperSaslClient.java#L205]
>  creates Login thread.
> [ZooKeeperSaslClient#createSaslToken 
> |https://github.com/apache/zookeeper/blob/branch-3.4.13/src/java/main/org/apache/zookeeper/client/ZooKeeperSaslClient.java#L310]
>  throws SaslException which propagates all the way back to 
> [ClientCnxn#SendThread#run|https://github.com/apache/zookeeper/blob/branch-3.4.13/src/java/main/org/apache/zookeeper/ClientCnxn.java#L1074]
>  method.
> [ClientCnxn#SendThread#run|https://github.com/apache/zookeeper/blob/branch-3.4.13/src/java/main/org/apache/zookeeper/ClientCnxn.java#L1075-L1078]
>  handles SaslException by changing setting state to AUTH_FAILED, queueing the 
> eventOfDeath for EventThread and exiting/cleaning up the SendThread but we 
> DON'T close the zookeeperSaslClient which in turns shutDown the Login thread.
> Logs are added below for one failed connection.
> {noformat}
> `20210831053800.393 jute.maxbuffer value is 4194304 Bytes
> `20210831053800.393 Initiating client connection, 
> connectString=<zookeeper-ensemble string> sessionTimeout=4000 
> watcher=org.apache.curator.ConnectionState@7b974f93
> `20210831053800.401 zookeeper.request.timeout value is 10000. feature enabled=
> `20210831053800.404 Client successfully logged in.
> `20210831053800.405 Client will use GSSAPI as SASL mechanism.
> `20210831053800.405 TGT refresh sleeping until: Wed Sep 01 00:59:06 GMT 2021
> `20210831053800.405 TGT refresh thread started.
> `20210831053800.405 TGT valid starting at:        Tue Aug 31 05:38:00 GMT 2021
> `20210831053800.405 TGT expires:                  Wed Sep 01 05:38:00 GMT 2021
> `20210831053800.407 Opening socket connection to server <zookeeper-server-1>. 
> Will attempt to SASL-authenticate using Login Context section 'Client'
> `20210831053800.419 Socket connection established to <zookeeper-server-1>, 
> initiating session
> `20210831053800.435 Session establishment complete on server 
> <zookeeper-server-1>, sessionid = 0x1000004066cc52b, negotiated timeout = 6000
> `20210831053800.438 An error: (java.security.PrivilegedActionException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Server not 
> found in Kerberos database (7) - UNKNOWN_SERVER)]) occurred when evaluating 
> Zookeeper Quorum Member's  received SASL token. This may be caused by Java's 
> being unable to resolve the Zookeeper Quorum Member's hostname correctly. You 
> may want to try to adding '-Dsun.net.spi.nameservice.provider.1=dns,sun' to 
> your client's JVMFLAGS environment. Zookeeper Client will go to AUTH_FAILED 
> state.
> `20210831053800.438 EventThread shut down for session: 0x1000004066cc52b
> `20210831053800.438 SASL authentication with Zookeeper Quorum member failed: 
> javax.security.sasl.SaslException: An error: 
> (java.security.PrivilegedActionException: javax.security.sasl.SaslException: 
> GSS initiate failed [Caused by GSSException: No valid credentials provided 
> (Mechanism level: Server not found in Kerberos database (7) - 
> UNKNOWN_SERVER)]) occurred when evaluating Zookeeper Quorum Member's  
> received SASL token. This may be caused by Java's being unable to resolve the 
> Zookeeper Quorum Member's hostname correctly. You may want to try to adding 
> '-Dsun.net.spi.nameservice.provider.1=dns,sun' to your client's JVMFLAGS 
> environment. Zookeeper Client will go to AUTH_FAILED state.
> {noformat}
> What is the correct way to shutdown Login thread in case of SaslException ?
> We use Curator framework to connect to Zookeeper.
> We fixed similar bug here where we were leaking EventThreads.  ZOOKEEPER-3059
> This is similar except for Login threads. Please help.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to