[
https://issues.apache.org/jira/browse/ZOOKEEPER-2323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15013605#comment-15013605
]
Bob commented on ZOOKEEPER-2323:
--------------------------------
From the code flow, I can see the ZooKeeperSaslClient throw exception in
above scenarios is caused by below reason:
During network disconnection, relogin happened which clear the content of
subject object , and after network recovery, when client try to setup
connection, it will check the TGT firstly, TGT will not be avaliable until next
relogin operator happened.
So , I thinks when try to setup connection when network recovery , in this flow
we can do relogin firstly if relogin happened during network disconnected.
[~arshad.mohammad], Any thought?
> ZooKeeper client enters into infinite AuthFailedException cycle if its unable
> to recreate Kerberos ticket
> ---------------------------------------------------------------------------------------------------------
>
> Key: ZOOKEEPER-2323
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2323
> Project: ZooKeeper
> Issue Type: Bug
> Components: java client
> Affects Versions: 3.4.7, 3.5.1
> Reporter: Arshad Mohammad
> Assignee: Arshad Mohammad
> Fix For: 3.4.8, 3.5.2, 3.6.0
>
> Attachments: ZOOKEEPER-2323-01.patch
>
>
> ZooKeeper client enters into infinite AuthFailedException cycle. For every
> operation its throws AuthFailedException
> Here is the create operation exception
> {code}
> org.apache.zookeeper.KeeperException$AuthFailedException: KeeperErrorCode =
> AuthFailed for /continuousRunningZKClient
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1753)
> {code}
> This can be reproduced easily with the following steps:
> # Reduce the ZooKeeper client principal max life for example set 2 min. use
> command {color:blue} modprinc -maxlife 2min zkcli {color} in kadmin. (This
> is done to reduce the issue reproduce time)
> # Connect Client to ZooKeeper quorum,let it gets connected and some
> operations are done successfully
> # Disconnect the Client's network, by pulling out the Ethernet cable or by
> any way. Now the Client is in disconnected state, no operation is
> expected,Client tries to reconnect to different-different servers in the
> ZooKeeper quorum.
> # After two minutes Client tries to get new Keberos ticket and it fails.
> # Connect the Client to network. Client comes in connected state but
> AuthFailedException for every operation.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)