Arun Suresh created HADOOP-11722:
------------------------------------
Summary: Some Instances of Services using
ZKDelegationTokenSecretManager go when old token cannot be deleted
Key: HADOOP-11722
URL: https://issues.apache.org/jira/browse/HADOOP-11722
Project: Hadoop Common
Issue Type: Bug
Reporter: Arun Suresh
Assignee: Arun Suresh
The delete node code in {{ZKDelegationTokenSecretManager}} is as follows :
{noformat}
while(zkClient.checkExists().forPath(nodeRemovePath) != null){
zkClient.delete().guaranteed().forPath(nodeRemovePath);
}
{noformat}
When instances of a Service using {{ZKDelegationTokenSecretManager}} try
deleting a node simutaneously, It is possible that all of them enter into the
while loop in which case, all peers will try to delete the node.. Only 1 will
succeed and the rest will throw an exception.. which will bring down the node.
The Exception is as follows :
{noformat}
2015-03-15 10:24:54,000 ERROR
org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager:
ExpiredTokenRemover thread received unexpected exception
java.lang.RuntimeException: Could not remove Stored Token
ZKDTSMDelegationToken_28
at
org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.removeStoredToken(ZKDelegationTokenSecretManager.java:770)
at
org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.removeExpiredToken(AbstractDelegationTokenSecretManager.java:605)
at
org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.access$400(AbstractDelegationTokenSecretManager.java:54)
at
org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager$ExpiredTokenRemover.run(AbstractDelegationTokenSecretManager.java:656)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.zookeeper.KeeperException$NoNodeException:
KeeperErrorCode = NoNode for /zkdtsm/ZKDTSMRoot/ZKDTSMTokensRoot/DT_28
at org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:873)
at
org.apache.curator.framework.imps.DeleteBuilderImpl$5.call(DeleteBuilderImpl.java:238)
at
org.apache.curator.framework.imps.DeleteBuilderImpl$5.call(DeleteBuilderImpl.java:233)
at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107)
at
org.apache.curator.framework.imps.DeleteBuilderImpl.pathInForeground(DeleteBuilderImpl.java:230)
at
org.apache.curator.framework.imps.DeleteBuilderImpl.forPath(DeleteBuilderImpl.java:214)
at
org.apache.curator.framework.imps.DeleteBuilderImpl.forPath(DeleteBuilderImpl.java:41)
at
org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.removeStoredToken(ZKDelegationTokenSecretManager.java:764)
... 4 more
{noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)