[
https://issues.apache.org/jira/browse/HDFS-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13747031#comment-13747031
]
Jing Zhao commented on HDFS-5124:
---------------------------------
Analysis from [~cnauroth]:
"From a very quick scan, it looks to me like it's related to HADOOP-9880. With
this patch, we now have a lock ordering conflict around the namesystem lock and
synchronized methods on the DelegationTokenSecretManager. Example:
RPC handler thread 1 is running a cancelDelegationToken:
1. Acquire FSNamesystem write lock in FSNamesystem.cancelDelegationToken.
2. Call DelegationTokenSecretManager.cancelToken, which is synchronized.
RPC handler thread 2 is negotiating SASL for a message:
1. Call DelegationTokenSecretManager.retrievePassword, which is synchronized.
2. Acquire FSNamesystem read lock in
DelegationTokenSecretManager.retrievePassword.
(Same instance of FSNamesystem lock and DelegationTokenSecretManager accessed
in both threads, with different locking orders.)"
> Namenode in secure cluster deadlocks
> ------------------------------------
>
> Key: HDFS-5124
> URL: https://issues.apache.org/jira/browse/HDFS-5124
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: namenode
> Affects Versions: 2.1.1-beta
> Environment: Secure Hadoop 2 cluster
> Reporter: Deepesh Khandelwal
> Assignee: Jing Zhao
> Priority: Blocker
> Attachments: nn_jstack.out
>
>
> Namenode deadlocks after a while in use.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira