[ 
https://issues.apache.org/jira/browse/HDFS-5322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13794288#comment-13794288
 ] 

Jing Zhao commented on HDFS-5322:
---------------------------------

bq. Isn't it wrong for the NN to claim it's active/writable when it's not? It 
seems like another state is needed to indicate a transition is in progress - 
and that state indicates the namespace isn't writable.

Agree. I think the current code wants to achieve this through the FSNamesystem 
R/W lock: the startActiveService method holds the write lock and blocks other 
methods. Then since we remove the FSNamesystem lock in retrievePassword, the 
original implementation does not work for delegation token part. We should file 
a separate jira to track this.

> HDFS delegation token not found in cache errors seen on secure HA clusters
> --------------------------------------------------------------------------
>
>                 Key: HDFS-5322
>                 URL: https://issues.apache.org/jira/browse/HDFS-5322
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: ha
>    Affects Versions: 2.1.1-beta
>            Reporter: Arpit Gupta
>            Assignee: Jing Zhao
>             Fix For: 2.2.1
>
>         Attachments: HDFS-5322.000.patch, HDFS-5322.000.patch, 
> HDFS-5322.001.patch, HDFS-5322.002.patch, HDFS-5322.003.patch, 
> HDFS-5322.004.patch, HDFS-5322.005.patch, HDFS-5322.006.patch
>
>
> While running HA tests we have seen issues were we see HDFS delegation token 
> not found in cache errors causing jobs running to fail.
> {code}
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
> |2013-10-06 20:14:51,193 INFO  [main] mapreduce.Job: Task Id : 
> attempt_1381090351344_0001_m_000007_0, Status : FAILED
> Error: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  token (HDFS_DELEGATION_TOKEN token 11 for hrt_qa) can't be found in cache
> at org.apache.hadoop.ipc.Client.call(Client.java:1347)
> at org.apache.hadoop.ipc.Client.call(Client.java:1300)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> at com.sun.proxy.$Proxy10.getBlockLocations(Unknown Source)
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to