[ 
https://issues.apache.org/jira/browse/HDFS-5322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13794166#comment-13794166
 ] 

Daryn Sharp commented on HDFS-5322:
-----------------------------------

bq. During the transition (Standby -> Active), the current code first sets the 
state of the NN to Active, then starts the active service, during which the NN 
still needs to tail the remaining editlog

This is what I was questioning.  Isn't it wrong for the NN to claim it's 
active/writable when it's not?  It seems like another state is needed to 
indicate a transition is in progress - and that state indicates the namespace 
isn't writable.

Otherwise kerberos and known token connections are going to block all the 
handler threads during the transition.  Which means ha admin commands may 
become blocked during the transition which may be a serious problem.

> HDFS delegation token not found in cache errors seen on secure HA clusters
> --------------------------------------------------------------------------
>
>                 Key: HDFS-5322
>                 URL: https://issues.apache.org/jira/browse/HDFS-5322
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: ha
>    Affects Versions: 2.1.1-beta
>            Reporter: Arpit Gupta
>            Assignee: Jing Zhao
>             Fix For: 2.2.1
>
>         Attachments: HDFS-5322.000.patch, HDFS-5322.000.patch, 
> HDFS-5322.001.patch, HDFS-5322.002.patch, HDFS-5322.003.patch, 
> HDFS-5322.004.patch, HDFS-5322.005.patch, HDFS-5322.006.patch
>
>
> While running HA tests we have seen issues were we see HDFS delegation token 
> not found in cache errors causing jobs running to fail.
> {code}
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
> |2013-10-06 20:14:51,193 INFO  [main] mapreduce.Job: Task Id : 
> attempt_1381090351344_0001_m_000007_0, Status : FAILED
> Error: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  token (HDFS_DELEGATION_TOKEN token 11 for hrt_qa) can't be found in cache
> at org.apache.hadoop.ipc.Client.call(Client.java:1347)
> at org.apache.hadoop.ipc.Client.call(Client.java:1300)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> at com.sun.proxy.$Proxy10.getBlockLocations(Unknown Source)
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to