[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14293809#comment-14293809
 ] 

Jason Lowe commented on MAPREDUCE-6230:
---------------------------------------

There are a couple of reasons this doesn't work:

1) The token handed to the AM from the RM does not have a service set, since 
the RM does not know the service name being used by the client to connect to 
the RM.  Unfortunately the RMContainerAllocator never updates the service name 
of the token, so it ends up not being selected by the RPC layer when 
reconnecting to the RM.

2) The RMContainerAllocator for some reason tries to update the login user 
rather than the current user when security is enabled.  This ends up updating a 
token in the UGI that is _not_ being used by the IPC layer when reconnecting to 
the RM on a secure cluster.

Also note that the AMRM token is mapped with an empty service name in the 
credentials, so if the token service is updated before adding it to the 
credentials we can end up with two AMRM tokens (one with an empty service key 
and one with a non-empty service key).  So we need to be careful when adding 
the new token to the credentials that we will indeed be clobbering the old 
token, as we need to use a consistent service/alias that we cannot obtain 
directly from the credentials.

> MR AM does not survive RM restart if RM activated a new AMRM secret key
> -----------------------------------------------------------------------
>
>                 Key: MAPREDUCE-6230
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6230
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mr-am
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>            Priority: Blocker
>
> A MapReduce AM will fail to reconnect to an RM that performed restart in the 
> following scenario:
> # MapReduce job launched with AMRM token generated from AMRM secret X
> # RM rolls new AMRM secret Y and activates the new key
> # RM performs a work-preserving restart
> # MapReduce job AM now unable to connect to RM with "Invalid AMRMToken" 
> exception



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to