[ https://issues.apache.org/jira/browse/HADOOP-13381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384730#comment-15384730 ]
Xiao Chen commented on HADOOP-13381: ------------------------------------ I had an offline discussion with [~asuresh], and here's the minute: - Arun brought up the point that there's {{authRetry}} in KMSCP, and when {{authToken}} is expired, a new {{DelegationTokenAuthenticatedURL.Token}} is created and the call is retried. This doesn't help in our case, since [(code inside the call)|https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/web/DelegationTokenAuthenticatedURL.java#L290-L296] the UGI's credentials are used to get the kms-dt, which would be the same expired token. - Regarding Yarn log aggregation, I explained that MR jobs will get tokens and run, and in the end NM will use that job's tokens to do Yarn log aggregation as a final MR job. So this part should be done as the MR user (as opposed to NM user: yarn), since this writes to the MR user's dir {{/tmp/logs/user/....}}. cc [~rkanter] in case anything I said is not accurate. - To minimize impact, we should only update {{kms-dt}} in the call. - Arun has a general concern on updating the actualUgi's token, since normal use case is doAs / proxy user. This could be enhanced in another jira. (My thought after the discussion): to counter the race that multiple threads calling the same cached KMSCP, we should create a new UGI object and update the tokens. Will update a patch with more details. > KMS clients running in the same JVM should use updated KMS Delegation Token > --------------------------------------------------------------------------- > > Key: HADOOP-13381 > URL: https://issues.apache.org/jira/browse/HADOOP-13381 > Project: Hadoop Common > Issue Type: Bug > Components: kms > Affects Versions: 2.6.0 > Reporter: Xiao Chen > Assignee: Xiao Chen > Priority: Critical > Attachments: HADOOP-13381.01.patch > > > When {{/tmp}} is setup as an EZ, one may experience YARN log aggregation > failure after the very first KMS token is expired. The MR job itself runs > fine though. > When this happens, YARN NodeManager's log will show > {{AuthenticationException}} with {{token is expired}} / {{token can't be > found in cache}}, depending on whether the expired token is removed by the > background or not. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org