[ https://issues.apache.org/jira/browse/HADOOP-13381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15385165#comment-15385165 ]
Xiao Chen commented on HADOOP-13381: ------------------------------------ Thanks [~asuresh] for the quick response! The flow you mentioned would work, assuming we loosen the retry check of response message ([these lines|https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/key/kms/KMSClientProvider.java#L584-L587]), and add the remove token method to UGI. On the multi-thread side, did I miss anything? If many threads running in {{LogAggregationService}} try to do log aggregation, and end up with the same cached KMSCP, would this cause a race? IMO this problem exists before this patch, but maybe I missed something... I don't think the cached {{authToken}} work under this scenario. > KMS clients running in the same JVM should use updated KMS Delegation Token > --------------------------------------------------------------------------- > > Key: HADOOP-13381 > URL: https://issues.apache.org/jira/browse/HADOOP-13381 > Project: Hadoop Common > Issue Type: Bug > Components: kms > Affects Versions: 2.6.0 > Reporter: Xiao Chen > Assignee: Xiao Chen > Priority: Critical > Attachments: HADOOP-13381.01.patch > > > When {{/tmp}} is setup as an EZ, one may experience YARN log aggregation > failure after the very first KMS token is expired. The MR job itself runs > fine though. > When this happens, YARN NodeManager's log will show > {{AuthenticationException}} with {{token is expired}} / {{token can't be > found in cache}}, depending on whether the expired token is removed by the > background or not. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org