[ 
https://issues.apache.org/jira/browse/HADOOP-13381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15385314#comment-15385314
 ] 

Xiao Chen commented on HADOOP-13381:
------------------------------------

Thank you for the continued discussion, Arun.

Sorry I missed 1 point in your proposal... it wouldn't work as we hoped.
bq. 4. Then, we let the retry happen, at which point it will get a new 
delegation token.
IIUC, the {{authToken}} was to cache past successful authentications (so we 
don't have to authenticate every time). It does not 'get a new delegation 
token'. Instead, it just gets the {{kms-dt}} from the UGI's current user inside 
{{DelegationTokenAuthenticatedURL#openConnection}}, which happens inside the 
{{actualUgi.doAs}} in {{KMSCP#createConnection}}. So retries will still see the 
same expired DT (or no DT at all if we remove it). We have to get the DT from 
UGI's current user before actualUgi.doAs... right?

Let me elaborate on the race I was thinking:
I did a test as follows:
# set {{/tmp}} as an EZ
# run a MR job (wordcount) as user {{mapred}}, over {{/tmp}}. Let's call this 
job1
# run a MR job (wordcount) as user {{impala}}, over {{/tmp}}. Let's call this 
job2.
# get below logs from my customized logging in {{KMSCP#createConnection}}

{noformat}
2016-07-19 14:35:18,306 INFO 
org.apache.hadoop.crypto.key.kms.KMSClientProvider: ==== currentUGI:impala 
(auth:SIMPLE) creds: [Kind: kms-dt, Service: 172.31.9.35:16000, Ident: 00 06 69 
6d 70 61 6c 61 04 79 61 72 6e 00 8a 01 56 05 15 10 22 8a 01 56 05 17 cf 42 02 
02, Kind: mapreduce.job, Service: job_1468963667277_0002, Ident: 
(org.apache.hadoop.mapreduce.security.token.JobTokenIdentifier@2e951fb5), Kind: 
HDFS_DELEGATION_TOKEN, Service: 172.31.9.72:8020, Ident: (token for impala: 
HDFS_DELEGATION_TOKEN owner=imp...@gce.cloudera.com, renewer=yarn, realUser=, 
issueDate=1468964081478, maxDate=1468964381478, sequenceNumber=216, 
masterKeyId=20)]
2016-07-19 14:35:18,307 INFO 
org.apache.hadoop.crypto.key.kms.KMSClientProvider: ==== actualUGI: mapred 
(auth:SIMPLE) creds: [Kind: kms-dt, Service: 172.31.9.35:16000, Ident: 00 06 6d 
61 70 72 65 64 04 79 61 72 6e 00 8a 01 56 05 11 b5 db 8a 01 56 05 14 74 fb 01 
02, Kind: mapreduce.job, Service: job_1468963667277_0001, Ident: 
(org.apache.hadoop.mapreduce.security.token.JobTokenIdentifier@7fdacda0), Kind: 
HDFS_DELEGATION_TOKEN, Service: 172.31.9.72:8020, Ident: (token for mapred: 
HDFS_DELEGATION_TOKEN owner=map...@gce.cloudera.com, renewer=yarn, realUser=, 
issueDate=1468963861782, maxDate=1468964161782, sequenceNumber=215, 
masterKeyId=20)]
{noformat}
Note here the actual UGI is entirely mapred's. If job1 is about to 
{{actualUgi.doAs}} while job2 updated the credentials in {{actualUgi}}, job1 
will then see job2's dt when the invocation goes into DTAURL..... right?

My drive-home thinking is that we should doAs current ugi in this specific case 
(or retry with currentUGI).... Namely, when 
[this|https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/key/kms/KMSClientProvider.java#L535]
 is null.

> KMS clients running in the same JVM should use updated KMS Delegation Token
> ---------------------------------------------------------------------------
>
>                 Key: HADOOP-13381
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13381
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: kms
>    Affects Versions: 2.6.0
>            Reporter: Xiao Chen
>            Assignee: Xiao Chen
>            Priority: Critical
>         Attachments: HADOOP-13381.01.patch
>
>
> When {{/tmp}} is setup as an EZ, one may experience YARN log aggregation 
> failure after the very first KMS token is expired. The MR job itself runs 
> fine though.
> When this happens, YARN NodeManager's log will show 
> {{AuthenticationException}} with {{token is expired}} / {{token can't be 
> found in cache}}, depending on whether the expired token is removed by the 
> background or not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to