[ 
https://issues.apache.org/jira/browse/HADOOP-4343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12678899#action_12678899
 ] 

Kan Zhang commented on HADOOP-4343:
-----------------------------------

More details on the delegation token design.

h4. Overview

After initial authentication to NN using Kerberos credentials, a user may 
obtain a delegation token, which can be given to user jobs for subsequent 
authentication to NN as the user. The token is in fact a secret key shared 
between the user and NN and should be protected when passed over insecure 
channels. Anyone who gets it can impersonate the user on NN. Note that *a user 
can only obtain new tokens after authenticating using Kerberos*.

When a user obtains a delegation token from NN, the user should tell NN who is 
the designated token renewer. The designated renewer should authenticate to NN 
as itself when renewing the token for the user. Renewing a token means 
extending the validity period of that token on NN. No new token is issued. The 
old token continues to work. To let a Map/Reduce job use a delegation token, 
the user needs to designate JT as the token renewer. All the Tasks of the same 
job use the same token. JT is responsible for keeping the token valid till the 
job is finished. After that, JT may optionally cancel the token. 

h4. Design

Here is the format of delegation token.

{noformat}
TokenID = {ownerID, renewerID, issueDate, maxDate}
TokenAuthenticator = HMAC(masterKey, TokenID)
Delegation Token = {TokenID, TokenAuthenticator}
{noformat}

NN chooses {{masterKey}} randomly and uses it to generate and verify delegation 
tokens. NN keeps all active tokens in memory and associates each token with an 
{{expiryDate}}. If {{currentTime > expiryDate}}, the token is considered 
expired and any client authentication request using the token will be rejected. 
Expired tokens will be deleted from memory. A token is also deleted from memory 
when the owner or the renewer cancels the token.

*Using Delegation Token* When a client (e.g., a Task) uses a delegation token 
to authenticate, it first sends {{TokenID}} to NN (but never sends the 
associated {{TokenAuthenticator}} to NN). {{TokenID}} identifies the token the 
client intends to use. Using {{TokenID}} and {{masterKey}}, NN can re-compute 
{{TokenAuthenticator}} and the token. NN checks if the token is valid. A token 
is valid if and only if the token exists in memory and {{currentTime < 
expiryDate}} associated with the token. If the token is valid, the client and 
NN will try to authenticate each other using their own {{TokenAuthenticator}} 
as the secret key and [DIGEST-MD5|http://www.ietf.org/rfc/rfc2831.txt] as the 
protocol. Note that during authentication, one party never reveals its own 
{{TokenAuthenticator}} to the other party. If authentication fails (which means 
the client and NN do not share the same {{TokenAuthenticator}}), they don't get 
to know each other's {{TokenAuthenticator}}.

*Token Renewal* Delegation tokens need to be renewed periodically to keep them 
valid. Suppose JT is the designated renewer for a token. During renewal, JT 
authenticates to NN as JT. After successful authentication, JT sends the token 
to be renewed to NN. NN verifies that 1) JT is the renewer specified in 
{{TokenID}}, 2) {{TokenAuthenticator}} is correct, and 3) {{currentTime < 
maxDate}} specified in {{TokenID}}. Upon successful verification, if the token 
exists in memory, which means the token is currently valid, NN sets its new 
{{expiryDate}} to {{min(currentTime+renewPeriod, maxDate)}}. If the token 
doesn't exist in memory, which indicates NN has restarted and therefore lost 
memory of all previously stored tokens, NN adds the token to memory and sets 
its {{expiryDate}} similarly. The latter case allows jobs to survive NN 
restarts. All JT has to do is to renew all tokens with NN after NN restarts and 
before relaunching failed Tasks.

Note that the designated renewer can revive an expired (or canceled) token by 
simply renewing it, if {{currentTime < maxDate}} specified in the token. This 
is because NN can't tell the difference between a token that has expired (or 
has been canceled) and a token that is not in the memory because NN restarted. 
Since only the designated renewer can revive an expired (or canceled) token, 
this doesn't seem to be a security problem. An attacker who steals the token 
can't renew or revive it.

The {{masterKey}} needs to be updated periodically. NN only needs to persist 
the {{masterKey}} on disk, not the tokens.



> Adding user and service-to-service authentication to Hadoop
> -----------------------------------------------------------
>
>                 Key: HADOOP-4343
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4343
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: Kan Zhang
>            Assignee: Kan Zhang
>
> Currently, Hadoop services do not authenticate users or other services. As a 
> result, Hadoop is subject to the following security risks.
> 1. A user can access an HDFS or M/R cluster as any other user. This makes it 
> impossible to enforce access control in an uncooperative environment. For 
> example, file permission checking on HDFS can be easily circumvented.
> 2. An attacker can masquerade as Hadoop services. For example, user code 
> running on a M/R cluster can register itself as a new TaskTracker.
> This JIRA is intended to be a tracking JIRA, where we discuss requirements, 
> agree on a general approach and identify subtasks. Detailed design and 
> implementation are the subject of those subtasks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to