[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13094738#comment-13094738
 ] 

Daryn Sharp commented on MAPREDUCE-2764:
----------------------------------------

Yes, the patch was hurried to be a demonstration of the latest approach I am 
pursuing.    As posted, tests also don't pass.

bq. When delegation tokens were added, there was agreement that adding those 
methods to FileSystem wasn't appropriate. We can re-open that discussion, but 
it needs to be discussed more widely.

I removed {{FileSystem.getDelegationToken}} after you expressed displeasure 
with it ever being there.  I can re-add it for now if you feel a larger debate 
is required?

bq. I also assume that the change to fixFontPaths is accidental.

Yes, I didn't mean to include it in the patch.  It allows forrest to run on 
java 6.

bq. I also like the isRenewable method, although it also controls whether it is 
cancelable so probably needs a better name like isManaged.

Sure.

bq. Please don't remove InterruptedException from APIs. At some point we'll fix 
RPC to properly throw InterruptedException when it is waiting.

Ok.

bq. Changing the semantics of JobClient.addAll isn't ok.

Granted, I forget to use the boolean replace field, but it's not changing the 
semantics of the existing method.  The new {{addAll(Credentials, boolean)}} 
method prevents overwriting existing credentials.  This fixes an edge case in 
the JT where the client has already acquired valid renewable tokens for a 
service, but they are replaced by bad credentials on disk.

bq. We've been burned badly before by the idiom of using static blocks to 
register types. The ServiceLookup is much better.

I was concerned about that too, so I made the operation be a simple map 
insertion.  I found a number of other places that have similar static blocks 
for registering things so I thought it might be ok.  I'd like to get the rest 
of the implementation nailed down, then switch it to a SL.  Is that ok?

bq. Adding a transient field that is lost when going over RPC is problematic 
from a maintenance point of view.

The issuer is intentionally transient so: a) no binary compat issues b) it's 
only relevant to the client that requested the token.  The NN, DN, etc that 
uses the token doesn't care how the client acquired the token.

bq. It also doesn't feel justified, just so that we can continue to intermingle 
HFTP delegation tokens with HFTP delegation tokens.

The issuer is much larger than that.  Http(s) is merely the transport used to 
acquire/renew/cancel the token.  However, the token acquired is not authorizing 
future http(s) connections.  The token authorizes hdfs rpc connections.  +The 
protocol to manage the token is completely orthogonal to what the token is 
authorizing+.  Having the client stomp on the kind and service fields 
completely changes the token's semantics and renders it useless unless the 
original kind and service fields are swapped back.

The options are then:
* (your proposal) The client does hack-ish swapping of kind and service within 
the token so it knows the transport used to manage the token.  The client is no 
longer allowed to use the token for what it is actually authorizing.  Ex. The 
token now looks like it's for http (which it's not!), so the client can't use 
it for rpc.  The original values will need to be marshaled into existing fields 
during serialization, or the token becomes useless.

* (my early approach) The client creates an hftp token that is a wrapper for 
the real token that is acquired.  This prevents the swapping of kind and 
service within the token, but I found it unexpectedly difficult to implement.  
Note that the client still cannot directly use the token for what it 
authorizes, which was an earlier concern by Jitendra.

* (my current approach) The client tracks how it acquired the token via an 
issuer field.  The issuer field is only persistent within the client's 
credentials.  The client uses the issuer to determine the transport to manage 
the token.  Unlike the other approaches, the token is completely binary 
compatible and, the client can use the token for what it authorizes, and 
there's no guesswork in how to renew it!

> Fix renewal of dfs delegation tokens
> ------------------------------------
>
>                 Key: MAPREDUCE-2764
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2764
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>             Fix For: 0.20.205.0
>
>         Attachments: MAPREDUCE-2764-2.patch, MAPREDUCE-2764.patch, 
> delegation.patch
>
>
> The JT may have issues renewing hftp tokens which disrupt long distcp jobs.  
> The problem is the JT's delegation token renewal code is built on brittle 
> assumptions.  The token's service field contains only the "ip:port" pair.  
> The renewal process assumes that the scheme must be hdfs.  If that fails due 
> to a {{VersionMismatchException}}, it tries https based on another assumption 
> that it must be hftp if it's not hdfs.  A number of other exceptions, most 
> commonly {{IOExceptions}}, can be generated which fouls up the renewal since 
> it won't fallback to https.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to