[ 
https://issues.apache.org/jira/browse/HADOOP-14445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16343754#comment-16343754
 ] 

Daryn Sharp commented on HADOOP-14445:
--------------------------------------

This fell off my radar.  Quick recap since conversation has been fragmented 
across multiple jiras:
The LB provider requests 1 token, like it should, but it’s used only for that 
specific kms.  Ironic the load balancer increased load since it only works by 
retries cycling back to that kms, doesn't tolerate if that kms goes down, and 
it went unnoticed.  This Jira proposed originally proposed obtaining n-many 
tokens from each subordinate kms, even though a token from 1 will work for all. 
 The RM would have to unnecessarily renew n-many tokens and if one renew fails, 
job submission fails.  Not good.

Rushabh's original goal addresses a huge kms token renewal issue: it always 
uses the conf.  A server like the RM cannot support a multi-kms environment.  
The fix is use the kms provider's uri as the token service so the same provider 
can later be instantiated for renewal.  This also elegantly allows the LB 
provider to use a single token for all subordinate providers by using its own 
uri.  But it poses compatibility issues for job submitted by a new client that 
runs old tasks.

––

The semantics for getDelegationTokenService are oddly cyclical.  I'd expect it, 
like other hadoop clients, to premeditate the service name.  The latest patch 
is looking at the creds to decide the service based on whether a token exists 
so it can attempt to look up a token for that service – which it already looked 
up.

I’d prefer for the compatibility to be cleaner, and easier to revoke in the 
future.  The patch falls back to conf by assuming URISyntaxException means old 
service, however a malformed new service should fail to avoid surprises.  If it 
looks like a uri, it must be a valid uri.  Simplest approach is check if it 
contains ://.

I'm also uneasy about a client-side config to control compatibility since 
clients are notoriously hard to upgrade.

An alternative could remove the service guesswork, client conf, and be a bit 
more compatible by using a new token kind.  The current one is “kms-dt” whereas 
the standard naming convention should be “KMS_DELEGATION_TOKEN”.  The old token 
kind could continue using the conf, as today, while the new kind requires a 
service uri.  Effectively the current/old code remains unchanged.

There are tradeoffs to support old clients that must use the host:port.  I know 
I objected to duplicating tokens, but I’ll acquiesce if it provides a cleaner 
approach.  Duplicating a new KMS_DELEGATION_TOKEN/uri token into a single 
kms-dt/host:port is "no worse than today":
* Pro: Old client finds kms-dt from old client.
* Pro: Old client finds kms-dt from new client.
* Pro: New client finds kms-dt from old client.
* Pro: New client finds KMS_DELEGATION_TOKEN from new client.
* Pro: Old RM renews the kms-dt for both old/new clients.
* *Con*: New RM renews KMS_DELEGATION_TOKEN from new clients, effectively a 
double renew for the same token as kms-dt.

If we are willing to sacrifice a bit for new client + old RM:  Abuse fact that 
old kms clients look for a host:port service regardless of kind.  We can trick 
the RM into not renewing the unknown kind, ex. “kms-dt-deprecated”, to avoid 
the double renew.
* Pro: Old client finds kms-dt from old client.
* Pro: Old client finds kms-dt-deprecated from new client (remember, doesn't 
care about kind)
* Pro: New client finds kms-dt from old client.
* Pro: New client finds KMS_DELEGATION_TOKEN from new client.
* Pro: Old RM renews the kms-dt for old clients (all it knows about)
* *Con*: Old RM renews nothing for new clients (doesn't know 
KMS_DELEGATION_TOKEN or kms-dt-deprecated)
* Pro: New RM renews kms-dt for old clients.
* Pro: New RM renews KMS_DELEGATION_TOKEN for new clients (not 
kms-dt-deprecated)

Thoughts?

> Delegation tokens are not shared between KMS instances
> ------------------------------------------------------
>
>                 Key: HADOOP-14445
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14445
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: kms
>    Affects Versions: 2.8.0, 3.0.0-alpha1
>         Environment: CDH5.7.4, Kerberized, SSL, KMS-HA, at rest encryption
>            Reporter: Wei-Chiu Chuang
>            Assignee: Rushabh S Shah
>            Priority: Major
>         Attachments: HADOOP-14445-branch-2.8.002.patch, 
> HADOOP-14445-branch-2.8.patch, HADOOP-14445.002.patch, HADOOP-14445.003.patch
>
>
> As discovered in HADOOP-14441, KMS HA using LoadBalancingKMSClientProvider do 
> not share delegation tokens. (a client uses KMS address/port as the key for 
> delegation token)
> {code:title=DelegationTokenAuthenticatedURL#openConnection}
> if (!creds.getAllTokens().isEmpty()) {
>         InetSocketAddress serviceAddr = new InetSocketAddress(url.getHost(),
>             url.getPort());
>         Text service = SecurityUtil.buildTokenService(serviceAddr);
>         dToken = creds.getToken(service);
> {code}
> But KMS doc states:
> {quote}
> Delegation Tokens
> Similar to HTTP authentication, KMS uses Hadoop Authentication for delegation 
> tokens too.
> Under HA, A KMS instance must verify the delegation token given by another 
> KMS instance, by checking the shared secret used to sign the delegation 
> token. To do this, all KMS instances must be able to retrieve the shared 
> secret from ZooKeeper.
> {quote}
> We should either update the KMS documentation, or fix this code to share 
> delegation tokens.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to