[GitHub] spark issue #16788: [SPARK-16742] Kerberos impersonation support

vanzin Tue, 14 Mar 2017 13:26:54 -0700

Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/16788
  
    > I take it you mean that the driver logs in via Kerberos, and submits the 
resulting token (TGT?) via amContainer.setTokens
    
    No. `amContainer.setTokens` is used to distribute delegation tokens; the 
TGT remains only on the "gateway" node (the driver in client mode, or the 
"launcher" in cluster mode).
    
    > Whereas the Hadoop delegation tokens are distributed via HDFS itself.
    
    That's separate. That's Spark-specific code that distributes new delegation 
tokens and doesn't really depend on YARN. Which is a reason why I suggested the 
refactoring, since once you solve the initial token distribution, that code 
should work for Mesos without the need to change anything.
    
    Trying to put it differently: if Spark had its own, secure method for 
distributing the initial set of delegation tokens needed by the executors (+ AM 
in case of YARN), then the YARN backend wouldn't need to use 
`amContainer.setTokens` at all. What I'm suggesting here is that this method be 
the base of the Mesos / Kerberos integration; and later we could change YARN to 
also use it.
    
    This particular code is pretty self-contained and is the base of what you 
need here to bootstrap things. Moving it to "core" wouldn't be that hard, I 
think. The main thing would be to work on how the initial set of tokens is sent 
to executors, since *that* is the only thing YARN does for Spark right now.
    
    > I'm worried it's going to a) be quite a chore to factor out the YARN 
Kerberos code
    
    I'm not saying that it will be a walk in the park, but it's a much better 
solution than creating a completely separate way of dealing with Kerberos just 
for Mesos.
    
    >  I've never setup a YARN cluster. How difficult is it?
    
    Manually, probably complicated; I've only ever done it using our internal 
tools (based on Cloudera Manager).



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #16788: [SPARK-16742] Kerberos impersonation support

Reply via email to