[GitHub] spark issue #16788: [SPARK-16742] Kerberos impersonation support

mgummelt Tue, 14 Mar 2017 13:12:19 -0700

Github user mgummelt commented on the issue:

    https://github.com/apache/spark/pull/16788
  
    @vanzin When you say "distributing the principal's credentials", I take it 
you mean that the driver logs in via Kerberos, and submits the resulting token 
(TGT?) via `amContainer.setTokens`.  That's what I understand from reading the 
code.  Whereas the Hadoop delegation tokens are distributed via HDFS itself.  
Is this correct?
    
    I think this is necessary for YARN, because in both client and cluster 
mode, the `ApplicationMaster` runs remotely, correct?  In Mesos client mode, 
the scheduler runs in the same process as `spark-submit` (the driver), so 
there's no need for Kerberos token distribution.  The scheduler can simply use 
the `UserGroupInformation` the user initially logged in with.
    
    We would need some method of Kerberos token distribution in cluster mode, 
but we can punt on that.  Users have many ways of running Spark jobs 
asynchronously, and we'll have to take those one by one.  I think we can just 
focus on solving this in client mode for now.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #16788: [SPARK-16742] Kerberos impersonation support

Reply via email to