Github user harishreedharan commented on the pull request:
https://github.com/apache/spark/pull/4106#issuecomment-75848850
From what I can see, the approach is to login using the same principal and
keytab on all machines and login on every machine using that. This has a couple
of issues (at least):
* the KDC will block off the credentials thinking it is a DDoS attack when
so many of these logins happen at almost the same time - resulting in the whole
thing failing. Hadoop uses delegation tokens (for running user apps) and
principals specific to each host to avoid this (for headless users)
* Distributing the keytabs is an issue as well.
* Another issue is that since all applications are running as the same user
on each machine (`spark`) which can read all keytabs, one user would be able to
write an app that can be used to read another user's keytab. This is a major
security issue - and I don't see a way to avoid this. (credit to @vanzin for
finding this one)
I think the best way to fix this would be to find a way to ensure that the
delegation tokens can be transferred securely (with authentication within the
same app, and encryption to avoid snooping). This would fix all of these issues
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]