Github user mccheah commented on the pull request:
https://github.com/apache/spark/pull/4106#issuecomment-75852773
When @pwendell and I originally discussed the feature, we wanted to design
it to be simple and usable for small dedicated Spark clusters. We also
explicitly wanted to avoid the approach of transferring delegation tokens;
every attempt to do so in the past was prone to security flaws. So in response
to these concerns:
(1) With a cluster of ~10 machines, will the credentials blocking still
happen? How many is "many"?
(2) I assumed that the user will be responsible for securely transferring
the keytabs to every machine. I don't think that it is necessarily Spark's
responsibility to automatically transfer the keytabs securely. If the user
wants to leverage this, I would think they're already aware of security
measures that need to be taken in transferring keys between hosts. At any rate
we can document this explicitly.
(3) I'm confused here - Spark can't read a keytab if the permissions on the
keytab file deny access. Again, this comes down to configuration - the Spark
user should be configured to be able to access keytabs on a need-to-know basis.
And I think the bottom line is that we should still give this to users and the
users have the ultimate responsibility of using it wisely. If the Spark cluster
is a dedicated set of machines, then only the keytabs that are needed by
Spark's work will be on those machines. If they wanted a cluster that uses
Spark alongside other things, then YARN is probably a better solution for them
already.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]