Github user vanzin commented on the issue:
https://github.com/apache/spark/pull/16788
> I take it you mean that the driver logs in via Kerberos, and submits the
resulting token (TGT?) via amContainer.setTokens
No. `amContainer.setTokens` is used to distribute delegation tokens; the
TGT remains only on the "gateway" node (the driver in client mode, or the
"launcher" in cluster mode).
> Whereas the Hadoop delegation tokens are distributed via HDFS itself.
That's separate. That's Spark-specific code that distributes new delegation
tokens and doesn't really depend on YARN. Which is a reason why I suggested the
refactoring, since once you solve the initial token distribution, that code
should work for Mesos without the need to change anything.
Trying to put it differently: if Spark had its own, secure method for
distributing the initial set of delegation tokens needed by the executors (+ AM
in case of YARN), then the YARN backend wouldn't need to use
`amContainer.setTokens` at all. What I'm suggesting here is that this method be
the base of the Mesos / Kerberos integration; and later we could change YARN to
also use it.
This particular code is pretty self-contained and is the base of what you
need here to bootstrap things. Moving it to "core" wouldn't be that hard, I
think. The main thing would be to work on how the initial set of tokens is sent
to executors, since *that* is the only thing YARN does for Spark right now.
> I'm worried it's going to a) be quite a chore to factor out the YARN
Kerberos code
I'm not saying that it will be a walk in the park, but it's a much better
solution than creating a completely separate way of dealing with Kerberos just
for Mesos.
> I've never setup a YARN cluster. How difficult is it?
Manually, probably complicated; I've only ever done it using our internal
tools (based on Cloudera Manager).
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]