Github user pwendell commented on the pull request:
https://github.com/apache/spark/pull/265#issuecomment-54778609
@dkanoafry with this patch, the main issue I see is that it distributes the
delegation tokens insecurity (through sc.AddFile)... so anyone could just read
the tokens over the network and mimic the user who is running the Spark job. In
fact we start an HTTP file server, so you wouldn't even need to observe the
traffic you could just make a request against it. I'm guessing this is fine for
the company submitting the patch, but it's too weak of a security model IMO to
merge upstream.
Since we've added more recently support for securing the HTTP file server
through a shared secret I think this might be okay to pull in now. @tgravescs
would you mind taking a quick look? I think the idea here is that in standalone
mode a user would just log in with a keytab and send delegation tokens to the
executors, with the main goal being to provide access to a secured HDFS
deployment. Is there a way now for them to set a share secret to authenticate
this HTTP request? (I think it's fine to assume that they just set something in
a conf file on all of the worker nodes, i.e. we don't need to disseminate that
secret).
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]