Github user aarondav commented on the pull request:
https://github.com/apache/spark/pull/607#issuecomment-42081936
Ah, your point about making sure we clean up if we don't cache the UGIs is
definitely correct, no reason not to do that. For your last question, you
might note that my solution is very careful in the sense that I have no
idea about Hadoop security and wanted to avoid changing the semantics for
that reason.
Do you have a workaround for the fact that the Executor is created in one
place, but used by Akka in a different place? I'd rather not leak the whole
UGI/Hadoop security through to the actual SchedulerBackend abstraction if
possible.
On Fri, May 2, 2014 at 9:27 AM, Tom Graves <[email protected]> wrote:
> This brings up a good question, what all setups are we going to support?
>
> on yarn:
>
> - secure yarn cluster
> - non-secure yarn cluster - daemons run as superuser (like yarn), user
> access hdfs as themselves.
> - yarn cluster daemons run as same user as running applications
>
> standalone:
>
> - daemons run as the user who also owns hdfs (no security)
> - daemons run as a super user, set SPARK_USER to access hdfs (no
> security)
> - daemons run as super user logged in via key tab and proxy as user to
> access secure hdfs (not sure on this one, I thought someone was using
this
> setup)?
>
> mesos:
>
> - I assume this is same as the standalone?
>
> â
> Reply to this email directly or view it on
GitHub<https://github.com/apache/spark/pull/607#issuecomment-42050371>
> .
>
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---