[GitHub] spark pull request: SPARK-1676: Cache Hadoop UGIs by default to pr...

aarondav Fri, 02 May 2014 14:41:23 -0700

Github user aarondav commented on the pull request:

    https://github.com/apache/spark/pull/607#issuecomment-42081936
  
    Ah, your point about making sure we clean up if we don't cache the UGIs is
    definitely correct, no reason not to do that. For your last question, you
    might note that my solution is very careful in the sense that I have no
    idea about Hadoop security and wanted to avoid changing the semantics for
    that reason.
    
    Do you have a workaround for the fact that the Executor is created in one
    place, but used by Akka in a different place? I'd rather not leak the whole
    UGI/Hadoop security through to the actual SchedulerBackend abstraction if
    possible.
    
    
    On Fri, May 2, 2014 at 9:27 AM, Tom Graves <[email protected]> wrote:
    
    > This brings up a good question, what all setups are we going to support?
    >
    > on yarn:
    >
    >    - secure yarn cluster
    >    - non-secure yarn cluster - daemons run as superuser (like yarn), user
    >    access hdfs as themselves.
    >    - yarn cluster daemons run as same user as running applications
    >
    > standalone:
    >
    >    - daemons run as the user who also owns hdfs (no security)
    >    - daemons run as a super user, set SPARK_USER to access hdfs (no
    >    security)
    >    - daemons run as super user logged in via key tab and proxy as user to
    >    access secure hdfs (not sure on this one, I thought someone was using 
this
    >    setup)?
    >
    > mesos:
    >
    >    - I assume this is same as the standalone?
    >
    > â
    > Reply to this email directly or view it on 
GitHub<https://github.com/apache/spark/pull/607#issuecomment-42050371>
    > .
    >



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1676: Cache Hadoop UGIs by default to pr...

Reply via email to