[GitHub] spark pull request: SPARK-3223 runAsSparkUser cannot change HDFS w...

jongyoul Tue, 26 Aug 2014 08:54:00 -0700

Github user jongyoul commented on the pull request:

    https://github.com/apache/spark/pull/2126#issuecomment-53443546
  
    Yes, I'm not using secure HDFS for some reasons. Mesos is just a resource 
manager so It doesn't care running program's id. mesos with switch_user option 
change the running program's id to an account of running spark-submit, but it 
may occurs another issue like every slave machine knows an account id of 
running spark-submit. So spark is changing their user id whatever option on 
mesos about switch_user.
    
    HADOOP_USER_NAME is only valid for non-secure mode. In a secure mode, that 
property is meaningless and we must use switch_user option.
    
    logDebug("running as user: " + user) changes and be changed remote user to 
SPARK_USER, and spark application runs as that user. But HDFS is not working 
like that. in non-secure mode. the user of Filesystem is decided by steps the 
following, check if hdfs runs a secure mode(KERBEROS) or not, then if it's not 
in secure mode, check if HADOOP_USER_NAME is set in System.getenv or 
System.getProperty, and finally, hdfs use system 
user.(UserGroupInformation.commit())
    
    Spark on mesos runs in a non-secure hdfs mode, hdfs client use system user 
if HADOOP_USER_NAME is not set, and system user is mesos' id not SPARK_USER. 
Thus the driver's hdfs user name of running spark-submit is not as same as the 
id of executor's hdfs client name. this occurs a permission problem.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: SPARK-3223 runAsSparkUser cannot change HDFS w...

Reply via email to