[
https://issues.apache.org/jira/browse/SPARK-3223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jongyoul Lee updated SPARK-3223:
--------------------------------
Description: While running mesos with --no-switch_user option, HDFS account
name is different from driver and executor. It makes a permission error at last
stage. Executor's id is mesos' user id and driver's id is who runs
spark-submit. So, moving output from _temporary/path/to/output/part-xxxx to
/output/path/part-xxxx fails because of permission error. The solution for this
is only setting SPARK_USER to HADOOP_USER_NAME when MesosExecutorBackend calls
runAsSparkUser. HADOOP_USER_NAME is used when FileSystem get user. (was: While
running mesos with --no-switch_user option, HDFS account name is different from
driver and executor. It makes a permission error at last stage. Executor's id
is mesos' user id and driver's id is who runs spark-submit. So, moving output
from _temporary/path/to/output/part-xxxx to /output/path/part-xxxx fails
because of permission error. The solution for this is only setting SPARK_USER
to HADOOP_USER_NAME when MesosExecutorBackend calls runAsSparkUser.)
> runAsSparkUser cannot change HDFS write permission properly in mesos cluster
> mode
> ---------------------------------------------------------------------------------
>
> Key: SPARK-3223
> URL: https://issues.apache.org/jira/browse/SPARK-3223
> Project: Spark
> Issue Type: Bug
> Components: Input/Output, Mesos
> Affects Versions: 1.0.2
> Reporter: Jongyoul Lee
> Fix For: 1.0.3
>
>
> While running mesos with --no-switch_user option, HDFS account name is
> different from driver and executor. It makes a permission error at last
> stage. Executor's id is mesos' user id and driver's id is who runs
> spark-submit. So, moving output from _temporary/path/to/output/part-xxxx to
> /output/path/part-xxxx fails because of permission error. The solution for
> this is only setting SPARK_USER to HADOOP_USER_NAME when MesosExecutorBackend
> calls runAsSparkUser. HADOOP_USER_NAME is used when FileSystem get user.
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]