Tomasz Früboes created SPARK-7791:
-------------------------------------
Summary: Set user for executors in standalone-mode
Key: SPARK-7791
URL: https://issues.apache.org/jira/browse/SPARK-7791
Project: Spark
Issue Type: New Feature
Components: Spark Core
Reporter: Tomasz Früboes
I'm opening this following a discussion in
https://www.mail-archive.com/[email protected]/msg28633.html
Our setup was following. Spark (1.3.1, prebuilt for hadoop 2.6, also 2.4) was
installed in the standalone mode and started manually from the root account.
Everything worked properly apart of operations such us
rdd.saveAsPickleFile(ofile)
which end with exception:
py4j.protocol.Py4JJavaError: An error occurred while calling o27.save.
: java.io.IOException: Failed to rename
DeprecatedRawLocalFileStatus{path=file:/mnt/lustre/bigdata/med_home/tmp/test19EE/namesAndAges.parquet2/_temporary/0/task_201505191540_0009_r_000001/part-r-00002.parquet;
isDirectory=false; length=534; replication=1; blocksize=33554432;
modification_time=1432042832000; access_time=0; owner=; group=;
permission=rw-rw-rw-; isSymlink=false} to
file:/mnt/lustre/bigdata/med_home/tmp/test19EE/namesAndAges.parquet2/part-r-00002.parquet
at
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.mergePaths(FileOutputCommitter.java:346)
(files created in _temporary were owned by user root). It would be great if
spark could set the user for the executor also in standalone mode. Setting
SPARK_USER has no effect here.
BTW it may be a good idea to add some warning (e.g. during spark startup) that
running from root account is not very healthy idea. E.g. mapping this function
def test(x):
f = open('/etc/testTMF.txt', 'w')
return 0
on a rdd creates a file in /etc/ (surprisingly calls like f.Write("text") end
with an exception)
Thanks,
Tomasz
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]