Andrew Or created SPARK-1900:
--------------------------------
Summary: Fix running PySpark files on YARN
Key: SPARK-1900
URL: https://issues.apache.org/jira/browse/SPARK-1900
Project: Spark
Issue Type: Bug
Reporter: Andrew Or
Priority: Blocker
This fails currently because of a mismatch in paths.
On a YARN cluster, spark-submit automatically assumes the file is on HDFS, even
if it is a relative path that refers to a local file. A natural workaround for
this is to explicitly specify the "file:" prefix. However, this prefix is not
understood by python, which fails with the following:
{code}
python: can't open file 'file:path/to/my/file.py': [Errno 2] No such file or
directory
{code}
--
This message was sent by Atlassian JIRA
(v6.2#6252)