[ 
https://issues.apache.org/jira/browse/SPARK-21945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16489367#comment-16489367
 ] 

Hyukjin Kwon commented on SPARK-21945:
--------------------------------------

To be more correct, the paths are added as are given my investigation so far. 
It's fine for zip archive but for .py file the paths shouldn't be added as are 
(but its parent directory) ...
so for py files, yes, we should copy them too.

It's weird but I think this is all because we happened to support .py file in 
the same option whereas PYTHONPATH doesn't expect a file.

> pyspark --py-files doesn't work in yarn client mode
> ---------------------------------------------------
>
>                 Key: SPARK-21945
>                 URL: https://issues.apache.org/jira/browse/SPARK-21945
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark
>    Affects Versions: 2.2.0
>            Reporter: Thomas Graves
>            Assignee: Hyukjin Kwon
>            Priority: Major
>             Fix For: 2.3.1, 2.4.0
>
>
> I tried running pyspark with --py-files pythonfiles.zip  but it doesn't 
> properly add the zip file to the PYTHONPATH.
> I can work around by exporting PYTHONPATH.
> Looking in SparkSubmitCommandBuilder.buildPySparkShellCommand  I don't see 
> this supported at all.   If that is the case perhaps it should be moved to 
> improvement.
> Note it works via spark-submit in both client and cluster mode to run python 
> script.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to