[ https://issues.apache.org/jira/browse/SPARK-21945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16487707#comment-16487707 ]
Marcelo Vanzin edited comment on SPARK-21945 at 5/23/18 5:27 PM: ----------------------------------------------------------------- {noformat} $ cat test.py from pyspark import SparkContext, SparkConf import funcs sc = SparkContext() funcs.test() sc.stop() $ cat lib/funcs.py def test(): print "This is a test." $ spark-submit --master yarn --py-files lib/funcs.py test.py SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/SPARK2-2.3.0.cloudera3-1.cdh5.13.3.p0.378171/lib/spark2/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-5.15.1-1.cdh5.15.1.p0.10/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] Traceback (most recent call last): File "/home/systest/test.py", line 2, in <module> import funcs ImportError: No module named funcs $ pyspark --master yarn --py-files lib/funcs.py [blah blah blah] Using Python version 2.7.5 (default, Sep 15 2016 22:37:39) SparkSession available as 'spark'. >>> import funcs >>> {noformat} was (Author: vanzin): {noformat} $ cat test.py from pyspark import SparkContext, SparkConf import funcs sc = SparkContext() funcs.test() sc.stop() $ cat lib/funcs.py def test(): print "This is a test." [systest@vanzin-c5-1 ~]$ spark-submit --master yarn --py-files lib/funcs.py test.py SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/SPARK2-2.3.0.cloudera3-1.cdh5.13.3.p0.378171/lib/spark2/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-5.15.1-1.cdh5.15.1.p0.10/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] Traceback (most recent call last): File "/home/systest/test.py", line 2, in <module> import funcs ImportError: No module named funcs $ pyspark --master yarn --py-files lib/funcs.py [blah blah blah] Using Python version 2.7.5 (default, Sep 15 2016 22:37:39) SparkSession available as 'spark'. >>> import funcs >>> {noformat} > pyspark --py-files doesn't work in yarn client mode > --------------------------------------------------- > > Key: SPARK-21945 > URL: https://issues.apache.org/jira/browse/SPARK-21945 > Project: Spark > Issue Type: Bug > Components: PySpark > Affects Versions: 2.2.0 > Reporter: Thomas Graves > Assignee: Hyukjin Kwon > Priority: Major > Fix For: 2.3.1, 2.4.0 > > > I tried running pyspark with --py-files pythonfiles.zip but it doesn't > properly add the zip file to the PYTHONPATH. > I can work around by exporting PYTHONPATH. > Looking in SparkSubmitCommandBuilder.buildPySparkShellCommand I don't see > this supported at all. If that is the case perhaps it should be moved to > improvement. > Note it works via spark-submit in both client and cluster mode to run python > script. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org