[jira] [Commented] (SPARK-8646) PySpark does not run on YARN

Juliet Hougland (JIRA) Mon, 06 Jul 2015 15:51:04 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-8646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14615809#comment-14615809
 ]


Juliet Hougland commented on SPARK-8646:
----------------------------------------

[~davies] Please look at the logs I have attached. The pandas.algo import error 
only appears in the pi-test.log file. I ran pi-test as a method to help debug 
this problem at the request of [~vanzin]. If you look at three other log files 
(with env diferences in the file names) those are from running my out-of-stock 
job. That job does have quite a few dependencies but I make sure those are 
available to the driver and workers. 

The real (first) issue that this ticket is related to is that pyspark isn't 
available on worker nodes. The same command I can use to run my job on spark 
1.3 does not work with spark 1.4.

> PySpark does not run on YARN
> ----------------------------
>
>                 Key: SPARK-8646
>                 URL: https://issues.apache.org/jira/browse/SPARK-8646
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark, YARN
>    Affects Versions: 1.4.0
>         Environment: SPARK_HOME=local/path/to/spark1.4install/dir
> also with
> SPARK_HOME=local/path/to/spark1.4install/dir
> PYTHONPATH=$SPARK_HOME/python/lib
> Spark apps are submitted with the command:
> $SPARK_HOME/bin/spark-submit outofstock/data_transform.py 
> hdfs://foe-dev/DEMO_DATA/FACT_POS hdfs:/user/juliet/ex/ yarn-client
> data_transform contains a main method, and the rest of the args are parsed in 
> my own code.
>            Reporter: Juliet Hougland
>         Attachments: pi-test.log, spark1.4-SPARK_HOME-set-PYTHONPATH-set.log, 
> spark1.4-SPARK_HOME-set-inline-HADOOP_CONF_DIR.log, 
> spark1.4-SPARK_HOME-set.log
>
>
> Running pyspark jobs result in a "no module named pyspark" when run in 
> yarn-client mode in spark 1.4.
> [I believe this JIRA represents the change that introduced this error.| 
> https://issues.apache.org/jira/browse/SPARK-6869 ]
> This does not represent a binary compatible change to spark. Scripts that 
> worked on previous spark versions (ie comands the use spark-submit) should 
> continue to work without modification between minor versions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-8646) PySpark does not run on YARN

Reply via email to