Yesha Vora created SPARK-19095:
----------------------------------
Summary: virtualenv example does not work in yarn cluster mode
Key: SPARK-19095
URL: https://issues.apache.org/jira/browse/SPARK-19095
Project: Spark
Issue Type: Bug
Reporter: Yesha Vora
Priority: Critical
Steps:
* install virtualenv on all nodes
* create requirement1.txt with "numpy > requirement1.txt "
* Run kmeans.py application in yarn-cluster mode.
{code}
spark-submit --master yarn --deploy-mode cluster --conf
"spark.pyspark.virtualenv.enabled=true" --conf
"spark.pyspark.virtualenv.type=native" --conf
"spark.pyspark.virtualenv.requirements=/tmp/requirements1.txt" --conf
"spark.pyspark.virtualenv.bin.path=/usr/bin/virtualenv" --jars
/usr/hdp/current/hadoop-client/lib/hadoop-lzo.jar kmeans.py
/tmp/in/kmeans_data.txt 3{code}
The application fails to find numpy.
{code}
LogType:stdout
Log Upload Time:Thu Jan 05 20:05:49 +0000 2017
LogLength:134
Log Contents:
Traceback (most recent call last):
File "kmeans.py", line 27, in <module>
import numpy as np
ImportError: No module named numpy
End of LogType:stdout
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]