haniar created TOREE-344:
----------------------------
Summary: No module named pyspark
Key: TOREE-344
URL: https://issues.apache.org/jira/browse/TOREE-344
Project: TOREE
Issue Type: Bug
Reporter: haniar
I have installed toree to my jupyter environment
(https://github.com/apache/incubator-toree) and written a piece of code that
works with pyspark. Yarn starts properly and I can see the containers running
in the queue,
When I run the code, I get the following error
Error from python worker:
/usr/local/bin/python2.7: No module named pyspark
the kernel is set-up as follows:
{
"language": "python",
"display_name": "Apache Toree - PySpark",
"env": {
"__TOREE_SPARK_OPTS__": " --master yarn",
"SPARK_HOME": "/usr/hdp/2.4.2.0-258/spark",
"__TOREE_OPTS__": "",
"DEFAULT_INTERPRETER": "PySpark",
"PYTHONPATH":
"/usr/hdp/2.4.2.0-258/spark/python:/usr/hdp/2.4.2.0-258/spark/python/lib/py4j-0.9-src.zip",
"PYTHON_EXEC": "python",
"PYTHONSTARTUP": "/usr/hdp/2.4.2.0-258/spark/python/pyspark/shell.py",
"PYSPARK_PYTHON": "/usr/local/bin/python2.7",
"PYSPARK_DRIVER_PYTHON": "/usr/local/bin/python2.7"
},
"argv": [
"/usr/local/share/jupyter/kernels/apache_toree_pyspark/bin/run.sh",
"--profile",
"{connection_file}"
]
}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)