I have installed toree to my jupyter environment 
(https://github.com/apache/incubator-toree) and written a piece of code that 
works with pyspark. Yarn starts properly and I can see the containers running 
in the queue,
When I run the code, I get the following error
Error from python worker:  /usr/local/bin/python2.7: No module named pyspark
the kernel is set-up as follows:
{  "language": "python",  "display_name": "Apache Toree - PySpark",  "env": {   
 "__TOREE_SPARK_OPTS__": " --master yarn",    "SPARK_HOME": 
"/usr/hdp/2.4.2.0-258/spark",    "__TOREE_OPTS__": "",    
"DEFAULT_INTERPRETER": "PySpark",    "PYTHONPATH": 
"/usr/hdp/2.4.2.0-258/spark/python:/usr/hdp/2.4.2.0-258/spark/python/lib/py4j-0.9-src.zip",
    "PYTHON_EXEC": "python", "PYTHONSTARTUP": 
"/usr/hdp/2.4.2.0-258/spark/python/pyspark/shell.py", "PYSPARK_PYTHON": 
"/usr/local/bin/python2.7",       "PYSPARK_DRIVER_PYTHON": 
"/usr/local/bin/python2.7"
  },  "argv": [    
"/usr/local/share/jupyter/kernels/apache_toree_pyspark/bin/run.sh",    
"--profile",    "{connection_file}"  ]}

Reply via email to