Can you check zeppelin log to confirm whether it is running in yarn-client mode ? I suspect it is still in local mode. Spark require python version of driver and executor to be the same. In your case it should fail if driver is python2.7 while executor is python 2.6
On Wed, Feb 17, 2016 at 9:03 AM, Abhi Basu <9000r...@gmail.com> wrote: > I have a 6 node cluster and 1 edge node to access. The edge node has > Python 2.7 + NLTK + other libraries + hadoop client and Zeppelin installed. > All hadoop nodes have Python 2.6 and no other additional libraries. > > Running Zeppelin and my python code (with NLTK) is running under pyspark > interpreter fine. It must be running locally as I have not distributed the > python libraries to the other nodes yet. I dont see any errors in my Yarn > logs either. > > This is my interpreter setup. Can you please tell me how this is working? > > Also, if it is working locally, how to distribute over multiple nodes? > > > Thanks, > > Abhi > > spark %spark (default), %pyspark, %sql, %dep edit restart remove > Properties > namevalue > args > master yarn-client > spark.app.name Zeppelin-App > spark.cores.max 4 > spark.executor.memory 1024m > zeppelin.dep.additionalRemoteRepository spark-packages, > http://dl.bintray.com/spark-packages/maven,false; > zeppelin.dep.localrepo local-repo > zeppelin.pyspark.python /usr/local/bin/python2.7 > zeppelin.spark.concurrentSQL true > zeppelin.spark.maxResult 1000 > zeppelin.spark.useHiveContext true > > > -- > Abhi Basu > -- Best Regards Jeff Zhang