Might be, but there must be some clues in log that it doesn't work. On Thu, Feb 18, 2016 at 1:06 PM, Abhi Basu <9000r...@gmail.com> wrote:
> I believe the issue may be using binaries with CDH. What I need to do is > build source with the hadoop, spark and switches for yarn, right? > > Thanks, > > Abhi > > On Wed, Feb 17, 2016 at 9:02 PM, Jeff Zhang <zjf...@gmail.com> wrote: > >> According the log, spark interpreter is started in yarn-client as >> application_1455038611898_0015. Could you check the yarn application of >> this app ? >> >> On Thu, Feb 18, 2016 at 1:26 AM, Abhi Basu <9000r...@gmail.com> wrote: >> >>> Additional info, installed Zeppelin 0.56 using binaries on cdh5.1 /spark >>> 1.5.0 >>> >>> Any help is appreciated. >>> >>> Thanks, >>> >>> Abhi >>> >>> On Wed, Feb 17, 2016 at 9:07 AM, Abhi Basu <9000r...@gmail.com> wrote: >>> >>>> Logs attached. Am I supposed to edit the spark location in the zeppelin >>>> config file? All I have changed is the hadoop conf folder. >>>> >>>> Thanks, >>>> >>>> Abhi >>>> >>>> On Tue, Feb 16, 2016 at 5:29 PM, Jeff Zhang <zjf...@gmail.com> wrote: >>>> >>>>> Can you check zeppelin log to confirm whether it is running in >>>>> yarn-client mode ? I suspect it is still in local mode. Spark require >>>>> python version of driver and executor to be the same. In your case it >>>>> should fail if driver is python2.7 while executor is python 2.6 >>>>> >>>>> On Wed, Feb 17, 2016 at 9:03 AM, Abhi Basu <9000r...@gmail.com> wrote: >>>>> >>>>>> I have a 6 node cluster and 1 edge node to access. The edge node has >>>>>> Python 2.7 + NLTK + other libraries + hadoop client and Zeppelin >>>>>> installed. >>>>>> All hadoop nodes have Python 2.6 and no other additional libraries. >>>>>> >>>>>> Running Zeppelin and my python code (with NLTK) is running under >>>>>> pyspark interpreter fine. It must be running locally as I have not >>>>>> distributed the python libraries to the other nodes yet. I dont see any >>>>>> errors in my Yarn logs either. >>>>>> >>>>>> This is my interpreter setup. Can you please tell me how this is >>>>>> working? >>>>>> >>>>>> Also, if it is working locally, how to distribute over multiple nodes? >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Abhi >>>>>> >>>>>> spark %spark (default), %pyspark, %sql, %dep edit restart remove >>>>>> Properties >>>>>> namevalue >>>>>> args >>>>>> master yarn-client >>>>>> spark.app.name Zeppelin-App >>>>>> spark.cores.max 4 >>>>>> spark.executor.memory 1024m >>>>>> zeppelin.dep.additionalRemoteRepository spark-packages, >>>>>> http://dl.bintray.com/spark-packages/maven,false; >>>>>> zeppelin.dep.localrepo local-repo >>>>>> zeppelin.pyspark.python /usr/local/bin/python2.7 >>>>>> zeppelin.spark.concurrentSQL true >>>>>> zeppelin.spark.maxResult 1000 >>>>>> zeppelin.spark.useHiveContext true >>>>>> >>>>>> >>>>>> -- >>>>>> Abhi Basu >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Best Regards >>>>> >>>>> Jeff Zhang >>>>> >>>> >>>> >>>> >>>> -- >>>> Abhi Basu >>>> >>> >>> >>> >>> -- >>> Abhi Basu >>> >> >> >> >> -- >> Best Regards >> >> Jeff Zhang >> > > > > -- > Abhi Basu > -- Best Regards Jeff Zhang