Along the same lines, you can also check the Spark Job UI at http://localhost:4040/environment/ to see if spark.home is set correctly.
HTH Ram On Mar 31, 2015, at 4:44 PM, Felix C <[email protected]<mailto:[email protected]>> wrote: Could you check the interpreter settings to see if spark.home is correctly propagated? Also check if zeppelin is built with the same version of Spark as the one you have in SPARK_HOME? O had run into problems with that earlier. --- Original Message --- From: "Kelly, Jonathan" <[email protected]<mailto:[email protected]>> Sent: March 31, 2015 3:57 PM To: [email protected]<mailto:[email protected]> Subject: Re: running pyspark notes I am running into the same issue as in Ram's original post, though I do correctly have SPARK_HOME set. I see no obvious errors or warnings in any of the Zeppelin logs. Is there anything specific I should look for? Thanks, Jonathan Kelly From: Felix C <[email protected]<mailto:[email protected]>> Reply-To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Date: Tuesday, March 31, 2015 at 3:46 PM To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Subject: Re: running pyspark notes It is very easy to run spark locally. You can download the binary distribution and unpack it on your Dev box. --- Original Message --- From: "moon soo Lee" <[email protected]<mailto:[email protected]>> Sent: March 31, 2015 12:38 AM To: [email protected]<mailto:[email protected]> Subject: Re: running pyspark notes For Pyspark, you need to download spark distribution, and spark.home and SPARK_HOME need to point that directory. The reason why you need spark distribution is, to use pyspark, Zeppelin need some python module that spark distribution has under it's python directory. Thanks, moon On Tue, Mar 31, 2015 at 4:01 PM Ram Venkatesh <[email protected]<mailto:[email protected]>> wrote: Hello, Thank you for your reply. I have built zeppelin with the following command line: mvn install -DskipTests -Pspark-1.2 -Phadoop-2.4 I don’t have a separate Spark distribution package, I am running zeppelin on my dev box in local mode for testing. What should my spark.home (or SPARK_HOME) be set to, currently they are blank. Thanks! Ram On Mar 30, 2015, at 4:56 PM, moon soo Lee <[email protected]<mailto:[email protected]>> wrote: Hi, It can be happen when spark.home property and SPARK_HOME environment variable is misconfigured. Or when version of spark distribution package that spark.home (and SPARK_HOME) points is different from the version build with Zeppelin. Thanks, moon On Tue, Mar 31, 2015 at 4:29 AM Ram Venkatesh <[email protected]<mailto:[email protected]>> wrote: Hello, I am having trouble running a pyspark note (zeppelin newbie, could well be pilot error). The note is %pyspark print ‘hello world’ The note transitions to “PENDING” and then “RUNNING” but never finishes after that. From the zeppelin server logs: INFO [2015-03-30 12:24:10,512] ({pool-2-thread-2} RemoteInterpreterProcess.java[reference]:74) - Run interpreter process /Users/rvenkatesh/dev/asf/zeppelin/bin/interpreter.sh -d /Users/rvenkatesh/dev/asf/zeppelin/interpreter/spark -p 59569 INFO [2015-03-30 12:24:11,570] ({pool-2-thread-2} RemoteInterpreter.java[init]:114) - Create remote interpreter com.nflabs.zeppelin.spark.SparkInterpreter INFO [2015-03-30 12:24:11,623] ({pool-2-thread-2} RemoteInterpreter.java[init]:114) - Create remote interpreter com.nflabs.zeppelin.spark.PySparkInterpreter INFO [2015-03-30 12:24:11,628] ({pool-2-thread-2} RemoteInterpreter.java[init]:114) - Create remote interpreter com.nflabs.zeppelin.spark.SparkSqlInterpreter INFO [2015-03-30 12:24:11,631] ({pool-2-thread-2} RemoteInterpreter.java[init]:114) - Create remote interpreter com.nflabs.zeppelin.spark.DepInterpreter INFO [2015-03-30 12:24:11,635] ({pool-2-thread-2} RemoteInterpreter.java[open]:143) - open remote interpreter com.nflabs.zeppelin.spark.PySparkInterpreter INFO [2015-03-30 12:24:11,682] ({pool-2-thread-2} Paragraph.java[jobRun]:182) - RUN : print 'hello world' INFO [2015-03-30 12:24:19,444] ({Thread-24} RemoteScheduler.java[getStatus]:185) - getStatus from remote RUNNING INFO [2015-03-30 12:24:19,444] ({Thread-24} NotebookServer.java[broadcast]:205) - SEND >> NOTE INFO [2015-03-30 12:24:19,446] ({Thread-25} NotebookServer.java[broadcast]:205) - SEND >> PROGRESS INFO [2015-03-30 12:24:19,955] ({Thread-25} NotebookServer.java[broadcast]:205) - SEND >> PROGRESS … ad infinetum Nothing interesting in the spark interpreter logs. Any help appreciated. Thanks! Ram
