For Pyspark, you need to download spark distribution, and spark.home and SPARK_HOME need to point that directory. The reason why you need spark distribution is, to use pyspark, Zeppelin need some python module that spark distribution has under it's python directory.
Thanks, moon On Tue, Mar 31, 2015 at 4:01 PM Ram Venkatesh <[email protected]> wrote: > Hello, > > Thank you for your reply. > > I have built zeppelin with the following command line: mvn install > -DskipTests -Pspark-1.2 -Phadoop-2.4 > > I don’t have a separate Spark distribution package, I am running > zeppelin on my dev box in local mode for testing. > > What should my spark.home (or SPARK_HOME) be set to, currently they are > blank. > > Thanks! > Ram > > On Mar 30, 2015, at 4:56 PM, moon soo Lee <[email protected]> wrote: > > Hi, > > It can be happen when spark.home property and SPARK_HOME environment > variable is misconfigured. Or when version of spark distribution package > that spark.home (and SPARK_HOME) points is different from the version build > with Zeppelin. > > Thanks, > moon > > > On Tue, Mar 31, 2015 at 4:29 AM Ram Venkatesh <[email protected]> > wrote: > >> Hello, >> >> I am having trouble running a pyspark note (zeppelin newbie, could well >> be pilot error). >> >> The note is >> >> %pyspark >> >> print ‘hello world’ >> >> The note transitions to “PENDING” and then “RUNNING” but never finishes >> after that. >> >> From the zeppelin server logs: >> INFO [2015-03-30 12:24:10,512] ({pool-2-thread-2} >> RemoteInterpreterProcess.java[reference]:74) - Run interpreter process >> /Users/rvenkatesh/dev/asf/zeppelin/bin/interpreter.sh -d >> /Users/rvenkatesh/dev/asf/zeppelin/interpreter/spark -p 59569 >> INFO [2015-03-30 12:24:11,570] ({pool-2-thread-2} >> RemoteInterpreter.java[init]:114) - Create remote interpreter >> com.nflabs.zeppelin.spark.SparkInterpreter >> INFO [2015-03-30 12:24:11,623] ({pool-2-thread-2} >> RemoteInterpreter.java[init]:114) - Create remote interpreter >> com.nflabs.zeppelin.spark.PySparkInterpreter >> INFO [2015-03-30 12:24:11,628] ({pool-2-thread-2} >> RemoteInterpreter.java[init]:114) - Create remote interpreter >> com.nflabs.zeppelin.spark.SparkSqlInterpreter >> INFO [2015-03-30 12:24:11,631] ({pool-2-thread-2} >> RemoteInterpreter.java[init]:114) - Create remote interpreter >> com.nflabs.zeppelin.spark.DepInterpreter >> INFO [2015-03-30 12:24:11,635] ({pool-2-thread-2} >> RemoteInterpreter.java[open]:143) - open remote interpreter >> com.nflabs.zeppelin.spark.PySparkInterpreter >> INFO [2015-03-30 12:24:11,682] ({pool-2-thread-2} >> Paragraph.java[jobRun]:182) - RUN : >> print 'hello world' >> >> INFO [2015-03-30 12:24:19,444] ({Thread-24} >> RemoteScheduler.java[getStatus]:185) >> - getStatus from remote RUNNING >> INFO [2015-03-30 12:24:19,444] ({Thread-24} >> NotebookServer.java[broadcast]:205) - SEND >> NOTE >> INFO [2015-03-30 12:24:19,446] ({Thread-25} >> NotebookServer.java[broadcast]:205) - SEND >> PROGRESS >> INFO [2015-03-30 12:24:19,955] ({Thread-25} >> NotebookServer.java[broadcast]:205) - SEND >> PROGRESS >> … ad infinetum >> >> Nothing interesting in the spark interpreter logs. >> >> Any help appreciated. >> >> Thanks! >> Ram > > >
