In my case the environment that I am using could be the reason behind this strange behavior.
My /home/ directory is present on a different machine(call it X) and it is shared across different nodes(Y nodes). I run Zeppelin on one of the Y nodes and I have installed it in the home directory of Y, which is actually X. So I had to copy the assembly jar into its own directory. X uses “modules” environment to load modules, and the tools that run are active for a given user session, hence even if X has Spark binaries there is no way I can load the modules unless I make some changes in zeppelin-daemon.sh From: moon soo Lee [mailto:[email protected]] Sent: Tuesday, May 05, 2015 9:30 PM To: [email protected] Subject: Re: Scheduler already terminated error Cool. But it's little bit strange. usually it works without copying assembly. On Tue, May 5, 2015 at 3:43 PM Sambit Tripathy (RBEI/EDS1) <[email protected]<mailto:[email protected]>> wrote: Finally I can make use of Zeppelin to run spark commands. I copied the spark-assembly jar into interpreter/spark directory From: Sambit Tripathy (RBEI/EDS1) [mailto:[email protected]<mailto:[email protected]>] Sent: Monday, May 04, 2015 9:51 PM To: [email protected]<mailto:[email protected]> Subject: RE: Scheduler already terminated error This is the error ERROR [2015-05-04 21:19:17,292] ({pool-1-thread-4} ProcessFunction.java[process]:41) - Internal error processing getProgress java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.deploy.SparkHadoopUtil$ at org.apache.spark.util.Utils$.getSparkOrYarnConfig(Utils.scala:1873) at org.apache.spark.storage.BlockManager.<init>(BlockManager.scala:105) at org.apache.spark.storage.BlockManager.<init>(BlockManager.scala:180) at org.apache.spark.SparkEnv$.create(SparkEnv.scala:308) at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:159) at org.apache.spark.SparkContext.<init>(SparkContext.scala:240) at org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:272) at org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:145) at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:394) at org.apache.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:73) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:68) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.getProgress(LazyOpenInterpreter.java:109) at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.getProgress(RemoteInterpreterServer.java:299) at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Processor$getProgress.getResult(RemoteInterpreterService.java:938) at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Processor$getProgress.getResult(RemoteInterpreterService.java:923) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) From: moon soo Lee [mailto:[email protected]] Sent: Monday, May 04, 2015 9:19 PM To: [email protected]<mailto:[email protected]> Subject: Re: Scheduler already terminated error Do you have any error message in your ZEPPELIN_HOME/logs/zeppelin-interpreter-spark*.log file? On Mon, May 4, 2015 at 9:58 PM Sambit Tripathy (RBEI/EDS1) <[email protected]<mailto:[email protected]>> wrote: Thanks Moon. I do not get this error anymore now. However, when I run some command using %spark interpreter, I do not get any response back. When I checked the log files, I saw the following exception happening. Does this mean the interpreter is not working correctly? %spark val count = ctx.sql("select v1.value from versionOne v1").count() INFO [2015-05-04 13:50:19,511] ({Thread-63} NotebookServer.java[broadcast]:251) - SEND >> NOTE ERROR [2015-05-04 13:50:21,775] ({pool-1-thread-7} Job.java[run]:183) - Job failed org.apache.zeppelin.interpreter.InterpreterException: org.apache.thrift.TApplicationException: Internal error processing interpret at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.interpret(RemoteInterpreter.java:221) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:93) at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:212) at org.apache.zeppelin.scheduler.Job.run(Job.java:170) at org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:296) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) Caused by: org.apache.thrift.TApplicationException: Internal error processing interpret at org.apache.thrift.TApplicationException.read(TApplicationException.java:108) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71) at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.recv_interpret(RemoteInterpreterService.java:190) at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.interpret(RemoteInterpreterService.java:175) at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.interpret(RemoteInterpreter.java:204) ... 12 more INFO [2015-05-04 13:50:21,776] ({Thread-63} NotebookServer.java[afterStatusChange]:571) - Job 20150504-132152_2065400849 is finished INFO [2015-05-04 13:50:21,784] ({Thread-63} NotebookServer.java[broadcast]:251) - SEND >> NOTE INFO [2015-05-04 13:50:21,785] ({pool-1-thread-7} SchedulerFactory.java[jobFinished]:138) - Job paragraph_1430770912781_-924929327 finished by scheduler remoteinterpreter_1530616708 ERROR [2015-05-04 13:50:27,004] ({Thread-64} JobProgressPoller.java[run]:57) - Can not get or update progress org.apache.zeppelin.interpreter.InterpreterException: org.apache.thrift.TApplicationException: Internal error processing getProgress at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getProgress(RemoteInterpreter.java:286) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.getProgress(LazyOpenInterpreter.java:110) at org.apache.zeppelin.notebook.Paragraph.progress(Paragraph.java:179) at org.apache.zeppelin.scheduler.JobProgressPoller.run(JobProgressPoller.java:54) Caused by: org.apache.thrift.TApplicationException: Internal error processing getProgress at org.apache.thrift.TApplicationException.read(TApplicationException.java:108) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71) at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.recv_getProgress(RemoteInterpreterService.java:235) at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.getProgress(RemoteInterpreterService.java:221) at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getProgress(RemoteInterpreter.java:284) ... 3 more From: moon soo Lee [mailto:[email protected]<mailto:[email protected]>] Sent: Friday, May 01, 2015 3:32 PM To: [email protected]<mailto:[email protected]> Subject: Re: Scheduler already terminated error I think you need spark-1.2 profile and hadoop-2.4 profile. Please try mvn install -DskipTests -Pspark-1.2 -Dspark.version=1.2.1 -Phadoop-2.4 -Dhadoop.version=2.5.0 Thanks, moon On Fri, May 1, 2015 at 10:22 AM Sambit Tripathy (RBEI/EDS1) <[email protected]<mailto:[email protected]>> wrote: Moon, This is what I have in my configuration export ZEPPELIN_INTERPRETERS=org.apache.zeppelin.spark.SparkInterpreter,org.apache.zeppelin.spark.PySparkInterpreter,org.apache.zeppelin.spark.SparkSqlInterpreter,org.apache.zeppelin.spark.DepInterpreter,org.apache.zeppelin.markdown.Markdown,org.apache.zeppelin.shell.ShellInterpreter,org.apache.zeppelin.hive.HiveInterpreter export ZEPPELIN_INTERPRETER_DIR=/home/sambit/incubator-zeppelin/interpreter export ZEPPELIN_PORT=8901 export HADOOP_CONF_DIR=/usr/lib/hadoop/etc/hadoop export SPARK_YARN_JAR=/usr/lib/spark/lib/spark-assembly-1.2.0-cdh5.3.0-hadoop2.5.0-cdh5.3.0.jar export ZEPPELIN_NOTEBOOK_DIR=/home/sambit/zep-notebook-dir # Where notebook saved Used this command mvn install -DskipTests -Dspark.version=1.2.1 -Dhadoop.version=2.5.0 to build Zeppelin as provided in the website That’s all. Should the –Dhadoop.version change to 2.5.0-cdh5.3.0? Regards, Sambit. From: moon soo Lee [mailto:[email protected]<mailto:[email protected]>] Sent: Thursday, April 30, 2015 5:25 PM To: [email protected]<mailto:[email protected]> Subject: Re: Scheduler already terminated error Hi, That error message can be shown when Zeppelin fails to create SparkContext. Could you check Zeppelin configuration for your yarn cluster? How did you setup Zeppelin for your Yarn cluster? Like Zeppelin build command against your spark / hadoop version, Zeppelin Interpreter setting, hadoop/yarn configuration files. Thanks, moon On Fri, May 1, 2015 at 8:02 AM Sambit Tripathy (RBEI/EDS1) <[email protected]<mailto:[email protected]>> wrote: Hi, After installation, I tried to run this simple spark command and got this error. Any idea what it could be? Command: %spark val ctx = new org.apache.spark.sql.SqlContext(sc) Error: Scheduler already terminated org.apache.zeppelin.scheduler.RemoteScheduler.submit(RemoteScheduler.java:122) org.apache.zeppelin.notebook.Note.run(Note.java:271) org.apache.zeppelin.socket.NotebookServer.runParagraph(NotebookServer.java:531) org.apache.zeppelin.socket.NotebookServer.onMessage(NotebookServer.java:119) org.java_websocket.server.WebSocketServer.onWebsocketMessage(WebSocketServer.java:469) org.java_websocket.WebSocketImpl.decodeFrames(WebSocketImpl.java:368) org.java_websocket.WebSocketImpl.decode(WebSocketImpl.java:157) org.java_websocket.server.WebSocketServer$WebSocketWorker.run(WebSocketServer.java:657) ERROR What is the best way to verify that Spark Interpreter is working correctly? Is this a Yarn error? PS: I am using yarn. Regards, Sambit.
