Ruslan Dautkhanov created ZEPPELIN-3327:
-------------------------------------------
Summary: NPE when Spark interpreter couldn't start
Key: ZEPPELIN-3327
URL: https://issues.apache.org/jira/browse/ZEPPELIN-3327
Project: Zeppelin
Issue Type: Bug
Affects Versions: 0.8.0, 0.9.0
Reporter: Ruslan Dautkhanov
Attachments: image-2018-03-13-19-16-46-353.png,
image-2018-03-13-19-19-59-364.png
When Spark couldn't start on backend, Zeppelin just shows NPE:
!image-2018-03-13-19-16-46-353.png!
What it should have printed, is true root cause or exception as it was given by
spark-submit.
To reproduce, for example, add an invalid spark interpreter setting, like
!image-2018-03-13-19-19-59-364.png!
and try to start Spark interpreter to reproduce NPE.
This is confusing to users not to see true error obstructed by NPE.
Zeppelin should transparently deliver exception, as was produced by Spark, like
in this example:
{noformat}
Caused by: java.lang.NumberFormatException: Size must be specified as bytes
(b), kibibytes (k), mebibytes (m), gibibytes (g), tebibytes (t), or
pebibytes(p). E.g. 50b, 100k, or 250m.
Invalid suffix: "petabytes"
at
org.apache.spark.network.util.JavaUtils.byteStringAs(JavaUtils.java:291)
at
org.apache.spark.network.util.JavaUtils.byteStringAsBytes(JavaUtils.java:302)
at org.apache.spark.util.Utils$.byteStringAsBytes(Utils.scala:1087)
at org.apache.spark.SparkConf.getSizeAsBytes(SparkConf.scala:302)
at
org.apache.spark.memory.UnifiedMemoryManager$.getMaxMemory(UnifiedMemoryManager.scala:223)
at
org.apache.spark.memory.UnifiedMemoryManager$.apply(UnifiedMemoryManager.scala:199)
at org.apache.spark.SparkEnv$.create(SparkEnv.scala:332)
at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:175)
at org.apache.spark.SparkContext.createSparkEnv(SparkContext.scala:257)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:432)
{noformat}
Notice I had to dig deep into the logs to find root cause and not every user
can do that.
Full exception from interpreter log -
{noformat}
ERROR [2018-03-13 19:15:26,476] ({pool-2-thread-2}
PySparkInterpreter.java[open]:203) - Error
java.lang.NullPointerException
at org.apache.zeppelin.spark.Utils.invokeMethod(Utils.java:44)
at org.apache.zeppelin.spark.Utils.invokeMethod(Utils.java:39)
at
org.apache.zeppelin.spark.OldSparkInterpreter.createSparkContext_2(OldSparkInterpreter.java:375)
at
org.apache.zeppelin.spark.OldSparkInterpreter.createSparkContext(OldSparkInterpreter.java:364)
at
org.apache.zeppelin.spark.OldSparkInterpreter.getSparkContext(OldSparkInterpreter.java:172)
at
org.apache.zeppelin.spark.OldSparkInterpreter.open(OldSparkInterpreter.java:740)
at
org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:61)
at
org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
at
org.apache.zeppelin.spark.PySparkInterpreter.getSparkInterpreter(PySparkInterpreter.java:665)
at
org.apache.zeppelin.spark.PySparkInterpreter.createGatewayServerAndStartScript(PySparkInterpreter.java:273)
at
org.apache.zeppelin.spark.PySparkInterpreter.open(PySparkInterpreter.java:201)
at
org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
at
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:618)
at org.apache.zeppelin.scheduler.Job.run(Job.java:186)
at
org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
ERROR [2018-03-13 19:15:26,476] ({pool-2-thread-2} Job.java[run]:188) - Job
failed
org.apache.zeppelin.interpreter.InterpreterException:
java.lang.NullPointerException
at
org.apache.zeppelin.spark.PySparkInterpreter.open(PySparkInterpreter.java:204)
at
org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
at
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:618)
at org.apache.zeppelin.scheduler.Job.run(Job.java:186)
at
org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NullPointerException
at org.apache.zeppelin.spark.Utils.invokeMethod(Utils.java:44)
at org.apache.zeppelin.spark.Utils.invokeMethod(Utils.java:39)
at
org.apache.zeppelin.spark.OldSparkInterpreter.createSparkContext_2(OldSparkInterpreter.java:375)
at
org.apache.zeppelin.spark.OldSparkInterpreter.createSparkContext(OldSparkInterpreter.java:364)
at
org.apache.zeppelin.spark.OldSparkInterpreter.getSparkContext(OldSparkInterpreter.java:172)
at
org.apache.zeppelin.spark.OldSparkInterpreter.open(OldSparkInterpreter.java:740)
at
org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:61)
at
org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
at
org.apache.zeppelin.spark.PySparkInterpreter.getSparkInterpreter(PySparkInterpreter.java:665)
at
org.apache.zeppelin.spark.PySparkInterpreter.createGatewayServerAndStartScript(PySparkInterpreter.java:273)
at
org.apache.zeppelin.spark.PySparkInterpreter.open(PySparkInterpreter.java:201)
... 11 more
INFO [2018-03-13 19:15:26,487] ({pool-2-thread-2}
SchedulerFactory.java[jobFinished]:115) - Job 20180313-115214_1579158632
finished by scheduler interpreter_860134591
{noformat}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)