Hi Guys and great work! In our company we decided to relay on Zeppelin as our tool to access and visualize our data. I'm working with Spark 1.3.1 on top of Hadoop yarn 2.4
Lately we started to see many NullPointerExcpetion on the Spark interpreter log (see below). Sometimes it does not affect Zeppelin but sometimes we start to see 'can't get status' error caused by broken pipe, this time on the *zeppelin* log file (see below). When the latest error starts - it stacks the Spark interpretor and it enforce us to restart Zeppelin. I couldn't find when and why it's happening, it just happened once awhile... Any help will be more than appreciated! 10x! *The NPE on zeppelin-interpreter-spark log file:* WARN [2015-06-22 09:22:37,265] ({qtp622396559-74} ServletHandler.java[doHandle]:561) - /jobs/ java.lang.NullPointerException at org.spark-project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1467) at org.spark-project.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:499) at org.spark-project.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086) at org.spark-project.jetty.servlet.ServletHandler.doScope(ServletHandler.java:428) at org.spark-project.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020) at org.spark-project.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.spark-project.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255) at org.spark-project.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.spark-project.jetty.server.Server.handle(Server.java:370) at org.spark-project.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:494) at org.spark-project.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:971) at org.spark-project.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1033) at org.spark-project.jetty.http.HttpParser.parseNext(HttpParser.java:644) at org.spark-project.jetty.http.HttpParser.parseAvailable(HttpParser.java:235) at org.spark-project.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82) at org.spark-project.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:667) at org.spark-project.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52) at org.spark-project.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.spark-project.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) at java.lang.Thread.run(Thread.java:745) *The broken pipe on zeppelin log file:* ERROR [2015-06-21 13:10:31,368] ({Thread-120} RemoteScheduler.java[getStatus]:245) - Can't get status information org.apache.thrift.transport.TTransportException: java.net.SocketException: Broken pipe at org.apache.thrift.transport.TIOStreamTransport.flush(TIOStreamTransport.java:161) at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:65) at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.send_getStatus(RemoteInterpreterService.java:319) at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.getStatus(RemoteInterpreterService.java:311) at org.apache.zeppelin.scheduler.RemoteScheduler$JobStatusPoller.getStatus(RemoteScheduler.java:232) at org.apache.zeppelin.scheduler.RemoteScheduler$JobStatusPoller.run(RemoteScheduler.java:183) Caused by: java.net.SocketException: Broken pipe at java.net.SocketOutputStream.socketWrite0(Native Method) at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:113) at java.net.SocketOutputStream.write(SocketOutputStream.java:159) at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) at org.apache.thrift.transport.TIOStreamTransport.flush(TIOStreamTransport.java:159) ... 5 more