Clemens Valiente created ZEPPELIN-3698:
------------------------------------------
Summary: Zeppelin stops working after a few days of uptime with
RemoteEndpoint unavailable, outgoing connection not open
Key: ZEPPELIN-3698
URL: https://issues.apache.org/jira/browse/ZEPPELIN-3698
Project: Zeppelin
Issue Type: Bug
Reporter: Clemens Valiente
I am running a zeppelin 0.9.0-snapshot instance and it works fine most of the
time, but after running it for a while we eventually encounter issues. The
notebooks suddenly show up empty (are not loading) and the logfile
(apache-zeppelin-dev.log) shows the following error: INFO [2018-07-13
09:57:38,484] (\{qtp2091156596-404} NotebookServer.java[sendNote]:865) - New
operation from 172.21.95.21 : 62462 : johndoe : GET_NOTE : 2DJK3XG1R ERROR
[2018-07-13 09:57:38,484] (\{qtp2091156596-404}
NotebookServer.java[onMessage]:395) - Can't handle message:
\{"op":"GET_NOTE","data":{"id":"2DJK3XG1R"},"principal":"johndoe","ticket":"52ceedde-5654-49b7-88c1-bcd95ec9ae8c","roles":"[]"}
org.eclipse.jetty.websocket.api.WebSocketException: RemoteEndpoint
unavailable, outgoing connection not open at
org.eclipse.jetty.websocket.common.WebSocketSession.getRemote(WebSocketSession.java:252)
at org.apache.zeppelin.socket.NotebookSocket.send(NotebookSocket.java:70) at
org.apache.zeppelin.socket.NotebookServer.broadcast(NotebookServer.java:553) at
org.apache.zeppelin.socket.NotebookServer.checkCollaborativeStatus(NotebookServer.java:496)
at
org.apache.zeppelin.socket.NotebookServer.removeConnectionFromNote(NotebookServer.java:457)
at
org.apache.zeppelin.socket.NotebookServer.removeConnectionFromAllNote(NotebookServer.java:471)
at
org.apache.zeppelin.socket.NotebookServer.addConnectionToNote(NotebookServer.java:437)
at org.apache.zeppelin.socket.NotebookServer.sendNote(NotebookServer.java:884)
at org.apache.zeppelin.socket.NotebookServer.onMessage(NotebookServer.java:249)
at
org.apache.zeppelin.socket.NotebookSocket.onWebSocketText(NotebookSocket.java:58)
at
org.eclipse.jetty.websocket.common.events.JettyListenerEventDriver.onTextMessage(JettyListenerEventDriver.java:128)
at
org.eclipse.jetty.websocket.common.message.SimpleTextMessage.messageComplete(SimpleTextMessage.java:69)
at
org.eclipse.jetty.websocket.common.events.AbstractEventDriver.appendMessage(AbstractEventDriver.java:65)
at
org.eclipse.jetty.websocket.common.events.JettyListenerEventDriver.onTextFrame(JettyListenerEventDriver.java:122)
at
org.eclipse.jetty.websocket.common.events.AbstractEventDriver.incomingFrame(AbstractEventDriver.java:161)
at
org.eclipse.jetty.websocket.common.WebSocketSession.incomingFrame(WebSocketSession.java:309)
at
org.eclipse.jetty.websocket.common.extensions.ExtensionStack.incomingFrame(ExtensionStack.java:214)
at org.eclipse.jetty.websocket.common.Parser.notifyFrame(Parser.java:220) at
org.eclipse.jetty.websocket.common.Parser.parse(Parser.java:258) at
org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.readParse(AbstractWebSocketConnection.java:632)
at
org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.onFillable(AbstractWebSocketConnection.java:480)
at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:544)
at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
at
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)
at java.lang.Thread.run(Thread.java:748) after restarting the server with
zeppelin-daemon.sh restart, everything works again (for a while). this happens
for all users on different systems, after logging in/logging out, so it doesn't
look like a client related problem. The little websocket indicator next to the
username in the webUI shows up green, so I don't know which websocket this
error is referring to. I didn't find this error in any tasks or reported issue.
Did anyone else encounter this problem? Any hints on how I can further debug
this?
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)