Vladimir Prus created ZEPPELIN-5292:
---------------------------------------

             Summary: Deadlock in ConnectionManager
                 Key: ZEPPELIN-5292
                 URL: https://issues.apache.org/jira/browse/ZEPPELIN-5292
             Project: Zeppelin
          Issue Type: Bug
    Affects Versions: 0.9.0
            Reporter: Vladimir Prus
         Attachments: stacktrace-2021-03-18.txt

Our 0.9.0 install fairly regularly becomes unresponsive. Specifically, if I 
open the home page, I see the navigation bar, but nothing else shows up. The 
problem does not resolve itself, and there's no CPU usage whatsoever.

I attach a stacktrace from one such incident, where about all threads are 
waiting inside ConnectionManager, like so:
{code:java}
 
"qtp733672688-15179" #15179 prio=5 os_prio=0 tid=0x00007fc1f0002000 nid=0x14103 
waiting for monitor entry [0x00007fc1d48c7000]
   java.lang.Thread.State: BLOCKED (on object monitor)
        at 
org.apache.zeppelin.socket.ConnectionManager.removeConnectionFromAllNote(ConnectionManager.java:175)
        - waiting to lock <0x00007fc5dbb0c5d8> (a java.util.HashMap)
{code}
 

and 
{code:java}
 
"qtp733672688-15068" #15068 prio=5 os_prio=0 tid=0x00007fc358001000 nid=0x14069 
waiting for monitor entry [0x00007fc15aae9000]
   java.lang.Thread.State: BLOCKED (on object monitor)
        at 
org.apache.zeppelin.socket.ConnectionManager.addNoteConnection(ConnectionManager.java:108)
        - waiting to lock <0x00007fc5dbb0c5d8> (a java.util.HashMap)
 
{code}
The lock is held here: 
{code:java}
 
"qtp733672688-10896" #10896 prio=5 os_prio=0 tid=0x00007fc2f4007800 nid=0x12661 
waiting for monitor entry [0x00007fc395267000]
   java.lang.Thread.State: BLOCKED (on object monitor)
        at 
org.apache.zeppelin.socket.NotebookSocket.send(NotebookSocket.java:70)
        - waiting to lock <0x00007fc5dbe1b050> (a 
org.apache.zeppelin.socket.NotebookSocket)
        at 
org.apache.zeppelin.socket.ConnectionManager.broadcast(ConnectionManager.java:247)
        at 
org.apache.zeppelin.socket.ConnectionManager.checkCollaborativeStatus(ConnectionManager.java:214)
        at 
org.apache.zeppelin.socket.ConnectionManager.removeConnectionFromNote(ConnectionManager.java:190)
        - locked <0x00007fc5dbb0c5d8> (a java.util.HashMap)
        at 
org.apache.zeppelin.socket.ConnectionManager.removeConnectionFromAllNote(ConnectionManager.java:178)
        - locked <0x00007fc5dbb0c5d8> (a java.util.HashMap)
{code}
Probably, NotebookSocket.send takes a long time, while holding a lock that is 
blocking basically all connections?

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to