Fuseki/TDB stuck: ~150 threads BLOCKED waiting to lock the same object

Enrico Daga (enridaga) Mon, 08 May 2017 03:23:59 -0700

Good morning,

I have an instance of Fuseki 2.3.1 running in production that sometimes hangs, 
leaving a long set of CLOSE_WAIT connections with Apache (its proxy).
It looks like connections are left hanging ignoring the timeout in the 
configuration, therefore I need to shut down the VM, delete the lock files 
manually and restart.
However, this is happening quite often recently (probably because of an 
increase of activity).
Looking into it with jstack, I observed ~150 threads in BLOCKED state, all 
waiting to lock *the same* object.
Here is one:


"qtp2123914473-3049" #3049 prio=5 os_prio=0 tid=0x00007f1ae9909000 nid=0x5016 
waiting for monitor entry [0x00007f15d7108000]
   java.lang.Thread.State: BLOCKED (on object monitor)
    at 
org.apache.jena.tdb.store.nodetable.NodeTableCache._idForNode(NodeTableCache.java:144)
    - waiting to lock <0x00000004c007ffa8> (a java.lang.Object)
    at 
org.apache.jena.tdb.store.nodetable.NodeTableCache.getNodeIdForNode(NodeTableCache.java:87)
    at 
org.apache.jena.tdb.store.nodetable.NodeTableWrapper.getNodeIdForNode(NodeTableWrapper.java:48)
    at 
org.apache.jena.tdb.store.nodetable.NodeTableInline.getNodeIdForNode(NodeTableInline.java:59)
    at 
org.apache.jena.tdb.transaction.NodeTableTrans.getNodeIdForNode(NodeTableTrans.java:98)
    at 
org.apache.jena.tdb.store.nodetable.NodeTableWrapper.getNodeIdForNode(NodeTableWrapper.java:48)
    at 
org.apache.jena.tdb.store.nodetable.NodeTableInline.getNodeIdForNode(NodeTableInline.java:59)
    at 
org.apache.jena.tdb.transaction.NodeTableTrans.getNodeIdForNode(NodeTableTrans.java:98)
    at 
org.apache.jena.tdb.store.nodetable.NodeTableWrapper.getNodeIdForNode(NodeTableWrapper.java:48)
    at 
org.apache.jena.tdb.store.nodetable.NodeTableInline.getNodeIdForNode(NodeTableInline.java:59)
    at 
org.apache.jena.tdb.transaction.NodeTableTrans.getNodeIdForNode(NodeTableTrans.java:98)
    at 
org.apache.jena.tdb.store.nodetable.NodeTableWrapper.getNodeIdForNode(NodeTableWrapper.java:48)
    at 
org.apache.jena.tdb.store.nodetable.NodeTableInline.getNodeIdForNode(NodeTableInline.java:59)
    at 
org.apache.jena.tdb.transaction.NodeTableTrans.getNodeIdForNode(NodeTableTrans.java:98)
    at 
org.apache.jena.tdb.store.nodetable.NodeTableWrapper.getNodeIdForNode(NodeTableWrapper.java:48)
    at 
org.apache.jena.tdb.store.nodetable.NodeTableInline.getNodeIdForNode(NodeTableInline.java:59)
    at 
org.apache.jena.tdb.transaction.NodeTableTrans.getNodeIdForNode(NodeTableTrans.java:98)
    at 
org.apache.jena.tdb.store.nodetable.NodeTableWrapper.getNodeIdForNode(NodeTableWrapper.java:48)
    at 
org.apache.jena.tdb.store.nodetable.NodeTableInline.getNodeIdForNode(NodeTableInline.java:59)
    at 
org.apache.jena.tdb.transaction.NodeTableTrans.getNodeIdForNode(NodeTableTrans.java:98)
    at 
org.apache.jena.tdb.store.nodetable.NodeTableWrapper.getNodeIdForNode(NodeTableWrapper.java:48)
    at 
org.apache.jena.tdb.store.nodetable.NodeTableInline.getNodeIdForNode(NodeTableInline.java:59)
[…]

These three objects call each other for a while in the stack.

Looking at the issues, it might be similar to what observed here [1], although 
version 2.3.1 did not seem to be affected (BTW, the issue says FIXED but 
reading the comments it looks like the problem was not reproduced).
It looks like a deadlock, I was wondering whether anybody have seen this before?

Thank you for any feedback,

Enrico


[1] https://issues.apache.org/jira/browse/JENA-1296


—
Enrico Daga (enridaga)
http://www.enridaga.net
Il budda e’ nel parco.

Fuseki/TDB stuck: ~150 threads BLOCKED waiting to lock the same object

Reply via email to