Ricardo Merizalde created SOLR-5215:
---------------------------------------
Summary: Deadlock in Solr Cloud ConnectionManager
Key: SOLR-5215
URL: https://issues.apache.org/jira/browse/SOLR-5215
Project: Solr
Issue Type: Bug
Components: clients - java
Affects Versions: 4.2.1
Environment: Linux 2.6.18-164.el5 #1 SMP Tue Aug 18 15:51:48 EDT 2009
x86_64 x86_64 x86_64 GNU/Linux
java version "1.6.0_18"
Java(TM) SE Runtime Environment (build 1.6.0_18-b07)
Java HotSpot(TM) 64-Bit Server VM (build 16.0-b13, mixed mode)
Reporter: Ricardo Merizalde
We are constantly seeing a deadlock in our production application servers.
The problem seems to be that a thread A:
- try to process an event and acquires the ConnectionManager lock
- the update callback acquires connectionUpdateLock and invokes waitForConnected
- waitForConnected tries to acquire the ConnectionManager lock (which already
has)
- waitForConnected calls wait releasing the ConnectionManager lock (but still
has the connectionUpdateLock)
The thread B:
- tries to process an event and acquires the ConnectionManager lock
- the update call back tries to acquire connectionUpdateLock but gets blocked
holding the ConnectionManager lock and preventing thread A from getting out of
the wait state.
Here is part of the thread dump:
"http-0.0.0.0-8080-82-EventThread" daemon prio=10 tid=0x0000000059965800
nid=0x3e81 waiting for monitor entry [0x0000000057169000]
java.lang.Thread.State: BLOCKED (on object monitor)
at
org.apache.solr.common.cloud.ConnectionManager.process(ConnectionManager.java:71)
- waiting to lock <0x00002aab1b0e0ce0> (a
org.apache.solr.common.cloud.ConnectionManager)
at
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
"http-0.0.0.0-8080-82-EventThread" daemon prio=10 tid=0x000000005ad40000
nid=0x3e67 waiting for monitor entry [0x000000004dbd4000]
java.lang.Thread.State: BLOCKED (on object monitor)
at
org.apache.solr.common.cloud.ConnectionManager$1.update(ConnectionManager.java:98)
- waiting to lock <0x00002aab1b0e0f78> (a java.lang.Object)
at
org.apache.solr.common.cloud.DefaultConnectionStrategy.reconnect(DefaultConnectionStrategy.java:46)
at
org.apache.solr.common.cloud.ConnectionManager.process(ConnectionManager.java:91)
- locked <0x00002aab1b0e0ce0> (a
org.apache.solr.common.cloud.ConnectionManager)
at
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
"http-0.0.0.0-8080-82-EventThread" daemon prio=10 tid=0x00002aac4c2f7000
nid=0x3d9a waiting for monitor entry [0x0000000042821000]
java.lang.Thread.State: BLOCKED (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00002aab1b0e0ce0> (a
org.apache.solr.common.cloud.ConnectionManager)
at
org.apache.solr.common.cloud.ConnectionManager.waitForConnected(ConnectionManager.java:165)
- locked <0x00002aab1b0e0ce0> (a
org.apache.solr.common.cloud.ConnectionManager)
at
org.apache.solr.common.cloud.ConnectionManager$1.update(ConnectionManager.java:98)
- locked <0x00002aab1b0e0f78> (a java.lang.Object)
at
org.apache.solr.common.cloud.DefaultConnectionStrategy.reconnect(DefaultConnectionStrategy.java:46)
at
org.apache.solr.common.cloud.ConnectionManager.process(ConnectionManager.java:91)
- locked <0x00002aab1b0e0ce0> (a
org.apache.solr.common.cloud.ConnectionManager)
at
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
Found one Java-level deadlock:
=============================
"http-0.0.0.0-8080-82-EventThread":
waiting to lock monitor 0x000000005c7694b0 (object 0x00002aab1b0e0ce0, a
org.apache.solr.common.cloud.ConnectionManager),
which is held by "http-0.0.0.0-8080-82-EventThread"
"http-0.0.0.0-8080-82-EventThread":
waiting to lock monitor 0x00002aac4c314978 (object 0x00002aab1b0e0f78, a
java.lang.Object),
which is held by "http-0.0.0.0-8080-82-EventThread"
"http-0.0.0.0-8080-82-EventThread":
waiting to lock monitor 0x000000005c7694b0 (object 0x00002aab1b0e0ce0, a
org.apache.solr.common.cloud.ConnectionManager),
which is held by "http-0.0.0.0-8080-82-EventThread"
Java stack information for the threads listed above:
===================================================
"http-0.0.0.0-8080-82-EventThread":
at
org.apache.solr.common.cloud.ConnectionManager.process(ConnectionManager.java:71)
- waiting to lock <0x00002aab1b0e0ce0> (a
org.apache.solr.common.cloud.ConnectionManager)
at
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
"http-0.0.0.0-8080-82-EventThread":
at
org.apache.solr.common.cloud.ConnectionManager$1.update(ConnectionManager.java:98)
- waiting to lock <0x00002aab1b0e0f78> (a java.lang.Object)
at
org.apache.solr.common.cloud.DefaultConnectionStrategy.reconnect(DefaultConnectionStrategy.java:46)
at
org.apache.solr.common.cloud.ConnectionManager.process(ConnectionManager.java:91)
- locked <0x00002aab1b0e0ce0> (a
org.apache.solr.common.cloud.ConnectionManager)
at
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
"http-0.0.0.0-8080-82-EventThread":
at java.lang.Object.wait(Native Method)
- waiting on <0x00002aab1b0e0ce0> (a
org.apache.solr.common.cloud.ConnectionManager)
at
org.apache.solr.common.cloud.ConnectionManager.waitForConnected(ConnectionManager.java:165)
- locked <0x00002aab1b0e0ce0> (a
org.apache.solr.common.cloud.ConnectionManager)
at
org.apache.solr.common.cloud.ConnectionManager$1.update(ConnectionManager.java:98)
- locked <0x00002aab1b0e0f78> (a java.lang.Object)
at
org.apache.solr.common.cloud.DefaultConnectionStrategy.reconnect(DefaultConnectionStrategy.java:46)
at
org.apache.solr.common.cloud.ConnectionManager.process(ConnectionManager.java:91)
- locked <0x00002aab1b0e0ce0> (a
org.apache.solr.common.cloud.ConnectionManager)
at
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]