Christopher L. Shannon created AMQ-5712:
-------------------------------------------
Summary: Broker can deadlock for queues while producers wait on
disk space
Key: AMQ-5712
URL: https://issues.apache.org/jira/browse/AMQ-5712
Project: ActiveMQ
Issue Type: Bug
Components: Broker
Affects Versions: 5.11.1
Reporter: Christopher L. Shannon
I am experiencing a deadlock when using a Queue with non-persistent messages.
The queue has a cursor high memory water mark set (right now at 70%). When a
producer is producing messages quickly to the queue and that limit gets hit,
the broker can deadlock. I have tried setting producerWindowSize and
alwaysSyncSend which did not seem to help. When the broker hits that limit, I
am unable to do things like purge the queue. Consumers can also deadlock as
well.
Note that this appears to be the same issue as described in this ticket here:
AMQ-2475 . The difference is that I am using a Queue and not a Topic and the
fix for this appears to only have been for Topics.
The problem appears to be in the Queue class on line 1852 inside the
{{cursorAdd}} method. The method being called is {{return
messages.addMessageLast(msg);}} which will block indefinitely if there is no
space available, which in turn ties up the {{messagesLock}} from being used by
any other threads. We have seen a deadlock where consumers can't consume
because they are waiting on this lock. It looks like in AMQ-2475 part of the
fix was to replace {{messages.addMessageLast(msg)}} with
{{messages.tryAddMessageLast(msg, 10)}}. I also noticed that not all of the
message cursors support {{tryAddMessageLast}}, which could be a problem.
{{FilePendingMessageCursor}} implements it but the rest of the cursors (notably
{{StoreQueueCursor}}) simply delegate back to {{addMessageLast}} in the parent
class. So part of this fix may require implementing {{tryAddMessageLast}}
across more cursors.
Here is part of the thread dump showing the stuck producer:
{code}
"ActiveMQ Transport: ssl:///192.168.3.142:38589" daemon prio=10
tid=0x00007fb46c006000 nid=0x3b1a runnable [0x00007fb4b8a0d000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000cfb13cd0> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2176)
at org.apache.activemq.usage.Usage.waitForSpace(Usage.java:103)
at org.apache.activemq.usage.Usage.waitForSpace(Usage.java:90)
at org.apache.activemq.usage.Usage.waitForSpace(Usage.java:80)
at
org.apache.activemq.broker.region.cursors.FilePendingMessageCursor.tryAddMessageLast(FilePendingMessageCursor.java:235)
- locked <0x00000000d2015ee0> (a
org.apache.activemq.broker.region.cursors.FilePendingMessageCursor)
at
org.apache.activemq.broker.region.cursors.FilePendingMessageCursor.addMessageLast(FilePendingMessageCursor.java:207)
- locked <0x00000000d2015ee0> (a
org.apache.activemq.broker.region.cursors.FilePendingMessageCursor)
at
org.apache.activemq.broker.region.cursors.StoreQueueCursor.addMessageLast(StoreQueueCursor.java:97)
- locked <0x00000000d1f20908> (a
org.apache.activemq.broker.region.cursors.StoreQueueCursor)
at
mw.activemq.plugins.broker.adapter.PendingMessageCursorSupport.addMessageLast(PendingMessageCursorSupport.java:66)
at org.apache.activemq.broker.region.Queue.cursorAdd(Queue.java:1852)
at
org.apache.activemq.broker.region.Queue.orderedCursorAdd(Queue.java:926)
at org.apache.activemq.broker.region.Queue.doMessageSend(Queue.java:902)
at org.apache.activemq.broker.region.Queue.send(Queue.java:781)
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)