[jira] [Created] (ARTEMIS-3809) LargeMessageConsumerImpl hangs the message consume

David Bennion (Jira) Fri, 29 Apr 2022 14:50:05 -0700

David Bennion created ARTEMIS-3809:
--------------------------------------

             Summary: LargeMessageConsumerImpl hangs the message consume
                 Key: ARTEMIS-3809
                 URL: https://issues.apache.org/jira/browse/ARTEMIS-3809
             Project: ActiveMQ Artemis
          Issue Type: Bug
          Components: Broker
    Affects Versions: 2.21.0
         Environment: OS: Windows Server 2019
JVM: OpenJDK 64-Bit Server VM Temurin-17.0.1+12
Max Memory (-Xmx): 6GB
Allocated to JVM: 4.168GB
Currently in use: 3.398GB  (heap 3.391GB, non-heap 0.123GB)
            Reporter: David Bennion



I wondered if this might be a recurrence of issue ARTEMIS-2293 but this happens 
on 2.21.0 and I can see the code change in LargeMessageControllerImpl.  

Using the default min-large-message-size of 100K. (defaults)

Many messages are passing through the broker when this happens.  I would 
anticipate that most of the messages are smaller than 100K, but clearly some of 
them must exceed.  After some number of messages, a particular consumer ceases 
to consume messages.

After the system became "hung" I was able to get a stack trace and I was able 
to identify that the system is stuck in an Object.wait() for a notify that 
appears to never come.

Here is the trace I was able to capture:
{code:java}
Thread-2 (ActiveMQ-client-global-threads) id=78 state=TIMED_WAITING
    - waiting on <0x43523a75> (a 
org.apache.activemq.artemis.core.client.impl.LargeMessageControllerImpl)
    - locked <0x43523a75> (a 
org.apache.activemq.artemis.core.client.impl.LargeMessageControllerImpl)
    at  [email protected]/java.lang.Object.wait(Native Method)
    at 
org.apache.activemq.artemis.core.client.impl.LargeMessageControllerImpl.waitCompletion(LargeMessageControllerImpl.java:294)
    at 
org.apache.activemq.artemis.core.client.impl.LargeMessageControllerImpl.saveBuffer(LargeMessageControllerImpl.java:268)
    at 
org.apache.activemq.artemis.core.client.impl.ClientLargeMessageImpl.checkBuffer(ClientLargeMessageImpl.java:157)
    at 
org.apache.activemq.artemis.core.client.impl.ClientLargeMessageImpl.getBodyBuffer(ClientLargeMessageImpl.java:89)
    at mypackage.MessageListener.handleMessage(MessageListener.java:46)

{code}
 

The app can run either as a single node using the InVM transporter or as a 
cluster using the TCP.  To my knowledge, I have only seen this issue occur on 
the InVM. 

I am not expert in this code, but I can tell from the call stack that 0 must be 
the value of timeWait passed into waitCompletion().  But from what I can 
discern of the code changes in 2.21.0,  it should be adjusting the readTimeout 
to the timeout of the message (I think?) such that it causes the read to 
eventually give up rather than remaining blocked forever.

We have persistenceEnabled = false, which leads me to believe that the only 
disk activity  for messages should be related to large messages(?).  

On a machine and context where this was consistently happening, I adjusted the 
min-large-message-size upwards and the problem went away.   This makes sense 
for my application, but ultimately if a message goes across the threshold to 
become large it appears to hang the consumer indefinitely. 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Created] (ARTEMIS-3809) LargeMessageConsumerImpl hangs the message consume

Reply via email to