[jira] [Updated] (ARTEMIS-3809) LargeMessageControllerImpl hangs the message consume

David Bennion (Jira) Fri, 29 Apr 2022 18:59:14 -0700


     [ 
https://issues.apache.org/jira/browse/ARTEMIS-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


David Bennion updated ARTEMIS-3809:
-----------------------------------
    Summary: LargeMessageControllerImpl hangs the message consume  (was: 
LargeMessageConsumerImpl hangs the message consume)

> LargeMessageControllerImpl hangs the message consume
> ----------------------------------------------------
>
>                 Key: ARTEMIS-3809
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-3809
>             Project: ActiveMQ Artemis
>          Issue Type: Bug
>          Components: Broker
>    Affects Versions: 2.21.0
>         Environment: OS: Windows Server 2019
> JVM: OpenJDK 64-Bit Server VM Temurin-17.0.1+12
> Max Memory (-Xmx): 6GB
> Allocated to JVM: 4.168GB
> Currently in use: 3.398GB  (heap 3.391GB, non-heap 0.123GB)
>            Reporter: David Bennion
>            Priority: Major
>              Labels: test-stability
>
> I wondered if this might be a recurrence of issue ARTEMIS-2293 but this 
> happens on 2.21.0 and I can see the code change in 
> LargeMessageControllerImpl.  
> Using the default min-large-message-size of 100K. (defaults)
> Many messages are passing through the broker when this happens.  I would 
> anticipate that most of the messages are smaller than 100K, but clearly some 
> of them must exceed.  After some number of messages, a particular consumer 
> ceases to consume messages.
> After the system became "hung" I was able to get a stack trace and I was able 
> to identify that the system is stuck in an Object.wait() for a notify that 
> appears to never come.
> Here is the trace I was able to capture:
> {code:java}
> Thread-2 (ActiveMQ-client-global-threads) id=78 state=TIMED_WAITING
>     - waiting on <0x43523a75> (a 
> org.apache.activemq.artemis.core.client.impl.LargeMessageControllerImpl)
>     - locked <0x43523a75> (a 
> org.apache.activemq.artemis.core.client.impl.LargeMessageControllerImpl)
>     at  [email protected]/java.lang.Object.wait(Native Method)
>     at 
> org.apache.activemq.artemis.core.client.impl.LargeMessageControllerImpl.waitCompletion(LargeMessageControllerImpl.java:294)
>     at 
> org.apache.activemq.artemis.core.client.impl.LargeMessageControllerImpl.saveBuffer(LargeMessageControllerImpl.java:268)
>     at 
> org.apache.activemq.artemis.core.client.impl.ClientLargeMessageImpl.checkBuffer(ClientLargeMessageImpl.java:157)
>     at 
> org.apache.activemq.artemis.core.client.impl.ClientLargeMessageImpl.getBodyBuffer(ClientLargeMessageImpl.java:89)
>     at mypackage.MessageListener.handleMessage(MessageListener.java:46)
> {code}
>  
> The app can run either as a single node using the InVM transporter or as a 
> cluster using the TCP.  To my knowledge, I have only seen this issue occur on 
> the InVM. 
> I am not expert in this code, but I can tell from the call stack that 0 must 
> be the value of timeWait passed into waitCompletion().  But from what I can 
> discern of the code changes in 2.21.0,  it should be adjusting the 
> readTimeout to the timeout of the message (I think?) such that it causes the 
> read to eventually give up rather than remaining blocked forever.
> We have persistenceEnabled = false, which leads me to believe that the only 
> disk activity  for messages should be related to large messages(?).  
> On a machine and context where this was consistently happening, I adjusted 
> the min-large-message-size upwards and the problem went away.   This makes 
> sense for my application, but ultimately if a message goes across the 
> threshold to become large it appears to hang the consumer indefinitely. 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Updated] (ARTEMIS-3809) LargeMessageControllerImpl hangs the message consume

Reply via email to