[ 
https://issues.apache.org/jira/browse/ARTEMIS-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17530988#comment-17530988
 ] 

David Bennion commented on ARTEMIS-3809:
----------------------------------------

I have logs and I was able to identify the last time the message consumer gets 
called:


{code:java}
05/02/2022 15:19:10.440 TRACE (global-threads)) [ClientConsumerImpl  ]  
org.apache.activemq.artemis.core.client.impl.ClientConsumerImpl@4e3abde6{consumerContext=ActiveMQConsumerContext{id=0},
 queueName=analyze}::Calling handler.onMessage
{code}

The blocked consumer is on a queue called 'analyze'.  We have another queue 
called 'Stats' that I believe is still pushing messages.

 

 
{noformat}
05/02/2022 15:19:10.440 TRACE (0.0-2081182161)) [InVMConnection      ]  
InVMConnection [serverID=0, id=1e5a49e7-ca54-11ec-99a3-0050568ed7da]::Sending 
inVM packet
05/02/2022 15:19:10.440 TRACE (0.0-2081182161)) [motingConnectionImpl]  
RemotingConnectionID=1e5a49e7-ca54-11ec-99a3-0050568ed7da handling packet 
SessionIndividualAcknowledgeMessage[type=81, channelID=23, responseAsync=false, 
requiresResponse=false, correlationID=-1, consumerID=0, messageID=34369871410, 
requiresResponse=false]
05/02/2022 15:19:10.440 TRACE (0.0-2081182161)) [InVMConnection      ]  
InVMConnection [serverID=0, id=1e5a49e7-ca54-11ec-99a3-0050568ed7da]::packet 
sent done
05/02/2022 15:19:10.440 TRACE (0.0-2081182161)) [InVMConnection      ]  
InVMConnection [serverID=0, id=1e5a49e7-ca54-11ec-99a3-0050568ed7da]::Sending 
inVM packet
05/02/2022 15:19:10.440 TRACE (0.0-2081182161)) [motingConnectionImpl]  
RemotingConnectionID=1e5a49e7-ca54-11ec-99a3-0050568ed7da handling packet 
SessionSendMessage_V2[type=71, channelID=14, responseAsync=true, 
requiresResponse=false, correlationID=0, 
message=CoreMessage[messageID=0,durable=false,userID=null,priority=4, 
timestamp=Mon May 02 15:19:10 CDT 2022,expiration=Mon May 02 15:19:12 CDT 2022, 
durable=false, 
address=Stats,size=1552,properties=TypedProperties[...SNIPPED...]]@1327958296, 
requiresResponse=false, correlationID=0, requiresResponse=false]
05/02/2022 15:19:10.440 TRACE (0.0-2081182161)) [InVMConnection      ]  
InVMConnection [serverID=0, id=1e5a49e7-ca54-11ec-99a3-0050568ed7da]::packet 
sent done
05/02/2022 15:19:10.440 TRACE (Impl$6@f15a0d8)) [SessionPacketHandler]  
ServerSessionPacketHandler::handlePacket,SessionIndividualAcknowledgeMessage[type=81,
 channelID=23, responseAsync=false, requiresResponse=false, correlationID=-1, 
consumerID=0, messageID=34369871410, requiresResponse=false]
05/02/2022 15:19:10.440 TRACE (Impl$6@f15a0d8)) [ServerConsumerImpl  ]  
individualACK messageID=34369871410
05/02/2022 15:19:10.440 TRACE (Impl$6@f15a0d8)) [ServerConsumerImpl  ]  
individualACK starting new TX
05/02/2022 15:19:10.440 TRACE (Impl$6@f15a0d8)) [ServerConsumerImpl  ]  ACKing 
ref 
Reference[34369871410]:NON-RELIABLE:CoreMessage[messageID=34369871410,durable=false,userID=null,priority=5,
 timestamp=Mon May 02 15:19:09 CDT 2022,expiration=Mon May 02 15:20:09 CDT 
2022, durable=false, 
address=analyze,size=9014,properties=TypedProperties[...SNIPPED...]]@1054124345 
on tx= TransactionImpl [xid=null, txID=34369871912, xid=null, state=ACTIVE, 
createTime=1651522750440(Mon May 02 15:19:10 CDT 2022), timeoutSeconds=-1, nr 
operations = 0]@1df2cdae, consumer=ServerConsumerImpl [id=0, filter=null, 
binding=LocalQueueBinding [address=analyze, queue=QueueImpl[name=analyze, 
postOffice=PostOfficeImpl [server=ActiveMQServerImpl::name=0.0.0.0], 
temp=false]@1199309c, filter=null, name=analyze, 
clusterName=analyze5aa9778c-98de-11ec-b671-0050568ed7da]]
05/02/2022 15:19:10.440 TRACE (Impl$6@f15a0d8)) [QueueImpl           ]  
QueueImpl[name=analyze, postOffice=PostOfficeImpl 
[server=ActiveMQServerImpl::name=0.0.0.0], temp=false]@1199309c acknowledge 
tx=true 
ref=Reference[34369871410]:NON-RELIABLE:CoreMessage[messageID=34369871410,durable=false,userID=null,priority=5,
 timestamp=Mon May 02 15:19:09 CDT 2022,expiration=Mon May 02 15:20:09 CDT 
2022, durable=false, 
address=analyze,size=9014,properties=TypedProperties[...SNIPPED...]]@1054124345,
 reason=NORMAL, consumer=ServerConsumerImpl [id=0, filter=null, 
binding=LocalQueueBinding [address=analyze, queue=QueueImpl[name=analyze, 
postOffice=PostOfficeImpl [server=ActiveMQServerImpl::name=0.0.0.0], 
temp=false]@1199309c, filter=null, name=analyze, 
clusterName=analyze5aa9778c-98de-11ec-b671-0050568ed7da]]
05/02/2022 15:19:10.440 INFO  (Impl$6@f15a0d8)) [message             ]  
AMQ601502: User guest(guest)@invm:0 is acknowledging a message from analyze: 
CoreMessage[messageID=34369871410,durable=false,userID=null,priority=5, 
timestamp=Mon May 02 15:19:09 CDT 2022,expiration=Mon May 02 15:20:09 CDT 2022, 
durable=false, 
address=analyze,size=9014,properties=TypedProperties[...SNIPPED...]]@1054124345
05/02/2022 15:19:10.440 DEBUG (Impl$6@f15a0d8)) [impl                ]  
AMQ843014: acknowledged message: 
Reference[34369871410]:NON-RELIABLE:CoreMessage[messageID=34369871410,durable=false,userID=null,priority=5,
 timestamp=Mon May 02 15:19:09 CDT 2022,expiration=Mon May 02 15:20:09 CDT 
2022, durable=false, 
address=analyze,size=9014,properties=TypedProperties[...SNIPPED...]]@1054124345,
 with ackReason: NORMAL
05/02/2022 15:19:10.440 INFO  (Impl$6@f15a0d8)) [impl                ]  
AMQ841014: acknowledged message ID: 34369871410, messageRef sessionID: 
2767a430-ca54-11ec-99a3-0050568ed7da, with messageRef consumerID: 0, messageRef 
QueueName: analyze,  with ackReason: NORMAL
05/02/2022 15:19:10.440 TRACE (Impl$6@f15a0d8)) [TransactionImpl     ]  
TransactionImpl::commit::TransactionImpl [xid=null, txID=34369871912, xid=null, 
state=ACTIVE, createTime=1651522750440(Mon May 02 15:19:10 CDT 2022), 
timeoutSeconds=-1, nr operations = 1]@1df2cdae
05/02/2022 15:19:10.440 DEBUG (Impl$6@f15a0d8)) [QueueMessageMetrics ]  
QueuePendingMessageMetrics[queue=analyze, name=delivering] decrement 
messageCount to 57: 
Reference[34369871410]:NON-RELIABLE:CoreMessage[messageID=34369871410,durable=false,userID=null,priority=5,
 timestamp=Mon May 02 15:19:09 CDT 2022,expiration=Mon May 02 15:20:09 CDT 
2022, durable=false, 
address=analyze,size=9014,properties=TypedProperties[...SNIPPED...]]@1054124345
{noformat}
This is one of several rotated logs, but catches the moment of the last message 
delivery and several message deliveries prior. My log file is 50MB.   I am not 
certain what is relevant.

> LargeMessageControllerImpl hangs the message consume
> ----------------------------------------------------
>
>                 Key: ARTEMIS-3809
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-3809
>             Project: ActiveMQ Artemis
>          Issue Type: Bug
>          Components: Broker
>    Affects Versions: 2.21.0
>         Environment: OS: Windows Server 2019
> JVM: OpenJDK 64-Bit Server VM Temurin-17.0.1+12
> Max Memory (-Xmx): 6GB
> Allocated to JVM: 4.168GB
> Currently in use: 3.398GB  (heap 3.391GB, non-heap 0.123GB)
>            Reporter: David Bennion
>            Priority: Major
>              Labels: test-stability
>
> I wondered if this might be a recurrence of issue ARTEMIS-2293 but this 
> happens on 2.21.0 and I can see the code change in 
> LargeMessageControllerImpl.  
> Using the default min-large-message-size of 100K. (defaults)
> Many messages are passing through the broker when this happens.  I would 
> anticipate that most of the messages are smaller than 100K, but clearly some 
> of them must exceed.  After some number of messages, a particular consumer 
> ceases to consume messages.
> After the system became "hung" I was able to get a stack trace and I was able 
> to identify that the system is stuck in an Object.wait() for a notify that 
> appears to never come.
> Here is the trace I was able to capture:
> {code:java}
> Thread-2 (ActiveMQ-client-global-threads) id=78 state=TIMED_WAITING
>     - waiting on <0x43523a75> (a 
> org.apache.activemq.artemis.core.client.impl.LargeMessageControllerImpl)
>     - locked <0x43523a75> (a 
> org.apache.activemq.artemis.core.client.impl.LargeMessageControllerImpl)
>     at  java.base@17.0.1/java.lang.Object.wait(Native Method)
>     at 
> org.apache.activemq.artemis.core.client.impl.LargeMessageControllerImpl.waitCompletion(LargeMessageControllerImpl.java:294)
>     at 
> org.apache.activemq.artemis.core.client.impl.LargeMessageControllerImpl.saveBuffer(LargeMessageControllerImpl.java:268)
>     at 
> org.apache.activemq.artemis.core.client.impl.ClientLargeMessageImpl.checkBuffer(ClientLargeMessageImpl.java:157)
>     at 
> org.apache.activemq.artemis.core.client.impl.ClientLargeMessageImpl.getBodyBuffer(ClientLargeMessageImpl.java:89)
>     at mypackage.MessageListener.handleMessage(MessageListener.java:46)
> {code}
>  
> The app can run either as a single node using the InVM transporter or as a 
> cluster using the TCP.  To my knowledge, I have only seen this issue occur on 
> the InVM. 
> I am not expert in this code, but I can tell from the call stack that 0 must 
> be the value of timeWait passed into waitCompletion().  But from what I can 
> discern of the code changes in 2.21.0,  it should be adjusting the 
> readTimeout to the timeout of the message (I think?) such that it causes the 
> read to eventually give up rather than remaining blocked forever.
> We have persistenceEnabled = false, which leads me to believe that the only 
> disk activity  for messages should be related to large messages(?).  
> On a machine and context where this was consistently happening, I adjusted 
> the min-large-message-size upwards and the problem went away.   This makes 
> sense for my application, but ultimately if a message goes across the 
> threshold to become large it appears to hang the consumer indefinitely. 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to