[ 
https://issues.apache.org/jira/browse/ARTEMIS-4928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17873013#comment-17873013
 ] 

Justin Bertram edited comment on ARTEMIS-4928 at 2/6/25 5:35 PM:
-----------------------------------------------------------------

Interestingly, it looks like this stuff is held up in the journal 
{{{}OperationContextImpl{}}}, which has 3 things in it.  A transaction of some 
sort at the head, and then 2 confirmations 
({{{}ServerSessionPacketHandler$1){}}}, the first of which is the one I'm stuck 
waiting for, and the one after it seems to be an even earlier correlation ID 
(don't know if that is expected?).  

 

!image-2024-08-12-18-36-35-508.png|width=574,height=359!

!image-2024-08-12-18-37-10-723.png|width=576,height=360!

!image-2024-08-12-18-37-44-279.png|width=575,height=344!


was (Author: parkri):
Interestingly, it looks like this stuff is held up in the journal 
{{{}OperationContextImpl{}}}, which has 3 things in it.  A transaction of some 
sort at the head, and then 2 confirmations 
({{{}ServerSessionPacketHandler$1){}}}, the first of which is the one I'm stuck 
waiting for, and the one after it seems to be an even earlier correlation ID 
(don't know if that is expected?).  

 

!image-2024-08-12-18-36-35-508.png!

!image-2024-08-12-18-37-10-723.png!

!image-2024-08-12-18-37-44-279.png!

> SendAcknowledgementHandler not getting called
> ---------------------------------------------
>
>                 Key: ARTEMIS-4928
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-4928
>             Project: ActiveMQ Artemis
>          Issue Type: Bug
>          Components: Broker
>    Affects Versions: 2.32.0, 2.35.0, 2.36.0
>         Environment: The environment is Linux based, with Azul Java 17.  I 
> can update with more precise details if needed.
> Artemis version is 2.32.0.  However, Artemis broker and the application (and 
> thus client producer) are in the same JVM with socket transports.
> We do not see any exceptions in our logs.
>  
>            Reporter: Rick Parker
>            Priority: Critical
>         Attachments: image-2024-07-17-13-41-55-900.png, 
> image-2024-07-17-13-49-53-962.png, image-2024-08-12-17-44-03-698.png, 
> image-2024-08-12-17-45-55-812.png, image-2024-08-12-18-36-35-508.png, 
> image-2024-08-12-18-37-10-723.png, image-2024-08-12-18-37-44-279.png, 
> image-2024-09-04-15-09-31-595.png, image-2024-09-04-15-10-04-610.png, 
> image-2024-10-16-16-08-29-866.png
>
>
> We have been using ArtemisMQ since 2016, and recently upgrading from 2.19.1 
> on JDK8 to 2.32.0 on JDK17.  We occasionally experience what looks like a 
> failure to acknowledge the sending of a message by a (CORE) producer, since 
> doing that upgrade, and it brings our application to a halt.
> When I say occasionally, we have a nightly performance test of our 
> application that sends about 20-30 million messages from the one producer.  
> This failure to acknowledge the send so far has happened twice in the space 
> of about a month, which means it is happening approximately every 250-400 
> million messages or perhaps more.  This also means we don't currently have a 
> self contained reproduction of the problem.  We are starting to think about 
> how we might reproduce it more frequently, if possible, since we have now 
> seen it twice and have gained a tiny bit more understanding.
> The symptom is a failure to be called back from the send, and inspecting a 
> heap dump I _think_ confirms that the producer is sitting on a send - but I 
> am not an expert on the internal workings of Artemis and many apologies in 
> advance if I either mislead or point fingers inappropriately.  
> We will try upgrading to the latest 2.35.0 (as at time of writing) to see if 
> it goes away - the fixed issues don't immediately shout out that it might be 
> solved however.
> The API from which we do not get called back is:
> {{org.apache.activemq.artemis.api.core.client.ClientProducer.}}
> {{send(SimpleString address, Message message, SendAcknowledgementHandler 
> handler)}}
> Can a misbehaving handler/callback somehow cause this?  e.g. what happens if 
> it throws an exception? (which we are not seeing bubble up anywhere, but 
> haven not ruled it out)
> I have a screenshot of what looks like an interesting part of the heap dump - 
> the {{{}ChannelImpl{}}}.  To my eyes the {{firstStoredCommandID}} value looks 
> out of sync with the content ({{{}correlationID{}}} of message) of the 
> {{resendCache}} which is lagging behind for some reason.  8,815,497 is the 
> message that has not had the handler called.  But like I say, I'm looking at 
> all this for the first time with little understanding.
> !image-2024-07-17-13-41-55-900.png!
> It also looks like the same message is still present in the broker data 
> structures / heap dump, along with 8,815,495
> !image-2024-07-17-13-49-53-962.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
For further information, visit: https://activemq.apache.org/contact


Reply via email to