[jira] [Work logged] (ARTEMIS-4476) Connection Failure Race Conditions in AMQP and Core

ASF GitHub Bot (Jira) Wed, 29 Nov 2023 17:15:05 -0800


     [ 
https://issues.apache.org/jira/browse/ARTEMIS-4476?focusedWorklogId=893034&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-893034
 ]


ASF GitHub Bot logged work on ARTEMIS-4476:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 30/Nov/23 01:14
            Start Date: 30/Nov/23 01:14
    Worklog Time Spent: 10m 
      Work Description: clebertsuconic commented on code in PR #4694:
URL: https://github.com/apache/activemq-artemis/pull/4694#discussion_r1410044174


##########
artemis-protocols/artemis-openwire-protocol/src/main/java/org/apache/activemq/artemis/core/protocol/openwire/OpenWireConnection.java:
##########
@@ -761,7 +761,11 @@ public void fail(ActiveMQException me, String message) {
 
       final ThresholdActor<Command> localVisibleActor = openWireActor;
       if (localVisibleActor != null) {
-         localVisibleActor.shutdown(() -> doFail(me, message));
+         localVisibleActor.requestShutdown();
+      }
+
+      if (executor != null) {
+         executor.execute(() -> doFail(me, message));

Review Comment:
   @gtully notice that the executor is used by the Actor. Meaning it will 
always execute after sthudown is called, using the same thread it would have 
used with the actor.
   
   This is just a safer option.





Issue Time Tracking
-------------------

    Worklog Id:     (was: 893034)
    Time Spent: 5h 10m  (was: 5h)

> Connection Failure Race Conditions in AMQP and Core
> ---------------------------------------------------
>
>                 Key: ARTEMIS-4476
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-4476
>             Project: ActiveMQ Artemis
>          Issue Type: Task
>            Reporter: Clebert Suconic
>            Assignee: Clebert Suconic
>            Priority: Major
>          Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> Failure Detection has a possibility to a race condition with the processing 
> of the client packets (or frames in the case of AMQP).
> This is because Netty detects the failure and removes the connection objects 
> while the packets are still processing things. 
> I was not able to reproduce this particular issue, but I have seen a case 
> from a memory dump where the consumer was created while the connection was 
> already dropped, leaving the consumer isolated without any communication with 
> clients.
> That particular case I could see a possibility because of these races.
> I am adding tests to exercise connection failure in stress and I was able to 
> reproduce other issues.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (ARTEMIS-4476) Connection Failure Race Conditions in AMQP and Core

Reply via email to