[ 
https://issues.apache.org/jira/browse/ARTEMIS-4527?focusedWorklogId=900508&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-900508
 ]

ASF GitHub Bot logged work on ARTEMIS-4527:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 18/Jan/24 20:45
            Start Date: 18/Jan/24 20:45
    Worklog Time Spent: 10m 
      Work Description: AntonRoskvist commented on PR #4705:
URL: 
https://github.com/apache/activemq-artemis/pull/4705#issuecomment-1899178681

   Hello @jbertram 
   
   I had some time to look further into this and came up with another fix and 
reproducer/test which seem to work better.
   
   The main issue is that in some very race conditions the broker will send out 
it's notification for an added consumer before sending the binding_added 
notification for the queue the consumer is bound to.
   
   From my testing this seems to happen when `postOfficeImpl#addBinding()` has 
_just_ added the actual binding to addressManager, but not yet called 
`managementService.sendNotification()` so that the unsynchronized 
`postOfficeImpl#getBinding()` will return it for 
ServerSessionImpl#createConsumer() and enabling it to lock the 
`managementService` before postOffice is able to.
   
   I have not been able to point out _exactly_ what conditions has to be met 
for this to occur, but the new test included in this change works reliably to 
triggering the first issue in the chain leading up the "redistributor race" i.e 
getting the clusters remoteBindings out of sync with regards to their consumer 
count.
   
   I'm removing the "Draft" status as of now but please let me know if anything 
looks off about these changes.




Issue Time Tracking
-------------------

    Worklog Id:     (was: 900508)
    Time Spent: 1h  (was: 50m)

> Redistributor race when consumerCount reaches 0 in cluster
> ----------------------------------------------------------
>
>                 Key: ARTEMIS-4527
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-4527
>             Project: ActiveMQ Artemis
>          Issue Type: Bug
>            Reporter: Anton Roskvist
>            Priority: Major
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> This is a very rare bug caused by cluster notifications arriving in the wrong 
> order in some very specific circumstances



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to