[ 
https://issues.apache.org/jira/browse/AMQ-9107?focusedWorklogId=823714&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-823714
 ]

ASF GitHub Bot logged work on AMQ-9107:
---------------------------------------

                Author: ASF GitHub Bot
            Created on: 06/Nov/22 08:17
            Start Date: 06/Nov/22 08:17
    Worklog Time Spent: 10m 
      Work Description: lucastetreault commented on PR #928:
URL: https://github.com/apache/activemq/pull/928#issuecomment-1304744663

   Hey @cshannon, I had a chance to look at this today. I ran the example app 
against a build including this change and the performance seems good 👍  
   
   I was trying to understand what I missed in my original changes. I ran the 
tests you added without your changes to ManagedRegionBroker (e.g.: 
https://github.com/apache/activemq/compare/main...lucastetreault:activemq:AMQ-9107-test)
 expecting failures but all the tests are passing even with the addition of 
checking that `consumerSubscriptionMap` is empty. Can you help me understand 
what I'm missing? 
   
   As for the memory leak, it seems I missed one spot where I should have been 
removing from consumerSubscriptionMap in `public void 
unregisterSubscription(Subscription sub)` around [line 
300](https://github.com/apache/activemq/compare/main...lucastetreault:activemq:AMQ-9107-test#diff-17798bd5f1fa8141e349c916704b7f1d7ff660f02bd3582ef696344a5bb3c723R300).
 I added a quick and dirty test 
[here](https://github.com/apache/activemq/compare/main...lucastetreault:activemq:AMQ-9107-test#diff-aa152194070293f1fb21ac1c45654d8041aa45cc3a6770aae3b3fd85961db08fR175)
 that passes now.  




Issue Time Tracking
-------------------

    Worklog Id:     (was: 823714)
    Time Spent: 3.5h  (was: 3h 20m)

> Closing many consumers causes CPU to spike to 100%
> --------------------------------------------------
>
>                 Key: AMQ-9107
>                 URL: https://issues.apache.org/jira/browse/AMQ-9107
>             Project: ActiveMQ
>          Issue Type: Bug
>    Affects Versions: 5.17.1, 5.16.5
>            Reporter: Lucas Tétreault
>            Assignee: Jean-Baptiste Onofré
>            Priority: Major
>             Fix For: 5.18.0, 5.16.6, 5.17.3
>
>         Attachments: example.zip, image-2022-10-07-00-12-39-657.png, 
> image-2022-10-07-00-17-30-657.png
>
>          Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> When there are many consumers (~188k) on a queue, closing them is incredibly 
> expensive and causes the CPU to spike to 100% while the consumers are closed. 
> Tested on an Amazon MQ mq.m5.large instance (2 vcpu, 8gb memory).
> I have attached a minimal recreation of the issue where the following 
> happens: 
> 1/ Open 100 connections.
> 2/ Create consumers as fast as we can on all of those connections until we 
> hit at least 188k consumers.
> 3/ Sleep for 5 minutes so we can observe the CPU come back down after opening 
> all those connections.
> 4/ Start closing consumers as fast as we can.
> 5/ After all consumers are closed, sleep for 5 minutes to observe the CPU 
> come back down after closing all the connections.
>  
> In this example it seems 5 minutes wasn't actually sufficient time for the 
> CPU to come back down and the consumer and connection counts seem to hit 0 at 
> the same time: 
> !image-2022-10-07-00-12-39-657.png|width=757,height=353!
>  
> In a previous test with more time sleeping after closing all the consumers we 
> can see the CPU come back down before we close the connections. 
> !image-2022-10-07-00-17-30-657.png|width=764,height=348!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to