[
https://issues.apache.org/jira/browse/AMQ-9107?focusedWorklogId=823714&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-823714
]
ASF GitHub Bot logged work on AMQ-9107:
---------------------------------------
Author: ASF GitHub Bot
Created on: 06/Nov/22 08:17
Start Date: 06/Nov/22 08:17
Worklog Time Spent: 10m
Work Description: lucastetreault commented on PR #928:
URL: https://github.com/apache/activemq/pull/928#issuecomment-1304744663
Hey @cshannon, I had a chance to look at this today. I ran the example app
against a build including this change and the performance seems good 👍
I was trying to understand what I missed in my original changes. I ran the
tests you added without your changes to ManagedRegionBroker (e.g.:
https://github.com/apache/activemq/compare/main...lucastetreault:activemq:AMQ-9107-test)
expecting failures but all the tests are passing even with the addition of
checking that `consumerSubscriptionMap` is empty. Can you help me understand
what I'm missing?
As for the memory leak, it seems I missed one spot where I should have been
removing from consumerSubscriptionMap in `public void
unregisterSubscription(Subscription sub)` around [line
300](https://github.com/apache/activemq/compare/main...lucastetreault:activemq:AMQ-9107-test#diff-17798bd5f1fa8141e349c916704b7f1d7ff660f02bd3582ef696344a5bb3c723R300).
I added a quick and dirty test
[here](https://github.com/apache/activemq/compare/main...lucastetreault:activemq:AMQ-9107-test#diff-aa152194070293f1fb21ac1c45654d8041aa45cc3a6770aae3b3fd85961db08fR175)
that passes now.
Issue Time Tracking
-------------------
Worklog Id: (was: 823714)
Time Spent: 3.5h (was: 3h 20m)
> Closing many consumers causes CPU to spike to 100%
> --------------------------------------------------
>
> Key: AMQ-9107
> URL: https://issues.apache.org/jira/browse/AMQ-9107
> Project: ActiveMQ
> Issue Type: Bug
> Affects Versions: 5.17.1, 5.16.5
> Reporter: Lucas Tétreault
> Assignee: Jean-Baptiste Onofré
> Priority: Major
> Fix For: 5.18.0, 5.16.6, 5.17.3
>
> Attachments: example.zip, image-2022-10-07-00-12-39-657.png,
> image-2022-10-07-00-17-30-657.png
>
> Time Spent: 3.5h
> Remaining Estimate: 0h
>
> When there are many consumers (~188k) on a queue, closing them is incredibly
> expensive and causes the CPU to spike to 100% while the consumers are closed.
> Tested on an Amazon MQ mq.m5.large instance (2 vcpu, 8gb memory).
> I have attached a minimal recreation of the issue where the following
> happens:
> 1/ Open 100 connections.
> 2/ Create consumers as fast as we can on all of those connections until we
> hit at least 188k consumers.
> 3/ Sleep for 5 minutes so we can observe the CPU come back down after opening
> all those connections.
> 4/ Start closing consumers as fast as we can.
> 5/ After all consumers are closed, sleep for 5 minutes to observe the CPU
> come back down after closing all the connections.
>
> In this example it seems 5 minutes wasn't actually sufficient time for the
> CPU to come back down and the consumer and connection counts seem to hit 0 at
> the same time:
> !image-2022-10-07-00-12-39-657.png|width=757,height=353!
>
> In a previous test with more time sleeping after closing all the consumers we
> can see the CPU come back down before we close the connections.
> !image-2022-10-07-00-17-30-657.png|width=764,height=348!
--
This message was sent by Atlassian Jira
(v8.20.10#820010)