[jira] [Work logged] (AMQ-9107) Closing many consumers causes CPU to spike to 100%

ASF GitHub Bot (Jira) Tue, 01 Nov 2022 13:21:08 -0700


     [ 
https://issues.apache.org/jira/browse/AMQ-9107?focusedWorklogId=822456&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-822456
 ]


ASF GitHub Bot logged work on AMQ-9107:
---------------------------------------

                Author: ASF GitHub Bot
            Created on: 01/Nov/22 20:20
            Start Date: 01/Nov/22 20:20
    Worklog Time Spent: 10m 
      Work Description: cshannon commented on PR #928:
URL: https://github.com/apache/activemq/pull/928#issuecomment-1299090316

   @mattrpav , @jbonofre . @lucastetreault - This is ready for review, see what 
you think. Part of this commit reverts the previous change and updates with a 
new test.
   
   Because technically ConsumerInfos can be associated with multiple 
Subscriptions and because ConsumerInfo objects can change if a durable 
subscription goes offline and back online I thought it was generally a bad idea 
to store a map associating them as you'd have to keep it up to date on changes 
with durables (memory leak in the original PR because of that) and doesn't 
really work for one to many association of composite destinations even if 
rarely used.
   
   The new approach here is to just use the existing Regions that already 
handle all of that and track in a concurrent map consumer ids and subscriptions 
so all we have to do is look through the region or regions (if composite) to 
grab the subs and then we are good. We shouldn't need it but I added a fail 
safe where it falls back on Exception or if it couldn't find a sub using the 
new method but that shouldn't happen unless something went very wrong.
   
   I would expect performance to still be much better than before and similar 
to the previous commit as we are still just looking up the subscriptions in 
concurrent maps vs iterating.




Issue Time Tracking
-------------------

    Worklog Id:     (was: 822456)
    Time Spent: 3h 10m  (was: 3h)

> Closing many consumers causes CPU to spike to 100%
> --------------------------------------------------
>
>                 Key: AMQ-9107
>                 URL: https://issues.apache.org/jira/browse/AMQ-9107
>             Project: ActiveMQ
>          Issue Type: Bug
>    Affects Versions: 5.17.1, 5.16.5
>            Reporter: Lucas Tétreault
>            Assignee: Jean-Baptiste Onofré
>            Priority: Major
>             Fix For: 5.18.0, 5.16.6, 5.17.3
>
>         Attachments: example.zip, image-2022-10-07-00-12-39-657.png, 
> image-2022-10-07-00-17-30-657.png
>
>          Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> When there are many consumers (~188k) on a queue, closing them is incredibly 
> expensive and causes the CPU to spike to 100% while the consumers are closed. 
> Tested on an Amazon MQ mq.m5.large instance (2 vcpu, 8gb memory).
> I have attached a minimal recreation of the issue where the following 
> happens: 
> 1/ Open 100 connections.
> 2/ Create consumers as fast as we can on all of those connections until we 
> hit at least 188k consumers.
> 3/ Sleep for 5 minutes so we can observe the CPU come back down after opening 
> all those connections.
> 4/ Start closing consumers as fast as we can.
> 5/ After all consumers are closed, sleep for 5 minutes to observe the CPU 
> come back down after closing all the connections.
>  
> In this example it seems 5 minutes wasn't actually sufficient time for the 
> CPU to come back down and the consumer and connection counts seem to hit 0 at 
> the same time: 
> !image-2022-10-07-00-12-39-657.png|width=757,height=353!
>  
> In a previous test with more time sleeping after closing all the consumers we 
> can see the CPU come back down before we close the connections. 
> !image-2022-10-07-00-17-30-657.png|width=764,height=348!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (AMQ-9107) Closing many consumers causes CPU to spike to 100%

Reply via email to