Re: K8s broker pod getting killed with OOM

Clebert Suconic Thu, 15 Jan 2026 07:33:24 -0800

Those points you described are the reason why I suggesting using
max-address-size on the every destination...



have max-size at say 20M for every destination (make 100K for small
destinations if you like, but I think 20M for every destination is mostly
okay, unless you have a lot of destinations).


have max-read at 10M...

This should then optimize your memory usage

On Thu, Jan 15, 2026 at 6:53 AM Shiv Kumar Dixit <
[email protected]> wrote:

> Hello Arthur and Clebert
>
> When our broker pod starts, it first starts 2 init containers which
> terminate and release resources after completing the setup. So our pod
> basically runs 2 containers – one for vault and another for broker. We
> verified the memory and CPU usage of these init containers and main
> containers using top pod and it shows reasonable data.
>
>
>
> Yes we see Linux OOMKiller is invoked and we are trying to read its report
> to see any meaningful information.
>
>
>
> In the meanwhile, we have noticed below scenario is causing OOMKilling of
> broker container.
>
>    1. There are lot of pending messages on a given queue TEST along with
>    small pending messages on various other queues. Since we are using global
>    max size, some of messages are loaded in memory and rest are in paging
>    folder.
>
>
>
>    2. There are 3-4 consumers on TEST queue but they are very slow hence
>    pending message backlog is not cleared. We see below log in broker:
>
> AMQ224127: Message dispatch from paging is blocked. Address TEST/Queue
> TEST will not read any more messages from paging until pending messages are
> acknowledged. There are currently 5150 messages pending (20972400 bytes)
> with max reads at maxPageReadMessages(-1) and maxPageReadBytes(20971520).
> Either increase reading attributes at the address-settings or change your
> consumers to acknowledge more often.
>
>
>
>    3. We also see below log in broker:
>
> AMQ224108: Stopped paging on address ‘TEST’; size=62986496 bytes (96016
> messages); maxSize=-1 bytes (-1 messages); globalSize=430581015 bytes
> (158406 messages); globalMaxSize=4194304000 bytes (-1 messages);
>
>
>
>    4. If such blocked consumers and pending messages combination can
>    cause broker pod to go into OOM which is running with 30 GB of heap and 40
>    GB of pod memory?
>
>
>
>    5. Since consumers were not consuming messages on time and gave
>    consent to purge the messages, we tried to purge the message manually via
>    broker GUI. Sometimes it worked and more messages got loaded from pages to
>    broker memory but many times broker pod got OOM and restarted.
>
>
>
>    6. This cycle of successful purge or broker restart continued till all
>    messages from pages were loaded into memory and purged. Post cleanup there
>    was no broker restart.
>
>
>
>    7. If purging messages via broker GUI can cause OOM even though broker
>    pod is running with 30 GB of heap and 40 GB of pod memory?
>
>
>
>    8. What is the best way to optimize the broker configuration in such
>    cases where we will always have slow consumers and possibly lot of pending
>    messages in memory and paging folders?
>
>
>
> This impacted broker pod A has a network bridge with another independent
> broker pod B in a hub and spoke model which has very less connection and
> almost no pending messages. We also noticed that if broker pod A goes into
> OOM due to slow consumer and pending messages as described above, second
> broker pod B which is connected over network bridge with first broker pod A
> also goes into restart loop with OOM. Does restart of source pod A and
> disconnection-reconnection of small numbers of bridges can cause target
> broker pod B to restart? We have seen this side effect as well.
>
>
>
> We are using Artemis 2.37.0 version in K8s and Artemis IO operator version
> is 1.2.5.
>
>
>
> Best Regards
>
> Shiv
>
>
>
> *From:* Arthur Naseef <[email protected]>
> *Sent:* 15 January 2026 06:40 AM
> *To:* [email protected]
> *Subject:* Re: K8s broker pod getting killed with OOM
>
>
>
>
>
> *Unverified Sender: *The sender of this email has not been verified.
> Review the content of the message carefully and verify the identity of the
> sender before acting on this email: replying, opening attachments or
> clicking links.
>
>
>
> So 3100 connections is a large number, but that doesn't sound like a good
> reason for the broker pod to go OOM.  Also, getting up to 40gb, I would say
> the 50% rule of thumb may be too conservative (i.e. a higher percentage
> could be reasonable), which is contradicted by your outcome.  Are there
> other containers running in the same POD that might be taking up memory?
> Maybe sidecars?
>
>
>
> Unfortunately, I don't have a working kubernetes setup available right
> now.  If I did, I could poke around and try to give specific tips on
> checking the memory use of the POD.
>
>
>
> Do you know if the Linux OOM killer is getting invoked?  That would be
> reported by the kernel of the node on which the pod was executing.  If you
> can view that report, it includes a lot of useful information, including
> all of the processes involved and the amount of memory used by each.
>
>
>
> Art
>
>
>
>
>
> On Wed, Jan 14, 2026 at 3:52 PM Shiv Kumar Dixit <
> [email protected]> wrote:
>
> Thanks Clebert and Arthur for inputs. I will try your suggestions and let
> you know how it goes.
>
> I have another observation based on issue happening in live. Based on
> input from Arthur, current setup is configured with 20 GB heap and 40 GB
> pod. As the pod started, we got 3100 connections to broker and within
> minutes the pod got OOMKilled. If there is any relation b/w number of
> connections on broker and pod going OOM?
>
> Best Regards
> Shiv
>
> -----Original Message-----
> From: Clebert Suconic <[email protected]>
> Sent: 15 January 2026 04:06 AM
> To: [email protected]
> Subject: Re: K8s broker pod getting killed with OOM
>
>
>
> Unverified Sender: The sender of this email has not been verified. Review
> the content of the message carefully and verify the identity of the sender
> before acting on this email: replying, opening attachments or clicking
> links.
>
>
> so, in summary, what I'm recommending you is:
>
> use max-size-messages for all the queues.. for your large queues, use
> something like 10MB and for your small queues 100K
>
> also keep max-read-page-bytes in use... keep it at 20M
>
>
>
> If I could change the past I would have a max-size on every address we
> deploy, and having global-max-size for the upmost emergency case..
> it's something I'm looking to change into artemis 3.0 or 4.0. (I can't
> change that into a minor version, as it could break certain cases...
> as some users that I know use heavy filtering and can't really rely on
> paging).
>
>
> On Wed, Jan 14, 2026 at 5:31 PM Clebert Suconic <[email protected]>
> wrote:
> >
> > I would recommend against trusting global-max-size. and use max-size
> > for all the addresses.
> >
> > Also what is your reading attributes. I would recommending using the
> > new prefetch values.
> >
> >
> >
> > And also what operator are you using? arkmq? your own?
> >
> > On Wed, Jan 14, 2026 at 7:44 AM Shiv Kumar Dixit
> > <[email protected]> wrote:
> > >
> > > We are hosting Artemis broker in Kubernetes using operator-based
> solution. We deploy the broker as statefulset with 2 or 4 replicas. We
> assign for e.g. 6 GB for heap and 9 GB for pod, 1.2 GB (1/5 of max heap)
> for global-max-size. All addresses normally use -1 for max-size-bytes but
> some less frequently used queues are defined with 100KB for max-size-bytes
> to allow early paging.
> > >
> > >
> > >
> > > We have following observations:
> > >
> > > 1. As the broker pod starts, broker container immediately occupies 6
> GB for max heap. It seems expected as both min and max heap are same.
> > >
> > > 2. Pod memory usage starts with 6+ GB and once we have pending
> messages, good producers and consumers connect to broker, invalid SSL
> attempts happen, broker GUI access happens etc. during normal broker
> operations - pod memory usage keeps increasing and now reaches 9 GB.
> > >
> > > 3. Once the pod hits limit of 9 GB, K8s kills the pod with OOMKilling
> event and restarts the pod. Here we don’t see broker container getting
> killed with OOM rather pod is killed and restarted. It forces the broker to
> restart.
> > >
> > > 4. We have configured artemis.profile to capture memory dump in case
> of OOM of broker but it never happens. So, we are assuming broker process
> is not going out of memory, but pod is going out of memory due to increased
> non-heap usage.
> > >
> > > 5. Only way to recover here is to increase heap and pod memory limits
> from 6 GB and 9 GB to higher values and wait for next re-occurrence.
> > >
> > >
> > >
> > > 1. Is there any way to analyse what is going wrong with non-heap
> native memory usage?
> > >
> > > 2. If non-heap native memory is expected to increase to such extent
> due to pending messages, SSL errors etc.?
> > >
> > > 3. Is there any param we can use to restrict the non-heap native
> memory usage?
> > >
> > > 4. If netty which handles connection aspect of broker can create such
> memory consumption and cause OOM of pod?
> > >
> > > 5. Can we have any monitoring param that can hint that pod is
> potentially in danger of getting killed?
> > >
> > >
> > >
> > > Thanks
> > >
> > > Shiv
> >
> >
> >
> > --
> > Clebert Suconic
>
>
>
> --
> Clebert Suconic
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>

-- 
Clebert Suconic

Re: K8s broker pod getting killed with OOM

Reply via email to