Re: [I] [Bug] Update netty allocator options to avoid oom when using EntryFilters [pulsar]

via GitHub Tue, 02 Dec 2025 23:36:07 -0800


geniusjoe commented on issue #25021:
URL: https://github.com/apache/pulsar/issues/25021#issuecomment-3605476419

> There a blog post by AutoMQ folks about PooledByteBufAllocator memory
fragmentation:
https://www.automq.com/blog/netty-based-streaming-systems-memory-fragmentation-and-oom-issues
. I guess the blog post was written before
[AdaptiveByteBufAllocator](https://netty.io/4.1/api/io/netty/buffer/AdaptiveByteBufAllocator.html)
became usable. It would be interesting to also see how
AdaptiveByteBufAllocator behaves in Pulsar.
>
> In Pulsar, the broker caching adds more of this fragmentation since it
could hold on to much larger buffer when an entry is cached. In Netty, a slice
of a buffer will hold the parent buffer in memory until all slices have been
released. The mitigation in Pulsar is `managedLedgerCacheCopyEntries=true`, but
that adds overhead since entries would get copied each time they are added to
the cache.

Lari thank you so much for your detailed reply.

I also support adjusting the chunk size to 8MB
(`-Dio.netty.allocator.maxOrder=10`) as a first step. At the very least, for
scenarios involving large message payload (e.g., where each message is 4MB or
larger), this change would prevent Netty from needing to call the
`allocateHuge()` method to request direct memory from the OS for every single
byteBuf allocation. Increasing the `maxOrder` should provide benefits for both
message throughput and memory reuse.

Your later point about adjusting the client-side JVM startup parameters to
improve throughput efficiency is also very meaningful. I will try enabling this
parameter in some of our high-traffic client clusters to verify if performance
improves.

In our scenario, we have also observed the backpressure mentioned in your
issue #24926. When a topic has many subscriptions, and each subscription has
only one active consumer with a large `receiver_queue_size`(default: 1000
entries), slow network speeds can easily trigger the channel’s high
watermark(`c.isWritable() = false`). Since no other consumer channels are
available for writing within the current subscription, the consumer’s
dispatcher will eventually write all entries from the read batch, one by one,
into this single channel. This leads to increased direct memory usage. To
mitigate this, we have reduced the client-side queue size(to 20) to lower
direct memory consumption. However, a long-term solution may require optimizing
the `trySendMessagesToConsumers` logic for high-watermark scenarios.

Due to varying write speeds across different channels, the timing of
`byteBuf` releases also differs. This situation is highly likely to cause
memory fragmentation within chunks. In one of my clusters, each broker runs on
a 16C32G spec, configured with a 12GB heap and 16GB of direct memory (`-Xms12g
-Xmx12g -XX:MaxDirectMemorySize=16g`). The current Netty configuration uses
`-Dio.netty.allocator.maxOrder=13, -Dio.netty.allocator.numDirectArenas=8, and
-Dio.netty.allocator.maxCachedBufferCapacity=8388608`.
During production operation, some brokers maintain consistently high direct
memory usage, typically ranging from 8–11GB. However, despite this high usage,
after observing for a week, I have not detected a clear upward trend in direct
memory consumption, nor have I observed the memory leak issue described in the
AutoMQ documentation. And in the long term, it might be valuable enough to
conduct a performance comparison or benchmark between the
`AdaptiveByteBufAllocator` and the current `PooledByteBufAllocator`.

This Pulsar cluster runs only one Pulsar broker per pod. The configured 16GB
of direct memory is dedicated exclusively to this single broker. Therefore, my
primary concern isn't the reclamation speed of the chunks holding the byteBufs.
Instead, I aim to maximize the chunk reuse rate to avoid frequent allocation
and deallocation of native memory by Netty from the OS. I probably won't enable
the `managedLedgerCacheCopyEntries=true` configuration currently.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [I] [Bug] Update netty allocator options to avoid oom when using EntryFilters [pulsar]

Reply via email to