This is an automated email from the ASF dual-hosted git repository.

yubiao pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/pulsar.git


The following commit(s) were added to refs/heads/master by this push:
     new 90b2aac0fa6 [improve][pip] PIP-434: Expose Netty channel configuration 
WRITE_BUFFER_WATER_MARK to pulsar conf and pause receive requests when channel 
is unwritable (#24510)
90b2aac0fa6 is described below

commit 90b2aac0fa61f90b3ddf3095da1652decf18602a
Author: fengyubiao <yubiao.f...@streamnative.io>
AuthorDate: Tue Sep 9 13:19:33 2025 +0800

    [improve][pip] PIP-434: Expose Netty channel configuration 
WRITE_BUFFER_WATER_MARK to pulsar conf and pause receive requests when channel 
is unwritable (#24510)
---
 pip/pip-434.md | 95 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 95 insertions(+)

diff --git a/pip/pip-434.md b/pip/pip-434.md
new file mode 100644
index 00000000000..6f6ee872984
--- /dev/null
+++ b/pip/pip-434.md
@@ -0,0 +1,95 @@
+# PIP-434: Expose Netty channel configuration WRITE_BUFFER_WATER_MARK to 
pulsar conf and pause receive requests when channel is unwritable
+
+# Background knowledge & Motivation
+
+As we discussed along the discussion: 
https://lists.apache.org/thread/6jfs02ovt13mnhn441txqy5m6knw6rr8
+
+> Problem Statement:
+> We've encountered a critical issue in our Apache Pulsar clusters where 
brokers experience Out-Of-Memory (OOM) errors and continuous restarts under 
specific load patterns. This occurs when Netty channel write buffers become 
full, leading to a buildup of unacknowledged responses in the broker's memory.
+
+> Background:
+> Our clusters are configured with numerous namespaces, each containing 
approximately 8,000 to 10,000 topics. Our consumer applications are quite 
large, with each consumer using a regular expression (regex) pattern to 
subscribe to all topics within a namespace.
+
+> The problem manifests particularly during consumer application restarts. 
When a consumer restarts, it issues a getTopicsOfNamespace request. Due to the 
sheer number of topics, the response size is extremely large. This massive 
response overwhelms the socket output buffer, causing it to fill up rapidly. 
Consequently, the broker's responses get backlogged in memory, eventually 
leading to the broker's OOM and subsequent restart loop.
+
+> Solution we got:
+> - Expose Netty channel configuration WRITE_BUFFER_WATER_MARK to pulsar conf
+> - Stops receive requests continuously once the Netty channel is unwritable, 
users can use the new config to control the threshold that limits the max bytes 
that are pending write.
+
+# Goals
+
+## In Scope
+- Expose Netty channel configuration WRITE_BUFFER_WATER_MARK to pulsar conf
+- Stops receive requests continuously once the Netty channel is unwritable, 
users can use the new config to control the threshold that limits the max bytes 
that are pending write.
+
+## Out of Scope
+
+- This proposal is not in order to add a broker level memory limitation, it 
only focuses on addressing the OOM caused by the accumulation of a large number 
of responses in memory due to the channel granularity being unwritable. 
+
+# Detailed Design
+
+### Configuration
+
+```shell
+# It relates to configuration "WriteBufferHighWaterMark" of Netty Channel 
Config. If the number of bytes queued in the write buffer exceeds this value, 
channel writable state will start to return "false".
+pulsarChannelWriteBufferHighWaterMark=64k
+# It relates to configuration "WriteBufferLowWaterMark" of Netty Channel 
Config. If the number of bytes queued in the write buffer is smaller than this 
value, channel writable state will start to return "true".
+pulsarChannelWriteBufferLowWaterMark=32k
+# Once the writer buffer is full, the channel stops dealing with new requests 
until it changes to writable
+pulsarChannelPauseReceivingRequestsIfUnwritable=false
+After the connection is recovered from an unreadable state, the channel will 
be rate-limited for a period of time to avoid overwhelming due to the backlog 
of requests. This parameter defines how many" requests should be allowed in the 
rate limiting period.
+pulsarChannelRateLimitingRateAfterResumeFromUnreadable=1000
+After the connection is recovered from an unreadable state, the channel will 
be rate-limited for a period of time to avoid overwhelming due to the backlog 
of requests. This parameter defines how long the rate limiting should last, in 
seconds. Once the bytes that are waiting to be sent out reach the 
pulsarChannelWriteBufferHighWaterMark\", the timer will be reset.
+pulsarChannelRateLimitingSecondsAfterResumeFromUnreadable=5
+```
+
+### How it works
+With the settings `pulsarChannelPauseReceivingRequestsIfUnwritable=false`, the 
behaviour is exactly the same as the previous.
+
+After setting `pulsarChannelPauseReceivingRequestsIfUnwritable` to `true`, the 
channel state will be changed as follows.
+- Netty sets `channel.writable` to `false` when there is too much data that is 
waiting to be sent out(the size of the data cached in `ChannelOutboundBuffer` 
is larger than `{pulsarChannelWriteBufferHighWaterMark}`)
+  - Netty will trigger an event `channelWritabilityChanged`
+- Stops receiving requests that come into the channel, which relies on the 
attribute `autoRead` of the `Netty channel`.
+- Netty sets `channel.writable` to `true` once the size of the data that is 
waiting to be sent out is less than `{pulsarChannelWriteBufferLowWaterMark}`
+- Starts to receive requests(sets `channel.autoRead` to `true`).
+  - Note: relies on `ServerCnxThrottleTracker`, which will track the "throttle 
count". When a throttling condition is present, the throttle count is increased 
and when it's no more present, the count is decreased. The autoread should be 
switched to false when the counter value goes from 0 to 1, and only when it 
goes back from 1 to 0 should it be set to true again. The autoread flag is no 
longer controlled directly from the rate limiters. Rate limiters are only 
responsible for their part, [...]
+  - To avoid handling a huge request in the backlog instantly, Pulsar will 
start a timed rate-limiter, which limits the rate of handling the request 
backlog("pulsarChannelRateLimitingRateAfterResumeFromUnreadable" requests per 
second).
+  - After "{pulsarChannelRateLimitingSecondsAfterResumeFromUnreadable}" 
seconds, the rate-limiter will be removed automatically. Once the bytes that 
are waiting to be sent out reach the pulsarChannelWriteBufferHighWaterMark\", 
the timer will be reset.
+
+### CLI
+
+### Metrics
+| Name                                                 | Description           
                                                                      | 
Attributes   | Units|
+|------------------------------------------------------|---------------------------------------------------------------------------------------------|--------------|
 --- |
+| `pulsar_server_channel_write_buf_memory_used_bytes` | Counter. The memory 
amount that is occupied by netty write buffers                      | cluster | 
- |
+
+
+# Monitoring
+
+
+# Security Considerations
+Nothing.
+
+# Backward & Forward Compatibility
+
+## Upgrade
+Nothing.
+
+## Downgrade / Rollback
+Nothing.
+
+## Pulsar Geo-Replication Upgrade & Downgrade/Rollback Considerations
+Nothing.
+
+# Alternatives
+Nothing.
+
+# General Notes
+
+# Links
+
+<!--
+Updated afterwards
+-->
+* Mailing List discussion thread: 
https://lists.apache.org/thread/hnbm9q3yvyf2wcbdggxmjzhr9boorqkn
+* Mailing List voting thread: 
https://lists.apache.org/thread/vpvtf4jnbbrhsy9y5fg00mpz9qhb0cp5

Reply via email to