poorbarcode commented on code in PR #24510:
URL: https://github.com/apache/pulsar/pull/24510#discussion_r2265866824


##########
pip/pip-434.md:
##########
@@ -0,0 +1,76 @@
+# PIP-434: Expose Netty channel configuration WRITE_BUFFER_WATER_MARK to 
pulsar conf and pause receive requests when channel is unwritable
+
+# Background knowledge & Motivation
+
+As we discussed along the discussion: 
https://lists.apache.org/thread/6jfs02ovt13mnhn441txqy5m6knw6rr8
+
+> Problem Statement:
+> We've encountered a critical issue in our Apache Pulsar clusters where 
brokers experience Out-Of-Memory (OOM) errors and continuous restarts under 
specific load patterns. This occurs when Netty channel write buffers become 
full, leading to a buildup of unacknowledged responses in the broker's memory.
+
+> Background:
+> Our clusters are configured with numerous namespaces, each containing 
approximately 8,000 to 10,000 topics. Our consumer applications are quite 
large, with each consumer using a regular expression (regex) pattern to 
subscribe to all topics within a namespace.
+
+> The problem manifests particularly during consumer application restarts. 
When a consumer restarts, it issues a getTopicsOfNamespace request. Due to the 
sheer number of topics, the response size is extremely large. This massive 
response overwhelms the socket output buffer, causing it to fill up rapidly. 
Consequently, the broker's responses get backlogged in memory, eventually 
leading to the broker's OOM and subsequent restart loop.
+
+> Solution we got:
+> - Expose Netty channel configuration WRITE_BUFFER_WATER_MARK to pulsar conf
+> - Stops receive requests continuously once the Netty channel is unwritable, 
users can use the new config to control the threshold that limits the max bytes 
that are pending write.
+
+# Goals
+
+## In Scope
+- Expose Netty channel configuration WRITE_BUFFER_WATER_MARK to pulsar conf
+- Stops receive requests continuously once the Netty channel is unwritable, 
users can use the new config to control the threshold that limits the max bytes 
that are pending write.

Review Comment:
   > It should be left to ServerCnxThrottleTracker to handle that. You should 
be using incrementThrottleCount when writability changes from true to false and 
decrementThrottleCount when writability changes from false to true. Please add 
this to the design document.
   
   It is not a big change, I think it is not needed to write into the proposal



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to