QiuYucheng2003 opened a new issue, #9983:
URL: https://github.com/apache/rocketmq/issues/9983

   ### Before Creating the Enhancement Request
   
   - [x] I have confirmed that this should be classified as an enhancement 
rather than a bug/feature.
   
   
   ### Summary
   
   Add defensive checks in `NettyRemotingServer#registerProcessor` to detect 
and prohibit/warn against the usage of `CallerRunsPolicy` in custom user 
executors. Using this policy triggers blocking operations on Netty EventLoop 
threads when the thread pool is exhausted, leading to severe performance 
degradation.
   
   ### Motivation
   
   In `NettyRemotingServer.java`, the `registerProcessor` method allows users 
(e.g., Broker developers) to register a custom `ExecutorService` for specific 
request codes.
   
   Currently, there is no validation on the `RejectedExecutionHandler` of the 
provided executor.
   In high-concurrency scenarios (common in Cloud/K8s environments), if a user 
mistakenly configures a `ThreadPoolExecutor` with **`CallerRunsPolicy`** and 
the pool becomes full:
   1. The task execution is rejected.
   2. `CallerRunsPolicy` forces the task to run in the caller's thread.
   3. In this context, the caller thread is the **Netty IO Thread (EventLoop)** 
(invoked from `NettyServerHandler.channelRead0`).
   
   **Consequence:**
   The Netty IO thread gets blocked processing business logic. This stops the 
server from reading/writing other requests or handling heartbeats on that 
channel, potentially causing the node to be marked as "down" by the cluster or 
causing a "Stop-the-World" like pause in network throughput (DPSE - Death by 
Poorly Scheduled Executor).
   
   Adding a check will prevent this silent misconfiguration.
   
   ### Describe the Solution You'd Like
   
   Modify 
`org.apache.rocketmq.remoting.netty.NettyRemotingServer#registerProcessor` to 
include a validation step.
   
   Logic:
   1. Check if the passed `executor` is an instance of `ThreadPoolExecutor`.
   2. If yes, check if `((ThreadPoolExecutor) 
executor).getRejectedExecutionHandler()` is an instance of `CallerRunsPolicy`.
   3. If detected, throw an `IllegalArgumentException` (fast fail) or at least 
log a generic `ERROR/WARN` to alert the user that this configuration is 
dangerous.
   
   Sample Code Logic:
   ```java
   if (executor instanceof ThreadPoolExecutor) {
       if (((ThreadPoolExecutor) executor).getRejectedExecutionHandler() 
instanceof ThreadPoolExecutor.CallerRunsPolicy) {
           throw new IllegalArgumentException("CallerRunsPolicy is strictly 
forbidden in RocketMQ Remoting as it blocks Netty IO threads.");
       }
   }
   
   ### Describe Alternatives You've Considered
   
   ### 6. Describe Alternatives You've Considered 
   ```text
   1. **Documentation warning only:** We could just add a warning in Javadoc, 
but users often miss documentation details when configuring custom thread pools.
   2. **Wrapper Executor:** Wrapping the user executor to intercept the 
rejection. This is overly complex and might introduce overhead.
   3. **Log only:** Just logging a WARN message. This is a softer approach but 
might be ignored during startup, leading to runtime failures later. Fast-fail 
is preferred for such critical configuration errors.
   
   ### Additional Context
   
   Relevant Code Location:
   - Registration: 
`org.apache.rocketmq.remoting.netty.NettyRemotingServer#registerProcessor`
   - Execution path: 
`org.apache.rocketmq.remoting.netty.NettyRemotingServer.NettyServerHandler#channelRead0`
 calling `remotingAbstract.processMessageReceived`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to