QiuYucheng2003 opened a new issue, #9983:
URL: https://github.com/apache/rocketmq/issues/9983
### Before Creating the Enhancement Request
- [x] I have confirmed that this should be classified as an enhancement
rather than a bug/feature.
### Summary
Add defensive checks in `NettyRemotingServer#registerProcessor` to detect
and prohibit/warn against the usage of `CallerRunsPolicy` in custom user
executors. Using this policy triggers blocking operations on Netty EventLoop
threads when the thread pool is exhausted, leading to severe performance
degradation.
### Motivation
In `NettyRemotingServer.java`, the `registerProcessor` method allows users
(e.g., Broker developers) to register a custom `ExecutorService` for specific
request codes.
Currently, there is no validation on the `RejectedExecutionHandler` of the
provided executor.
In high-concurrency scenarios (common in Cloud/K8s environments), if a user
mistakenly configures a `ThreadPoolExecutor` with **`CallerRunsPolicy`** and
the pool becomes full:
1. The task execution is rejected.
2. `CallerRunsPolicy` forces the task to run in the caller's thread.
3. In this context, the caller thread is the **Netty IO Thread (EventLoop)**
(invoked from `NettyServerHandler.channelRead0`).
**Consequence:**
The Netty IO thread gets blocked processing business logic. This stops the
server from reading/writing other requests or handling heartbeats on that
channel, potentially causing the node to be marked as "down" by the cluster or
causing a "Stop-the-World" like pause in network throughput (DPSE - Death by
Poorly Scheduled Executor).
Adding a check will prevent this silent misconfiguration.
### Describe the Solution You'd Like
Modify
`org.apache.rocketmq.remoting.netty.NettyRemotingServer#registerProcessor` to
include a validation step.
Logic:
1. Check if the passed `executor` is an instance of `ThreadPoolExecutor`.
2. If yes, check if `((ThreadPoolExecutor)
executor).getRejectedExecutionHandler()` is an instance of `CallerRunsPolicy`.
3. If detected, throw an `IllegalArgumentException` (fast fail) or at least
log a generic `ERROR/WARN` to alert the user that this configuration is
dangerous.
Sample Code Logic:
```java
if (executor instanceof ThreadPoolExecutor) {
if (((ThreadPoolExecutor) executor).getRejectedExecutionHandler()
instanceof ThreadPoolExecutor.CallerRunsPolicy) {
throw new IllegalArgumentException("CallerRunsPolicy is strictly
forbidden in RocketMQ Remoting as it blocks Netty IO threads.");
}
}
### Describe Alternatives You've Considered
### 6. Describe Alternatives You've Considered
```text
1. **Documentation warning only:** We could just add a warning in Javadoc,
but users often miss documentation details when configuring custom thread pools.
2. **Wrapper Executor:** Wrapping the user executor to intercept the
rejection. This is overly complex and might introduce overhead.
3. **Log only:** Just logging a WARN message. This is a softer approach but
might be ignored during startup, leading to runtime failures later. Fast-fail
is preferred for such critical configuration errors.
### Additional Context
Relevant Code Location:
- Registration:
`org.apache.rocketmq.remoting.netty.NettyRemotingServer#registerProcessor`
- Execution path:
`org.apache.rocketmq.remoting.netty.NettyRemotingServer.NettyServerHandler#channelRead0`
calling `remotingAbstract.processMessageReceived`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]