platinumhamburg opened a new issue, #2064: URL: https://github.com/apache/fluss/issues/2064
### Search before asking - [x] I searched in the [issues](https://github.com/apache/fluss/issues) and found nothing similar. ### Fluss version 0.8.0 (latest release) ### Please describe the bug 🐞 The current `RequestChannel` implementation uses a bounded `ArrayBlockingQueue` for request queuing. When the queue is full, `putRequest()` blocks the calling thread, which can cause Netty EventLoop threads to block. This leads to: 1. **EventLoop Thread Blocking**: When the request queue reaches capacity, `ArrayBlockingQueue.put()` blocks the EventLoop thread, severely degrading server performance and responsiveness. 2. **No Backpressure Control**: The bounded queue provides a hard limit but doesn't implement proper backpressure at the TCP level. When the queue is full, new requests are blocked at the application level rather than being controlled at the network layer. 3. **Memory Management Issues**: Without proper backpressure, the system cannot gracefully handle traffic spikes, potentially leading to memory exhaustion or degraded performance. ### Solution This change refactors `RequestChannel` to: 1. **Use Unbounded Queue**: Replace `ArrayBlockingQueue` with unbounded queue to ensure `putRequest()` never blocks, preventing EventLoop threads from being blocked. 2. **Implement TCP-Level Backpressure**: - When queue size exceeds `backpressureThreshold`, pause all associated Netty channels by setting `autoRead(false)` - When queue size drops below `resumeThreshold` (50% of backpressure threshold), resume channels by setting `autoRead(true)` - This provides hysteresis to avoid thrashing between pause/resume states 3. **Channel Lifecycle Management**: - Register channels when they become active in `NettyServerHandler.channelActive()` - Unregister channels when they become inactive in `NettyServerHandler.channelInactive()` - Each `RequestChannel` manages its own set of associated channels independently 4. **Thread Safety and Performance**: - Use `ReentrantLock` to protect backpressure state transitions atomically - Use `volatile boolean isBackpressureActive` for fast-path checks to avoid unnecessary lock contention - All channel operations are submitted to the channel's EventLoop thread for thread safety ### Are you willing to submit a PR? - [x] I'm willing to submit a PR! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
