lizhimins opened a new issue, #10520: URL: https://github.com/apache/rocketmq/issues/10520
### Before Creating the Bug Report - [x] I found a bug, not just asking a question, which should be created in [GitHub Discussions](https://github.com/apache/rocketmq/discussions). - [x] I have searched the [GitHub Issues](https://github.com/apache/rocketmq/issues) and [GitHub Discussions](https://github.com/apache/rocketmq/discussions) of this repository and believe that this is not a duplicate. - [x] I have confirmed that this bug belongs to the current repository, not other repositories of RocketMQ. ### Runtime platform environment OS: Linux (NVMe SSD, cloud disks such as Alibaba Cloud ESSD) ### RocketMQ version branch: develop version: 5.3.x ### Describe the Bug `ConsumeQueue.correctMinOffset` performs binary search on mmap files (random access pattern). The Linux kernel default `read_ahead_kb` on NVMe devices is aggressively large, so each page fault during binary search pulls in far more data than actually needed, producing periodic disk read pulses. On cloud disks where read/write bandwidth share a single quota, these read pulses squeeze CommitLog writes and cause periodic send-RT spikes. In our production case, send p99 jumped from ~4ms to ~26ms every 8-9 minutes, with ~244x read amplification. ### Steps to Reproduce 1. Run a Broker with a large number of ConsumeQueue instances (e.g. 10000+) on NVMe storage 2. Let disk usage approach the cleanup threshold so `correctMinOffset` runs frequently 3. Observe periodic disk read pulses and send-RT spikes via `dstat` / `pidstat` ### What Did You Expect to See? `correctMinOffset` binary search should not cause excessive disk I/O or impact send latency. ### What Did You See Instead? Periodic disk read pulses (~975MB per cycle) and send p99 spikes (4ms -> 26ms) every 8-9 minutes, correlated with `StoreCleanQueueScheduledThread` running `correctMinOffset`. ### Additional Context Root cause: `madvise(MADV_RANDOM)` is not applied before binary search, so the kernel read-ahead remains active for random access. `posix_fadvise(FADV_RANDOM)` does NOT work here because the mmap readahead path (`do_sync_mmap_readahead`) only checks `VM_RAND_READ` (set by `madvise`), not `FMODE_RANDOM` (set by `fadvise`) on Linux 2.6.35+. Fix: wrap the binary search with `madvise(MADV_RANDOM)` / `madvise(MADV_NORMAL)`, gated by a config switch `correctMinOffsetMadviseEnable` (default: off). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
