lizhimins opened a new issue, #10462:
URL: https://github.com/apache/rocketmq/issues/10462

   ### Before Creating the Enhancement Request
   
   - [x] I have confirmed that this should be classified as an enhancement 
rather than a bug/feature.
   
   ### Summary
   
   对 tieredstore 模块进行系统性优化,包括内存背压机制、并发安全修复、资源泄漏修复、日志规范化等。
   
   ### Motivation
   
   tieredstore 模块存在多个潜在的数据安全和稳定性风险:
   - 无背压机制,dispatch 落后时内存无界增长可能导致 OOM
   - PosixFileSegment FileChannel 并发读写竞态
   - IndexStoreFile 哈希值 Integer.MIN_VALUE 溢出导致越界
   - FileSegment close 不等待飞行中提交,数据丢失
   - MessageStoreExecutor 线程池过大(32 核机器 256+ 线程)
   - 日志格式不统一,排查问题困难
   
   ### Describe the Solution You'd Like
   
   **内存背压**:添加 ratio + cap 背压(默认 10% 堆,上限 1GB),dispatch 前检查可用内存
   
   **并发修复**:
   - PosixFileSegment 使用 `read(ByteBuffer, long)` 原子读
   - IndexStoreFile 哈希改用 `& 0x7FFFFFFF` 防溢出
   - FlatCommitLogFile firstOffset 使用本地变量快照防 TOCTOU
   - FlatMessageFile metadata 字段加 volatile
   
   **资源管理**:
   - FileSegment close 等待 in-flight commit(30s 超时)
   - PosixFileSegment 存储 RAF 引用防止句柄泄漏
   - MessageStoreExecutor shutdown 添加 awaitTermination
   - 删除 MessageStoreExecutor 单例模式和 fileRecyclingExecutor
   - FlatFileFactory 删除泄漏线程池的测试构造函数
   
   **线程池优化**:core 改为 processors,max 改为 processors*2,删除 fileRecyclingExecutor
   
   **错误处理**:
   - TieredMessageStore 解包 CompletionException,Error 不吞
   - PosixFileSegment IOException 抛 RuntimeException 而非静默返回空 buffer
   - FlatAppendFile getFileCorrectSize 改为有限重试 3 次
   - IndexStoreFile timeDiff clamp 到 [0, Integer.MAX_VALUE]
   
   **日志规范化**:统一为 `ClassName#methodName, 描述, key={}` 格式
   
   ### Describe Alternatives You've Considered
   
   无
   
   ### Additional Context
   
   无


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to