lizhimins opened a new issue, #7363: URL: https://github.com/apache/rocketmq/issues/7363
### Before Creating the Bug Report - [X] I found a bug, not just asking a question, which should be created in [GitHub Discussions](https://github.com/apache/rocketmq/discussions). - [X] I have searched the [GitHub Issues](https://github.com/apache/rocketmq/issues) and [GitHub Discussions](https://github.com/apache/rocketmq/discussions) of this repository and believe that this is not a duplicate. - [X] I have confirmed that this bug belongs to the current repository, not other repositories of RocketMQ. ### Runtime platform environment Linux 4.19 ### RocketMQ version RocketMQ develop branch, 5.1.3 ### JDK Version JDK11 ### Describe the Bug When the pull and pop threads try to read data from tiered storage, they will call TieredMessageStore#getMessageAsync. Due to tiered storage has the behavior of caching data in batches during upload, the current implementation returns incorrect results. The next begin offset of the get message result cycles between the local cq max offset and the tiered storage cq commit offset, causing a large number of duplicate messages. For example, if the storage position of the local cq is 100-200, the position of the tiered storage at this time may be 50-190, and the messages from 190-200 are waiting to be uploaded. At this time, it is not possible to read the data from 190-200 from the tiered storage, and the max offset of the pull result should also be 190 instead of 200. I will submit a pull request to fix this issue. 当 pull 和 pop 线程尝试从分级存储读取数据时,会调用org.apache.rocketmq.tieredstore.TieredMessageStore#getMessageAsync,由于分级存储在上传时存在缓存数据攒批的行为,当前实现返回的结果不对,get message result 的 next begin offset 在本地 cq max offset 和分级存储的 cq commit offset 之间循环,造成大量消息重复。例如,本地 cq 的存储位点是 100-200,分级存储此时的位点可能是 50-190,其中 190-200 的消息正在等待上传。此时从分级存储是读取不到 190-200 这段数据的,pull result 的 max offset 也应该是 190 而非 200。我将提交一个 pr,来修复这个问题。 ### Steps to Reproduce 修改 deep storage level 为 force 强制从分级存储读取数据时,pop 消费会产生大量消息重复 ### What Did You Expect to See? 没有重复 ### What Did You See Instead? pop 消费会产生大量消息重复 ### Additional Context _No response_ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
