lizhimins opened a new issue, #7363:
URL: https://github.com/apache/rocketmq/issues/7363

   ### Before Creating the Bug Report
   
   - [X] I found a bug, not just asking a question, which should be created in 
[GitHub Discussions](https://github.com/apache/rocketmq/discussions).
   
   - [X] I have searched the [GitHub 
Issues](https://github.com/apache/rocketmq/issues) and [GitHub 
Discussions](https://github.com/apache/rocketmq/discussions)  of this 
repository and believe that this is not a duplicate.
   
   - [X] I have confirmed that this bug belongs to the current repository, not 
other repositories of RocketMQ.
   
   
   ### Runtime platform environment
   
   Linux 4.19
   
   ### RocketMQ version
   
   RocketMQ develop branch, 5.1.3 
   
   ### JDK Version
   
   JDK11
   
   ### Describe the Bug
   
   When the pull and pop threads try to read data from tiered storage, they 
will call TieredMessageStore#getMessageAsync. Due to tiered storage has the 
behavior of caching data in batches during upload, the current implementation 
returns incorrect results. The next begin offset of the get message result 
cycles between the local cq max offset and the tiered storage cq commit offset, 
causing a large number of duplicate messages.
   
   For example, if the storage position of the local cq is 100-200, the 
position of the tiered storage at this time may be 50-190, and the messages 
from 190-200 are waiting to be uploaded. At this time, it is not possible to 
read the data from 190-200 from the tiered storage, and the max offset of the 
pull result should also be 190 instead of 200.
   
   I will submit a pull request to fix this issue.
   
   当 pull 和 pop 
线程尝试从分级存储读取数据时,会调用org.apache.rocketmq.tieredstore.TieredMessageStore#getMessageAsync,由于分级存储在上传时存在缓存数据攒批的行为,当前实现返回的结果不对,get
 message result 的 next begin offset 在本地 cq max offset 和分级存储的 cq commit offset 
之间循环,造成大量消息重复。例如,本地 cq 的存储位点是 100-200,分级存储此时的位点可能是 50-190,其中 190-200 
的消息正在等待上传。此时从分级存储是读取不到 190-200 这段数据的,pull result 的 max offset 也应该是 190 而非 
200。我将提交一个 pr,来修复这个问题。
   
   
   ### Steps to Reproduce
   
   修改 deep storage level 为 force 强制从分级存储读取数据时,pop 消费会产生大量消息重复
   
   ### What Did You Expect to See?
   
   没有重复
   
   ### What Did You See Instead?
   
   pop 消费会产生大量消息重复
   
   ### Additional Context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to