ASF GitHub Bot commented on ROCKETMQ-332:

Jason918 opened a new pull request #227: [ROCKETMQ-332] fix concurrent bug in 
MappedFileQueue#findMappedFileByOffset, which m…
URL: https://github.com/apache/rocketmq/pull/227
   ## What is the purpose of the change
   fix concurrent bug in MappedFileQueue#findMappedFileByOffset, which may 
cause message loss.
   ## Brief changelog
   The origin bug only occurs when the mappedFileQueue is deleting mappedFiles 
from the head of the queue. So the main idea of this bug fix is to check if the 
firstMappedFile in the queue is changed. If it changed, we may get the wrong 
mappedFile, and we handle this by doing retries. 
   Finally, If it failed after 3 times, we will try to find the mappedFile by 
iterating through all the mappedFiles in the queue to ensure returning the 
right result (solution from zhouxinyu).
   ## Verifying this change
   Follow this checklist to help us incorporate your contribution quickly and 
   - [x] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/ROCKETMQ/issues/) filed for the 
change (usually before you start working on it). Trivial changes like typos do 
not require a JIRA issue. Your pull request should address just this issue, 
without pulling in other changes - one PR resolves one issue. 
   - [x] Format the pull request title like `[ROCKETMQ-XXX] Fix 
UnknownException when host config not exist`. Each commit in the pull request 
should have a meaningful subject line and body.
   - [x] Write a pull request description that is detailed enough to understand 
what the pull request does, how, and why.
   - [x] Write necessary unit-test to verify your logic correction, more mock a 
little better when cross module dependency exist. If the new feature or 
significant change is committed, please remember to add integration-test in 
[test module](https://github.com/apache/rocketmq/tree/master/test).
   - [x] Run `mvn -B clean apache-rat:check findbugs:findbugs 
checkstyle:checkstyle` to make sure basic checks pass. Run `mvn clean install 
-DskipITs` to make sure unit-test pass. Run `mvn clean test-compile 
failsafe:integration-test`  to make sure integration-test pass.
   - [x] If this contribution is large, please file an [Apache Individual 
Contributor License Agreement](http://www.apache.org/licenses/#clas).

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:

> MappedFileQueue is not thread safe, which will cause message loss.
> ------------------------------------------------------------------
>                 Key: ROCKETMQ-332
>                 URL: https://issues.apache.org/jira/browse/ROCKETMQ-332
>             Project: Apache RocketMQ
>          Issue Type: Bug
>          Components: rocketmq-store
>    Affects Versions: 4.0.0-incubating, 4.1.0-incubating
>            Reporter: Jas0n918
>            Assignee: yukon
>            Priority: Major
>         Attachments: rocketmq.log
> In RocketMQ V3.5.8, there is a readWriteLock in 
> com.alibaba.rocketmq.store.MapedFileQueue, which guarantee thread safety. But 
> in the new org.apache.rocketmq.store.MappedFileQueue, there is not any 
> concurrent control mechanism. 
> when consumer is fetching message(no large lag), broker calls
> org.apache.rocketmq.broker.processor.PullMessageProcessor#processRequest ==>
> org.apache.rocketmq.store.DefaultMessageStore#getMessage  ==>
> org.apache.rocketmq.store.ConsumeQueue#getIndexBuffer ==>
> org.apache.rocketmq.store.MappedFileQueue#findMappedFileByOffset
> but findMappedFileByOffset is not thread safe, as
> org.apache.rocketmq.store.MappedFileQueue#deleteExpiredFile maybe running 
> concurrently(  the size of mappedFiles maybe change) , which will results in 
> ConsumeQueue#getIndexBuffer returns null, causing 
> _nextBeginOffset  = nextOffsetCorrection(offset, 
> consumeQueue.rollNextFile(offset));_+
> which will skip the whole consumeQueue file, any messages left in this 
> ConsumeQueue will not be consumed by client.

This message was sent by Atlassian JIRA

Reply via email to