guyinyou opened a new issue, #10017:
URL: https://github.com/apache/rocketmq/issues/10017

   ### Before Creating the Enhancement Request
   
   - [x] I have confirmed that this should be classified as an enhancement 
rather than a bug/feature.
   
   
   ### Summary
   
   Add commitlog offset validation check in the `recoverAbnormally` method to 
detect and truncate old file data that may pass CRC and magic code checks but 
contains invalid commitlog offsets. This enhancement improves the robustness of 
abnormal recovery by preventing the processing of stale data from reused 
commitlog files.
   
   ### Motivation
   
   When commitlog files are reused or overwritten, there's a small probability 
that old data from previous writes can pass CRC and magic code validation 
checks. However, these old messages contain commitlog offsets that were valid 
when originally written but are now outside the valid range of the current 
mapped file. 
   
   Without offset validation, the recovery process may incorrectly process 
these old messages, leading to:
   - Potential data inconsistency
   - Incorrect message dispatching
   - Unpredictable behavior during recovery
   
   This enhancement adds an additional safety check by validating that the 
commitlog offset in each message falls within the valid range of the current 
mapped file (`[fileFromOffset, fileFromOffset + fileSize]`). If the offset is 
outside this range, the recovery process correctly identifies it as old file 
data and truncates at that point, enhancing the overall robustness of the 
abnormal recovery mechanism.
   
   **Benefits:**
   - Prevents processing of stale data during recovery
   - Improves recovery reliability and correctness
   - Adds an extra layer of validation beyond CRC checks
   - Better handles edge cases when commitlog files are reused
   
   ### Describe the Solution You'd Like
   
   The enhancement involves adding a commitlog offset validation check in the 
`recoverAbnormally` method before processing messages that pass CRC and magic 
code checks.
   
   **Implementation Details:**
   
   1. **Add offset validation check**: Before checking if 
`dispatchRequest.isSuccess()`, validate that 
`dispatchRequest.getCommitLogOffset()` falls within the valid range of the 
current `mappedFile`:
      - Check: `commitLogOffset >= mappedFile.getFileFromOffset() && 
commitLogOffset <= mappedFile.getFileFromOffset() + mappedFile.getFileSize()`
   
   2. **Handle invalid offsets**: If the offset is outside the valid range:
      - Log a warning with detailed information (offset, filename, current 
position)
      - Log recovery end information
      - Break the recovery loop to truncate at that point
   
   3. **Improve logging**: Use proper SLF4J placeholders in log messages for 
better formatting and include relevant context information.
   
   **Code Location:**
   - File: `store/src/main/java/org/apache/rocketmq/store/CommitLog.java`
   - Method: `recoverAbnormally(long maxPhyOffsetOfConsumeQueue)`
   - Location: In the message recovery loop, before the 
`dispatchRequest.isSuccess()` check
   
   **Example Implementation:**
   if (dispatchRequest.getCommitLogOffset() < mappedFile.getFileFromOffset()
       || dispatchRequest.getCommitLogOffset() > mappedFile.getFileFromOffset() 
+ mappedFile.getFileSize()) {
       log.warn("found illegal commitlog offset {} in {}, current pos={}, it 
will be truncated.", 
           dispatchRequest.getCommitLogOffset(), mappedFile.getFileName(), 
processOffset + mappedFileOffset);
       log.info("recover physics file end, {} pos={}", 
mappedFile.getFileName(), byteBuffer.position());
       break;
   } else if (dispatchRequest.isSuccess()) {
       // Continue with normal processing...
   }
   
   
   ### Describe Alternatives You've Considered
   
   none
   
   ### Additional Context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to