peterylh opened a new pull request, #54494:
URL: https://github.com/apache/doris/pull/54494

   # Add version checking support for replace partition operations
   
   ### What problem does this PR solve?
   
   Issue Number: #52031
   
   
   Problem Summary:
   
   Currently, the `REPLACE PARTITION` operation in Apache Doris lacks version 
checking mechanism, which can lead to data consistency issues during partition 
replacement. When users perform partition replacement operations (such as 
adjusting bucketing strategies or merging small partitions), concurrent data 
modifications might occur between the time when temporary partitions are 
created and when the actual replacement happens. This can result in data loss 
or inconsistency.
   
   
   
   ### What changes were proposed in this PR?
   
   This PR introduces a version checking mechanism for `REPLACE PARTITION` 
operations to ensure data consistency and prevent data loss during partition 
replacement.
   
   **Key Changes:**
   
   1. **Enhanced ReplacePartitionClause**: Added `expectedVersions` field and 
related methods to support version checking in the legacy analysis framework.
   
   2. **Enhanced ReplacePartitionOp**: Added corresponding support in the 
Nereids optimizer framework with version parsing and validation capabilities.
   
   3. **Version Validation Logic**: Implemented `validatePartitionVersions()` 
method in `Env.java` to check partition versions before performing replacement 
operations.
   
   
   
   **Usage Example:**
   ```sql
   ALTER TABLE page_views
   REPLACE PARTITION (p20240601, p20240602)
   WITH TEMPORARY PARTITION (tp20240601, tp20240602)
   PROPERTIES (
       "versions" = "p20240601:2, p20240602:3",
       "strict_range" = "true"
   );
   ```
   
   **Benefits:**
   - **Data Safety**: Prevents data loss by ensuring partitions haven't been 
modified during the replacement process
   - **Consistency**: Maintains data consistency across concurrent operations
   - **Atomicity**: Provides atomic partition replacement with version 
validation
   - **Backward Compatibility**: The feature is optional and doesn't affect 
existing workflows
   
   ### Release note
   
   **Feature**
   - Add version checking support for `REPLACE PARTITION` operations to prevent 
data loss and ensure consistency during partition replacement. Users can now 
specify expected partition versions using the `versions` property to validate 
that partitions haven't been modified during the replacement process.
   
   ### Check List (For Author)
   
   - Test <!-- At least one of them must be included. -->
       - [X] Regression test
       - [ ] Unit Test
       - [ ] Manual test (add detailed scripts or steps below)
       - [ ] No need to test or manual test. Explain why:
           - [ ] This is a refactor/code format and no logic has been changed.
           - [ ] Previous test can cover this change.
           - [ ] No code files have been changed.
           - [ ] Other reason <!-- Add your reason?  -->
   
   - Behavior changed:
       - [ ] No.
       - [ ] Yes. <!-- Explain the behavior change -->
   
   - Does this need documentation?
       - [ ] No.
       - [ ] Yes. <!-- Add document PR link here. eg: 
https://github.com/apache/doris-website/pull/1214 -->
   
   ### Check List (For Reviewer who merge this PR)
   
   - [ ] Confirm the release note
   - [ ] Confirm test cases
   - [ ] Confirm document
   - [ ] Add branch pick label <!-- Add branch pick label that this PR should 
merge into -->
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to