JunRuiLee opened a new issue, #7863:
URL: https://github.com/apache/paimon/issues/7863

   ### Search before asking
   
   - [x] I searched in the [issues](https://github.com/apache/paimon/issues) 
and found nothing similar.
   
   
   ### Motivation
   
   Paimon's branch management currently supports create, delete, and 
fast-forward operations, but lacks the ability to incrementally merge data from 
one branch into another. Users need a non-destructive way to incorporate new 
data from a branch into another branch while keeping both branches' data intact.
   
   ### Solution
   
   Add a `merge_branch` operation for append-only tables that incrementally 
merges data files from a source branch into a target branch, skipping files 
that already exist in the target.
   
     - **Core**: Add `mergeBranch` to the Table, Catalog, and BranchManager 
APIs. Introduce a `BranchMergeHandler` interface to encapsulate file reading
     and committing, with `FileSystemBranchManager` handling validation and 
orchestration (schema compatibility, append-only check, no-compaction check).
     - **REST**: Add a `/branches/merge` endpoint with `RESTCatalog` 
integration.
     - **Engines**: Add `merge_branch` procedure for both Flink and Spark, plus 
a Flink Action for CLI usage.
   
     **Limitations**: only supported for append-only tables where both branches 
have identical schemas and **no compaction** has been performed.
   
    ### PR Plan
   
     This feature is split into two PRs:
   
     - **PR1 (Core + REST)**: API interfaces, core merge implementation, REST 
endpoint, and unit tests.
     - **PR2 (Flink + Spark)**: Engine-side procedures, Flink Action, 
integration tests, and documentation.
   
   
   ### Anything else?
   
   _No response_
   
   ### Are you willing to submit a PR?
   
   - [x] I'm willing to submit a PR!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to