JunRuiLee opened a new issue, #7863: URL: https://github.com/apache/paimon/issues/7863
### Search before asking - [x] I searched in the [issues](https://github.com/apache/paimon/issues) and found nothing similar. ### Motivation Paimon's branch management currently supports create, delete, and fast-forward operations, but lacks the ability to incrementally merge data from one branch into another. Users need a non-destructive way to incorporate new data from a branch into another branch while keeping both branches' data intact. ### Solution Add a `merge_branch` operation for append-only tables that incrementally merges data files from a source branch into a target branch, skipping files that already exist in the target. - **Core**: Add `mergeBranch` to the Table, Catalog, and BranchManager APIs. Introduce a `BranchMergeHandler` interface to encapsulate file reading and committing, with `FileSystemBranchManager` handling validation and orchestration (schema compatibility, append-only check, no-compaction check). - **REST**: Add a `/branches/merge` endpoint with `RESTCatalog` integration. - **Engines**: Add `merge_branch` procedure for both Flink and Spark, plus a Flink Action for CLI usage. **Limitations**: only supported for append-only tables where both branches have identical schemas and **no compaction** has been performed. ### PR Plan This feature is split into two PRs: - **PR1 (Core + REST)**: API interfaces, core merge implementation, REST endpoint, and unit tests. - **PR2 (Flink + Spark)**: Engine-side procedures, Flink Action, integration tests, and documentation. ### Anything else? _No response_ ### Are you willing to submit a PR? - [x] I'm willing to submit a PR! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
