viirya opened a new pull request, #56102:
URL: https://github.com/apache/spark/pull/56102
### What changes were proposed in this pull request?
Add a standard DSv2 mix-in interface `SupportsBranching`, plus the SQL
surface to manage branches:
```java
public interface SupportsBranching extends Table {
TableBranch createBranch(String name, OptionalLong sourceSnapshotId);
default TableBranch replaceBranch(String name, OptionalLong
sourceSnapshotId);
boolean dropBranch(String name);
TableBranch fastForward(String branch, String targetBranch);
default TableBranch[] listBranches();
}
```
With companion value type `TableBranch(name, snapshotId, creationTimeMs)`
and `SupportsBranching.BranchAlreadyExistsException` for the duplicate-create
case.
New DDL:
```sql
ALTER TABLE t CREATE [OR REPLACE] BRANCH [IF NOT EXISTS] name [AS OF VERSION
<integer>]
ALTER TABLE t DROP BRANCH [IF EXISTS] name
ALTER TABLE t FAST FORWARD branch TO target
SHOW BRANCHES (FROM | IN) t
```
Implementation:
- Define the interface and `TableBranch` value type under
`sql/catalyst/.../connector/catalog/`.
- Extend the ANTLR grammar with the four DDL forms; register `BRANCH`,
`BRANCHES`, `FAST`, `FORWARD` as non-reserved keywords; update
`docs/sql-ref-ansi-compliance.md`.
- Add logical plans (`CreateBranch` / `DropBranch` / `FastForwardBranch` /
`ShowBranches`) and exec nodes; dispatch through `ResolvedTable` in
`DataSourceV2Strategy`.
- Add `DataSourceV2Implicits.asBranchable` and
`QueryCompilationErrors.tableDoesNotSupportBranchingError` so non-branching
tables fail with a clear message.
- Add error condition `CREATE_BRANCH_WITH_IF_NOT_EXISTS_AND_REPLACE` for the
conflicting clauses.
- Implement `SupportsBranching` on `InMemoryTable` for testing.
Reads and writes against a specific branch (`SELECT ... FOR BRANCH 'x'`,
`INSERT ... FOR BRANCH 'x'`) are out of scope here and are added in the
follow-up SPARK-57057.
### Why are the changes needed?
Apache Iceberg and similar table formats support named branches as a
first-class concept, but Spark today only exposes branching through
connector-specific SQL extensions (e.g. `IcebergSparkSessionExtensions`). A
standard DSv2 interface lets any data source expose branching through built-in
Spark SQL, the same way `SupportsDelete` / `TruncatableTable` /
`SupportsRowLevelOperations` standardize their respective capabilities.
### Does this PR introduce _any_ user-facing change?
Yes. New SQL DDL is recognized:
```sql
ALTER TABLE t CREATE [OR REPLACE] BRANCH [IF NOT EXISTS] name [AS OF VERSION
<integer>]
ALTER TABLE t DROP BRANCH [IF EXISTS] name
ALTER TABLE t FAST FORWARD branch TO target
SHOW BRANCHES (FROM | IN) t
```
Data sources that do not implement `SupportsBranching` are unaffected —
running the new DDL against them raises `AnalysisException` ("does not support
branching"). Four new non-reserved keywords (`BRANCH`, `BRANCHES`, `FAST`,
`FORWARD`) are added; they remain usable as identifiers in non-DDL contexts.
### How was this patch tested?
- `DDLParserSuite`: 5 new parser tests covering all four DDL forms and the
`IF NOT EXISTS` / `OR REPLACE` conflict error.
- `SupportsBranchingSuite`: 12 new integration tests exercising the new DDL
end-to-end against `InMemoryTable`, including positive cases for each operation
and negative cases for duplicates, missing branches, fast-forward direction,
and non-branching tables.
- All 152 existing tests in `DDLParserSuite` still pass;
`TableIdentifierParserSuite` (7 tests) confirms the new keywords don't regress
identifier handling.
### Was this patch authored or co-authored using generative AI tooling?
Generated-by: Claude Code (Claude Opus 4.7)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]