[
https://issues.apache.org/jira/browse/SPARK-57056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
L. C. Hsieh updated SPARK-57056:
--------------------------------
Description:
Apache Iceberg and similar table formats support named branches as a
first-class concept, but today Spark exposes branching only through
connector-specific SQL extensions (e.g. IcebergSparkSessionExtensions).
This ticket adds a standard DSv2 interface, SupportsBranching, so any data
source can expose branching through built-in Spark SQL.
Proposed interface (Java, under sql/catalyst .../connector/catalog/):
public interface SupportsBranching extends Table {
TableBranch createBranch(String name, OptionalLong sourceSnapshotId);
default TableBranch replaceBranch(String name, OptionalLong
sourceSnapshotId);
boolean dropBranch(String name);
TableBranch fastForward(String branch, String targetBranch);
default TableBranch[] listBranches();
}
With companion value type TableBranch(name, snapshotId, creationTimeMs)
and a BranchAlreadyExistsException for the duplicate-create case.
Standardised DDL:
ALTER TABLE t CREATE [OR REPLACE] BRANCH [IF NOT EXISTS] name
[AS OF VERSION <integer>]
ALTER TABLE t DROP BRANCH [IF EXISTS] name
ALTER TABLE t FAST FORWARD branch TO target
SHOW BRANCHES (FROM | IN) t
Scope:
* Define the interface and value types.
* Extend the ANTLR grammar with the four DDL forms; register BRANCH,
BRANCHES, FAST, FORWARD as non-reserved keywords.
* Add logical plans (CreateBranch / DropBranch / FastForwardBranch /
ShowBranches), resolve them through ResolvedTable, and dispatch to
new V2 exec nodes via DataSourceV2Strategy.
* Add an asBranchable helper and tableDoesNotSupportBranchingError so
non-branching tables fail with a clear message.
* Implement SupportsBranching on InMemoryTable for testing.
* Add a new error condition CREATE_BRANCH_WITH_IF_NOT_EXISTS_AND_REPLACE.
Non-goals (deferred to SPARK-57057):
* SELECT / INSERT against a specific branch.
* Any session-level default-branch configuration.
Data sources that do not implement SupportsBranching are unaffected.
The work follows existing DSv2 conventions (compare SupportsDelete,
TruncatableTable, SupportsRowLevelOperations).
> Add SupportsBranching DSv2 interface and branching DDL
> ------------------------------------------------------
>
> Key: SPARK-57056
> URL: https://issues.apache.org/jira/browse/SPARK-57056
> Project: Spark
> Issue Type: New Feature
> Components: SQL
> Affects Versions: 4.3.0
> Reporter: L. C. Hsieh
> Assignee: L. C. Hsieh
> Priority: Major
>
> Apache Iceberg and similar table formats support named branches as a
> first-class concept, but today Spark exposes branching only through
> connector-specific SQL extensions (e.g. IcebergSparkSessionExtensions).
> This ticket adds a standard DSv2 interface, SupportsBranching, so any data
> source can expose branching through built-in Spark SQL.
> Proposed interface (Java, under sql/catalyst .../connector/catalog/):
> public interface SupportsBranching extends Table {
> TableBranch createBranch(String name, OptionalLong sourceSnapshotId);
> default TableBranch replaceBranch(String name, OptionalLong
> sourceSnapshotId);
> boolean dropBranch(String name);
> TableBranch fastForward(String branch, String targetBranch);
> default TableBranch[] listBranches();
> }
> With companion value type TableBranch(name, snapshotId, creationTimeMs)
> and a BranchAlreadyExistsException for the duplicate-create case.
> Standardised DDL:
> ALTER TABLE t CREATE [OR REPLACE] BRANCH [IF NOT EXISTS] name
> [AS OF VERSION <integer>]
> ALTER TABLE t DROP BRANCH [IF EXISTS] name
> ALTER TABLE t FAST FORWARD branch TO target
> SHOW BRANCHES (FROM | IN) t
> Scope:
> * Define the interface and value types.
> * Extend the ANTLR grammar with the four DDL forms; register BRANCH,
> BRANCHES, FAST, FORWARD as non-reserved keywords.
> * Add logical plans (CreateBranch / DropBranch / FastForwardBranch /
> ShowBranches), resolve them through ResolvedTable, and dispatch to
> new V2 exec nodes via DataSourceV2Strategy.
> * Add an asBranchable helper and tableDoesNotSupportBranchingError so
> non-branching tables fail with a clear message.
> * Implement SupportsBranching on InMemoryTable for testing.
> * Add a new error condition CREATE_BRANCH_WITH_IF_NOT_EXISTS_AND_REPLACE.
> Non-goals (deferred to SPARK-57057):
> * SELECT / INSERT against a specific branch.
> * Any session-level default-branch configuration.
> Data sources that do not implement SupportsBranching are unaffected.
> The work follows existing DSv2 conventions (compare SupportsDelete,
> TruncatableTable, SupportsRowLevelOperations).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]