[ 
https://issues.apache.org/jira/browse/SPARK-57056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

L. C. Hsieh updated SPARK-57056:
--------------------------------
    Description: 
Apache Iceberg and similar table formats support named branches as a 
first-class concept, but today Spark exposes branching only through 
connector-specific SQL extensions (e.g. IcebergSparkSessionExtensions).

This ticket adds a standard DSv2 interface, SupportsBranching, so any data 
source can expose branching through built-in Spark SQL.

Proposed interface (Java, under sql/catalyst .../connector/catalog/):

    public interface SupportsBranching extends Table {
        TableBranch createBranch(String name, OptionalLong sourceSnapshotId);
        default TableBranch replaceBranch(String name, OptionalLong 
sourceSnapshotId);
        boolean dropBranch(String name);
        TableBranch fastForward(String branch, String targetBranch);
        default TableBranch[] listBranches();
    }

With companion value type TableBranch(name, snapshotId, creationTimeMs)
and a BranchAlreadyExistsException for the duplicate-create case.

Standardised DDL:

    ALTER TABLE t CREATE [OR REPLACE] BRANCH [IF NOT EXISTS] name
        [AS OF VERSION <integer>]
    ALTER TABLE t DROP BRANCH [IF EXISTS] name
    ALTER TABLE t FAST FORWARD branch TO target
    SHOW BRANCHES (FROM | IN) t

Scope:
  * Define the interface and value types.
  * Extend the ANTLR grammar with the four DDL forms; register BRANCH,
    BRANCHES, FAST, FORWARD as non-reserved keywords.
  * Add logical plans (CreateBranch / DropBranch / FastForwardBranch /
    ShowBranches), resolve them through ResolvedTable, and dispatch to
    new V2 exec nodes via DataSourceV2Strategy.
  * Add an asBranchable helper and tableDoesNotSupportBranchingError so
    non-branching tables fail with a clear message.
  * Implement SupportsBranching on InMemoryTable for testing.
  * Add a new error condition CREATE_BRANCH_WITH_IF_NOT_EXISTS_AND_REPLACE.

Non-goals (deferred to SPARK-57057):
  * SELECT / INSERT against a specific branch.
  * Any session-level default-branch configuration.

Data sources that do not implement SupportsBranching are unaffected.
The work follows existing DSv2 conventions (compare SupportsDelete,
TruncatableTable, SupportsRowLevelOperations).



> Add SupportsBranching DSv2 interface and branching DDL
> ------------------------------------------------------
>
>                 Key: SPARK-57056
>                 URL: https://issues.apache.org/jira/browse/SPARK-57056
>             Project: Spark
>          Issue Type: New Feature
>          Components: SQL
>    Affects Versions: 4.3.0
>            Reporter: L. C. Hsieh
>            Assignee: L. C. Hsieh
>            Priority: Major
>
> Apache Iceberg and similar table formats support named branches as a 
> first-class concept, but today Spark exposes branching only through 
> connector-specific SQL extensions (e.g. IcebergSparkSessionExtensions).
> This ticket adds a standard DSv2 interface, SupportsBranching, so any data 
> source can expose branching through built-in Spark SQL.
> Proposed interface (Java, under sql/catalyst .../connector/catalog/):
>     public interface SupportsBranching extends Table {
>         TableBranch createBranch(String name, OptionalLong sourceSnapshotId);
>         default TableBranch replaceBranch(String name, OptionalLong 
> sourceSnapshotId);
>         boolean dropBranch(String name);
>         TableBranch fastForward(String branch, String targetBranch);
>         default TableBranch[] listBranches();
>     }
> With companion value type TableBranch(name, snapshotId, creationTimeMs)
> and a BranchAlreadyExistsException for the duplicate-create case.
> Standardised DDL:
>     ALTER TABLE t CREATE [OR REPLACE] BRANCH [IF NOT EXISTS] name
>         [AS OF VERSION <integer>]
>     ALTER TABLE t DROP BRANCH [IF EXISTS] name
>     ALTER TABLE t FAST FORWARD branch TO target
>     SHOW BRANCHES (FROM | IN) t
> Scope:
>   * Define the interface and value types.
>   * Extend the ANTLR grammar with the four DDL forms; register BRANCH,
>     BRANCHES, FAST, FORWARD as non-reserved keywords.
>   * Add logical plans (CreateBranch / DropBranch / FastForwardBranch /
>     ShowBranches), resolve them through ResolvedTable, and dispatch to
>     new V2 exec nodes via DataSourceV2Strategy.
>   * Add an asBranchable helper and tableDoesNotSupportBranchingError so
>     non-branching tables fail with a clear message.
>   * Implement SupportsBranching on InMemoryTable for testing.
>   * Add a new error condition CREATE_BRANCH_WITH_IF_NOT_EXISTS_AND_REPLACE.
> Non-goals (deferred to SPARK-57057):
>   * SELECT / INSERT against a specific branch.
>   * Any session-level default-branch configuration.
> Data sources that do not implement SupportsBranching are unaffected.
> The work follows existing DSv2 conventions (compare SupportsDelete,
> TruncatableTable, SupportsRowLevelOperations).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to