[ 
https://issues.apache.org/jira/browse/SPARK-57057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-57057:
-----------------------------------
    Labels: pull-request-available  (was: )

> Allow SELECT and INSERT to target a specific branch on SupportsBranching 
> tables
> -------------------------------------------------------------------------------
>
>                 Key: SPARK-57057
>                 URL: https://issues.apache.org/jira/browse/SPARK-57057
>             Project: Spark
>          Issue Type: New Feature
>          Components: SQL
>    Affects Versions: 4.3.0
>            Reporter: L. C. Hsieh
>            Assignee: L. C. Hsieh
>            Priority: Major
>              Labels: pull-request-available
>
> Builds on SPARK-57056 (SupportsBranching DSv2 interface and branching
> DDL). Once a data source can expose branches, users need a way to read
> and write data against a specific branch.
> This ticket adds two complementary mechanisms.
> 1. Per-query temporal clause
>    Extend the existing temporalClause so it accepts a branch:
>        SELECT * FROM t FOR BRANCH 'dev'
>        SELECT * FROM t VERSION AS OF BRANCH 'dev'
>        SELECT * FROM t SYSTEM_VERSION AS OF BRANCH 'dev'
>        INSERT INTO t FOR BRANCH 'dev' SELECT ...
>        INSERT OVERWRITE t FOR BRANCH 'dev' SELECT ...
>        INSERT INTO t FOR BRANCH 'dev' REPLACE WHERE / REPLACE USING ...
>    BRANCH is the only temporal variant allowed on writes. VERSION AS OF
>    <int> and TIMESTAMP AS OF <ts> on writes remain rejected (existing
>    Spark constraint, surfaced at parse time with a clearer error).
> 2. Session default branch
>    New config:
>        spark.sql.defaultBranch
>    When non-empty, every read and write against a SupportsBranching
>    table is routed to the named branch. Tables that do not implement
>    SupportsBranching silently ignore the config. An explicit FOR BRANCH
>    clause always overrides the config.
> Precedence:
>   1. Explicit FOR BRANCH / VERSION AS OF BRANCH in the query.
>   2. spark.sql.defaultBranch.
>   3. Today's behaviour (no branch targeting).
> Implementation notes:
>   * SupportsBranching gains loadBranch(name): Table.
>   * TimeTravelSpec gains AsOfBranch(branch, isExplicit). RelationTimeTravel
>     carries an optional branch field.
>   * UnresolvedRelation carries the branch on writes via a reserved
>     internal option (mirrors REQUIRED_WRITE_PRIVILEGES), so the
>     NamedRelation slot in InsertIntoStatement / OverwriteByExpression
>     is preserved.
>   * CatalogV2Util.getTable composes loadTable + loadBranch, lifting the
>     "no time travel on writes" assertion only for the branch case.
>   * The default branch is applied only on the persistent relation
>     resolution path; temp views are unaffected.
>   * InMemoryTable.loadBranch returns an independent InMemoryTable per
>     branch so reads and writes are isolated end-to-end in tests.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to