[
https://issues.apache.org/jira/browse/SPARK-57057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated SPARK-57057:
-----------------------------------
Labels: pull-request-available (was: )
> Allow SELECT and INSERT to target a specific branch on SupportsBranching
> tables
> -------------------------------------------------------------------------------
>
> Key: SPARK-57057
> URL: https://issues.apache.org/jira/browse/SPARK-57057
> Project: Spark
> Issue Type: New Feature
> Components: SQL
> Affects Versions: 4.3.0
> Reporter: L. C. Hsieh
> Assignee: L. C. Hsieh
> Priority: Major
> Labels: pull-request-available
>
> Builds on SPARK-57056 (SupportsBranching DSv2 interface and branching
> DDL). Once a data source can expose branches, users need a way to read
> and write data against a specific branch.
> This ticket adds two complementary mechanisms.
> 1. Per-query temporal clause
> Extend the existing temporalClause so it accepts a branch:
> SELECT * FROM t FOR BRANCH 'dev'
> SELECT * FROM t VERSION AS OF BRANCH 'dev'
> SELECT * FROM t SYSTEM_VERSION AS OF BRANCH 'dev'
> INSERT INTO t FOR BRANCH 'dev' SELECT ...
> INSERT OVERWRITE t FOR BRANCH 'dev' SELECT ...
> INSERT INTO t FOR BRANCH 'dev' REPLACE WHERE / REPLACE USING ...
> BRANCH is the only temporal variant allowed on writes. VERSION AS OF
> <int> and TIMESTAMP AS OF <ts> on writes remain rejected (existing
> Spark constraint, surfaced at parse time with a clearer error).
> 2. Session default branch
> New config:
> spark.sql.defaultBranch
> When non-empty, every read and write against a SupportsBranching
> table is routed to the named branch. Tables that do not implement
> SupportsBranching silently ignore the config. An explicit FOR BRANCH
> clause always overrides the config.
> Precedence:
> 1. Explicit FOR BRANCH / VERSION AS OF BRANCH in the query.
> 2. spark.sql.defaultBranch.
> 3. Today's behaviour (no branch targeting).
> Implementation notes:
> * SupportsBranching gains loadBranch(name): Table.
> * TimeTravelSpec gains AsOfBranch(branch, isExplicit). RelationTimeTravel
> carries an optional branch field.
> * UnresolvedRelation carries the branch on writes via a reserved
> internal option (mirrors REQUIRED_WRITE_PRIVILEGES), so the
> NamedRelation slot in InsertIntoStatement / OverwriteByExpression
> is preserved.
> * CatalogV2Util.getTable composes loadTable + loadBranch, lifting the
> "no time travel on writes" assertion only for the branch case.
> * The default branch is applied only on the persistent relation
> resolution path; temp views are unaffected.
> * InMemoryTable.loadBranch returns an independent InMemoryTable per
> branch so reads and writes are isolated end-to-end in tests.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]