Vinoth Chandar created HUDI-6709:
------------------------------------
Summary: Multi Table Transactions
Key: HUDI-6709
URL: https://issues.apache.org/jira/browse/HUDI-6709
Project: Apache Hudi
Issue Type: Epic
Reporter: Vinoth Chandar
Assignee: Vinoth Chandar
Fix For: 1.0.0
h3. Strawman idea:
* Introduce a notion of a "database" into Hudi's core (think of it, analogous
to a database server), with its own timeline.
* We introduce a parent-child relationship between the table's timeline and
the database timeline i.e an action is complete only if its completed in both
timelines (similar to data <=> metadata table sync today; although we can't
reuse that)
* A multi table transaction will first create the action on the database
timeline, then perform actions on individual tables, then finally complete it
on the database timeline.
h3. Open items:
* Need to formalize the design with considerations around isolation levels,
nested queries, self joins, avoid phantom reads.
* Need to layout how we can deliver this via API and SQL (Spark for now)
* How this interplays with multi-writer scenarios and async table services.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)