[GitHub] [hudi] codope commented on a diff in pull request #9537: [HUDI-6711] [RFC-73] Multi-table transactions

via GitHub Wed, 30 Aug 2023 08:56:20 -0700


codope commented on code in PR #9537:
URL: https://github.com/apache/hudi/pull/9537#discussion_r1310492751



##########
rfc/rfc-73/rfc-73.md:
##########
@@ -0,0 +1,416 @@
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+
+# RFC-73: Multi-Table Transactions
+
+## Proposers
+
+- @codope
+
+## Approvers
+
+- @vinothchandar
+
+## Status
+
+JIRA: [HUDI-6709](https://issues.apache.org/jira/browse/HUDI-6709)
+
+## Abstract
+
+Modern data lake architectures often comprise numerous interconnected tables. 
Operations, such as data backfill,
+upserts, deletes or complex transformations, may span across multiple tables. 
In these scenarios, it's crucial that
+these operations are atomic - i.e., they either succeed across all tables or 
fail without partial writes. This ensures
+data consistency across the entire dataset. Users can design data workflows 
with the assurance that operations spanning
+multiple tables are treated as a single atomic unit.
+
+## Background
+
+Hudi has always emphasized transactional guarantees, ensuring data integrity 
and consistency for a specific table.
+Central to Hudi's approach to transactions is its 
[timeline](https://hudi.apache.org/docs/timeline), which logs all
+actions (like commits, deltacommits, rollbacks, etc) on the table. With 
timeline as the source of truth for all changes
+on the table, Hudi employs tunable [concurrency 
control](https://hudi.apache.org/docs/concurrency_control) to allow for
+concurrent reads and writes on the table
+with [snapshot 
isolation](https://cwiki.apache.org/confluence/display/HUDI/RFC+-+22+%3A+Snapshot+Isolation+using+Optimistic+Concurrency+Control+for+multi-writers).
+This is achieved by leveraging the timeline to determine the latest consistent 
snapshot of the table. Additionally,
+users can bring their own custom conflict resolution strategy by implementing 
the `ConflictResolutionStrategy`
+interface.
+
+However, the current implementation cannot be extended as-is to support atomic 
operations across multiple tables. First
+of all, we need a notion of a "database" and its tables to be able to 
associate a transaction with multiple tables.
+Secondly, Hudi's timeline would need to evolve to account for changes across 
multiple tables. This introduces
+complexities in tracking and managing the order of operations across tables. 
With multiple tables involved, the points
+of potential failures increase. A failure in one table can cascade and affect 
the entire transaction. In case of
+failures, rolling back changes becomes more intricate in multi-table 
scenarios. Ensuring data consistency across tables
+during rollbacks, especially when there are inter-table dependencies, 
introduces additional challenges. Finally, the
+current concurrency control implementation is not designed to handle 
multi-table transactions. Hence, this RFC proposes
+a new design to support multi-table transactions.
+
+## Design
+
+First of all, let us discuss the goals and non-goals which will help in 
understanding the design considerations better.
+
+### Goals
+
+1. **Atomicity Across Tables:** Ensure that a set of changes spanning multiple 
tables either fully completes or fully
+   rolls back. No partial commits should occur.
+2. **Consistency:** Ensure that, after every transaction, all involved tables 
are in a consistent state. Avoid dirty
+   reads, non-repeatable reads, handle read-write and write-write conflicts.
+3. **Isolation:** Multiple concurrent transactions should not interfere with 
each other. One transaction shouldn't see
+   the intermediate states of another ongoing transaction.
+4. **Durability:** Once a multi-table transaction is committed, the changes 
should be permanent and recoverable, even
+   after system failures.
+5. **Performance:** The multi-table transaction mechanism should introduce 
minimal overhead. Its performance impact on
+   typical Hudi operations should be kept low. By corollary, locking (in case 
of OCC) cannot be coarser than file level.
+6. **Monitoring:** Integration with tools such as hudi-cli to track the status 
and health of transactions. Separate
+   transaction metrics. Also, provide comprehensive logs for debugging 
transaction failures or issues.
+7. **Integration with Current Hudi Features:** The multi-table transaction 
should be seamlessly integrated with existing
+   Hudi features and should not break existing APIs or workflows. In
+   particular, `TransacationManager`, `LockManager`, 
`ConflictResolutionStrategy` APIs should work as they do today with
+   single table.
+8. **Configurability:** Allow users to configure the behavior of multi-table 
transactions, e.g., concurrency control
+   mechanism, timeout durations, etc.
+9. **Recovery**: Rollbacks, savepoint and restore should be supported.
+
+### Non-goals
+
+1. **Cross-Database Transactions:** Transactions spanning multiple databases 
might be too complex as an initial target.
+2. **Granular Record Locking:** Instead of locking at the granularity of 
individual records, coarse-grained locking (
+   like at the file level) might be more practical to start with. But, let's 
avoid table or partition level locking as
+   mentioned in the goals.
+3. **Distributed Transactions:** We are not going to consider transactions in 
the sense of distributed databases, and

Review Comment:
   Yes, that's a design assumption. Each storage node is trusted to durably and 
reliably handle writes for an individual table. There's an implicit expectation 
of fault-tolerance at the storage level. None of the storage nodes are holding 
replicas.
   This is unlike distrbuted databases like Amazon Aurora or CorckroachDB that 
will use a distributed consensus algorithm or quorum membership to guarantee 
consistency.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] codope commented on a diff in pull request #9537: [HUDI-6711] [RFC-73] Multi-table transactions

Reply via email to