Hi all, Today, the Iceberg spec has table properties defining the transaction isolation levels: write.delete/update/merge.isolation-level. These properties can be set to either `snapshot` or `serializable`. With a properly designed writer and Iceberg multi version snapshots, we can achieve single table snapshot isolation or even serializable isolation.
But for queries involving multiple tables, the spec does not provide a mechanism to achieve a global snapshot consistency. The Iceberg REST Catalog (IRC) API provides only single-table load operation: LoadTable, and clients would need to call this API multiple times to resolve table metadata in a single query statement - each could represent a different snapshot view of the catalog. This creates problem especially for engines that already support global SI. For example, the transaction semantics for AWS Redshift when query its native tables is different than querying against Iceberg tables, which surprises customers at times. There were proposals in the past in the context of multi-statement transaction discussion (https://docs.google.com/document/d/1jr4Ah8oceOmo6fwxG_0II4vKDUHUKScb/edit#heading=h.qb9z621zr507). But I feel these proposals are too complicated and require significant changes to the catalog/IRC protocol. Here I propose a simpler approach: add a batch LoadTables API, and rely on the catalog's underlying system-of-record to provide snapshot isolation for that batch read. When a client calls LoadTables({table_a, table_b, table_c}), the catalog reads the current metadata for all requested tables in a single consistent operation (e.g., a TransactGetItems in DynamoDB, or a single SI read in a relational DB). The client receives a consistent cross-table snapshot — the latest committed state of all requested tables as of a single point in time. This would give us the statement level global snapshot consistency. It doesn’t provide full transaction level SI consistency for multi statement transactions, but I believe it’s a reasonable trade off. I capture the details of this proposal in this doc - https://docs.google.com/document/d/1u11b4pzeFUKD0XX--nHPj-DoYcNeCgOe94WKCaX2XMI/edit?usp=sharing I also created a prototype that implements the LoadTables API for Apache Polaris, levering the underlying Postgres for the snapshot isolation - https://github.com/xndai/polaris/commit/f4eb514a2920effe67ecfb8c64e2e3fa418baf11 Feedbacks and comments are welcomed!
