Hi Xiening, The LoadTables proposal above seems to address the problem of atomically reading the metadata.json across multiple tables "as of" a consistent time, the CSN proposal provides a detailed <https://docs.google.com/document/d/1KVgUJc1WgftHfLz118vMbEE7HV8_pUDk4s-GJFDyAOE/edit?tab=t.0#bookmark=id.ue33k3ujfi7s>explanation of how to achieve it. It does not require reading metadata.json N times for the single table or pinning the catalog state ( I have added comments and provided links to relevant sections). Also, there is no need to rewrite the artifacts (manifest/manifest lists) stored in cloud storage as the CSN lives only in the TableMetadata which is written only by the catalog for the REST catalogs.
The rest of the proposal aligns closely with the CSN proposal described here <https://docs.google.com/document/d/1KVgUJc1WgftHfLz118vMbEE7HV8_pUDk4s-GJFDyAOE/edit?tab=t.0#heading=h.nwyigim62nez> . Thanks, Maninder On Wed, Jun 3, 2026 at 8:59 AM Xiening Dai <[email protected]> wrote: > Hi all, > > Today, the Iceberg spec has table properties defining the transaction > isolation levels: write.delete/update/merge.isolation-level. These > properties can be set to either `snapshot` or `serializable`. With a > properly designed writer and Iceberg multi version snapshots, we can > achieve single table snapshot isolation or even serializable isolation. > > But for queries involving multiple tables, the spec does not provide a > mechanism to achieve a global snapshot consistency. The Iceberg REST > Catalog (IRC) API provides only single-table load operation: LoadTable, and > clients would need to call this API multiple times to resolve table > metadata in a single query statement - each could represent a different > snapshot view of the catalog. > > This creates problem especially for engines that already support global > SI. For example, the transaction semantics for AWS Redshift when query its > native tables is different than querying against Iceberg tables, which > surprises customers at times. > > There were proposals in the past in the context of multi-statement > transaction discussion ( > https://docs.google.com/document/d/1jr4Ah8oceOmo6fwxG_0II4vKDUHUKScb/edit#heading=h.qb9z621zr507). > But I feel these proposals are too complicated and require significant > changes to the catalog/IRC protocol. > > Here I propose a simpler approach: add a batch LoadTables API, and rely on > the catalog's underlying system-of-record to provide snapshot isolation for > that batch read. > > When a client calls LoadTables({table_a, table_b, table_c}), the catalog > reads the current metadata for all requested tables in a single consistent > operation (e.g., a TransactGetItems in DynamoDB, or a single SI read in a > relational DB). The client receives a consistent cross-table snapshot — the > latest committed state of all requested tables as of a single point in time. > > This would give us the statement level global snapshot consistency. It > doesn’t provide full transaction level SI consistency for multi statement > transactions, but I believe it’s a reasonable trade off. > > I capture the details of this proposal in this doc - > https://docs.google.com/document/d/1u11b4pzeFUKD0XX--nHPj-DoYcNeCgOe94WKCaX2XMI/edit?usp=sharing > > I also created a prototype that implements the LoadTables API for Apache > Polaris, levering the underlying Postgres for the snapshot isolation - > https://github.com/xndai/polaris/commit/f4eb514a2920effe67ecfb8c64e2e3fa418baf11 > > Feedbacks and comments are welcomed! >
