Always welcome, please do

On Wed, Jun 17, 2026 at 12:55 PM Xiening Dai <[email protected]> wrote:

> Is it ok that I put this topic into the community sync so we can get some
> tractions on this issue?
>
> On 2026/06/05 23:16:04 Xiening Dai wrote:
> > And I replied your comments in the doc. Thank you.
> >
> > On 2026/06/04 23:35:04 Maninder Parmar wrote:
> > > Hi Xiening,
> > > The LoadTables proposal above seems to address the problem of
> atomically
> > > reading the metadata.json across multiple tables "as of" a consistent
> time,
> > > the CSN proposal provides a detailed
> > > <
> https://docs.google.com/document/d/1KVgUJc1WgftHfLz118vMbEE7HV8_pUDk4s-GJFDyAOE/edit?tab=t.0#bookmark=id.ue33k3ujfi7s
> >explanation
> > > of how to achieve it.
> > > It does not require reading metadata.json N times for the single table
> or
> > > pinning the catalog state ( I have added comments and provided links to
> > > relevant sections). Also, there is no need to rewrite the artifacts
> > > (manifest/manifest lists) stored in cloud storage as the CSN lives
> only in
> > > the TableMetadata which is written only by the catalog for the REST
> > > catalogs.
> > >
> > > The rest of the proposal aligns closely with the CSN proposal
> described here
> > > <
> https://docs.google.com/document/d/1KVgUJc1WgftHfLz118vMbEE7HV8_pUDk4s-GJFDyAOE/edit?tab=t.0#heading=h.nwyigim62nez
> >
> > > .
> > >
> > > Thanks,
> > > Maninder
> > >
> > >
> > >
> > >
> > >
> > > On Wed, Jun 3, 2026 at 8:59 AM Xiening Dai <[email protected]> wrote:
> > >
> > > > Hi all,
> > > >
> > > > Today, the Iceberg spec has table properties defining the transaction
> > > > isolation levels: write.delete/update/merge.isolation-level. These
> > > > properties can be set to either `snapshot` or `serializable`. With a
> > > > properly designed writer and Iceberg multi version snapshots, we can
> > > > achieve single table snapshot isolation or even serializable
> isolation.
> > > >
> > > > But for queries involving multiple tables, the spec does not provide
> a
> > > > mechanism to achieve a global snapshot consistency. The Iceberg REST
> > > > Catalog (IRC) API provides only single-table load operation:
> LoadTable, and
> > > > clients would need to call this API multiple times to resolve table
> > > > metadata in a single query statement - each could represent a
> different
> > > > snapshot view of the catalog.
> > > >
> > > > This creates problem especially for engines that already support
> global
> > > > SI. For example, the transaction semantics for AWS Redshift when
> query its
> > > > native tables is different than querying against Iceberg tables,
> which
> > > > surprises customers at times.
> > > >
> > > > There were proposals in the past in the context of multi-statement
> > > > transaction discussion (
> > > >
> https://docs.google.com/document/d/1jr4Ah8oceOmo6fwxG_0II4vKDUHUKScb/edit#heading=h.qb9z621zr507
> ).
> > > > But I feel these proposals are too complicated and require
> significant
> > > > changes to the catalog/IRC protocol.
> > > >
> > > > Here I propose a simpler approach: add a batch LoadTables API, and
> rely on
> > > > the catalog's underlying system-of-record to provide snapshot
> isolation for
> > > > that batch read.
> > > >
> > > > When a client calls LoadTables({table_a, table_b, table_c}), the
> catalog
> > > > reads the current metadata for all requested tables in a single
> consistent
> > > > operation (e.g., a TransactGetItems in DynamoDB, or a single SI read
> in a
> > > > relational DB). The client receives a consistent cross-table
> snapshot — the
> > > > latest committed state of all requested tables as of a single point
> in time.
> > > >
> > > > This would give us the statement level global snapshot consistency.
> It
> > > > doesn’t provide full transaction level SI consistency for multi
> statement
> > > > transactions, but I believe it’s a reasonable trade off.
> > > >
> > > > I capture the details of this proposal in this doc -
> > > >
> https://docs.google.com/document/d/1u11b4pzeFUKD0XX--nHPj-DoYcNeCgOe94WKCaX2XMI/edit?usp=sharing
> > > >
> > > > I also created a prototype that implements the LoadTables API for
> Apache
> > > > Polaris, levering the underlying Postgres for the snapshot isolation
> -
> > > >
> https://github.com/xndai/polaris/commit/f4eb514a2920effe67ecfb8c64e2e3fa418baf11
> > > >
> > > > Feedbacks and comments are welcomed!
> > > >
> > >
> >
>

Reply via email to