Re: Discuss proposal - IRC APIs for Multi-Statement Multi-Table Transactions

Jagdeep Sidhu Mon, 12 Jan 2026 13:39:11 -0800

Hi everyone,

We had a community discussion on the Multi Table Transactions proposal on
December 18, 2025. Link to the recording -
https://www.youtube.com/watch?v=HCAuJEWhZXE


Here is a list of discussion points and action items. I will update the
document based on this:


   1. Agreed that CSN is the preferred approach compared to using the
   existing timestamp field.
   2. Discussed that filtering based on CSN should be on the Catalog side.
   There are two options, either the Catalog should return the filtered
   snapshot or return the list as it does today and provide a hint via config
   bundle on the right snapshot to use.
   3. Instead of providing monotonic numbers as CSN, we want CSN to be a
   string. This way catalogs can assess whether a provided CSN was generated
   by the same catalog, or is it a CSN from another Catalog (so Catalog can
   safely fail the LoadTable call).
   4. CSN should not be persisted in table metadata, since tables can be
   deregistered/registered with different Catalogs and CSN is local to a
   Catalog.
   5. Write recommendations on how Catalogs should handle registration of
   new tables. Discussed that the best approach is that newly registered
   tables are only read by transactions that started after their registration.
   6. Document different isolation levels and recommended behaviour on how
   tables are loaded/updated for each isolation level.


Dov, regarding your comment above about *"un-coordinated writers using LSN
to provide the correct timeline", *I dont think this will be as simple,
since readers may start reading tables at higher LSN while an
un-coordinated writer may do a write at lower LSN. Would it be easier to
depend on the CSN approach than using PostgreSQL LSN for your use case?

-Jagdeep

On Thu, Dec 11, 2025 at 6:14 PM Dov Alperin via dev <[email protected]>
wrote:

> Sorry for my late reply and thanks for this Jagdeep! Unfortunately I have
> a conflict at that time. If you end up scheduling another one, keep us
> posted.
>
> I still feel fairly strongly that adding a new field vs overriding is the
> right thing to do.
> I think generally CSN is better vs opaque IDs in the general case, however
> there is potentially a benefit to allowing the client to set a (unique)
> opaque id identifying a version rather than delegating the responsibility
> to the catalog. In CDC-like settings (imagine capturing postgres->iceberg)
> if you have different syncers for different tables, transactions are
> already globally ordered in postgres through LSN. If writers could use the
> LSN to globally identify a "transaction" you could have uncoordinated
> writers still mapping to the same logical timeline. This is appealing for
> my use case (in an inherently streaming setting) but I am not sure it's a
> compelling enough option to adopt writ large.
>
> I recently gained access to slack so I am happy to additionally
> coordinate/chat there.
>
> Best
> Dov
>
> On Wed, Dec 10, 2025 at 4:08 PM Jagdeep Sidhu <[email protected]>
> wrote:
>
>> Hi,
>>
>> I have responded to comments in the doc and made fixes/clarifications:
>> https://docs.google.com/document/d/1jr4Ah8oceOmo6fwxG_0II4vKDUHUKScb
>>
>> Two main changes:
>>
>> 1. Updated the sequence diagram for CSN approach to show that client is
>> filtering the snapshots using CSN and remembering CSN in its Transaction
>> context. That way the catalog is not resolving tables to right versions per
>> transaction.
>> 2. Update the preferred way to get CSN by piggybacking on LoadTables API
>> call.
>>
>> I have set up a follow up meeting to discuss this topic at 9AM PST on
>> December 18, 2025. Hoping that folks can join in to provide feedback. Happy
>> to schedule more in different timezones.
>>
>> Two high level contention points that we need to discuss:
>> a) Using sequence numbers and doing filtering on client side versus using
>> Opaque IDs that Catalog resolves. In the last meeting there were opinions
>> on each side.
>> b) Adding a new field for CSN or repurposing existing timestamp field (by
>> overriding it on catalog side). CSN versus CAT approaches.
>>
>>
>> -Jagdeep
>>
>> On Thu, Nov 20, 2025 at 4:29 PM Jagdeep Sidhu <[email protected]>
>> wrote:
>>
>>> Thanks for the feedback Ryan and Dov. Agree that overriding and reusing
>>> timestamps is not good - it is backwards incompatible change. I can rework
>>> the CSN proposal and send an updated version on this email thread for
>>> further discussion.
>>>
>>> Dov (and others), do you have feedback on the CSN proposal, described in
>>> Option 1 in:
>>> https://docs.google.com/document/d/1jr4Ah8oceOmo6fwxG_0II4vKDUHUKScb
>>>
>>> We can also collaborate on updating the CSN proposal via Slack as well
>>> and then organize a meeting to get feedback/discuss further.
>>>
>>> Thank you!
>>> -Jagdeep
>>>
>>> On Sun, Nov 9, 2025 at 2:14 PM Ryan Blue <[email protected]> wrote:
>>>
>>>> v4 is a revision of the file spec, not the catalog spec so it's
>>>> unrelated. I would recommend just proposing changes to the REST catalog
>>>> spec and building consensus around how it would work. We typically want to
>>>> have an implementation in Java to demonstrate the feature before finalizing
>>>> and voting to adopt the changes.
>>>>
>>>> On Sun, Nov 9, 2025 at 1:53 PM Dov Alperin
>>>> <[email protected]> wrote:
>>>>
>>>>> That generally aligns with my sensibilities as well (avoiding
>>>>> overriding existing fields' meaning). The fact that adding a CSN requires
>>>>> changes to the spec is notable. What's the process that would be required
>>>>> to get that landed in v4?
>>>>>
>>>>> On Sun, Nov 9, 2025 at 2:40 PM Ryan Blue <[email protected]> wrote:
>>>>>
>>>>>> I am fairly strongly opposed to repurposing the timestamp field for
>>>>>> this. To move forward, I'd recommend working on catalog sequence numbers.
>>>>>>
>>>>>> On Sat, Nov 8, 2025 at 6:54 PM Dov Alperin
>>>>>> <[email protected]> wrote:
>>>>>>
>>>>>>> Hi Iceberg community!
>>>>>>> (I initially opened this message as it's own thread in error, sorry
>>>>>>> about that)
>>>>>>> I’m curious where this proposal landed? I work at Materialize
>>>>>>> <http://materialize.com/> and we are keenly interested both in
>>>>>>> seeing this
>>>>>>> proposal come to fruition but possibly also helping to implement it.
>>>>>>>
>>>>>>> I see there was a call in May, but I’m not sure what the conclusion
>>>>>>> was. As
>>>>>>> spec v4 nears closer, I am curious which of the two proposals the
>>>>>>> community
>>>>>>> favors here?
>>>>>>>
>>>>>>> Best,
>>>>>>> Dov
>>>>>>>
>>>>>>> On Tue, May 27, 2025 at 01:09:05AM -0700, Maninderjit Singh wrote:
>>>>>>> > Forgot to attach a link to the update proposal
>>>>>>> > <
>>>>>>> https://docs.google.com/document/d/1KVgUJc1WgftHfLz118vMbEE7HV8_pUDk4s-GJFDyAOE/edit?pli=1&tab=t.0#heading=h.ypbwvr181qn4
>>>>>>> >
>>>>>>> > .
>>>>>>> >
>>>>>>> > On Tue, May 27, 2025 at 1:06 AM Maninderjit Singh <
>>>>>>> > [email protected]> wrote:
>>>>>>> >
>>>>>>> > > Hi community,
>>>>>>> > >
>>>>>>> > >  I have updated the proposal with both the options (overwriting
>>>>>>> existing
>>>>>>> > > timestamps-ms vs introducing a new sequence/timestamp field) as
>>>>>>> we have
>>>>>>> > > initial consensus on using catalog authored sequence/timestamp.
>>>>>>> Jagdeep,
>>>>>>> > > please review to ensure that the options are correctly captured.
>>>>>>> I have
>>>>>>> > > also added additional arguments on why we can't assume timestamp
>>>>>>> to be
>>>>>>> > > "informational" since it's being used in critical paths and
>>>>>>> > > incorrect values can take the table offline.
>>>>>>> > >
>>>>>>> > > Also, I'm moving the meeting to Thursday to better accommodate
>>>>>>> conflicts.
>>>>>>> > > I would also record the meeting in case anyone misses and is
>>>>>>> interested in
>>>>>>> > > the discussion.
>>>>>>> > >
>>>>>>> > > Sync for iceberg multi-table transactions
>>>>>>> > > Thursday, May 29 · 9:00 – 10:00am
>>>>>>> > > Time zone: America/Los_Angeles
>>>>>>> > > Google Meet joining info
>>>>>>> > > Video call link: https://meet.google.com/ffc-ttjs-vti
>>>>>>> > >
>>>>>>> > > Thanks,
>>>>>>> > > Maninder
>>>>>>> > >
>>>>>>> > >
>>>>>>> > >
>>>>>>> > > On Mon, May 26, 2025 at 12:47 AM Péter Váry <
>>>>>>> [email protected]>
>>>>>>> > > wrote:
>>>>>>> > >
>>>>>>> > >> I'm interested, but can't be there, but please record the
>>>>>>> meeting.
>>>>>>> > >> Thanks,
>>>>>>> > >> Peter
>>>>>>> > >>
>>>>>>> > >> Maninderjit Singh <[email protected]> ezt írta
>>>>>>> (időpont:
>>>>>>> > >> 2025. máj. 24., Szo, 2:30):
>>>>>>> > >>
>>>>>>> > >>> Hi dev community,
>>>>>>> > >>> I was wondering if we could join a call next week for
>>>>>>> discussing the
>>>>>>> > >>> multi-table transactions so we can make progress. I have
>>>>>>> shared a meeting
>>>>>>> > >>> invite where anyone who's interested in the discussion can
>>>>>>> join. Please let
>>>>>>> > >>> me know if this works.
>>>>>>> > >>>
>>>>>>> > >>> Thanks,
>>>>>>> > >>> Maninder
>>>>>>> > >>>
>>>>>>> > >>> Sync for iceberg multi-table transactions
>>>>>>> > >>> Friday, May 30 · 9:00 – 10:00am
>>>>>>> > >>> Time zone: America/Los_Angeles
>>>>>>> > >>> Google Meet joining info
>>>>>>> > >>> Video call link: https://meet.google.com/ffc-ttjs-vti
>>>>>>> > >>>
>>>>>>> > >>>
>>>>>>> > >>> On Wed, May 21, 2025 at 10:25 AM Maninderjit Singh <
>>>>>>> > >>> [email protected]> wrote:
>>>>>>> > >>>
>>>>>>> > >>>> Hi dev community,
>>>>>>> > >>>> Following up on the thread here to continue the discussion
>>>>>>> and get
>>>>>>> > >>>> feedback since we couldn't get to it in sync. I think we have
>>>>>>> made some
>>>>>>> > >>>> progress in the discussion that I want to capture while
>>>>>>> highlighting the
>>>>>>> > >>>> items where we need to create consensus along with pros and
>>>>>>> cons. I would
>>>>>>> > >>>> need help to add clarity and to make sure the arguments are
>>>>>>> captured
>>>>>>> > >>>> correctly.
>>>>>>> > >>>>
>>>>>>> > >>>> *Things we agree on*
>>>>>>> > >>>>
>>>>>>> > >>>>    1. Don't maintain server side state for tracking the
>>>>>>> transactions.
>>>>>>> > >>>>    2. Need global (catalog-wide) ordering of snapshots via
>>>>>>> some
>>>>>>> > >>>>    (hybrid/logical) clock/CSN
>>>>>>> > >>>>    3. Optionally expose the catalog's clock/CSN information
>>>>>>> without
>>>>>>> > >>>>    changing how tables load
>>>>>>> > >>>>    4. Loading consistent snapshot across multiple tables and
>>>>>>> > >>>>    repeatable reads based on the reference clock/CSN
>>>>>>> > >>>>
>>>>>>> > >>>>
>>>>>>> > >>>> *Things we disagree on*
>>>>>>> > >>>>
>>>>>>> > >>>>    1. Reuse existing timestamp field vs introduce a new field
>>>>>>> CSN
>>>>>>> > >>>>
>>>>>>> > >>>>
>>>>>>> > >>>> *Reusing timestamp field approach*
>>>>>>> > >>>>
>>>>>>> > >>>>    - Pros:
>>>>>>> > >>>>
>>>>>>> > >>>>
>>>>>>> > >>>>    1. Backwards compatibility, no change to table metadata
>>>>>>> spec so
>>>>>>> > >>>>    could be used by existing v2 tables.
>>>>>>> > >>>>    2. Fixes existing time travel and ordering issues
>>>>>>> > >>>>    3. Simplifies and clarifies the spec (no new id for
>>>>>>> snapshots)
>>>>>>> > >>>>    4. Common notion of timestamp that could be used to
>>>>>>> evaluate causal
>>>>>>> > >>>>    relationships in other proposals like events or commit
>>>>>>> reports.
>>>>>>> > >>>>
>>>>>>> > >>>>
>>>>>>> > >>>>    - Cons
>>>>>>> > >>>>
>>>>>>> > >>>>
>>>>>>> > >>>>    1. Unique timestamp generation in milliseconds. Potential
>>>>>>> > >>>>    mitigations:
>>>>>>> > >>>>
>>>>>>> https://docs.google.com/document/d/1KVgUJc1WgftHfLz118vMbEE7HV8_pUDk4s-GJFDyAOE/edit?pli=1&disco=AAABjwaxXeg
>>>>>>> > >>>>    2. Concerns about client side timestamp being overridden.
>>>>>>> > >>>>
>>>>>>> > >>>> *Adding new CSN field*
>>>>>>> > >>>>
>>>>>>> > >>>>    - Pros:
>>>>>>> > >>>>
>>>>>>> > >>>>
>>>>>>> > >>>>    1. Flexibility to use logical or hybrid clocks. Not sure
>>>>>>> how
>>>>>>> > >>>>    clients can generate a hybrid clock timestamp here without
>>>>>>> suffering from
>>>>>>> > >>>>    clock skew (Would be good to clarify this)?
>>>>>>> > >>>>    2. No client side overriding concerns.
>>>>>>> > >>>>
>>>>>>> > >>>>
>>>>>>> > >>>>    - Cons:
>>>>>>> > >>>>
>>>>>>> > >>>>
>>>>>>> > >>>>    1. Not backwards compatible, requires new field in table
>>>>>>> metadata
>>>>>>> > >>>>    so need to wait for v4
>>>>>>> > >>>>    2. Does not fix time travel and snapshot-log ordering
>>>>>>> issues
>>>>>>> > >>>>    3. Adds another id for snapshots that clients need to
>>>>>>> generate and
>>>>>>> > >>>>    reason about.
>>>>>>> > >>>>    4. Could not be extended to use in other proposals for
>>>>>>> causal
>>>>>>> > >>>>    reasoning.
>>>>>>> > >>>>
>>>>>>> > >>>>
>>>>>>> > >>>> Thanks,
>>>>>>> > >>>> Maninder
>>>>>>> > >>>>
>>>>>>> > >>>> On Tue, May 20, 2025 at 8:16 PM Maninderjit Singh <
>>>>>>> > >>>> [email protected]> wrote:
>>>>>>> > >>>>
>>>>>>> > >>>>> Appreciate the feedback on the "catalog-authored timestamp"
>>>>>>> document
>>>>>>> > >>>>> <
>>>>>>> https://docs.google.com/document/d/1KVgUJc1WgftHfLz118vMbEE7HV8_pUDk4s-GJFDyAOE/edit?pli=1&tab=t.0
>>>>>>> >
>>>>>>> > >>>>> !
>>>>>>> > >>>>>
>>>>>>> > >>>>> Ryan, I don't think we can get consistent time travel
>>>>>>> queries in
>>>>>>> > >>>>> iceberg without fixing the timestamp field since it's what
>>>>>>> the spec
>>>>>>> > >>>>> <
>>>>>>> https://iceberg.apache.org/spec/#point-in-time-reads-time-travel>
>>>>>>> > >>>>> prescribes for time travel. Hence I took the liberty to
>>>>>>> re-use it for the
>>>>>>> > >>>>> catalog timestamp which ensures that snapshot-log is
>>>>>>> correctly ordered for
>>>>>>> > >>>>> time travel.  Additionally, the timestamp field needs to be
>>>>>>> fixed to avoid
>>>>>>> > >>>>> breaking commits to the table due to accidental large skews
>>>>>>> as per current
>>>>>>> > >>>>> spec, the scenario is described in detail here
>>>>>>> > >>>>> <
>>>>>>> https://docs.google.com/document/d/1KVgUJc1WgftHfLz118vMbEE7HV8_pUDk4s-GJFDyAOE/edit?pli=1&tab=t.0#bookmark=id.6avx66vzo168
>>>>>>> >
>>>>>>> > >>>>> .
>>>>>>> > >>>>> The other benefit of reusing the timestamp field is spec
>>>>>>> simplicity
>>>>>>> > >>>>> and clarity on timestamp generation responsibilities without
>>>>>>> requiring the
>>>>>>> > >>>>> need to manage yet another identifier (in addition to
>>>>>>> sequence number,
>>>>>>> > >>>>> snapshot id and timestamp) for snapshots.
>>>>>>> > >>>>>
>>>>>>> > >>>>> Jagdeep, your concerns about overriding the timestamp field
>>>>>>> are valid
>>>>>>> > >>>>> but the reason I'm not too worried about it is because
>>>>>>> client can't assume
>>>>>>> > >>>>> a commit is successful without their response being
>>>>>>> acknowledged by the
>>>>>>> > >>>>> catalog which returns the CommitTableResponse
>>>>>>> > >>>>> <
>>>>>>> https://github.com/apache/iceberg/blob/c2478968e65368c61799d8ca4b89506a61ca3e7c/open-api/rest-catalog-open-api.yaml#L3997>
>>>>>>> with
>>>>>>> > >>>>> new metadata (that has catalog authored timestamps in the
>>>>>>> proposal). I'm
>>>>>>> > >>>>> happy to work with you to put something common together and
>>>>>>> get the best
>>>>>>> > >>>>> out of the proposals.
>>>>>>> > >>>>>
>>>>>>> > >>>>> Thanks,
>>>>>>> > >>>>> Maninder
>>>>>>> > >>>>>
>>>>>>> > >>>>>
>>>>>>> > >>>>>
>>>>>>> > >>>>>
>>>>>>> > >>>>> On Tue, May 20, 2025 at 5:48 PM Jagdeep Sidhu <
>>>>>>> [email protected]>
>>>>>>> > >>>>> wrote:
>>>>>>> > >>>>>
>>>>>>> > >>>>>> Thank you Ryan, Maninder and the rest of the community for
>>>>>>> feedback
>>>>>>> > >>>>>> and ideas!
>>>>>>> > >>>>>> Drew and I will take another pass and remove the catalog
>>>>>>> > >>>>>> co-ordination requirement for LoadTable API, and bring the
>>>>>>> proposal closer
>>>>>>> > >>>>>> to "catalog-authored timestamp" in the sense that clients
>>>>>>> can use CSN to
>>>>>>> > >>>>>> find the right snapshot, but still leave upto Catalog on
>>>>>>> what it want to
>>>>>>> > >>>>>> use for CSN (Hybrid clock timestamp or another
>>>>>>> monotonically increasing
>>>>>>> > >>>>>> number).
>>>>>>> > >>>>>>
>>>>>>> > >>>>>> If more folks have feedback, please leave it in the doc or
>>>>>>> email
>>>>>>> > >>>>>> list, so we can address it as well in the document update.
>>>>>>> > >>>>>>
>>>>>>> > >>>>>> Maninder, one reason we proposed a new field for
>>>>>>> CommitSequenceNumber
>>>>>>> > >>>>>> instead of using an existing field is for backwards
>>>>>>> compatibility. Catalogs
>>>>>>> > >>>>>> can start optionally exposing the new field, and interested
>>>>>>> clients can use
>>>>>>> > >>>>>> the new field, but existing clients keep working as is.
>>>>>>> Existing and new
>>>>>>> > >>>>>> clients can also keep working as is against the same tables
>>>>>>> in the
>>>>>>> > >>>>>> same Catalog. My one worry is that having Catalog override
>>>>>>> the timestamp
>>>>>>> > >>>>>> field for commits may break some existing clients? Today
>>>>>>> all Iceberg
>>>>>>> > >>>>>> engines/clients do not expect the timestamp field in
>>>>>>> metadata/snapshot-log
>>>>>>> > >>>>>> to be overwritten by the Catalog.
>>>>>>> > >>>>>>
>>>>>>> > >>>>>> How do you feel about taking the best from each proposal?,
>>>>>>> i.e.
>>>>>>> > >>>>>> monotonically increasing commit sequence numbers (some
>>>>>>> catalogs can use
>>>>>>> > >>>>>> timestamps, some can use logical clock but we don't have to
>>>>>>> enforce it -
>>>>>>> > >>>>>> leave it up to Catalog), but keep client side logic for
>>>>>>> resolving the right
>>>>>>> > >>>>>> snapshot using sequence numbers instead of adding that
>>>>>>> functionality to
>>>>>>> > >>>>>> Catalog. Let me know!
>>>>>>> > >>>>>>
>>>>>>> > >>>>>> Thank you!
>>>>>>> > >>>>>> -Jagdeep
>>>>>>> > >>>>>>
>>>>>>> > >>>>>> On Tue, May 20, 2025 at 2:45 PM Ryan Blue <[email protected]>
>>>>>>> wrote:
>>>>>>> > >>>>>>
>>>>>>> > >>>>>>> Thanks for the proposals! There are things that I think
>>>>>>> are good
>>>>>>> > >>>>>>> about both of them. I think that the catalog-authored
>>>>>>> timestamps proposal
>>>>>>> > >>>>>>> misunderstands the purpose of the timestamp field, but
>>>>>>> does get right that
>>>>>>> > >>>>>>> a monotonically increasing "time" field (really a sequence
>>>>>>> number) across
>>>>>>> > >>>>>>> tables enables the coordination needed for snapshot
>>>>>>> isolated reads. I like
>>>>>>> > >>>>>>> that the sequence number proposal leaves the meaning of
>>>>>>> the field to the
>>>>>>> > >>>>>>> catalog for coordination, but it still proposes catalog
>>>>>>> coordination by
>>>>>>> > >>>>>>> loading tables "at" some sequence number. Ideally, we
>>>>>>> would be able to
>>>>>>> > >>>>>>> (optionally) expose this extra catalog information to
>>>>>>> clients and not need
>>>>>>> > >>>>>>> to change how loading works.
>>>>>>> > >>>>>>>
>>>>>>> > >>>>>>> Ryan
>>>>>>> > >>>>>>>
>>>>>>> > >>>>>>> On Tue, May 20, 2025 at 9:45 AM Ryan Blue <
>>>>>>> [email protected]> wrote:
>>>>>>> > >>>>>>>
>>>>>>> > >>>>>>>> Hi everyone,
>>>>>>> > >>>>>>>>
>>>>>>> > >>>>>>>> To avoid passing copies of a file around for comments, I
>>>>>>> put the
>>>>>>> > >>>>>>>> doc for commit sequence numbers into Google so we can
>>>>>>> comment on a central
>>>>>>> > >>>>>>>> copy:
>>>>>>> > >>>>>>>>
>>>>>>> https://docs.google.com/document/d/1jr4Ah8oceOmo6fwxG_0II4vKDUHUKScb/edit?usp=sharing&ouid=100239850723655533404&rtpof=true&sd=true
>>>>>>> > >>>>>>>>
>>>>>>> > >>>>>>>> Ryan
>>>>>>> > >>>>>>>>
>>>>>>> > >>>>>>>> On Fri, May 16, 2025 at 2:51 AM Maninderjit Singh <
>>>>>>> > >>>>>>>> [email protected]> wrote:
>>>>>>> > >>>>>>>>
>>>>>>> > >>>>>>>>> Thanks for the updated proposal Drew!
>>>>>>> > >>>>>>>>> My preference for using the catalog authored timestamp
>>>>>>> is to
>>>>>>> > >>>>>>>>> minimize changes to the REST spec so we can have good
>>>>>>> backwards
>>>>>>> > >>>>>>>>> compatibility. I have quickly put together a draft
>>>>>>> proposal on how this
>>>>>>> > >>>>>>>>> should work. Looking forward to feedback and discussion.
>>>>>>> > >>>>>>>>>
>>>>>>> > >>>>>>>>>  Draft Proposal: Catalog‑Authored Timestamps for
>>>>>>> Apache Iceberg
>>>>>>> > >>>>>>>>> REST Catalog
>>>>>>> > >>>>>>>>> <
>>>>>>> https://drive.google.com/open?id=1KVgUJc1WgftHfLz118vMbEE7HV8_pUDk4s-GJFDyAOE
>>>>>>> >
>>>>>>> > >>>>>>>>>
>>>>>>> > >>>>>>>>> Thanks,
>>>>>>> > >>>>>>>>> Maninder
>>>>>>> > >>>>>>>>>
>>>>>>> > >>>>>>>>> On Wed, May 14, 2025 at 6:12 PM Drew <[email protected]>
>>>>>>> wrote:
>>>>>>> > >>>>>>>>>
>>>>>>> > >>>>>>>>>> Hi everyone,
>>>>>>> > >>>>>>>>>>
>>>>>>> > >>>>>>>>>> Thank you for feedback on the MTT proposal and during
>>>>>>> community
>>>>>>> > >>>>>>>>>> sync. Based on it, Jagdeep and I have iterated on the
>>>>>>> document and added a
>>>>>>> > >>>>>>>>>> second option to use *Catalog CommitSequenceNumbers*.
>>>>>>> Looking
>>>>>>> > >>>>>>>>>> forward to getting more feedback on the proposal, where
>>>>>>> to add more details
>>>>>>> > >>>>>>>>>> or approach/changes to consider. We appreciate
>>>>>>> everyone's time on this!
>>>>>>> > >>>>>>>>>>
>>>>>>> > >>>>>>>>>> The option introduces *Catalog
>>>>>>> CommitSequenceNumbers(CSNs)*,
>>>>>>> > >>>>>>>>>> which allow clients/engines to read a consistent view
>>>>>>> of multiple tables
>>>>>>> > >>>>>>>>>> without needing to register a transaction context with
>>>>>>> the catalog. This
>>>>>>> > >>>>>>>>>> removes the need of registering a transaction context
>>>>>>> with Catalog, thus
>>>>>>> > >>>>>>>>>> removing the need of transaction bookkeeping on the
>>>>>>> catalog side. For
>>>>>>> > >>>>>>>>>> aborting transactions early, clients can use LoadTable
>>>>>>> with and without CSN
>>>>>>> > >>>>>>>>>> to figure out if there is already a conflicting write
>>>>>>> on any of the tables
>>>>>>> > >>>>>>>>>> being modified. Also removed the section where
>>>>>>> transactions were staging
>>>>>>> > >>>>>>>>>> commits on Catalog, and changed the proposal to align
>>>>>>> with Eduard's PR
>>>>>>> > >>>>>>>>>> around staging changes locally before commit (
>>>>>>> > >>>>>>>>>> https://github.com/apache/iceberg/pull/6948).
>>>>>>> > >>>>>>>>>>
>>>>>>> > >>>>>>>>>> Jagdeep also clarified in an example in a previous
>>>>>>> email where a
>>>>>>> > >>>>>>>>>> workload may require multi table snapshot isolation,
>>>>>>> even if the tables are
>>>>>>> > >>>>>>>>>> being updated without Multi-Table commit API. Though
>>>>>>> most MTT transactions
>>>>>>> > >>>>>>>>>> will commit using the multi table commit API.
>>>>>>> > >>>>>>>>>>
>>>>>>> > >>>>>>>>>> Maninder, for the approach of "common notion of time
>>>>>>> between
>>>>>>> > >>>>>>>>>> clients and catalog" - I spent some time thinking about
>>>>>>> it, but cannot find
>>>>>>> > >>>>>>>>>> a feasible way to do this. Yes, the catalogs can use a
>>>>>>> high precision
>>>>>>> > >>>>>>>>>> clock, but clients cannot use Catalog Timestamp from
>>>>>>> API calls to set local
>>>>>>> > >>>>>>>>>> clock due to network latency for request/response. For
>>>>>>> example, different
>>>>>>> > >>>>>>>>>> requests to the same Catalog servers can return
>>>>>>> different timestamps based
>>>>>>> > >>>>>>>>>> on network latency. Also what if a client works with
>>>>>>> more than 1 Catalog.
>>>>>>> > >>>>>>>>>> If you want to do a rough write-up or share a reference
>>>>>>> implementation that
>>>>>>> > >>>>>>>>>> uses such an approach, I will be happy to brainstorm it
>>>>>>> more. Let us know!
>>>>>>> > >>>>>>>>>>
>>>>>>> > >>>>>>>>>> Here is the link to updated proposal
>>>>>>> > >>>>>>>>>>
>>>>>>> > >>>>>>>>>>
>>>>>>> > >>>>>>>>>> <
>>>>>>> https://docs.google.com/document/d/1jr4Ah8oceOmo6fwxG_0II4vKDUHUKScb/edit?usp=sharing&ouid=100384647237395649950&rtpof=true&sd=true
>>>>>>> >
>>>>>>> > >>>>>>>>>> Thanks Again!
>>>>>>> > >>>>>>>>>> - Drew
>>>>>>> > >>>>>>>>>>
>>>>>>> > >>>>>>>>>
>>>>>>>
>>>>>>

Re: Discuss proposal - IRC APIs for Multi-Statement Multi-Table Transactions

Reply via email to