Hi Fokko, Thank you for detailed feedback! I need to think through your comments and update/reply accordingly.
But I want to discuss a key goal for writing this proposal (MTMS = "Multi-Table Multi-Statement"): For MTMS txns, many un-coordinated engines could be operating on the same set of tables in a Catalog. In this scenario, lets say an engine starts a MTMS txn, which needs to read through different tables during transaction lifetime. It needs to know exactly which version of a table it should load to provide SNAPSHOT ISOLATION consistent reads across tables (we can even ignore commits and assume this txn runs read only queries). That is the key problem this proposal aims to solve - provide SNAPSHOT ISOLATION consistent reads across tables when multiple engines are operating on the same tables. As you suggest, this kind of book-keeping can be implemented in engines using some sort of algorithm, but that does not help if multiple different engines are operating on the same set of tables. Catalog is already in the commit path when engines use IRC. By adding this sort of functionality to Catalog, we can make it possible for MTMS transactions to simply ask the Catalog about which version of table it needs to load to provide cross table SNAPSHOT ISOLATION consistent reads. While the proposal has additional changes to further simplify how engines can implement MTMS transactions, what I described above is the key reason for this proposal. Looking forward to hearing what you think! -Jagdeep On Tue, Apr 22, 2025 at 2:22 AM Fokko Driesprong <fo...@apache.org> wrote: > Hey Jagdeep, > > Thanks for proposing this. In general, it depends on what entity of the > triangle (Table-format, Catalog, Engine) we want to make responsible for > what operation. Where this proposal delegates more to the catalog, and what > is done today in the engine itself. I went over it and added some context > and comments. Let me know what you think! > > Kind regards, > Fokko > > > Op di 22 apr 2025 om 04:42 schreef Jagdeep Sidhu <sidhujagde...@gmail.com > >: > >> Hi Iceberg dev community, >> >> cc: Dru - I have been collaborating with him. >> >> I want to start this email thread to discuss an IRC API proposal to >> enable engines to implement Multi-Statement Multi-Table Transactions. More >> details in the document: >> >> >> https://docs.google.com/document/d/1jr4Ah8oceOmo6fwxG_0II4vKDUHUKScb/edit?usp=sharing&ouid=117626409673211358817&rtpof=true&sd=true >> Proposal issue: https://github.com/apache/iceberg/issues/12865 >> >> This document discusses challenges for engines to implement such >> support, and proposes APIs that Catalog can implement to enable such >> functionality. Looking forward to feedback from Iceberg dev community. >> >> Thank you in advance! >> -Jagdeep >> >> >>