Hey Amogh, Thanks for the write-up. Unfortunately, I won’t be able to attend. Will it be recorded? Thanks!
Kind regards, Fokko Op di 7 okt 2025 om 20:36 schreef Amogh Jahagirdar <[email protected]> > Hey all, > > I've setup time this Friday at 9am PST for another sync on single file > commits. In terms of what would be great to focus on for the discussion: > > 1. Whether it makes sense or not to eliminate the tuple, and instead > representing the tuple via lower/upper boundaries. As a reminder, one of > the goals is to avoid tying a partition spec to a manifest; in the root we > can have a mix of files spanning different partition specs, and even in > leaf manifests avoiding this coupling can enable more desirable clustering > of metadata. > In the vast majority of cases, we could leverage the property that a file > is effectively partitioned if the lower/upper for a given field is equal. > The nuance here is with the particular case of identity partitioned > string/binary columns which can be truncated in stats. One approach is to > require that writers must not produce truncated stats for identity > partitioned columns. It's also important to keep in mind that all of this > is just for the purpose of reconstructing the partition tuple, which is > only required during equality delete matching. Another area we need to > cover as part of this is on exact bounds on stats. There are other options > here as well such as making all new equality deletes in V4 be global and > instead match based on bounds, or keeping the tuple but each tuple is > effectively based off a union schema of all partition specs. I am adding a > separate appendix section outlining the span of options here and the > different tradeoffs. > Once we get this more to a conclusive state, I'll move a summarized > version to the main doc. > > 2. @[email protected] <[email protected]> has updated the doc with > a section > <https://docs.google.com/document/d/1k4x8utgh41Sn1tr98eynDKCWq035SV_f75rtNHcerVw/edit?tab=t.rrpksmp8zkb#heading=h.qau0y5xkh9mn> > on > how we can do change detection from the root in a variety of write > scenarios. I've done a review on it, and it covers the cases I would > expect. It'd be good for folks to take a look and please give feedback > before we discuss. Thank you Steven for adding that section and all the > diagrams. > > Thanks, > Amogh Jahagirdar > > On Thu, Sep 18, 2025 at 3:19 PM Amogh Jahagirdar <[email protected]> wrote: > >> Hey folks just following up from the discussion last Friday with a >> summary and some next steps: >> >> 1.) For the various change detection cases, we concluded it's best just >> to go through those in an offline manner on the doc since it's hard to >> verify all that correctness in a large meeting setting. >> 2.) We mostly discussed eliminating the partition tuple. On the original >> proposal, I was mostly aiming for the ability to re-constructing the tuple >> from the stats for the purpose of equality delete matching (a file is >> partitioned if the lower and upper bounds are equal); There's some nuance >> in how we need to handle identity partition values since for string/binary >> they cannot be truncated. Another potential option is to treat all equality >> deletes as effectively global and narrow their application based on the >> stats values. This may require defining tight bounds. I'm still collecting >> my thoughts on this one. >> >> Thanks folks! Please also let me know if any of the following links are >> inaccessible for any reason. >> >> Meeting recording link: >> https://drive.google.com/file/d/1gv8TrR5xzqqNxek7_sTZkpbwQx1M3dhK/view >> Meeting summary: >> https://docs.google.com/document/d/131N0CDpzZczURxitN0HGS7dTqRxQT_YS9jMECkGGvQU >> >> On Mon, Sep 8, 2025 at 3:40 PM Amogh Jahagirdar <[email protected]> wrote: >> >>> Update: I moved the discussion time to this Friday at 9 am PST since I >>> found out that quite a few folks involved in the proposals will be out next >>> week, and I also know some folks will also be out the week after that. >>> >>> Thanks, >>> Amogh J >>> >>> On Mon, Sep 8, 2025 at 8:57 AM Amogh Jahagirdar <[email protected]> >>> wrote: >>> >>>> Hey folks sorry for the late follow up here, >>>> >>>> Thanks @Kevin Liu <[email protected]> for sharing the recording >>>> link of the previous discussion! I've set up another sync for next Tuesday >>>> 09/16 at 9am PST. This time I've set it up from my corporate email so we >>>> can get recordings and transcriptions (and I've made sure to keep the >>>> meeting invite open so we don't have to manually let people in). >>>> >>>> In terms of next steps of areas which I think would be good to focus on >>>> for establishing consensus: >>>> >>>> 1. How do we model the manifest entry structure so that changes to >>>> manifest DVs can be obtained easily from the root? There are a few options >>>> here; the most promising approach is to keep an additional DV which encodes >>>> the diff in additional positions which have been removed from a leaf >>>> manifest. >>>> >>>> 2. Modeling partition transforms via expressions and establishing a >>>> unified table ID space so that we can simplify how partition tuples may be >>>> represented via stats and also have a way in the future to store stats on >>>> any derived column. I have a short proposal >>>> <https://docs.google.com/document/d/1oV8dapKVzB4pZy5pKHUCj5j9i2_1p37BJSeT7hyKPpg/edit?tab=t.0> >>>> for >>>> this that probably still needs some tightening up on the expression >>>> modeling itself (and some prototyping) but the general idea for >>>> establishing a unified table ID space is covered. All feedback welcome! >>>> >>>> Thanks, >>>> >>>> Amogh Jahagirdar >>>> >>>> On Mon, Aug 25, 2025 at 1:34 PM Kevin Liu <[email protected]> >>>> wrote: >>>> >>>>> Thanks Amogh. Looks like the recording for last week's sync is >>>>> available on Youtube. Here's the link, >>>>> https://www.youtube.com/watch?v=uWm-p--8oVQ >>>>> >>>>> Best, >>>>> Kevin Liu >>>>> >>>>> On Tue, Aug 12, 2025 at 9:10 PM Amogh Jahagirdar <[email protected]> >>>>> wrote: >>>>> >>>>>> Hey folks, >>>>>> >>>>>> Just following up on this to give the community as to where we're at >>>>>> and my proposed next steps. >>>>>> >>>>>> I've been editing and merging the contents from our proposal into the >>>>>> proposal >>>>>> <https://docs.google.com/document/d/1k4x8utgh41Sn1tr98eynDKCWq035SV_f75rtNHcerVw/edit?tab=t.0#heading=h.unn922df0zzw> >>>>>> from >>>>>> Russell and others. For any future comments on docs, please comment on >>>>>> the >>>>>> linked proposal. I've also marked it on our doc in red text so it's clear >>>>>> to redirect to the other proposal as a source of truth for comments. >>>>>> >>>>>> In terms of next steps, >>>>>> >>>>>> 1. An important design decision point is around inline manifest DVs, >>>>>> external manifest DVs or enabling both. I'm working on measuring >>>>>> different >>>>>> approaches for representing the compressed DV representation since that >>>>>> will inform how many entries can reasonably fit in a small root manifest; >>>>>> from that we can derive implications on different write patterns and >>>>>> determine the right approach for storing these manifest DVs. >>>>>> >>>>>> 2. Another key point is around determining if/how we can reasonably >>>>>> enable V4 to represent changes in the root manifest so that readers can >>>>>> effectively just infer file level changes from the root. >>>>>> >>>>>> 3. One of the aspects of the proposal is getting away from partition >>>>>> tuple requirement in the root which currently holds us to have >>>>>> associativity between a partition spec and a manifest. These aspects can >>>>>> be >>>>>> modeled as essentially column stats which gives a lot of flexibility into >>>>>> the organization of the manifest. There are important details around >>>>>> field >>>>>> ID spaces here which tie into how the stats are structured. What we're >>>>>> proposing here is to have a unified expression ID space that could also >>>>>> benefit us for storing things like virtual columns down the line. I go >>>>>> into >>>>>> this in the proposal but I'm working on separating the appropriate parts >>>>>> so >>>>>> that the original proposal can mostly just focus on the organization of >>>>>> the >>>>>> content metadata tree and not how we want to solve this particular ID >>>>>> space >>>>>> problem. >>>>>> >>>>>> 4. I'm planning on scheduling a recurring community sync starting >>>>>> next Tuesday at 9am PST, every 2 weeks. If I get feedback from folks that >>>>>> this time will never work, I can certainly adjust. For some reason, I >>>>>> don't >>>>>> have the ability to add to the Iceberg Dev calendar, so I'll figure that >>>>>> out and update the thread when the event is scheduled. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Amogh Jahagirdar >>>>>> >>>>>> On Tue, Jul 22, 2025 at 11:47 AM Russell Spitzer < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> I think this is a great way forward, starting out with this much >>>>>>> parallel development shows that we have a lot of consensus already :) >>>>>>> >>>>>>> On Tue, Jul 22, 2025 at 12:42 PM Amogh Jahagirdar <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> Hey folks, just following up on this. It looks like our proposal >>>>>>>> and the proposal that @Russell Spitzer <[email protected]> >>>>>>>> shared >>>>>>>> are pretty aligned. I was just chatting with Russell about this, and we >>>>>>>> think it'd be best to combine both proposals and have a singular large >>>>>>>> effort on this. I can also set up a focused community discussion >>>>>>>> (similar >>>>>>>> to what we're doing on the other V4 proposals) on this starting >>>>>>>> sometime >>>>>>>> next week just to get things moving, if that works for people. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Amogh Jahagirdar >>>>>>>> >>>>>>>> On Mon, Jul 14, 2025 at 9:48 PM Amogh Jahagirdar <[email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hey Russell, >>>>>>>>> >>>>>>>>> Thanks for sharing the proposal! A few of us (Ryan, Dan, Anoop and >>>>>>>>> I) have also been working on a proposal for an adaptive metadata tree >>>>>>>>> structure as part of enabling more efficient one file commits. From a >>>>>>>>> read >>>>>>>>> of the summary, it's great to see that we're thinking along the same >>>>>>>>> lines >>>>>>>>> about how to tackle this fundamental area! >>>>>>>>> >>>>>>>>> Here is our proposal: >>>>>>>>> https://docs.google.com/document/d/1q2asTpq471pltOTC6AsTLQIQcgEsh0AvEhRWnCcvZn0 >>>>>>>>> <https://docs.google.com/document/d/1q2asTpq471pltOTC6AsTLQIQcgEsh0AvEhRWnCcvZn0> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Amogh Jahagirdar >>>>>>>>> >>>>>>>>> On Mon, Jul 14, 2025 at 8:08 PM Russell Spitzer < >>>>>>>>> [email protected]> wrote: >>>>>>>>> >>>>>>>>>> Hey y'all! >>>>>>>>>> >>>>>>>>>> We (Yi Fang, Steven Wu and Myself) wanted to share some >>>>>>>>>> of the thoughts we had on how one-file commits could work in >>>>>>>>>> Iceberg. This is pretty >>>>>>>>>> much just a high level overview of the concepts we think we need >>>>>>>>>> and how Iceberg would behave. >>>>>>>>>> We haven't gone very far into the actual implementation and >>>>>>>>>> changes that would need to occur in the >>>>>>>>>> SDK to make this happen. >>>>>>>>>> >>>>>>>>>> The high level summary is: >>>>>>>>>> >>>>>>>>>> Manifest Lists are out >>>>>>>>>> Root Manifests take their place >>>>>>>>>> A Root manifest can have data manifests, delete manifests, >>>>>>>>>> manifest delete vectors, data delete vectors and data files >>>>>>>>>> Manifest delete vectors allow for modifying a manifest without >>>>>>>>>> deleting it entirely >>>>>>>>>> Data files let you append without writing an intermediary >>>>>>>>>> manifest >>>>>>>>>> Having child data and delete manifests lets you still scale >>>>>>>>>> >>>>>>>>>> Please take a look if you like, >>>>>>>>>> >>>>>>>>>> https://docs.google.com/document/d/1k4x8utgh41Sn1tr98eynDKCWq035SV_f75rtNHcerVw/edit?tab=t.0 >>>>>>>>>> >>>>>>>>>> I'm excited to see what other proposals and Ideas are floating >>>>>>>>>> around the community, >>>>>>>>>> Russ >>>>>>>>>> >>>>>>>>>> On Wed, Jul 2, 2025 at 6:29 PM John Zhuge <[email protected]> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Very excited about the idea! >>>>>>>>>>> >>>>>>>>>>> On Wed, Jul 2, 2025 at 1:17 PM Anoop Johnson < >>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>> >>>>>>>>>>>> I'm very interested in this initiative. Micah Kornfield and I >>>>>>>>>>>> presented >>>>>>>>>>>> <https://youtu.be/4d4nqKkANdM?si=9TXgaUIXbq-l8idi&t=1405> on >>>>>>>>>>>> high-throughput ingestion for Iceberg tables at the 2024 Iceberg >>>>>>>>>>>> Summit, >>>>>>>>>>>> which leveraged Google infrastructure like Colossus for efficient >>>>>>>>>>>> appends. >>>>>>>>>>>> >>>>>>>>>>>> This new proposal is particularly exciting because it offers >>>>>>>>>>>> significant advancements in commit latency and metadata storage >>>>>>>>>>>> footprint. >>>>>>>>>>>> Furthermore, a consistent manifest structure promises to simplify >>>>>>>>>>>> the >>>>>>>>>>>> design and codebase, which is a major benefit. >>>>>>>>>>>> >>>>>>>>>>>> A related idea I've been exploring is having a loose affinity >>>>>>>>>>>> between data and delete manifests. While the current separation of >>>>>>>>>>>> data and >>>>>>>>>>>> delete manifests in Iceberg is valuable for avoiding data file >>>>>>>>>>>> rewrites >>>>>>>>>>>> (and stats updates) when deletes change, it does necessitate a join >>>>>>>>>>>> operation during reads. I'd be keen to discuss approaches that >>>>>>>>>>>> could >>>>>>>>>>>> potentially reduce this read-side cost while retaining the >>>>>>>>>>>> benefits of >>>>>>>>>>>> separate manifests. >>>>>>>>>>>> >>>>>>>>>>>> Best, >>>>>>>>>>>> Anoop >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Fri, Jun 13, 2025 at 11:06 AM Jagdeep Sidhu < >>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi everyone, >>>>>>>>>>>>> >>>>>>>>>>>>> I am new to the Iceberg community but would love to >>>>>>>>>>>>> participate in these discussions to reduce the number of file >>>>>>>>>>>>> writes, >>>>>>>>>>>>> especially for small writes/commits. >>>>>>>>>>>>> >>>>>>>>>>>>> Thank you! >>>>>>>>>>>>> -Jagdeep >>>>>>>>>>>>> >>>>>>>>>>>>> On Thu, Jun 5, 2025 at 4:02 PM Anurag Mantripragada >>>>>>>>>>>>> <[email protected]> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> We have been hitting all the metadata problems you mentioned, >>>>>>>>>>>>>> Ryan. I’m on-board to help however I can to improve this area. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> ~ Anurag Mantripragada >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Jun 3, 2025, at 2:22 AM, Huang-Hsiang Cheng >>>>>>>>>>>>>> <[email protected]> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> I am interested in this idea and looking forward to >>>>>>>>>>>>>> collaboration. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> Huang-Hsiang >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Jun 2, 2025, at 10:14 AM, namratha mk <[email protected]> >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> Hello, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I am interested in contributing to this effort. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> Namratha >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Thu, May 29, 2025 at 1:36 PM Amogh Jahagirdar < >>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks for kicking this thread off Ryan, I'm interested in >>>>>>>>>>>>>>> helping out here! I've been working on a proposal in this area >>>>>>>>>>>>>>> and it would >>>>>>>>>>>>>>> be great to collaborate with different folks and exchange ideas >>>>>>>>>>>>>>> here, since >>>>>>>>>>>>>>> I think a lot of people are interested in solving this problem. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> Amogh Jahagirdar >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Thu, May 29, 2025 at 2:25 PM Ryan Blue <[email protected]> >>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hi everyone, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Like Russell’s recent note, I’m starting a thread to >>>>>>>>>>>>>>>> connect those of us that are interested in the idea of >>>>>>>>>>>>>>>> changing Iceberg’s >>>>>>>>>>>>>>>> metadata in v4 so that in most cases committing a change only >>>>>>>>>>>>>>>> requires >>>>>>>>>>>>>>>> writing one additional metadata file. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> *Idea: One-file commits* >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> The current Iceberg metadata structure requires writing at >>>>>>>>>>>>>>>> least one manifest and a new manifest list to produce a new >>>>>>>>>>>>>>>> snapshot. The >>>>>>>>>>>>>>>> goal of this work is to allow more flexibility by allowing the >>>>>>>>>>>>>>>> manifest >>>>>>>>>>>>>>>> list layer to store data and delete files. As a result, only >>>>>>>>>>>>>>>> one file write >>>>>>>>>>>>>>>> would be needed before committing the new snapshot. In >>>>>>>>>>>>>>>> addition, this work >>>>>>>>>>>>>>>> will also try to explore: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> - Avoiding small manifests that must be read in >>>>>>>>>>>>>>>> parallel and later compacted (metadata maintenance changes) >>>>>>>>>>>>>>>> - Extend metadata skipping to use aggregated column >>>>>>>>>>>>>>>> ranges that are compatible with geospatial data (manifest >>>>>>>>>>>>>>>> metadata) >>>>>>>>>>>>>>>> - Using soft deletes to avoid rewriting existing >>>>>>>>>>>>>>>> manifests (metadata DVs) >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> If you’re interested in these problems, please reply! >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Ryan >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> John Zhuge >>>>>>>>>>> >>>>>>>>>>
