Thank you everyone for the initial review comments. It is exciting to see so much interest in this proposal.
I am currently reviewing and responding to each comment. The general themes of the feedback so far include: - Including partial updates (column updates on a subset of rows in a table). - Adding details on how SQL engines will write the update files. - Adding details on split planning and row alignment for update files. I will think through these points and update the design accordingly. Best Anurag On Tue, Jan 27, 2026 at 6:25 PM Anurag Mantripragada < [email protected]> wrote: > Hi Xiangin, > > Happy to learn from your experience in supporting backfill use-cases. > Please feel free to review the proposal and add your comments. I will wait > for a couple of days more to ensure everyone has a chance to review the > proposal. > > ~ Anurag > > On Tue, Jan 27, 2026 at 6:42 AM Xianjin Ye <[email protected]> wrote: > >> Hi Anurag and Peter, >> >> It’s great to see the partial column update has gained great interest in >> the community. I internally built a BackfillColumns action to efficiently >> backfill columns(by writing the partial columns only and copies the binary >> data of other columns into a new DataFile). The speedup could be 10x for >> wide tables but the write amplification is still there. I would be happy to >> collaborate on the work and eliminate the write amplification. >> >> On 2026/01/27 10:12:54 Péter Váry wrote: >> > Hi Anurag, >> > >> > It’s great to see how much interest there is in the community around >> this >> > potential new feature. Gábor and I have actually submitted an Iceberg >> > Summit talk proposal on this topic, and we would be very happy to >> > collaborate on the work. I was mainly waiting for the File Format API >> to be >> > finalized, as I believe this feature should build on top of it. >> > >> > For reference, our related work includes: >> > >> > - *Dev list thread:* >> > https://lists.apache.org/thread/h0941sdq9jwrb6sj0pjfjjxov8tx7ov9 >> > - *Proposal document:* >> > >> https://docs.google.com/document/d/1OHuZ6RyzZvCOQ6UQoV84GzwVp3UPiu_cfXClsOi03ww >> > (not shared widely yet) >> > - *Performance testing PR for readers and writers:* >> > https://github.com/apache/iceberg/pull/13306 >> > >> > During earlier discussions about possible metadata changes, another >> option >> > came up that hasn’t been documented yet: separating planner metadata >> from >> > reader metadata. Since the planner does not need to know about the >> actual >> > files, we could store the file composition in a separate file >> (potentially >> > a Puffin file). This file could hold the column_files metadata, while >> the >> > manifest would reference the Puffin file and blob position instead of >> the >> > data filename. >> > This approach has the advantage of keeping the existing metadata largely >> > intact, and it could also give us a natural place later to add >> file-level >> > indexes or Bloom filters for use during reads or secondary filtering. >> The >> > downsides are the additional files and the increased complexity of >> > identifying files that are no longer referenced by the table, so this >> may >> > not be an ideal solution. >> > >> > I do have some concerns about the MoR metadata proposal described in the >> > document. At first glance, it seems to complicate distributed planning, >> as >> > all entries for a given file would need to be collected and merged to >> > provide the information required by both the planner and the reader. >> > Additionally, when a new column is added or updated, we would still >> need to >> > add a new metadata entry for every existing data file. If we immediately >> > write out the merged metadata, the total number of entries remains the >> > same. The main benefit is avoiding rewriting statistics, which can be >> > significant, but this comes at the cost of increased planning >> complexity. >> > If we choose to store the merged statistics in the column_families >> entry, I >> > don’t see much benefit in excluding the rest of the metadata, especially >> > since including it would simplify the planning process. >> > >> > As Anton already pointed out, we should also discuss how this change >> would >> > affect split handling, particularly how to avoid double reads when row >> > groups are not aligned between the original data files and the new >> column >> > files. >> > >> > Finally, I’d like to see some discussion around the Java API >> implications. >> > In particular, what API changes are required, and how SQL engines would >> > perform updates. Since the new column files must have the same number of >> > rows as the original data files, with a strict one-to-one relationship, >> SQL >> > engines would need access to the source filename, position, and deletion >> > status in the DataFrame in order to generate the new files. This is more >> > involved than a simple update and deserves some explicit consideration. >> > >> > Looking forward to your thoughts. >> > Best regards, >> > Peter >> > >> > On Tue, Jan 27, 2026, 03:58 Anurag Mantripragada < >> [email protected]> >> > wrote: >> > >> > > Thanks Anton and others, for providing some initial feedback. I will >> > > address all your comments soon. >> > > >> > > On Mon, Jan 26, 2026 at 11:10 AM Anton Okolnychyi < >> [email protected]> >> > > wrote: >> > > >> > >> I had a chance to see the proposal before it landed and I think it >> is a >> > >> cool idea and both presented approaches would likely work. I am >> looking >> > >> forward to discussing the tradeoffs and would encourage everyone to >> > >> push/polish each approach to see what issues can be mitigated and >> what are >> > >> fundamental. >> > >> >> > >> [1] Iceberg-native approach: better visibility into column files >> from the >> > >> metadata, potentially better concurrency for non-overlapping column >> > >> updates, no dep on Parquet. >> > >> [2] Parquet-native approach: almost no changes to the table format >> > >> metadata beyond tracking of base files. >> > >> >> > >> I think [1] sounds a bit better on paper but I am worried about the >> > >> complexity in writers and readers (especially around keeping row >> groups >> > >> aligned and split planning). It would be great to cover this in >> detail in >> > >> the proposal. >> > >> >> > >> пн, 26 січ. 2026 р. о 09:00 Anurag Mantripragada < >> > >> [email protected]> пише: >> > >> >> > >>> Hi all, >> > >>> >> > >>> "Wide tables" with thousands of columns present significant >> challenges >> > >>> for AI/ML workloads, particularly when only a subset of columns >> needs to be >> > >>> added or updated. Current Copy-on-Write (COW) and Merge-on-Read >> (MOR) >> > >>> operations in Iceberg apply at the row level, which leads to >> substantial >> > >>> write amplification in scenarios such as: >> > >>> >> > >>> - Feature Backfilling & Column Updates: Adding new feature >> columns >> > >>> (e.g., model embeddings) to petabyte-scale tables. >> > >>> - Model Score Updates: Refresh prediction scores after >> retraining. >> > >>> - Embedding Refresh: Updating vector embeddings, which currently >> > >>> triggers a rewrite of the entire row. >> > >>> - Incremental Feature Computation: Daily updates to a small >> fraction >> > >>> of features in wide tables. >> > >>> >> > >>> With the Iceberg V4 proposal introducing single-file commits and >> column >> > >>> stats improvements, this is an ideal time to address column-level >> updates >> > >>> to better support these use cases. >> > >>> >> > >>> I have drafted a proposal that explores both table-format >> enhancements >> > >>> and file-format (Parquet) changes to enable more efficient updates. >> > >>> >> > >>> Proposal Details: >> > >>> - GitHub Issue: #15146 < >> https://github.com/apache/iceberg/issues/15146> >> > >>> - Design Document: Efficient Column Updates in Iceberg >> > >>> < >> https://docs.google.com/document/d/1Bd7JVzgajA8-DozzeEE24mID_GLuz6iwj0g4TlcVJcs/edit?tab=t.0 >> > >> > >>> >> > >>> Next Steps: >> > >>> I plan to create POCs to benchmark the approaches described in the >> > >>> document. >> > >>> >> > >>> Please review the proposal and share your feedback. >> > >>> >> > >>> Thanks, >> > >>> Anurag >> > >>> >> > >> >> > >> >
