Re: [Discuss] Efficient column updates in Iceberg

Anurag Mantripragada Tue, 27 Jan 2026 18:31:46 -0800

Thank you everyone for the initial review comments. It is exciting to see
so much interest in this proposal.


I am currently reviewing and responding to each comment. The general themes
of the feedback so far include:
- Including partial updates (column updates on a subset of rows in a table).
- Adding details on how SQL engines will write the update files.
- Adding details on split planning and row alignment for update files.

I will think through these points and update the design accordingly.

Best
Anurag

On Tue, Jan 27, 2026 at 6:25 PM Anurag Mantripragada <
[email protected]> wrote:

> Hi Xiangin,
>
> Happy to learn from your experience in supporting backfill use-cases.
> Please feel free to review the proposal and add your comments. I will wait
> for a couple of days more to ensure everyone has a chance to review the
> proposal.
>
> ~ Anurag
>
> On Tue, Jan 27, 2026 at 6:42 AM Xianjin Ye <[email protected]> wrote:
>
>> Hi Anurag and Peter,
>>
>> It’s great to see the partial column update has gained great interest in
>> the community. I internally built a BackfillColumns action to efficiently
>> backfill columns(by writing the partial columns only and copies the binary
>> data of other columns into a new DataFile). The speedup could be 10x for
>> wide tables but the write amplification is still there. I would be happy to
>> collaborate on the work and eliminate the write amplification.
>>
>> On 2026/01/27 10:12:54 Péter Váry wrote:
>> > Hi Anurag,
>> >
>> > It’s great to see how much interest there is in the community around
>> this
>> > potential new feature. Gábor and I have actually submitted an Iceberg
>> > Summit talk proposal on this topic, and we would be very happy to
>> > collaborate on the work. I was mainly waiting for the File Format API
>> to be
>> > finalized, as I believe this feature should build on top of it.
>> >
>> > For reference, our related work includes:
>> >
>> >    - *Dev list thread:*
>> >    https://lists.apache.org/thread/h0941sdq9jwrb6sj0pjfjjxov8tx7ov9
>> >    - *Proposal document:*
>> >
>> https://docs.google.com/document/d/1OHuZ6RyzZvCOQ6UQoV84GzwVp3UPiu_cfXClsOi03ww
>> >    (not shared widely yet)
>> >    - *Performance testing PR for readers and writers:*
>> >    https://github.com/apache/iceberg/pull/13306
>> >
>> > During earlier discussions about possible metadata changes, another
>> option
>> > came up that hasn’t been documented yet: separating planner metadata
>> from
>> > reader metadata. Since the planner does not need to know about the
>> actual
>> > files, we could store the file composition in a separate file
>> (potentially
>> > a Puffin file). This file could hold the column_files metadata, while
>> the
>> > manifest would reference the Puffin file and blob position instead of
>> the
>> > data filename.
>> > This approach has the advantage of keeping the existing metadata largely
>> > intact, and it could also give us a natural place later to add
>> file-level
>> > indexes or Bloom filters for use during reads or secondary filtering.
>> The
>> > downsides are the additional files and the increased complexity of
>> > identifying files that are no longer referenced by the table, so this
>> may
>> > not be an ideal solution.
>> >
>> > I do have some concerns about the MoR metadata proposal described in the
>> > document. At first glance, it seems to complicate distributed planning,
>> as
>> > all entries for a given file would need to be collected and merged to
>> > provide the information required by both the planner and the reader.
>> > Additionally, when a new column is added or updated, we would still
>> need to
>> > add a new metadata entry for every existing data file. If we immediately
>> > write out the merged metadata, the total number of entries remains the
>> > same. The main benefit is avoiding rewriting statistics, which can be
>> > significant, but this comes at the cost of increased planning
>> complexity.
>> > If we choose to store the merged statistics in the column_families
>> entry, I
>> > don’t see much benefit in excluding the rest of the metadata, especially
>> > since including it would simplify the planning process.
>> >
>> > As Anton already pointed out, we should also discuss how this change
>> would
>> > affect split handling, particularly how to avoid double reads when row
>> > groups are not aligned between the original data files and the new
>> column
>> > files.
>> >
>> > Finally, I’d like to see some discussion around the Java API
>> implications.
>> > In particular, what API changes are required, and how SQL engines would
>> > perform updates. Since the new column files must have the same number of
>> > rows as the original data files, with a strict one-to-one relationship,
>> SQL
>> > engines would need access to the source filename, position, and deletion
>> > status in the DataFrame in order to generate the new files. This is more
>> > involved than a simple update and deserves some explicit consideration.
>> >
>> > Looking forward to your thoughts.
>> > Best regards,
>> > Peter
>> >
>> > On Tue, Jan 27, 2026, 03:58 Anurag Mantripragada <
>> [email protected]>
>> > wrote:
>> >
>> > > Thanks Anton and others, for providing some initial feedback. I will
>> > > address all your comments soon.
>> > >
>> > > On Mon, Jan 26, 2026 at 11:10 AM Anton Okolnychyi <
>> [email protected]>
>> > > wrote:
>> > >
>> > >> I had a chance to see the proposal before it landed and I think it
>> is a
>> > >> cool idea and both presented approaches would likely work. I am
>> looking
>> > >> forward to discussing the tradeoffs and would encourage everyone to
>> > >> push/polish each approach to see what issues can be mitigated and
>> what are
>> > >> fundamental.
>> > >>
>> > >> [1] Iceberg-native approach: better visibility into column files
>> from the
>> > >> metadata, potentially better concurrency for non-overlapping column
>> > >> updates, no dep on Parquet.
>> > >> [2] Parquet-native approach: almost no changes to the table format
>> > >> metadata beyond tracking of base files.
>> > >>
>> > >> I think [1] sounds a bit better on paper but I am worried about the
>> > >> complexity in writers and readers (especially around keeping row
>> groups
>> > >> aligned and split planning). It would be great to cover this in
>> detail in
>> > >> the proposal.
>> > >>
>> > >> пн, 26 січ. 2026 р. о 09:00 Anurag Mantripragada <
>> > >> [email protected]> пише:
>> > >>
>> > >>> Hi all,
>> > >>>
>> > >>> "Wide tables" with thousands of columns present significant
>> challenges
>> > >>> for AI/ML workloads, particularly when only a subset of columns
>> needs to be
>> > >>> added or updated. Current Copy-on-Write (COW) and Merge-on-Read
>> (MOR)
>> > >>> operations in Iceberg apply at the row level, which leads to
>> substantial
>> > >>> write amplification in scenarios such as:
>> > >>>
>> > >>>    - Feature Backfilling & Column Updates: Adding new feature
>> columns
>> > >>>    (e.g., model embeddings) to petabyte-scale tables.
>> > >>>    - Model Score Updates: Refresh prediction scores after
>> retraining.
>> > >>>    - Embedding Refresh: Updating vector embeddings, which currently
>> > >>>    triggers a rewrite of the entire row.
>> > >>>    - Incremental Feature Computation: Daily updates to a small
>> fraction
>> > >>>    of features in wide tables.
>> > >>>
>> > >>> With the Iceberg V4 proposal introducing single-file commits and
>> column
>> > >>> stats improvements, this is an ideal time to address column-level
>> updates
>> > >>> to better support these use cases.
>> > >>>
>> > >>> I have drafted a proposal that explores both table-format
>> enhancements
>> > >>> and file-format (Parquet) changes to enable more efficient updates.
>> > >>>
>> > >>> Proposal Details:
>> > >>> - GitHub Issue: #15146 <
>> https://github.com/apache/iceberg/issues/15146>
>> > >>> - Design Document: Efficient Column Updates in Iceberg
>> > >>> <
>> https://docs.google.com/document/d/1Bd7JVzgajA8-DozzeEE24mID_GLuz6iwj0g4TlcVJcs/edit?tab=t.0
>> >
>> > >>>
>> > >>> Next Steps:
>> > >>> I plan to create POCs to benchmark the approaches described in the
>> > >>> document.
>> > >>>
>> > >>> Please review the proposal and share your feedback.
>> > >>>
>> > >>> Thanks,
>> > >>> Anurag
>> > >>>
>> > >>
>> >
>>
>

Re: [Discuss] Efficient column updates in Iceberg

Reply via email to