+1, I think this is a real problem, especially for streaming / frequent
appends where commit latency matters and metadata.json keeps getting bigger.

I also agree we probably shouldn’t remove the root metadata file
completely. Having one file that describes the whole table is really useful
for portability and debugging.

Of the options you listed, I like “offload pieces to external files” as a
first step. We still write the root file every commit, but it won’t grow as
fast. The downside is extra maintenance/GC complexity.

A couple questions/ideas:

   - Do we have any data on what parts of metadata.json grow the most
   (snapshots / history / refs)? Even a rough breakdown could help decide what
   to move out first.
   - Could we do a hybrid: still write the root file every commit, but only
   keep a “recent window” in it, and move older history to referenced files?
   (portable, but bounded growth)
   - For “optional on commit”, maybe make it a catalog capability (fast
   commits if the catalog can serve metadata), but still support an
   export/materialize step when portability is needed.

Thanks,
Huaxin

On Tue, Feb 10, 2026 at 2:58 PM Anton Okolnychyi <[email protected]>
wrote:

> I don't think we have any consensus or concrete plan. In fact, I don't
> know what my personal preference is at this point. The intention of this
> thread is to gain that clarity. I don't think removing the root metadata
> file entirely is a good idea. It is great to have a way to describe the
> entire state of a table in a file. We just need to find a solution for
> streaming appends that suffer from the increasing size of the root metadata
> file.
>
> Like I said, making the generation of the json file on commit optional is
> one way to solve this problem. We can also think about offloading pieces of
> it to external files (say old snapshots). This would mean we still have to
> write the root file on each commit but it will be smaller. One clear
> downside is more complicated maintenance.
>
> Any other ideas/thoughts/feedback? Do people see this as a problem?
>
>
> вт, 10 лют. 2026 р. о 14:18 Yufei Gu <[email protected]> пише:
>
>> Hi Anton, thanks for raising this. I would really like to make this
>> optional and then build additional use cases on top of it. For example, a
>> catalog like IRC could completely eliminate storage IO during commit and
>> load, which is a big win. It could also provide better protection for
>> encrypted Iceberg tables, since metadata.json files are plain text today.
>>
>> That said, do we have consensus that metadata.json can be optional? There
>> are real portability concerns, and engine-side work also needs
>> consideration. For example, static tables and the Spark driver still expect
>> to read this file directly from storage. It feels like the first step here
>> is aligning on whether metadata.json can be optional at all, before we go
>> deeper into how we get rid of. What do you think?
>>
>> Yufei
>>
>>
>> On Tue, Feb 10, 2026 at 11:23 AM Anton Okolnychyi <[email protected]>
>> wrote:
>>
>>> While it may be common knowledge among Iceberg devs that writing the
>>> root JSON file on commit is somewhat optional with a right catalog, what
>>> can we do in V4 to solve this problem for all? My problem is the suboptimal
>>> behavior that new users get by default with HMS or Hadoop catalogs and how
>>> this impacts their perception of Iceberg. We are doing a bunch of work for
>>> streaming (e.g. changelog scans, single file commits, etc), but the need to
>>> write the root JSON file may cancel all of that.
>>>
>>> Let me throw some ideas out there.
>>>
>>> - Describe how catalogs can make the generation of the root metadata
>>> file optional in the spec. Ideally, implement that in a built-in catalog of
>>> choice as a reference implementation.
>>> - Offload portions of the root metadata file to external files and keep
>>> references to them.
>>>
>>> Thoughts?
>>>
>>> - Anton
>>>
>>>
>>>

Reply via email to