Hey Micah, For some reason, your email ended up in my spam box 😨
There is a reason for everything! .gz.metadata.json is quite uncommon and can't be read by most existing > tools. Would it be better to support .metadata.json.gz and treat > .gz.metadata.json as legacy for backward compatibility? The Java client supports both <https://github.com/apache/iceberg/blob/dc26b72ad016840b79d62bf8a84b7f2109e9b71b/core/src/test/java/org/apache/iceberg/TableMetadataParserCodecTest.java#L29-L40>. I looked into this years ago, and if I recall correctly, it was to bypass the decompressor of Hadoop <https://github.com/apache/iceberg/pull/258/>. Hadoop would detect the .gz and handle all the (de)compression, which we wanted to do ourselves. gzip is becoming increasingly outdated due to its lack of support for > modern CPUs. New algorithms like zstd are gaining popularity, so should > we consider allowing users to use .metadata.json.zst as well? Yes, I think that would make a lot of sense. Kind regards, Fokko Op ma 28 apr 2025 om 08:41 schreef Xuanwo <xua...@apache.org>: > I've copied my comments from GitHub here for a broader discussion: > > > > Hi, I have two concerns about this change: > > - .gz.metadata.json is quite uncommon and can't be read by most > existing tools. Would it be better to support .metadata.json.gz and > treat .gz.metadata.json as legacy for backward compatibility? > - gzip is becoming increasingly outdated due to its lack of support > for modern CPUs. New algorithms like zstd are gaining popularity, so > should we consider allowing users to use .metadata.json.zst as well? > > > On Sun, Apr 27, 2025, at 07:36, Micah Kornfield wrote: > > I created https://github.com/apache/iceberg/pull/12598 to document this > feature. Kevin Liu already took a look, but I would like to get more eyes > on it before starting a vote for merging. > > Thanks, > Micah > > Xuanwo > > https://xuanwo.io/ > >