Agree, that this is metadata only for Iceberg and should not be read by other 
systems, it was just an example.
Main point is that having gz in the middle is confusing. I guess expectation is 
that if file ends with json suffix, it is a json.
Maybe another option is to remove “.json" from metadata files names at all, 
this might be less confusing.

Kind regards,
Arina

> On Jul 19, 2019, at 7:38 PM, Ryan Blue <rb...@netflix.com.INVALID> wrote:
> 
> The intent here was to make it easier to identify the format of a file, but 
> if this makes the files incompatible with other systems maybe we should 
> change it back.
> 
> I think the argument against changing it back is that I wouldn't expect 
> people to read these files with systems like Drill. Instead, we want to move 
> to using metadata tables to inspect table state, like the recently added 
> history, snapshots, manifests, and files tables.
> 
> On Fri, Jul 19, 2019 at 7:50 AM Arina Yelchiyeva <arina.yelchiy...@gmail.com 
> <mailto:arina.yelchiy...@gmail.com>> wrote:
> Hi all,
> 
> Recent changes in metadata compression started adding “.gz” after metadata 
> file name, not in the end as before.
> Before: v1.metadata.json.gz
> Now: v1.gz.metadata.json
> 
> Looks like this was done intentionally but for me it looks rather confusing. 
> Since gz is indication of compressed file and usually placed in the end. Plus 
> is causes problems when reading such file using external tools.
> For example, Apache Drill cannot read "v1.gz.metadata.json” as it assumes, it 
> is a json but it can successfully read "v1.metadata.json.gz” since it 
> understands that it is a compressed json file.
> 
> 
> Any thoughts?
> 
> Kind regards,
> Arina
> 
> 
> -- 
> Ryan Blue
> Software Engineer
> Netflix

Reply via email to