+1 on Micah starting a doc and following up by commenting in it. @Raphael, Wish Maple: agreed that changing the metadata representation is less important. Most engines can externalize and index metadata in some way. It is an option to propose a standard way to do it without changing the format. Adding new encodings or make existing encodings more parallelizable is something that needs to be in the format and more useful.
On Tue, May 14, 2024 at 9:26 AM Antoine Pitrou <anto...@python.org> wrote: > On Mon, 13 May 2024 16:10:24 +0100 > Raphael Taylor-Davies > <r.taylordav...@googlemail.com.INVALID> > wrote: > > > > I guess I wonder if rather than having a parquet format version 2, or > > even a parquet format version 3, we could just document what features a > > given parquet implementation actually supports. I believe Andrew intends > > to pick up on where previous efforts here left off. > > I also believe documenting implementation status is strongly desirable, > regardless of whether the discussion on "V3" goes anywhere. > > Regards > > Antoine. > > >