On Tue, 14 May 2024 at 17:48, Julien Le Dem <jul...@apache.org> wrote:

> +1 on Micah starting a doc and following up by commenting in it.
>

+maybe some conf call where people of interest can talk about it.



>
> @Raphael, Wish Maple: agreed that changing the metadata representation is
> less important. Most engines can externalize and index metadata in some
> way.


works if queries against specific tables are always routed to those
servers, the indices fit in memory and the servers stay up. once things
become more agile that doesn't hold any more.

This is why I've not investigated the idea of having the filesystem
connector (s3a, abfs...) cache footers to local fs across multiple
streams/between opening files, even as they now all move to support some
form of footer caching to boost ORC/Parquet performance for apps which seek
to the end repeatedly. The larger the worker pool: lower probability of
reuse; the more files you have the more space any caching takes up.


> It is an option to propose a standard way to do it without changing
> the format.


+1


> Adding new encodings or make existing encodings more
> parallelizable is something that needs to be in the format and more useful.
>

One of the things I'd like to see from Micah's work is some list of what
new data types and encodings people think are needed.





>
> On Tue, May 14, 2024 at 9:26 AM Antoine Pitrou <anto...@python.org> wrote:
>
> > On Mon, 13 May 2024 16:10:24 +0100
> > Raphael Taylor-Davies
> > <r.taylordav...@googlemail.com.INVALID>
> > wrote:
> > >
> > > I guess I wonder if rather than having a parquet format version 2, or
> > > even a parquet format version 3, we could just document what features a
> > > given parquet implementation actually supports. I believe Andrew
> intends
> > > to pick up on where previous efforts here left off.
> >
> > I also believe documenting implementation status is strongly desirable,
> > regardless of whether the discussion on "V3" goes anywhere.
> >
> > Regards
> >
> > Antoine.
> >
> >
> >
>

Reply via email to