Re: Pitch for Pcodec in Parquet (again)

Martin Loncaric Wed, 02 Apr 2025 11:23:20 -0700

Thanks Micah - I'll be happy to add comments/suggestions if you have a
draft ready. I assume it'll focus on the main 3 things we discussed in the
meeting: language support, developer support, and benchmark results.


On Wed, Apr 2, 2025 at 2:48 AM Micah Kornfield <[email protected]>
wrote:

> Apologies I been delayed in drafting this, should have something by end of
> this week to share
>
> On Thursday, March 20, 2025, Micah Kornfield <[email protected]>
> wrote:
>
> > Based on the in person sync, I took an action item to try to write a
> draft
> > doc so we can come to a clear consensus on how to decide on new
> > encodings/compression.  I hope to have something to share next week but
> it
> > will likely need further input from the community.
> >
> > Thanks,
> > Micah
> >
> > On Thu, Mar 20, 2025 at 3:33 AM Antoine Pitrou <[email protected]>
> wrote:
> >
> >> On Tue, 18 Mar 2025 19:08:04 +0100
> >> Alkis Evlogimenos
> >> <[email protected]>
> >> wrote:
> >> > At the end it boils down to which dataset you think is more
> >> representative
> >> > of the world data.
> >>
> >> This sentence does not even have a precise meaning. Data is plural,
> >> there is no "representative" dataset.
> >>
> >> If someone tells you that the average animal on Earth is 2 millimeters
> >> long, is that "representative" of the characteristics of mammals?
> >>
> >> In the end, the question is whether a new encoding brings enough
> >> benefits in *some* cases to justify including it in Parquet. You may
> >> care primarily about Databricks customers, but some people don't. This
> >> is not a Databricks project.
> >>
> >> Regards
> >>
> >> Antoine.
> >>
> >>
> >>
>

Re: Pitch for Pcodec in Parquet (again)

Reply via email to