Thanks Micah - I'll be happy to add comments/suggestions if you have a draft ready. I assume it'll focus on the main 3 things we discussed in the meeting: language support, developer support, and benchmark results.
On Wed, Apr 2, 2025 at 2:48 AM Micah Kornfield <emkornfi...@gmail.com> wrote: > Apologies I been delayed in drafting this, should have something by end of > this week to share > > On Thursday, March 20, 2025, Micah Kornfield <emkornfi...@gmail.com> > wrote: > > > Based on the in person sync, I took an action item to try to write a > draft > > doc so we can come to a clear consensus on how to decide on new > > encodings/compression. I hope to have something to share next week but > it > > will likely need further input from the community. > > > > Thanks, > > Micah > > > > On Thu, Mar 20, 2025 at 3:33 AM Antoine Pitrou <anto...@python.org> > wrote: > > > >> On Tue, 18 Mar 2025 19:08:04 +0100 > >> Alkis Evlogimenos > >> <alkis.evlogime...@databricks.com.INVALID> > >> wrote: > >> > At the end it boils down to which dataset you think is more > >> representative > >> > of the world data. > >> > >> This sentence does not even have a precise meaning. Data is plural, > >> there is no "representative" dataset. > >> > >> If someone tells you that the average animal on Earth is 2 millimeters > >> long, is that "representative" of the characteristics of mammals? > >> > >> In the end, the question is whether a new encoding brings enough > >> benefits in *some* cases to justify including it in Parquet. You may > >> care primarily about Databricks customers, but some people don't. This > >> is not a Databricks project. > >> > >> Regards > >> > >> Antoine. > >> > >> > >> >