Thank you folks, sounds exciting!

I don't see an invite for the sync. Is it happening today?

On Wed, Jul 31, 2024 at 3:12 AM Julien Le Dem <jul...@apache.org> wrote:

> It sounds like everybody is happy with the proposal.
> Tomorrow is the Parquet sync, we can finalize then.
>
> On Wed, Jul 24, 2024 at 9:20 AM Julien Le Dem <jul...@apache.org> wrote:
>
> > Hi Alkis,
> > I saw you addressed and resolved the comments in the doc. Thank you.
> > This looks good to me.
> > I would recommend others that have been active in this conversation to
> > take a final look.
> > Best
> > Julien
> >
> > On Tue, Jul 23, 2024 at 3:06 PM Julien Le Dem <jul...@apache.org> wrote:
> >
> >> I am also OK with the proposed solution in the document.
> >> However I think the doc itself needs one last wording change.
> >> I have left more details in comments but here is the gist:
> >> This effort is driven by a group of people in the community and not one
> >> vendor in particular even if said people do sometimes work for vendors.
> >> To reflect this, instead of saying the UUID identifies a Vendor, we
> >> should describe it as an extension ID.
> >> Then I'd remove all instances of the word "Vendor" and instead
> >> refer to "Extensions" identified by this UUID.
> >> This might not change anything to the implementation but it is important
> >> to reflecting how the community works in the document.
> >>
> >> Specifically:
> >>
> >> "Vendor introduces a Flatbuffers variant of FileMetaData." => "This
> >> extension introduces a Flatbuffers variant of FileMetaData..."
> >>
> >> "The UUID is picked by the Vendor once and used throughout the
> >> experiments." => "The UUID is picked for this specific extension and
> used
> >> throughout the experiments."
> >>
> >> "At some point Vendor decides that this is amazing and should be shared
> >> with the world at large to advance Parquet. " => "At some point, the
> >> community decides this extension is ready and proposed for inclusion."
> >>
> >>
> >> On Mon, Jul 22, 2024 at 10:11 PM Micah Kornfield <emkornfi...@gmail.com
> >
> >> wrote:
> >>
> >>> Hi Alkis,
> >>> Thanks for the revision.  I'm OK with this as is, we can maybe wait a
> few
> >>> more days to see if anybody else has comments and then discuss
> >>> implementation of the extension mechanism?
> >>>
> >>> Cheers,
> >>> Micah
> >>>
> >>> On Thu, Jul 18, 2024 at 10:22 PM Alkis Evlogimenos
> >>> <alkis.evlogime...@databricks.com.invalid> wrote:
> >>>
> >>> > After Jul 17th's Parquet Sync feedback I have updated the extensions
> >>> > proposal to remove the "reservation" mechanism. The updates are
> already
> >>> > reflected in the document
> >>> > <
> >>> >
> >>>
> https://docs.google.com/document/d/1KkoR0DjzYnLQXO-d0oRBv2k157IZU0_injqd4eV4WiI/edit
> >>> > >
> >>> > and
> >>> > the PR <https://github.com/apache/parquet-format/pull/254>.
> >>> >
> >>> > On Fri, Jun 28, 2024 at 10:02 AM Alkis Evlogimenos <
> >>> > alkis.evlogime...@databricks.com> wrote:
> >>> >
> >>> > > > I think we can at least have wording to encourage people doing
> >>> > > extensions to post them publicly and as part of the "reservation"
> >>> > mechanism
> >>> > > post a link the repo that they are being developed in, if anyone is
> >>> > curious.
> >>> > >
> >>> > > Good point. I will try to come up with something in the PR - unless
> >>> you
> >>> > > beat me to it :)
> >>> > >
> >>> > > On Fri, Jun 28, 2024 at 7:15 AM Micah Kornfield <
> >>> emkornfi...@gmail.com>
> >>> > > wrote:
> >>> > >
> >>> > >> >
> >>> > >> > 1. experimentation/prototyping is more often than not faster to
> >>> > iterate
> >>> > >> if
> >>> > >> > it is closed. Allowing this model of development was a primary
> >>> goal of
> >>> > >> the
> >>> > >> > design.
> >>> > >>
> >>> > >>
> >>> > >> I agree there are advantages here.  I think a large amount of
> speed
> >>> > comes
> >>> > >> from not having to gain consensus in the community.
> >>> > >>
> >>> > >> At the end of the day, I don't think there is any mechanism here
> to
> >>> > ensure
> >>> > >> everybody works in public, but I think we can at least have
> wording
> >>> to
> >>> > >> encourage people doing extensions to post them publicly and as
> part
> >>> of
> >>> > the
> >>> > >> "reservation" mechanism post a link the repo that they are being
> >>> > developed
> >>> > >> in, if anyone is curious.  I think this would be particularly
> >>> useful if
> >>> > >> there really is an intent for a number of organizations to
> >>> experiment
> >>> > with
> >>> > >> new footer designs (but possibly also in others).
> >>> > >>
> >>> > >> Thanks,
> >>> > >> Micah
> >>> > >>
> >>> > >>
> >>> > >>
> >>> > >>
> >>> > >> On Wed, Jun 26, 2024 at 9:33 AM Alkis Evlogimenos
> >>> > >> <alkis.evlogime...@databricks.com.invalid> wrote:
> >>> > >>
> >>> > >> > Thank you for taking a look Micah.
> >>> > >> >
> >>> > >> > On the topic of openness there are various aspects that we have
> >>> > >> considered.
> >>> > >> > 1. experimentation/prototyping is more often than not faster to
> >>> > iterate
> >>> > >> if
> >>> > >> > it is closed. Allowing this model of development was a primary
> >>> goal of
> >>> > >> the
> >>> > >> > design.
> >>> > >> > 2. when the design is final, keeping the design closed should
> have
> >>> > some
> >>> > >> > drawbacks. Duplicating content to support old readers puts some
> >>> > natural
> >>> > >> > incentive to make extensions official because at that point one
> >>> can
> >>> > drop
> >>> > >> > the fat from the files and move on. Another aspect of the design
> >>> is
> >>> > the
> >>> > >> > choice of a single extension field-id which makes the extension
> >>> space
> >>> > >> tiny.
> >>> > >> > This in turn means that it is difficult to interop with others
> >>> without
> >>> > >> > breaking their extensions. Ergo the easiest path to any interop
> >>> is to
> >>> > >> open
> >>> > >> > the extension.
> >>> > >> >
> >>> > >> > The above, while not enforcing work to happen in the open,
> strike
> >>> some
> >>> > >> > balance in between.
> >>> > >> >
> >>> > >> > I am open to suggestions on how to further incentivize opening
> >>> > >> extensions.
> >>> > >> >
> >>> > >> > On Wed, Jun 26, 2024 at 6:04 PM Micah Kornfield <
> >>> > emkornfi...@gmail.com>
> >>> > >> > wrote:
> >>> > >> >
> >>> > >> > > Hi Alkis,
> >>> > >> > > I'm generally in favor of this, my main concern/question is
> >>> trying
> >>> > to
> >>> > >> > > encourage work to be in the open.  I don't think in the long
> >>> run it
> >>> > is
> >>> > >> > good
> >>> > >> > > for users to always have proprietary extensions inside of
> >>> Parquet.
> >>> > >> > >
> >>> > >> > > IMO, I think the next steps would be to add implementations to
> >>> write
> >>> > >> out
> >>> > >> > > the footer extension points.
> >>> > >> > >
> >>> > >> > > Thanks,
> >>> > >> > > Micah
> >>> > >> > >
> >>> > >> > > On Mon, Jun 24, 2024 at 1:24 PM Alkis Evlogimenos
> >>> > >> > > <alkis.evlogime...@databricks.com.invalid> wrote:
> >>> > >> > >
> >>> > >> > > > The snafus are fixed. The original should work now.
> >>> > >> > > >
> >>> > >> > > > On Sun, 23 Jun 2024, 17:58 Alkis Evlogimenos, <
> >>> > >> > > > alkis.evlogime...@databricks.com> wrote:
> >>> > >> > > >
> >>> > >> > > > > Due to some sharing snafus with automation, please request
> >>> > access
> >>> > >> to
> >>> > >> > > > > comment. If you are just reading I've published this here:
> >>> > >> > > > >
> >>> > >> > > >
> >>> > >> > >
> >>> > >> >
> >>> > >>
> >>> >
> >>>
> https://docs.google.com/document/d/e/2PACX-1vThXkhHNozn_p1ZZWF-nCzOtoP1lKmkaV4Legq2FaRiIgwyY2XC9AmKpBtpeF8jbBB4wfjmQ6UTg03k/pub
> >>> > >> > > > >
> >>> > >> > > > > On Fri, Jun 21, 2024 at 10:29 AM Alkis Evlogimenos <
> >>> > >> > > > > alkis.evlogime...@databricks.com> wrote:
> >>> > >> > > > >
> >>> > >> > > > >> Hey folks.
> >>> > >> > > > >>
> >>> > >> > > > >> I want to move the extension PR
> >>> > >> > > > >> <https://github.com/apache/parquet-format/pull/254>
> >>> forward.
> >>> > >> > > > >> Unfortunately the discussion was spread across the PR,
> >>> other
> >>> > >> threads
> >>> > >> > > and
> >>> > >> > > > >> documents making it slow to progress. To avoid further
> >>> > >> > fragmentation I
> >>> > >> > > > have
> >>> > >> > > > >> put together a document
> >>> > >> > > > >> <
> >>> > >> > > >
> >>> > >> > >
> >>> > >> >
> >>> > >>
> >>> >
> >>>
> https://docs.google.com/document/d/1KkoR0DjzYnLQXO-d0oRBv2k157IZU0_injqd4eV4WiI/edit
> >>> > >> > > > >
> >>> > >> > > > >> discussing the extensions mechanism in isolation. I
> >>> believe the
> >>> > >> > > document
> >>> > >> > > > >> addresses all the concerns/comments from the PR and
> mailing
> >>> > list
> >>> > >> > > > >> discussions brought forward so far.
> >>> > >> > > > >>
> >>> > >> > > > >> I propose we continue the discussion in the document and
> >>> once
> >>> > >> > > everything
> >>> > >> > > > >> is addressed, we finalize the PR.
> >>> > >> > > > >>
> >>> > >> > > > >> Thank you,
> >>> > >> > > > >>
> >>> > >> > > > >
> >>> > >> > > >
> >>> > >> > >
> >>> > >> >
> >>> > >>
> >>> > >
> >>> >
> >>>
> >>
>

Reply via email to