Re: [Parquet] ALP Encoding for Floating point data

PRATEEK GAUR Wed, 25 Feb 2026 09:50:18 -0800

@Micah Kornfield <[email protected]> : Got it.

@Andrew Lamb <[email protected]>



> Do you think it would be good to start moving the spec development into
> markdown format, in preparation for finalizing it?
>

Yes I'll update the numbers for some of the examples I have in the spec
based
on the updated header size. Then we should be good to go for the markdown
format.

Thanks everyone!


>
> Andrew
>
> On Tue, Feb 17, 2026 at 7:28 PM PRATEEK GAUR <[email protected]> wrote:
>
> > Hi team,
> >
> > 1) Andrew
> >
> >    - Thanks for working on test files. My PR did add all the test files I
> >    used to benchmark on datasets. Maybe we can club it together. WIll
> also
> > aid
> >    cross language testing
> >    -  Kosta Tarasov working on Rust implementation. This is great. Thanks
> >
> >
> > 2) Antoine
> >
> >    - Thanks a lot for reporting the numbers on AMD. Looks like you are
> >    getting 8X the decoding performance of BSS. This is amazing!!.
> >    - Thanks for acknowledging the sampling design.
> >    - I agree with you on Fastlanes. In some crude experiments I didn't
> get
> >    a good perf benefit from it on Graviton3 (but maybe there was
> something
> >    wrong with my implementation).
> >    - Locking the 16bit exception encoding for the spec in this case.
> >    - Awesome I think we have solved for all open questions minus the
> >    version byte :). (will get back on this soon)
> >
> >
> > 3) Micah
> >
> >    - FastLanes : The current spec does allow for using FastLane with the
> >    configurable enum value for layout. We should be able to inject any
> > layout
> >    in the current design.
> >
> >
> > Working on resolving all remaining open comments on the spec this week.
> >
> > Best
> > Prateek
> >
> >
> > On Tue, Feb 10, 2026 at 3:37 AM Steve Loughran <[email protected]>
> > wrote:
> >
> > > On Sun, 8 Feb 2026 at 18:12, Micah Kornfield <[email protected]>
> > > wrote:
> > >
> > > >
> > > >
> > > > It looks like the actual issue described for ORC in the paper is that
> > it
> > > > has multiple sub-encodings in a batch.  This is different then the
> > design
> > > > proposed here where there is still fixed encoding per page in
> parquet.
> > > > Given reasonably sized pages I don't think branch misprediction
> should
> > > be a
> > > > big issue for new encodings.  I agree that we should be conservative
> in
> > > > general for adding new encodings.
> > > >
> > > >
> > > +1
> > >
> >
>

Re: [Parquet] ALP Encoding for Floating point data

Reply via email to