Re: [VOTE][Format] Add Float16 type to specification

Ben Harkins Wed, 25 Oct 2023 08:52:34 -0700

For anyone that didn't see the [RESULT] thread [1], this vote has passed.

[1] https://lists.apache.org/thread/odm5pmxssyd9kw1wvgdkg8hd044czqnk


On Tue, Oct 10, 2023 at 7:01 AM Uwe L. Korn <uw...@xhochy.com> wrote:

> +1 (binding)
>
> On Sat, Oct 7, 2023, at 5:49 AM, Daniel Weeks wrote:
> > +1
> >
> > On Fri, Oct 6, 2023, 8:33 PM Gang Wu <ust...@gmail.com> wrote:
> >
> >> +1 (non-binding)
> >>
> >> Best,
> >> Gang
> >>
> >> On Sat, Oct 7, 2023 at 11:05 AM Micah Kornfield <emkornfi...@gmail.com>
> >> wrote:
> >>
> >> > I'm +1 (non-binding) for the proposal in general.
> >> >
> >> > I do have a concern that we should be implementing
> >> > https://issues.apache.org/jira/browse/PARQUET-2182 (ignoring stats
> for
> >> > logical types the reader doesn't understand) and its equivalent in
> other
> >> > libraries first, but given potential low usage we can possibly do that
> >> as a
> >> > follow-up.
> >> >
> >> >
> >> >
> >> >
> >> > On Fri, Oct 6, 2023 at 12:50 AM Gábor Szádovszky <ga...@apache.org>
> >> wrote:
> >> >
> >> > > +1
> >> > >
> >> > > About the naming. We already use INT_8, INT_16 etc. for logical
> types
> >> for
> >> > > integer values. What do you think about FLOAT_16 to be consistent?
> >> > >
> >> > > Cheers,
> >> > > Gabor
> >> > >
> >> > > On 2023/10/05 22:17:13 Ryan Blue wrote:
> >> > > > +1
> >> > > >
> >> > > > I'm all for adding a 2-byte floating point representation since
> even
> >> > > 4-byte
> >> > > > floats are quite expensive to store.
> >> > > >
> >> > > > On Thu, Oct 5, 2023 at 1:43 PM Xinli shang
> <sha...@uber.com.invalid>
> >> > > wrote:
> >> > > >
> >> > > > > +1
> >> > > > >
> >> > > > > On Thu, Oct 5, 2023 at 1:32 PM Antoine Pitrou <
> anto...@python.org>
> >> > > wrote:
> >> > > > >
> >> > > > > >
> >> > > > > > Hello,
> >> > > > > >
> >> > > > > > +1 from me (non-binding).
> >> > > > > >
> >> > > > > > Regards
> >> > > > > >
> >> > > > > > Antoine.
> >> > > > > >
> >> > > > > >
> >> > > > > > On Wed, 4 Oct 2023 16:14:00 -0400
> >> > > > > > Ben Harkins <b...@voltrondata.com.INVALID>
> >> > > > > > wrote:
> >> > > > > >
> >> > > > > > > Hi everyone,
> >> > > > > > >
> >> > > > > > > I would like to propose adding a half-precision floating
> point
> >> > > type to
> >> > > > > > > the Parquet format specification, in accordance with the
> active
> >> > > > > > > proposal here:
> >> > > > > > >
> >> > > > > > >
> >> > > > > > >    - https://github.com/apache/parquet-format/pull/184
> >> > > > > > >
> >> > > > > > > To summarize, the current proposal would introduce a Float16
> >> > > logical
> >> > > > > > > type, represented by a little-endian 2-byte
> FixedLenByteArray.
> >> > The
> >> > > > > > > value's encoding would adhere to the IEEE-754 standard [1].
> >> > > > > > > Furthermore, implementations should ensure that any value
> >> > > comparisons
> >> > > > > > > and ordering requirements (mainly for column statistics)
> >> emulate
> >> > > the
> >> > > > > > > behavior of native (i.e. physical) floating point types.
> >> > > > > > >
> >> > > > > > > As for how this would look in practice, there are currently
> >> > several
> >> > > > > > > implementations of this proposal that are more or less
> >> complete:
> >> > > > > > >
> >> > > > > > >
> >> > > > > > >    - C++ (and Python):
> >> > https://github.com/apache/arrow/pull/36073
> >> > > > > > >    - Java: https://github.com/apache/parquet-mr/pull/1142
> >> > > > > > >    - Go: https://github.com/apache/arrow/pull/37599
> >> > > > > > >
> >> > > > > > > Of course, we're prepared to make adjustments to the
> >> > > implementations as
> >> > > > > > > needed, since the format additions will need to be approved
> >> > before
> >> > > > > those
> >> > > > > > > PRs are merged. I should also note that naming conventions
> >> > haven't
> >> > > been
> >> > > > > > > extensively discussed, so feel free to chime in if you have
> a
> >> > > strong
> >> > > > > > > preference for HALF or HALF_FLOAT over FLOAT16!
> >> > > > > > >
> >> > > > > > >
> >> > > > > > > This vote will be open for at least 72 hours.
> >> > > > > > >
> >> > > > > > > [ ] +1 Add this type to the format specification
> >> > > > > > > [ ] +0
> >> > > > > > > [ ] -1 Do not add this type to the format specification
> >> > because...
> >> > > > > > >
> >> > > > > > > Thanks!
> >> > > > > > >
> >> > > > > > > Ben
> >> > > > > > >
> >> > > > > > > [1]:
> >> > > > >
> https://en.wikipedia.org/wiki/Half-precision_floating-point_format
> >> > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > > >
> >> > > > > >
> >> > > > > >
> >> > > > >
> >> > > > > --
> >> > > > > Xinli Shang
> >> > > > >
> >> > > >
> >> > > >
> >> > > > --
> >> > > > Ryan Blue
> >> > > > Tabular
> >> > > >
> >> > >
> >> >
> >>
>

Re: [VOTE][Format] Add Float16 type to specification

Reply via email to