+1 About the naming. We already use INT_8, INT_16 etc. for logical types for integer values. What do you think about FLOAT_16 to be consistent?
Cheers, Gabor On 2023/10/05 22:17:13 Ryan Blue wrote: > +1 > > I'm all for adding a 2-byte floating point representation since even 4-byte > floats are quite expensive to store. > > On Thu, Oct 5, 2023 at 1:43 PM Xinli shang <[email protected]> wrote: > > > +1 > > > > On Thu, Oct 5, 2023 at 1:32 PM Antoine Pitrou <[email protected]> wrote: > > > > > > > > Hello, > > > > > > +1 from me (non-binding). > > > > > > Regards > > > > > > Antoine. > > > > > > > > > On Wed, 4 Oct 2023 16:14:00 -0400 > > > Ben Harkins <[email protected]> > > > wrote: > > > > > > > Hi everyone, > > > > > > > > I would like to propose adding a half-precision floating point type to > > > > the Parquet format specification, in accordance with the active > > > > proposal here: > > > > > > > > > > > > - https://github.com/apache/parquet-format/pull/184 > > > > > > > > To summarize, the current proposal would introduce a Float16 logical > > > > type, represented by a little-endian 2-byte FixedLenByteArray. The > > > > value's encoding would adhere to the IEEE-754 standard [1]. > > > > Furthermore, implementations should ensure that any value comparisons > > > > and ordering requirements (mainly for column statistics) emulate the > > > > behavior of native (i.e. physical) floating point types. > > > > > > > > As for how this would look in practice, there are currently several > > > > implementations of this proposal that are more or less complete: > > > > > > > > > > > > - C++ (and Python): https://github.com/apache/arrow/pull/36073 > > > > - Java: https://github.com/apache/parquet-mr/pull/1142 > > > > - Go: https://github.com/apache/arrow/pull/37599 > > > > > > > > Of course, we're prepared to make adjustments to the implementations as > > > > needed, since the format additions will need to be approved before > > those > > > > PRs are merged. I should also note that naming conventions haven't been > > > > extensively discussed, so feel free to chime in if you have a strong > > > > preference for HALF or HALF_FLOAT over FLOAT16! > > > > > > > > > > > > This vote will be open for at least 72 hours. > > > > > > > > [ ] +1 Add this type to the format specification > > > > [ ] +0 > > > > [ ] -1 Do not add this type to the format specification because... > > > > > > > > Thanks! > > > > > > > > Ben > > > > > > > > [1]: > > https://en.wikipedia.org/wiki/Half-precision_floating-point_format > > > > > > > > > > > > > > > > > > > > > > > > -- > > Xinli Shang > > > > > -- > Ryan Blue > Tabular >
