FYI: there was an effort from Jan (cc'd) to introduce a total order for floating-point numbers on the Parquet side: [1][2].
[1] https://github.com/apache/parquet-format/pull/221 [2] https://github.com/apache/parquet-format/pull/196 On Thu, Feb 27, 2025 at 4:24 AM Devin Smith <devinsm...@deephaven.io.invalid> wrote: > The spec https://iceberg.apache.org/spec/#sorting says > > Sorting floating-point numbers should produce the following behavior: -NaN >> < -Infinity < -value < -0 < 0 < value < Infinity < NaN. This aligns with >> the implementation of Java floating-point types comparisons. > > > As far as I know, this does not align with the implementation of Java > floating-point types comparison as there is no concept of -NaN. There may > be some more explicit total ordering regimes, such as IEEE 754-2019 - > Standard for Floating-Point Arithmetic > <https://ieeexplore.ieee.org/document/8766229> (or maybe, IEEE 754-2008), > but it's unclear if that was the intention of the Iceberg spec. If the > intention is to use this IEEE 754 total ordering, it probably makes sense > to link to the specification along with the implications (regarding qNan, > sNan, sign-bit on NaN, etc). If the intention is to use the Java ordering, > it probably makes sense to remove the reference to -NaN and to link to > the relevant javadoc. > > > https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/lang/Double.html#equivalenceRelation > > > https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/lang/Double.html#compareTo(java.lang.Double) > > https://en.wikipedia.org/wiki/IEEE_754#Total-ordering_predicate > > What is the correct interpretation? > > Thanks, > -Devin > >