rdblue commented on pull request #2891: URL: https://github.com/apache/iceberg/pull/2891#issuecomment-890592412
@electrum, I was leaving this vague. My inclination is to not to require canonicalization. It's more work and there may be legitimate uses of signalling NaNs. Plus, I just looked into signalling NaN values and they are evidently not part of the IEEE 754 spec and differ between processors: > Encodings of qNaN and sNaN are not specified in IEEE 754 and implemented differently on different processors. ([wikipedia](https://en.wikipedia.org/wiki/Single-precision_floating-point_format#Single-precision_examples)) @findepi, in IEEE 754, the sign bit is independent of whether a value is NaN (exponent bits are all set and significand is non-zero). That means there are legitimate -NaN values and it would be reasonable for another language to sort them that way. I think that there is no need to change the handling of -NaN values in the spec. It's also reasonable for Java to not distinguish between -NaN and NaN values and to only produce positive NaNs. The only update I would make is to ensure that the Java implementation only writes positive NaN values to align with its sort. In most cases, no changes are needed because we track NaN values through counts after identifying them with `Float.isNaN` or `Double.isNaN`. The only cases that would need to be updated are when engines write -NaN values into files *and* are sorting them. Those engines would be responsible for converting -NaN to NaN or to implement the sort order according to the spec rather than using the default Java comparison. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
