benibus commented on code in PR #1142:
URL: https://github.com/apache/parquet-mr/pull/1142#discussion_r1330570985


##########
parquet-hadoop/src/test/java/org/apache/parquet/format/converter/TestParquetMetadataConverter.java:
##########
@@ -990,6 +990,30 @@ private void testUseStatsWithSignedSortOrder(StatsHelper 
helper) {
     }
   }
 
+  @Test
+  public void testFloat16Stats() {
+    BinaryStatistics bStats = new BinaryStatistics();

Review Comment:
   I think we may need to define an independent `Float16Statistics` class since 
there's some special handling required for floating-point min/max values. For 
instance: 
https://github.com/apache/parquet-mr/blob/45a1ae2d3b4078d3b9e83919cacb03050c1b8c88/parquet-column/src/main/java/org/apache/parquet/column/statistics/Statistics.java#L83-L107
   
   The situation is a bit unusual, since `Statistics` classes are typically 
exclusive to physical types. However, I believe 
[`createStats`](https://github.com/apache/parquet-mr/blob/45a1ae2d3b4078d3b9e83919cacb03050c1b8c88/parquet-column/src/main/java/org/apache/parquet/column/statistics/Statistics.java#L194)
 is capable of constructing one based on the logical type without having to 
change the existing API. Ideally, `Float16Statistics` would simply extend 
`BinaryStatistics` (assuming there aren't any roadblocks to doing that)



##########
parquet-common/src/test/java/org/apache/parquet/util/TestFloat16.java:
##########
@@ -0,0 +1,89 @@
+/*
+ *  Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing,
+ *  software distributed under the License is distributed on an
+ *  "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ *  KIND, either express or implied.  See the License for the
+ *  specific language governing permissions and limitations
+ *  under the License.
+ */
+
+package org.apache.parquet.util;
+
+import org.junit.Test;
+
+import static org.junit.Assert.assertEquals;
+import static org.apache.parquet.util.Float16.*;
+
+public class TestFloat16
+{
+  @Test
+  public void testFloat16ToFloat() {
+    // Zeroes, NaN and infinities
+    assertEquals(0.0f, toFloat(toFloat16(0.0f)), 0.0f);
+    assertEquals(-0.0f, toFloat(toFloat16(-0.0f)), 0.0f);
+    assertEquals(Float.NaN, toFloat(toFloat16(Float.NaN)), 0.0f);
+    assertEquals(Float.POSITIVE_INFINITY, 
toFloat(toFloat16(Float.POSITIVE_INFINITY)), 0.0f);
+    assertEquals(Float.NEGATIVE_INFINITY, 
toFloat(toFloat16(Float.NEGATIVE_INFINITY)), 0.0f);
+    // Known values
+    assertEquals(1.0009765625f, toFloat(toFloat16(1.0009765625f)), 0.0f);
+    assertEquals(-2.0f, toFloat(toFloat16(-2.0f)), 0.0f);
+    assertEquals(6.1035156e-5f, toFloat(toFloat16(6.10352e-5f)), 0.0f); // 
Inexact
+    assertEquals(65504.0f, toFloat(toFloat16(65504.0f)), 0.0f);
+    assertEquals(0.33325195f, toFloat(toFloat16(1.0f / 3.0f)), 0.0f); // 
Inexact
+    // Denormals (flushed to +/-0)
+    assertEquals(6.097555e-5f, toFloat(toFloat16(6.09756e-5f)), 0.0f);
+    assertEquals(5.9604645e-8f, toFloat(toFloat16(5.96046e-8f)), 0.0f);
+    assertEquals(-6.097555e-5f, toFloat(toFloat16(-6.09756e-5f)), 0.0f);
+    assertEquals(-5.9604645e-8f, toFloat(toFloat16(-5.96046e-8f)), 0.0f);
+  }
+
+  @Test
+  public void testFloatToFloat16() {
+    // Zeroes, NaN and infinities
+    assertEquals(POSITIVE_ZERO, toFloat16(0.0f));
+    assertEquals(NEGATIVE_ZERO, toFloat16(-0.0f));
+    assertEquals(NaN, toFloat16(Float.NaN));

Review Comment:
   To future-proof things, we should probably test at least one other NaN 
representation - e.g where the payload is exclusively in the lower mantissa 
bit(s).



##########
parquet-column/src/main/java/org/apache/parquet/schema/PrimitiveComparator.java:
##########
@@ -276,4 +279,24 @@ public String toString() {
       return "BINARY_AS_SIGNED_INTEGER_COMPARATOR";
     }
   };
+
+  /**
+   * This comparator is for comparing two float16 values represented in 2 
bytes binary.
+   */
+  static final PrimitiveComparator<Binary> BINARY_AS_FLOAT16_COMPARATOR = new 
BinaryComparator() {

Review Comment:
   Impl looks good, but we should add a test for the comparator 
[here](https://github.com/apache/parquet-mr/blob/45a1ae2d3b4078d3b9e83919cacb03050c1b8c88/parquet-column/src/test/java/org/apache/parquet/schema/TestPrimitiveComparator.java)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to