ebyhr commented on code in PR #16639:
URL: https://github.com/apache/iceberg/pull/16639#discussion_r3339653482


##########
parquet/src/test/java/org/apache/iceberg/parquet/TestParquet.java:
##########
@@ -273,6 +273,97 @@ public void testColumnStatisticsEnabled() throws Exception 
{
     }
   }
 
+  @Test
+  public void testColumnStatisticsDisabledMultipleColumns() throws Exception {
+    Schema schema =
+        new Schema(
+            optional(1, "int_field", IntegerType.get()),
+            optional(2, "string_field", Types.StringType.get()),
+            optional(3, "double_field", Types.DoubleType.get()));
+
+    File file = createTempFile(temp);
+
+    List<GenericData.Record> records = Lists.newArrayListWithCapacity(5);
+    org.apache.avro.Schema avroSchema = 
AvroSchemaUtil.convert(schema.asStruct());
+    for (int i = 1; i <= 5; i++) {
+      GenericData.Record record = new GenericData.Record(avroSchema);
+      record.put("int_field", i);
+      record.put("string_field", "test");
+      record.put("double_field", i * 1.5);
+      records.add(record);
+    }
+
+    write(
+        file,
+        schema,
+        ImmutableMap.<String, String>builder()
+            .put(PARQUET_COLUMN_STATS_ENABLED_PREFIX + "int_field", "true")
+            .put(PARQUET_COLUMN_STATS_ENABLED_PREFIX + "string_field", "false")
+            .put(PARQUET_COLUMN_STATS_ENABLED_PREFIX + "double_field", "false")
+            .buildOrThrow(),
+        ParquetAvroWriter::buildWriter,
+        records.toArray(new GenericData.Record[] {}));
+
+    InputFile inputFile = Files.localInput(file);
+
+    try (ParquetFileReader reader = 
ParquetFileReader.open(ParquetIO.file(inputFile))) {
+      for (BlockMetaData block : reader.getFooter().getBlocks()) {
+        for (ColumnChunkMetaData column : block.getColumns()) {
+          boolean emptyStats = column.getStatistics().isEmpty();
+          if (column.getPath().toDotString().equals("int_field")) {
+            assertThat(emptyStats).as("int_field has 
statistics").isEqualTo(false);
+          } else if (column.getPath().toDotString().equals("string_field")) {
+            assertThat(emptyStats).as("string_field has statistics 
disabled").isEqualTo(true);
+          } else if (column.getPath().toDotString().equals("double_field")) {
+            assertThat(emptyStats).as("double_field has statistics 
disabled").isEqualTo(true);
+          }
+        }
+      }
+    }

Review Comment:
   The readability of this logic seems a bit low. Could you consider 
introducing a helper method? 
   ```java
       try (ParquetFileReader reader = 
ParquetFileReader.open(ParquetIO.file(inputFile))) {
         for (BlockMetaData block : reader.getFooter().getBlocks()) {
           assertStats(block, "int_field", false);
           assertStats(block, "string_field", true);
           assertStats(block, "double_field", true);
         }
       }
   ...
     private static void assertStats(BlockMetaData rowGroup, String columnName, 
boolean empty) {
       ColumnPath columnPath = ColumnPath.fromDotString(columnName);
       ColumnChunkMetaData chunkMetaData =
           getOnlyElement(
               rowGroup.getColumns().stream()
                   .filter(column -> columnPath.equals(column.getPath()))
                   .toList());
       assertThat(chunkMetaData.getStatistics().isEmpty()).isEqualTo(empty);
     }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to