aihuaxu commented on code in PR #14279:
URL: https://github.com/apache/iceberg/pull/14279#discussion_r2422249274


##########
parquet/src/main/java/org/apache/iceberg/parquet/ParquetMetricsRowGroupFilter.java:
##########
@@ -379,7 +380,8 @@ public <T> Boolean in(BoundReference<T> ref, Set<T> 
literalSet) {
       // When filtering nested types notNull() is implicit filter passed even 
though complex

Review Comment:
   nit: correct the comment notNull() => in()



##########
parquet/src/main/java/org/apache/iceberg/parquet/ParquetMetricsRowGroupFilter.java:
##########
@@ -329,7 +329,8 @@ public <T> Boolean eq(BoundReference<T> ref, Literal<T> 
lit) {
       // When filtering nested types notNull() is implicit filter passed even 
though complex
       // filters aren't pushed down in Parquet. Leave all nested column type 
filters to be
       // evaluated post scan.
-      if (schema.findType(id) instanceof Type.NestedType) {
+      Type type = schema.findType(id);
+      if (type instanceof Type.NestedType || type.isVariantType()) {

Review Comment:
   Thanks for details. That helps. 



##########
parquet/src/main/java/org/apache/iceberg/parquet/ParquetMetricsRowGroupFilter.java:
##########
@@ -329,7 +329,8 @@ public <T> Boolean eq(BoundReference<T> ref, Literal<T> 
lit) {
       // When filtering nested types notNull() is implicit filter passed even 
though complex

Review Comment:
   nit: can you help correct the comment here? notNull() => eq()



##########
data/src/test/java/org/apache/iceberg/data/TestMetricsRowGroupFilter.java:
##########
@@ -1048,19 +1041,68 @@ public void testVariantFieldAllNullsNotNull() throws 
IOException {
       records.add(record);
     }
 
-    File parquetFile = writeParquetFile("test-variant-nulls", VARIANT_SCHEMA, 
records);
-    InputFile inFile = Files.localInput(parquetFile);
+    boolean shouldRead = shouldReadVariant(notNull("variant_field"), records);
 
-    try (ParquetFileReader reader = 
ParquetFileReader.open(parquetInputFile(inFile))) {
-      BlockMetaData blockMetaData = reader.getRowGroups().get(0);
-      MessageType fileSchema = reader.getFileMetaData().getSchema();
-      ParquetMetricsRowGroupFilter rowGroupFilter =
-          new ParquetMetricsRowGroupFilter(VARIANT_SCHEMA, 
notNull("variant_field"), true);
+    assertThat(shouldRead)
+        .as("Should read: variant notNull filters must be evaluated post scan 
even for all nulls")
+        .isTrue();
+  }
 
-      assertThat(rowGroupFilter.shouldRead(fileSchema, blockMetaData))
-          .as("Should read: variant notNull filters must be evaluated post 
scan even for all nulls")
-          .isTrue();
-    }
+  @TestTemplate
+  public void testVariantFieldEq() throws IOException {

Review Comment:
   These tests are very similar. Probably consider parameterized tests or 
extract the common part into helper function - seems the only difference is the 
test function and error message? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to