aihuaxu commented on code in PR #14261:
URL: https://github.com/apache/iceberg/pull/14261#discussion_r2414468949
##########
parquet/src/test/java/org/apache/iceberg/parquet/TestPruneColumns.java:
##########
@@ -270,4 +273,64 @@ public void testStructElementName() {
MessageType actual = ParquetSchemaUtil.pruneColumns(fileSchema,
projection);
assertThat(actual).as("Pruned schema should be
matched").isEqualTo(expected);
}
+
+ @Test
+ public void testVariant() {
Review Comment:
Actually in my separate PR I have a roundtrip test and notice this issue.
Huaxin or I can work on the test.
##########
parquet/src/main/java/org/apache/iceberg/parquet/TypeWithSchemaVisitor.java:
##########
@@ -64,7 +64,7 @@ public static <T> T visit(
} else if (annotation instanceof
LogicalTypeAnnotation.VariantLogicalTypeAnnotation
|| (iType != null && iType.isVariantType())) {
// when Parquet has a VARIANT logical type, use it here
- return visitVariant(iType.asVariantType(), group, visitor);
+ return visitVariant(iType != null ? iType.asVariantType() : null,
group, visitor);
Review Comment:
You are right. It's general for any visitors and we would hit NPE without
Iceberg type, but more for pruning.
##########
parquet/src/test/java/org/apache/iceberg/parquet/TestPruneColumns.java:
##########
@@ -270,4 +274,45 @@ public void testStructElementName() {
MessageType actual = ParquetSchemaUtil.pruneColumns(fileSchema,
projection);
assertThat(actual).as("Pruned schema should be
matched").isEqualTo(expected);
}
+
+ @Test
+ public void testVariant() {
+ MessageType fileSchema =
+ Types.buildMessage()
+ .addField(
+ Types.primitive(PrimitiveTypeName.INT32,
Type.Repetition.REQUIRED)
+ .id(1)
+ .named("id"))
+ .addField(buildVariantType(2, "variant_1"))
+ .addField(buildVariantType(3, "variant_2"))
+ .named("table");
+
+ Schema projection =
+ new Schema(
+ ImmutableList.of(
+ NestedField.required(1, "id", IntegerType.get()),
+ NestedField.required(2, "variant_1", VariantType.get())));
+ MessageType expected =
+ Types.buildMessage()
+ .addField(
+ Types.primitive(PrimitiveTypeName.INT32,
Type.Repetition.REQUIRED)
+ .id(1)
+ .named("id"))
+ .addField(buildVariantType(2, "variant_1"))
+ .named("table");
+
+ MessageType actual = ParquetSchemaUtil.pruneColumns(fileSchema,
projection);
+ assertThat(actual).as("Pruned schema should be
matched").isEqualTo(expected);
+ }
+
+ private static Type buildVariantType(int id, String name) {
Review Comment:
I separate into a helper method here.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]