rdblue commented on code in PR #5704:
URL: https://github.com/apache/iceberg/pull/5704#discussion_r985305158


##########
core/src/main/java/org/apache/iceberg/avro/PruneColumns.java:
##########
@@ -345,4 +345,27 @@ private static boolean 
isOptionSchemaWithNonNullFirstOption(Schema schema) {
     return AvroSchemaUtil.isOptionSchema(schema)
         && schema.getTypes().get(0).getType() != Schema.Type.NULL;
   }
+
+  // for primitive types, the visitResult will be null, we want to reuse the 
primitive types from
+  // the original
+  // schema, while for nested types, we want to use the visitResult because 
they have content from
+  // the previous
+  // recursive calls.

Review Comment:
   Can you fix line wrapping? Looks like this was auto-formatted.



##########
core/src/main/java/org/apache/iceberg/avro/PruneColumns.java:
##########
@@ -345,4 +345,27 @@ private static boolean 
isOptionSchemaWithNonNullFirstOption(Schema schema) {
     return AvroSchemaUtil.isOptionSchema(schema)
         && schema.getTypes().get(0).getType() != Schema.Type.NULL;
   }
+
+  // for primitive types, the visitResult will be null, we want to reuse the 
primitive types from
+  // the original
+  // schema, while for nested types, we want to use the visitResult because 
they have content from
+  // the previous
+  // recursive calls.
+  private static Schema copyUnion(Schema record, List<Schema> visitResults) {

Review Comment:
   Is there a better name for this? Maybe `pruneComplexUnion`?



##########
core/src/main/java/org/apache/iceberg/avro/PruneColumns.java:
##########
@@ -345,4 +345,27 @@ private static boolean 
isOptionSchemaWithNonNullFirstOption(Schema schema) {
     return AvroSchemaUtil.isOptionSchema(schema)
         && schema.getTypes().get(0).getType() != Schema.Type.NULL;
   }
+
+  // for primitive types, the visitResult will be null, we want to reuse the 
primitive types from
+  // the original
+  // schema, while for nested types, we want to use the visitResult because 
they have content from
+  // the previous
+  // recursive calls.
+  private static Schema copyUnion(Schema record, List<Schema> visitResults) {
+    List<Schema> branches = 
Lists.newArrayListWithExpectedSize(visitResults.size());
+    for (int i = 0; i < visitResults.size(); i++) {
+      if (visitResults.get(i) == null) {
+        branches.add(record.getTypes().get(i));

Review Comment:
   It looks like `record` is actually a `union` and not a record.



##########
core/src/main/java/org/apache/iceberg/avro/SchemaToType.java:
##########
@@ -38,6 +43,11 @@ class SchemaToType extends AvroSchemaVisitor<Type> {
     }
   }
 
+  SchemaToType(Schema root, boolean deriveNameMapping) {
+    this(root);
+    this.deriveNameMapping = deriveNameMapping;

Review Comment:
   I don't think this PR should build a name mapping. That can be added in a 
later PR, and it should not use a custom Avro property.
   
   Where possible, we avoid mixing jobs in the Iceberg project. This class 
converts a schema from Avro to Iceberg and should do only that. If you want to 
derive a mapping, I'd recommend building a visitor to do that.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to