rdblue commented on a change in pull request #3186:
URL: https://github.com/apache/iceberg/pull/3186#discussion_r741526482



##########
File path: 
spark/v2.4/spark/src/test/java/org/apache/iceberg/spark/source/TestPartitionValues.java
##########
@@ -402,14 +402,13 @@ public void testPartitionedByNestedString() throws 
Exception {
 
     // input data frame
     StructField[] structFields = {
-        new StructField("struct",
-            DataTypes.createStructType(
-                new StructField[] {
-                    new StructField("string", DataTypes.StringType, false, 
Metadata.empty())
-                }
-            ),
-            false, Metadata.empty()
-        )
+        new StructField("col1",
+          DataTypes.createStructType(
+            new StructField[] {
+                new StructField("f1", DataTypes.StringType, false, 
Metadata.empty())
+            }
+                    ), false, Metadata.empty()
+            )

Review comment:
       Nearly all of the changes in this file are to formatting or names. Can 
you remove any changes that are not strictly necessary for what you're trying 
to fix?

##########
File path: 
spark/v2.4/spark/src/main/java/org/apache/iceberg/spark/source/BaseDataReader.java
##########
@@ -196,4 +198,40 @@ protected static Object convertConstant(Type type, Object 
value) {
     }
     return value;
   }
+
+  protected void findMoreCompoundConstants(Schema tableSchema, Map<Integer, ?> 
idToConstant) {
+    List<Types.NestedField> columns = tableSchema.columns();
+
+    for (Types.NestedField nestedField : columns) {
+      visitCompoundTypesByDFS(nestedField, (Map<Integer, Object>) 
idToConstant);
+    }
+  }
+
+  protected void visitCompoundTypesByDFS(
+          Types.NestedField nestedField,
+          Map<Integer, Object> idToConstant) {
+    switch (nestedField.type().typeId()) {
+      case STRUCT:
+        List<Types.NestedField> childFieldsInStruct = 
nestedField.type().asStructType().fields();
+        List<Integer> childFieldID = new ArrayList<>();
+        for (Types.NestedField childField : childFieldsInStruct) {
+          visitCompoundTypesByDFS(childField, idToConstant);
+          childFieldID.add(childField.fieldId());
+        }
+        if (idToConstant.keySet().containsAll(childFieldID)) {

Review comment:
       Looks like this is creating any struct that it can when all of the 
struct's children are present in the constants map.
   
   I think this would need to be refactored so that the method is clear about 
what it is doing. Simply "visit" doesn't really describe what is going on here. 
It is updating the constants map.

##########
File path: 
spark/v2.4/spark/src/main/java/org/apache/iceberg/spark/source/BaseDataReader.java
##########
@@ -196,4 +198,40 @@ protected static Object convertConstant(Type type, Object 
value) {
     }
     return value;
   }
+
+  protected void findMoreCompoundConstants(Schema tableSchema, Map<Integer, ?> 
idToConstant) {
+    List<Types.NestedField> columns = tableSchema.columns();
+
+    for (Types.NestedField nestedField : columns) {
+      visitCompoundTypesByDFS(nestedField, (Map<Integer, Object>) 
idToConstant);
+    }
+  }
+
+  protected void visitCompoundTypesByDFS(
+          Types.NestedField nestedField,
+          Map<Integer, Object> idToConstant) {

Review comment:
       I'm not sure that this should alter the incoming map. It would be better 
to build a new constants map with the extra fields.

##########
File path: 
spark/v2.4/spark/src/main/java/org/apache/iceberg/spark/source/BaseDataReader.java
##########
@@ -196,4 +198,40 @@ protected static Object convertConstant(Type type, Object 
value) {
     }
     return value;
   }
+
+  protected void findMoreCompoundConstants(Schema tableSchema, Map<Integer, ?> 
idToConstant) {
+    List<Types.NestedField> columns = tableSchema.columns();
+
+    for (Types.NestedField nestedField : columns) {
+      visitCompoundTypesByDFS(nestedField, (Map<Integer, Object>) 
idToConstant);
+    }
+  }
+
+  protected void visitCompoundTypesByDFS(
+          Types.NestedField nestedField,
+          Map<Integer, Object> idToConstant) {
+    switch (nestedField.type().typeId()) {
+      case STRUCT:

Review comment:
       In Iceberg, we separate schema traversal logic from other logic using 
visitors. I think that you should use a type visitor to implement this rather 
than traversing a type tree and doing some task in the same recursive function.

##########
File path: 
spark/v3.0/spark/src/main/java/org/apache/iceberg/spark/source/BaseDataReader.java
##########
@@ -196,4 +198,39 @@ protected static Object convertConstant(Type type, Object 
value) {
     }
     return value;
   }
+
+  protected void findMoreCompoundConstants(Schema tableSchema, Map<Integer, ?> 
idToConstant) {

Review comment:
       Can you please update just one version of Spark and then we'll port the 
changes to the other versions in separate PRs? That way, each PR is smaller and 
we don't have to review identical code multiple times.

##########
File path: 
spark/v2.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/ConstantColumnVector.java
##########
@@ -119,6 +122,8 @@ public UTF8String getUTF8String(int rowId) {
 
   @Override
   protected ColumnVector getChild(int ordinal) {
-    throw new UnsupportedOperationException("ConstantColumnVector only 
supports primitives");
+    DataType sparkType = ((StructType) type).fields()[ordinal].dataType();
+    Type childType = SparkSchemaUtil.convert(sparkType);
+    return new ConstantColumnVector(childType, batchSize, ((InternalRow) 
constant).get(ordinal, sparkType));

Review comment:
       How are these changes related to ORC?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to