rdblue commented on a change in pull request #3186:
URL: https://github.com/apache/iceberg/pull/3186#discussion_r748883894



##########
File path: 
spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/ConstantColumnVector.java
##########
@@ -119,6 +122,8 @@ public UTF8String getUTF8String(int rowId) {
 
   @Override
   public ColumnVector getChild(int ordinal) {
-    throw new UnsupportedOperationException("ConstantColumnVector only 
supports primitives");
+    DataType sparkType = ((StructType) type).fields()[ordinal].dataType();
+    Type childType = SparkSchemaUtil.convert(sparkType);
+    return new ConstantColumnVector(childType, batchSize, ((InternalRow) 
constant).get(ordinal, sparkType));

Review comment:
       It looks like the approach in this PR is to use existing constant 
handling to fix nested column construction. I'm not sure that this is the right 
approach. There isn't anything specific to ORC here, so I'm not convinced that 
the ORC readers handle nested constant values correctly. Avoiding a problem is 
an okay solution as long as we're confident that it is always avoided.
   
   At a minimum, I think we should have tests for partitioning by nested fields 
in structs that are not constants and would not get created by 
`findMoreCompoundConstants`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to