Re: [PR] fix: Only maps FIXED_LEN_BYTE_ARRAY to String for uuid type [arrow-datafusion-comet]

via GitHub Wed, 03 Apr 2024 21:38:47 -0700


huaxingao commented on code in PR #238:
URL: 
https://github.com/apache/arrow-datafusion-comet/pull/238#discussion_r1550911908



##########
common/src/main/java/org/apache/comet/parquet/TypeUtil.java:
##########
@@ -196,7 +196,9 @@ && isUnsignedIntTypeMatched(logicalTypeAnnotation, 64)) {
             || canReadAsBinaryDecimal(descriptor, sparkType)
             || sparkType == DataTypes.BinaryType
             // for uuid, since iceberg maps uuid to StringType
-            || sparkType == DataTypes.StringType) {
+            || sparkType == DataTypes.StringType
+                && descriptor.getPrimitiveType().getLogicalTypeAnnotation()
+                    instanceof 
LogicalTypeAnnotation.UUIDLogicalTypeAnnotation) {

Review Comment:
   @viirya If the `SparkType` is `StringType` and `LogicalTypeAnnotation` is 
`UUID`, then this must be iceberg UUID column, because only iceberg maps UUID 
to Spark `StringType`. I feel the change is safe. Or we can add an extra 
parameter in 
[getColumnReader](https://github.com/apache/arrow-datafusion-comet/blob/main/common/src/main/java/org/apache/comet/parquet/Utils.java#L39)
 to indicate whether the ColumnReader is an Iceberg ColumnReader.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] fix: Only maps FIXED_LEN_BYTE_ARRAY to String for uuid type [arrow-datafusion-comet]

Reply via email to