danny0405 commented on code in PR #12519:
URL: https://github.com/apache/hudi/pull/12519#discussion_r1894546320


##########
hudi-client/hudi-spark-client/src/main/scala/org/apache/hudi/SparkFileFormatInternalRowReaderContext.scala:
##########
@@ -263,20 +263,23 @@ class 
SparkFileFormatInternalRowReaderContext(parquetFileReader: SparkParquetRea
   }
 
   override def castValue(value: Comparable[_], newType: Schema.Type): 
Comparable[_] = {
-    value match {
+    val valueToCast = if (value == null) 0 else value
+    valueToCast match {
       case v: Integer => newType match {
         case Type.INT => v
         case Type.LONG => v.longValue()
         case Type.FLOAT => v.floatValue()
         case Type.DOUBLE => v.doubleValue()
         case Type.STRING => UTF8String.fromString(v.toString)
+        case Type.FIXED => BigDecimal(v)

Review Comment:
   I see a special handling for Spark decimal data type for precision < 18 in 
the unsafeRow decimal getter:
   
   ```java
   public Decimal getDecimal(int ordinal, int precision, int scale) {
       if (isNullAt(ordinal)) {
         return null;
       }
       if (precision <= Decimal.MAX_LONG_DIGITS()) {
         return Decimal.createUnsafe(getLong(ordinal), precision, scale);
       } else {
         byte[] bytes = getBinary(ordinal);
         BigInteger bigInteger = new BigInteger(bytes);
         BigDecimal javaDecimal = new BigDecimal(bigInteger, scale);
         return Decimal.apply(javaDecimal, precision, scale);
       }
     }
   ```
   
   Should we do it here too?
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to