rdblue commented on a change in pull request #1271:
URL: https://github.com/apache/iceberg/pull/1271#discussion_r466007312



##########
File path: 
spark/src/main/java/org/apache/iceberg/spark/data/SparkOrcValueReaders.java
##########
@@ -195,7 +197,15 @@ public Long nonNullRead(ColumnVector vector, int row) {
     @Override
     public Decimal nonNullRead(ColumnVector vector, int row) {
       HiveDecimalWritable value = ((DecimalColumnVector) vector).vector[row];
-      return new Decimal().set(value.serialize64(value.scale()), 
value.precision(), value.scale());
+
+      // The scale of decimal read from hive ORC file may be not equals to the 
expected scale. For data type
+      // decimal(10,3) and the value 10.100, the hive ORC writer will remove 
its trailing zero and store it
+      // as 101*10^(-1), its scale will adjust from 3 to 1. So here we could 
not assert that value.scale() == scale.
+      // we also need to convert the hive orc decimal to a decimal with 
expected precision and scale.
+      Preconditions.checkArgument(value.precision() <= precision,
+          "Cannot read value as decimal(%s,%s), too large: %s", precision, 
scale, value);

Review comment:
       I'm not sure we need to check the precision either. If we read a value, 
then we should return it, right?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to