yihua commented on code in PR #14120:
URL: https://github.com/apache/hudi/pull/14120#discussion_r2443875319


##########
hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroUtils.java:
##########
@@ -1208,7 +1226,9 @@ public static Object rewritePrimaryType(Object oldValue, 
Schema oldSchema, Schem
           return oldValue;
         case LONG:
           if (oldSchema.getLogicalType() != newSchema.getLogicalType()) {
-            if (oldSchema.getLogicalType() instanceof 
LogicalTypes.TimestampMillis) {
+            if (skipLogicalTimestampEvolution || oldSchema.getLogicalType() == 
null || newSchema.getLogicalType() == null) {

Review Comment:
   However, for handling the timestamp issue this PR addresses, the ingestion 
writer needs to rewrite the schema from timestamp micros to millis.



##########
hudi-client/hudi-spark-client/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/HoodieParquetFileFormatHelper.scala:
##########
@@ -171,8 +234,14 @@ object HoodieParquetFileFormatHelper {
       }
     }
 
-    if (typeChangeInfos.isEmpty) {
+    if (typeChangeInfos.isEmpty && columnsToMultiply.isEmpty) {

Review Comment:
   Does this mean that the record-level projection overhead is only incurred if 
there are columns to apply multiplication?



##########
hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroUtils.java:
##########
@@ -1208,7 +1226,9 @@ public static Object rewritePrimaryType(Object oldValue, 
Schema oldSchema, Schem
           return oldValue;
         case LONG:
           if (oldSchema.getLogicalType() != newSchema.getLogicalType()) {
-            if (oldSchema.getLogicalType() instanceof 
LogicalTypes.TimestampMillis) {
+            if (skipLogicalTimestampEvolution || oldSchema.getLogicalType() == 
null || newSchema.getLogicalType() == null) {

Review Comment:
   Based on my understanding, previously, 
`AvroSchemaCompatibility#calculateCompatibility` does not validate the logical 
timestamp evolution before, so timestamp micros to timestamp millis can happen 
which leads to precision loss, and such schema evolution should not be allowed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to