voonhous commented on code in PR #17581:
URL: https://github.com/apache/hudi/pull/17581#discussion_r2625594639


##########
hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/utils/HiveAvroSerializer.java:
##########
@@ -493,12 +499,16 @@ private static void 
copyOldValueOrSetDefault(GenericRecord oldRecord, GenericRec
       Object newFieldValue;
       if (fieldValue instanceof GenericRecord) {
         GenericRecord record = (GenericRecord) fieldValue;
-        newFieldValue = rewriteRecordIgnoreResultCheck(record, 
AvroSchemaUtils.resolveUnionSchema(field.schema(), 
record.getSchema().getFullName()));
+        HoodieSchema nonNullFieldSchema = field.schema().getNonNullType();
+        if (!Objects.equals(nonNullFieldSchema.getFullName(), 
record.getSchema().getFullName())) {

Review Comment:
   I'm flattening the logic here for `AvroSchemaUtils#resolveUnionSchema`.
   
   The first part of `AvroSchemaUtils#resolveUnionSchema` is pretty much the 
same as `HoodieSchema#getNonNullType()` where there's a check on whether the 
schema is of **UNION** type.
   
   If so, we attempt to extract the type that is not null from the union pair.
   
   The difference here is that `AvroSchemaUtils#resolveUnionSchema` seems to be 
able to account for UNIONS that have a size larger than 2, which 
`HoodieSchema#getNonNullType()` does not. 
   
   And so, I felt that this is an unreachable case, which is why i added the 
sanity check back there as a safeguard.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to