Re: [PR] fix: Allow String ordering fields can work with JSON src with COW [hudi]

via GitHub Wed, 21 Jan 2026 00:23:15 -0800


danny0405 commented on code in PR #17953:
URL: https://github.com/apache/hudi/pull/17953#discussion_r2711459373



##########
hudi-common/src/main/java/org/apache/hudi/common/table/read/BufferedRecordMergerFactory.java:
##########
@@ -249,7 +249,7 @@ public BufferedRecord<T> finalMerge(BufferedRecord<T> 
olderRecord, BufferedRecor
       Comparable oldOrderingValue = olderRecord.getOrderingValue();
       HoodieSchema newSchema = 
recordContext.getSchemaFromBufferRecord(newerRecord);
       if (!olderRecord.isCommitTimeOrderingDelete()
-          && oldOrderingValue.compareTo(newOrderingValue) > 0) {
+          && OrderingValues.compare(oldOrderingValue, newOrderingValue) > 0) {

Review Comment:
   > TLDR of the root cause: MercifulJsonConverter.StringProcessor returns 
java.lang.String directly instead of wrapping it in Utf8. This is inconsistent 
with Avro's standard behavior where strings should be Utf8 by default.
   
   Fixing the `MercifulJsonConverter.StringProcessor` to return avro `Utf8` 
looks like the right fix, for Flink and Spark engine row to avro generic record 
transformation, they all return `Uft8` string though.
   
   Can we fix the test 
`TestJsonKafkaSource.testJsonKafkaSourceWithEncodedDecimals` for the row 
projection, I guess there is a gap between the avro record to InternalRow 
transformation. The other 2 UT failues looks like a easy fix.
   
   In any case, let's not add the special handling(type conversion) in the 
comparison, it's costly also error prone.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] fix: Allow String ordering fields can work with JSON src with COW [hudi]

Reply via email to