danny0405 commented on code in PR #17953:
URL: https://github.com/apache/hudi/pull/17953#discussion_r2711459373
##########
hudi-common/src/main/java/org/apache/hudi/common/table/read/BufferedRecordMergerFactory.java:
##########
@@ -249,7 +249,7 @@ public BufferedRecord<T> finalMerge(BufferedRecord<T>
olderRecord, BufferedRecor
Comparable oldOrderingValue = olderRecord.getOrderingValue();
HoodieSchema newSchema =
recordContext.getSchemaFromBufferRecord(newerRecord);
if (!olderRecord.isCommitTimeOrderingDelete()
- && oldOrderingValue.compareTo(newOrderingValue) > 0) {
+ && OrderingValues.compare(oldOrderingValue, newOrderingValue) > 0) {
Review Comment:
> TLDR of the root cause: MercifulJsonConverter.StringProcessor returns
java.lang.String directly instead of wrapping it in Utf8. This is inconsistent
with Avro's standard behavior where strings should be Utf8 by default.
Fixing the `MercifulJsonConverter.StringProcessor` to return avro `Utf8`
looks like the right fix, for Flink and Spark engine row to avro generic record
transformation, they all return `Uft8` string though.
Can we fix the test
`TestJsonKafkaSource.testJsonKafkaSourceWithEncodedDecimals` for the row
projection, I guess there is a gap between the avro record to InternalRow
transformation. The other 2 UT failues looks like a easy fix.
In any case, let's not add the special handling(type conversion) in the
comparison, it's costly also error prone.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]