nsivabalan commented on code in PR #18061:
URL: https://github.com/apache/hudi/pull/18061#discussion_r2751769694
##########
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieCreateRecordUtils.scala:
##########
@@ -153,8 +153,17 @@ object HoodieCreateRecordUtils {
val orderingVal = OrderingValues.create(
orderingFields,
JFunction.toJavaFunction[String, Comparable[_]](
- field => HoodieAvroUtils.getNestedFieldVal(avroRec, field,
false,
-
consistentLogicalTimestampEnabled).asInstanceOf[Comparable[_]]))
+ field => {
+ val fieldVal = HoodieAvroUtils.getNestedFieldVal(avroRec,
field, false,
+ consistentLogicalTimestampEnabled)
+ if (fieldVal == null) {
+ throw new IllegalArgumentException(
+ s"Precombine/ordering field '$field' has null value
for record key '${hoodieKey.getRecordKey}'. " +
+ s"Please ensure all records have non-null values for
the precombine field, " +
+ s"or use a payload class that doesn't require
ordering (e.g., OverwriteWithLatestAvroPayload).")
Review Comment:
In previous versions of hudi, its common for users to configure precombine
field even w/ OverwriteWithLatestAvroPayload. But in later versions, we fully
relaxed the constraint and its totally fine to not have precombine field
configured or have precombine configured, but some records could have null
values, assuming the payload is OverwriteWithLatestAvroPayload or merge mode is
RecordMergeMode.COMMIT_TIME_ORDERING
So, if we were to throw exception here, atleast we need to check the payload
and merge mode and then throw exception accordingly.
For eg, incase of OverwriteWithLatestAvroPayload or
RecordMergeMode.COMMIT_TIME_ORDERING, we don't wanna throw exception
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]