scxwhite commented on code in PR #7842:
URL: https://github.com/apache/hudi/pull/7842#discussion_r1098391650
##########
hudi-common/src/main/java/org/apache/hudi/common/model/PartialUpdateAvroPayload.java:
##########
@@ -196,30 +196,40 @@ protected Option<IndexedRecord>
mergeDisorderRecordsWithMetadata(
/**
* Returns whether the given record is newer than the record of this payload.
*
- * @param orderingVal
- * @param record The record
- * @param prop The payload properties
- *
+ * @param schema The schema
+ * @param record The record
+ * @param prop The payload properties
* @return true if the given record is newer
*/
- private static boolean isRecordNewer(Comparable orderingVal, IndexedRecord
record, Properties prop) {
+ private boolean isRecordNewer(Schema schema, IndexedRecord record,
Properties prop) throws IOException {
String orderingField =
prop.getProperty(HoodiePayloadProps.PAYLOAD_ORDERING_FIELD_PROP_KEY);
+ String preCombineField =
prop.getProperty("hoodie.datasource.write.precombine.field");
+
if (!StringUtils.isNullOrEmpty(orderingField)) {
boolean consistentLogicalTimestampEnabled =
Boolean.parseBoolean(prop.getProperty(
KeyGeneratorOptions.KEYGENERATOR_CONSISTENT_LOGICAL_TIMESTAMP_ENABLED.key(),
KeyGeneratorOptions.KEYGENERATOR_CONSISTENT_LOGICAL_TIMESTAMP_ENABLED.defaultValue()));
+ Comparable incomingOrderingVal;
+ if (orderingField.equals(preCombineField)) {
+ incomingOrderingVal = orderingVal;
Review Comment:
@danny0405 Thank you for your review.
If I understand correctly, the value of orderingVal variable defined in
BaseAvroPayload is precombine configuration. You can refer to this part of code.
https://github.com/apache/hudi/blob/6436ef3ee2cb887cf25643c5c99f1fe34a64ec68/hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala#L1095
In the process of using Spark, I found that precombine is configured by
hoodie.datasource.write.precombine.field configuration item, and orderfield is
configured by hoodie.payload.ordering.field configuration item.There is no
restriction that precombine and orderfield must be consistent.
https://github.com/apache/hudi/blob/6436ef3ee2cb887cf25643c5c99f1fe34a64ec68/hudi-common/src/main/java/org/apache/hudi/common/model/DefaultHoodieRecordPayload.java#L126
And in the DefaultHoodieRecordPayload class, we can also find that the
orderingVal used is the value of the configuration item
hoodie.payload.ordering.field.The orderingVal variable in the BaseAvroPayload
class is not used.
As for the reason why I compare the fields of the two to see if they are
consistent, as you just mentioned, it is to avoid the time of data
serialization.
I look forward to your answer. Thank you again.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]