yihua commented on code in PR #12317:
URL: https://github.com/apache/hudi/pull/12317#discussion_r1857465552


##########
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala:
##########
@@ -755,7 +755,9 @@ class HoodieSparkSqlWriterInternal {
           .setPayloadClassName(payloadClass)
           
.setRecordMergeMode(RecordMergeMode.getValue(hoodieConfig.getString(HoodieWriteConfig.RECORD_MERGE_MODE)))
           .setRecordMergeStrategyId(recordMergerStrategy)
-          
.setPreCombineField(hoodieConfig.getStringOrDefault(PRECOMBINE_FIELD, null))
+          // we can't fetch preCombine field from hoodieConfig object, since 
it falls back to "ts" as default value,

Review Comment:
   The approach of this PR is much more complicated than required.  Previously, 
the `precombine` field is set as `ts` if the user does not configure it.  We 
can still keep that.  If the `ts` field does not exist, we use the natural 
ordering, i.e., setting `0` as the ordering value (in `else` branch which 
creates the record; second `DefaultHoodieRecordPayload` constructor fills the 0 
as the ordering value).
   ```
   val hoodieRecord = if (shouldCombine && !precombineEmpty) {
                 val orderingVal = HoodieAvroUtils.getNestedFieldVal(avroRec, 
precombine,
                   false, 
consistentLogicalTimestampEnabled).asInstanceOf[Comparable[_]]
                 DataSourceUtils.createHoodieRecord(processedRecord, 
orderingVal, hoodieKey,
                   config.getPayloadClass, recordLocation)
               } else {
                 DataSourceUtils.createHoodieRecord(processedRecord, hoodieKey,
                   config.getPayloadClass, recordLocation)
               }
   ```
   ```
     public DefaultHoodieRecordPayload(GenericRecord record, Comparable 
orderingVal) {
       super(record, orderingVal);
     }
   
     public DefaultHoodieRecordPayload(Option<GenericRecord> record) {
       this(record.isPresent() ? record.get() : null, 0); // natural order
     }
   ```
   Note that if `ts` or another configured ordering field exists in the schema, 
nothing should break, which is the behavior.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to