linliu-code commented on code in PR #14001:
URL: https://github.com/apache/hudi/pull/14001#discussion_r2385821443


##########
hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestPayloadDeprecationFlow.scala:
##########
@@ -114,15 +129,65 @@ class TestPayloadDeprecationFlow extends 
SparkClientFunctionalTestHarness {
       option(HoodieCompactionConfig.INLINE_COMPACT.key(), "false").
       option(HoodieWriteConfig.WRITE_TABLE_VERSION.key(), "8").
       option(HoodieTableConfig.ORDERING_FIELDS.key(), originalOrderingFields).
+      option(HoodieCleanConfig.CLEANER_COMMITS_RETAINED.key(), "3").
+      option(HoodieCleanConfig.AUTO_CLEAN.key(), "false").
+      option(HoodieArchivalConfig.AUTO_ARCHIVE.key(), "true").
+      option(HoodieArchivalConfig.COMMITS_ARCHIVAL_BATCH_SIZE.key(), "1").
+      option(HoodieArchivalConfig.MIN_COMMITS_TO_KEEP.key(), "2").
+      option(HoodieArchivalConfig.MAX_COMMITS_TO_KEEP.key(), "3").
+      option(HoodieClusteringConfig.INLINE_CLUSTERING.key(), "true").
+      option(HoodieClusteringConfig.INLINE_CLUSTERING_MAX_COMMITS.key(), "2").
+      option(HoodieClusteringConfig.PLAN_STRATEGY_SMALL_FILE_LIMIT.key(), 
"512000").
+      option(HoodieClusteringConfig.PLAN_STRATEGY_TARGET_FILE_MAX_BYTES.key(), 
"512000").
       options(opts).
       mode(SaveMode.Append).
       save(basePath)
     // Validate table version.
-    metaClient = HoodieTableMetaClient.reload(metaClient)
+    metaClient = HoodieTableMetaClient.builder()
+      .setBasePath(basePath)
+      .setConf(storageConf())
+      .build()
     assertEquals(8, metaClient.getTableConfig.getTableVersion.versionCode())
     val firstUpdateInstantTime = 
metaClient.getActiveTimeline.getInstants.get(1).requestedTime()
 
+    // 2.5. Add mixed ordering test data to validate proper ordering handling

Review Comment:
   To really confirm this, we should swap the two records for rider-CC in this 
batch since in this way commit_time and event_time based payloads would 
generate different results; then we know for eventt_time based table, the 
record with 35.00 fare is chosen based on ordering value, not based on 
commit_time by mistake.
   I know the data validation will be a little more complex. 
   Please think of a better way if possible.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to