linliu-code commented on code in PR #14001:
URL: https://github.com/apache/hudi/pull/14001#discussion_r2385821443
##########
hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestPayloadDeprecationFlow.scala:
##########
@@ -114,15 +129,65 @@ class TestPayloadDeprecationFlow extends
SparkClientFunctionalTestHarness {
option(HoodieCompactionConfig.INLINE_COMPACT.key(), "false").
option(HoodieWriteConfig.WRITE_TABLE_VERSION.key(), "8").
option(HoodieTableConfig.ORDERING_FIELDS.key(), originalOrderingFields).
+ option(HoodieCleanConfig.CLEANER_COMMITS_RETAINED.key(), "3").
+ option(HoodieCleanConfig.AUTO_CLEAN.key(), "false").
+ option(HoodieArchivalConfig.AUTO_ARCHIVE.key(), "true").
+ option(HoodieArchivalConfig.COMMITS_ARCHIVAL_BATCH_SIZE.key(), "1").
+ option(HoodieArchivalConfig.MIN_COMMITS_TO_KEEP.key(), "2").
+ option(HoodieArchivalConfig.MAX_COMMITS_TO_KEEP.key(), "3").
+ option(HoodieClusteringConfig.INLINE_CLUSTERING.key(), "true").
+ option(HoodieClusteringConfig.INLINE_CLUSTERING_MAX_COMMITS.key(), "2").
+ option(HoodieClusteringConfig.PLAN_STRATEGY_SMALL_FILE_LIMIT.key(),
"512000").
+ option(HoodieClusteringConfig.PLAN_STRATEGY_TARGET_FILE_MAX_BYTES.key(),
"512000").
options(opts).
mode(SaveMode.Append).
save(basePath)
// Validate table version.
- metaClient = HoodieTableMetaClient.reload(metaClient)
+ metaClient = HoodieTableMetaClient.builder()
+ .setBasePath(basePath)
+ .setConf(storageConf())
+ .build()
assertEquals(8, metaClient.getTableConfig.getTableVersion.versionCode())
val firstUpdateInstantTime =
metaClient.getActiveTimeline.getInstants.get(1).requestedTime()
+ // 2.5. Add mixed ordering test data to validate proper ordering handling
Review Comment:
To really confirm this, we should swap the two records for rider-CC in this
batch since in this way commit_time and event_time based payloads would
generate different results; then we know for eventt_time based table, the
record with 35.00 fare is chosen based on ordering value, not based on
commit_time by mistake.
I know the data validation will be a little more complex.
Please think of a better way if possible.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]