nsivabalan commented on code in PR #13022:
URL: https://github.com/apache/hudi/pull/13022#discussion_r2010388289
##########
hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestMORDataSource.scala:
##########
@@ -426,6 +427,138 @@ class TestMORDataSource extends HoodieSparkClientTestBase
with SparkDatasetMixin
}
}
+ /**
+ * Test to ensure that the metadata compaction works as expected after a
downgrade.
+ * The test also validates that the delete blocks are correctly read post
downgrade.
+ */
+ @Test
+ def testMetadataCompactionWithDeleteBlockPostDowngrade() : Unit = {
Review Comment:
do you think we should move this to UpgradeDowngrade tests?
##########
hudi-common/src/main/java/org/apache/hudi/common/table/read/HoodieFileGroupReaderSchemaHandler.java:
##########
@@ -191,7 +190,7 @@ private static String[]
getMandatoryFieldsForMerging(HoodieTableConfig cfg, Type
cfg.getPayloadClass(),
cfg.getRecordMergeStrategyId(),
cfg.getPreCombineField(),
- HoodieTableVersion.current());
+ cfg.getTableVersion());
Review Comment:
can you remove this. this was attempted as a fix for the merge strategy
conflict. and my hunch is, this is not the full fix
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/upgrade/UpgradeDowngrade.java:
##########
@@ -176,6 +186,33 @@ public void run(HoodieTableVersion toVersion, String
instantTime) {
HoodieTableConfig.update(metaClient.getStorage(),
metaClient.getMetaPath(), metaClient.getTableConfig().getProps());
+
+ if (metaClient.getTableConfig().isMetadataTableAvailable() &&
toVersion.equals(HoodieTableVersion.SIX) && isDowngrade) {
+ // NOTE: Add empty deltacommit to metadata table. The compaction instant
format has changed in version 8.
+ // It no longer has a suffix of "001" for the compaction instant.
Due to that, the timeline instant
+ // comparison logic in metadata table will fail after LSM timeline
downgrade.
+ // To avoid that, we add an empty deltacommit to metadata table in
the downgrade step.
+ TypedProperties typedProperties = config.getProps();
+
typedProperties.setProperty(HoodieMetadataConfig.ENABLE_METADATA_INDEX_COLUMN_STATS.key(),
"false");
+
typedProperties.setProperty(HoodieMetadataConfig.ENABLE_METADATA_INDEX_PARTITION_STATS.key(),
"false");
Review Comment:
can we remove disabling col stats and partition stats in this patch.
we need to brainstorm as a team and align on that.
i.e for table version 6 in 1.0 writer, are we going to enable col stats and
PSI by default or no.
if the ans is no, those partitions would have already been deleted during
downgrade by the time we reach here.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]