zhangyue19921010 commented on a change in pull request #5186:
URL: https://github.com/apache/hudi/pull/5186#discussion_r839712711
##########
File path:
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/HoodieTable.java
##########
@@ -781,6 +782,53 @@ public HoodieEngineContext getContext() {
return Option.empty();
}
+ /**
+ * Initialize hoodie.table.metadata.enable in hoodie.proterties;
+ * Check if current metadata table flag in hoodieWriteConfig is the same as
recorded in hoodie.proterties:
+ * 1. If the flag in hoodie.proterties is true but it is false in
hoodieWriteConfig, It means users turn off MDT and we need to clean up MDT.
+ * 2. Update hoodie.table.metadata.enable in hoodie.proterties based on
HoodieWriteConfig if the value is different.
+ */
+ protected void verifyMetadataTableInTableConfig() {
+ // ignore the hoodie.properties in MDT
+ if
(metaClient.getBasePath().contains(HoodieTableMetaClient.METADATA_TABLE_FOLDER_PATH))
{
+ return;
+ }
+
+ try {
+ if
(metaClient.getTableConfig().contains(HoodieTableConfig.METADATA_TABLE_ENABLE))
{
+ Boolean enableMDTInTableConfig =
metaClient.getTableConfig().getBoolean(HoodieTableConfig.METADATA_TABLE_ENABLE);
+ Path mdtBasePath = new
Path(HoodieTableMetadata.getMetadataTableBasePath(config.getBasePath()));
+ // MDT flag in TableConfig is true;
+ // MDT flag in write config is false;
+ // ===> Users turn off MDT, so that we need to clean up this metadata
table in case out-of-sync issue.
+ // the condition order in if(xx) is important.
+ if (enableMDTInTableConfig && !config.isMetadataTableEnabled() &&
metaClient.getFs().exists(mdtBasePath)) {
+ LOG.info("Deleting metadata table because of disabled in writer.");
+ metaClient.getFs().delete(mdtBasePath, true);
+ }
+
+ // update table config if necessary
+ if (Boolean.compare(enableMDTInTableConfig,
config.isMetadataTableEnabled()) != 0) {
+ updateMetadataTableEnableInTableConfig();
+ }
+ } else {
+ // initial METADATA_TABLE_ENABLE.key in table config
+ updateMetadataTableEnableInTableConfig();
Review comment:
Nice catching! Sure we do need to take care of upgrade.
Maybe we can add `METADATA_TABLE_ENABLE` in ThreeToFourUpgradeHandler. Also
clean up MDT during upgrade when users disable MDT.
So that :
1. All old version MDT upgrade to master has `METADATA_TABLE_ENABLE` config
in hoodie.properties.
2. When `METADATA_TABLE_ENABLE` are false, there is no
$basePath/.hoodie/metadata left on storage anymore.
3. We could reduce the total FS call(fs.exist or fs.delete) as much as
possible.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]