nsivabalan commented on a change in pull request #3836:
URL: https://github.com/apache/hudi/pull/3836#discussion_r744011894



##########
File path: 
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/SparkRDDWriteClient.java
##########
@@ -95,10 +95,19 @@ public SparkRDDWriteClient(HoodieEngineContext context, 
HoodieWriteConfig writeC
   public SparkRDDWriteClient(HoodieEngineContext context, HoodieWriteConfig 
writeConfig,
                              Option<EmbeddedTimelineService> timelineService) {
     super(context, writeConfig, timelineService);
+    bootstrapMetadataTable();
+  }
+
+  private void bootstrapMetadataTable() {
     if (config.isMetadataTableEnabled()) {
-      // If the metadata table does not exist, it should be bootstrapped here
-      // TODO: Check if we can remove this requirement - auto bootstrap on 
commit
-      
SparkHoodieBackedTableMetadataWriter.create(context.getHadoopConf().get(), 
config, context);
+      // Defer bootstrap if upgrade / downgrade is pending
+      HoodieTableMetaClient metaClient = createMetaClient(true);
+      UpgradeDowngrade upgradeDowngrade = new UpgradeDowngrade(
+          metaClient, config, context, 
SparkUpgradeDowngradeHelper.getInstance());
+      if 
(!upgradeDowngrade.needsUpgradeOrDowngrade(HoodieTableVersion.current())) {

Review comment:
       gotcha. wondering if we can move the instantiation of metadata table 
   ```
   SparkHoodieBackedTableMetadataWriter.create(context.getHadoopConf().get(), 
config, context);
   ```
   just after upgrade downgrade step within getTableAndInitCtx. 
   So, that we may not miss to bootstrap the table even if upgrade is required. 
   
   Here is what I am thinking. I would assume typically users might stop all 
processes and then do an upgrade for next hudi version. Once first commit goes 
through, users might want to get started w/ multi writers. so, would prefer to 
do the bootstrap on the first write operation where upgrade is triggered. 
   Open to hear your thoughts. 
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to