prashantwason commented on code in PR #8684:
URL: https://github.com/apache/hudi/pull/8684#discussion_r1203518705


##########
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/SparkRDDWriteClient.java:
##########
@@ -337,15 +337,33 @@ protected void initMetadataTable(Option<String> 
instantTime) {
    *
    * @param inFlightInstantTimestamp - The in-flight action responsible for 
the metadata table initialization
    */
-  private void initializeMetadataTable(Option<String> 
inFlightInstantTimestamp) {
-    if (config.isMetadataTableEnabled()) {
-      HoodieTableMetadataWriter writer = 
SparkHoodieBackedTableMetadataWriter.create(context.getHadoopConf().get(), 
config,
-          context, Option.empty(), inFlightInstantTimestamp);
-      try {
-        writer.close();
-      } catch (Exception e) {
-        throw new HoodieException("Failed to instantiate Metadata table ", e);
+  private void initializeMetadataTable(WriteOperationType operationType, 
Option<String> inFlightInstantTimestamp) {
+    if (!config.isMetadataTableEnabled()) {
+      return;
+    }
+
+    try (HoodieTableMetadataWriter writer = 
SparkHoodieBackedTableMetadataWriter.create(context.getHadoopConf().get(), 
config,
+            context, Option.empty(), inFlightInstantTimestamp)) {
+      if (writer.isInitialized()) {
+        // Optimize the metadata table which involves compacton. cleaning, 
etc. This should only be called from writers.
+        switch (operationType) {
+          case INSERT:
+          case INSERT_PREPPED:
+          case UPSERT:
+          case UPSERT_PREPPED:
+          case BULK_INSERT:
+          case BULK_INSERT_PREPPED:
+          case DELETE:

Review Comment:
   On second thoughts, the switch is not necessary. The above code is within a 
transaction lock so there should not be any conflicts of multiple writers 
optimizing MDT together. The checks within performTableServices should be light 
enough or we can optimize them.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to