xushiyan commented on code in PR #10099:
URL: https://github.com/apache/hudi/pull/10099#discussion_r1396262488


##########
hudi-sync/hudi-hive-sync/src/main/java/org/apache/hudi/hive/HiveSyncTool.java:
##########
@@ -103,15 +103,38 @@ public class HiveSyncTool extends HoodieSyncTool 
implements AutoCloseable {
 
   public HiveSyncTool(Properties props, Configuration hadoopConf) {
     super(props, hadoopConf);
-    String metastoreUris = props.getProperty(METASTORE_URIS.key());
+    /*String metastoreUris = props.getProperty(METASTORE_URIS.key());
     // Give precedence to HiveConf.ConfVars.METASTOREURIS if it is set.
     // Else if user has provided HiveSyncConfigHolder.METASTORE_URIS, then set 
that in hadoop conf.
     if (isNullOrEmpty(hadoopConf.get(HiveConf.ConfVars.METASTOREURIS.varname)) 
&& nonEmpty(metastoreUris)) {
       LOG.info(String.format("Setting %s = %s", 
HiveConf.ConfVars.METASTOREURIS.varname, metastoreUris));
       hadoopConf.set(HiveConf.ConfVars.METASTOREURIS.varname, metastoreUris);
     }
-    HiveSyncConfig config = new HiveSyncConfig(props, hadoopConf);
-    this.config = config;
+    HiveSyncConfig config = new HiveSyncConfig(props, hadoopConf);*/
+
+    String configuredMetastoreUris = props.getProperty(METASTORE_URIS.key());
+    String existingHadoopConfMetastoreUris = 
hadoopConf.get(HiveConf.ConfVars.METASTOREURIS.varname);
+
+    final Configuration hadoopConfForSync; // the configuration to use for 
this instance of the sync tool
+    if (nonEmpty(configuredMetastoreUris)) {
+      // if the metastore uris from the provided hadoop configuration exist 
and are equal to the user provided URIs, then we can use the provided 
configuration
+      if (configuredMetastoreUris.equals(existingHadoopConfMetastoreUris)) {
+        hadoopConfForSync = hadoopConf;
+      } else if (isNullOrEmpty(existingHadoopConfMetastoreUris)) {
+        // if there is no value already set in the provided configuration, 
update the existing configuration to avoid making copies of the configuration 
per instance of this tool
+        hadoopConf.set(HiveConf.ConfVars.METASTOREURIS.varname, 
configuredMetastoreUris);
+        hadoopConfForSync = hadoopConf;
+      } else {
+        // if the metastore uris exist in the provided hadoop configuration 
but differ from the user provided URIs, then we need to create a new 
configuration with the value set
+        hadoopConfForSync = new Configuration(hadoopConf);
+        hadoopConfForSync.set(HiveConf.ConfVars.METASTOREURIS.varname, 
configuredMetastoreUris);
+      }
+    } else {
+      // if the user did not provide any URIs, then we can use the provided 
configuration
+      hadoopConfForSync = hadoopConf;

Review Comment:
   i'd be really cautious about this change, given the conf loading part caused 
regressions before with glue sync. and the new logic seems making situation 
more complicated, and also hard to test. if the goal is to sync to multiple HMS 
for multiDS, then it'll be cleaner to create multi sync tool and keep the conf 
loading part straightforward



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to