the-other-tim-brown commented on code in PR #13249:
URL: https://github.com/apache/hudi/pull/13249#discussion_r2102730083


##########
hudi-sync/hudi-hive-sync/src/main/java/org/apache/hudi/hive/HiveSyncTool.java:
##########
@@ -233,19 +237,28 @@ protected void syncHoodieTable(String tableName, boolean 
useRealtimeInputFormat,
     LOG.info("Trying to sync hoodie table " + tableName + " with base path " + 
syncClient.getBasePath()
         + " of type " + syncClient.getTableType());
 
-    // create database if needed
-    checkAndCreateDatabase();
-
     final boolean tableExists = syncClient.tableExists(tableName);
-    // Get the parquet schema for this table looking at the latest commit
-    MessageType schema = 
syncClient.getStorageSchema(!config.getBoolean(HIVE_SYNC_OMIT_METADATA_FIELDS));
     // if table exists and location of the metastore table doesn't match the 
hoodie base path, recreate the table
     if (tableExists && 
!FSUtils.comparePathsWithoutScheme(syncClient.getBasePath(), 
syncClient.getTableLocation(tableName))) {
       LOG.info("basepath is updated for the table {}", tableName);
       recreateAndSyncHiveTable(tableName, useRealtimeInputFormat, 
readAsOptimized);
       return;
     }
 
+    // Check if any sync is required
+    if (tableExists && isIncrementalSync()) {
+      Option<String> lastCommitTimeSynced = 
syncClient.getLastCommitTimeSynced(tableName);
+      Option<String> lastCommitCompletionTimeSynced = 
syncClient.getLastCommitCompletionTimeSynced(tableName);
+      if (lastCommitTimeSynced.isPresent()) {
+        if (TimelineUtils.getCommitsTimelineAfter(syncClient.getMetaClient(), 
lastCommitTimeSynced.get(), lastCommitCompletionTimeSynced).countInstants() == 
0) {

Review Comment:
   We have a use-case where we run meta-sync on restart for our continuous 
ingestion to cover the case where the job shuts down before syncing. The job 
can also be run in a standalone mode so this is an optimization to help reduce 
the load for these cases.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to