CTTY commented on code in PR #11853:
URL: https://github.com/apache/hudi/pull/11853#discussion_r1804064169


##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/streamer/StreamSync.java:
##########
@@ -328,24 +328,54 @@ public StreamSync(HoodieStreamer.Config cfg, SparkSession 
sparkSession,
   }
 
   /**
-   * Refresh Timeline.
+   * Creates a meta client for the table, and refresh timeline.
    *
    * @throws IOException in case of any IOException
    */
-  public void refreshTimeline() throws IOException {
+  public HoodieTableMetaClient initializeMetaClientAndRefreshTimeline() throws 
IOException {

Review Comment:
   We can just throw `HoodieIOException` instead of `IOException`



##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/streamer/StreamSync.java:
##########
@@ -328,24 +328,54 @@ public StreamSync(HoodieStreamer.Config cfg, SparkSession 
sparkSession,
   }
 
   /**
-   * Refresh Timeline.
+   * Creates a meta client for the table, and refresh timeline.
    *
    * @throws IOException in case of any IOException
    */
-  public void refreshTimeline() throws IOException {
+  public HoodieTableMetaClient initializeMetaClientAndRefreshTimeline() throws 
IOException {
+    return initializeMetaClient(true);
+  }
+
+  /**
+   * Creates a meta client for the table without refreshing timeline.
+   *
+   * @throws IOException in case of any IOException
+   */
+  private HoodieTableMetaClient initializeMetaClient() throws IOException {
+    return initializeMetaClient(false);
+  }
+
+  /**
+   * Creates a meta client for the table, and refresh timeline if specified.
+   * If the table does not yet exist, it will be initialized.
+   *
+   * @param refreshTimeline when set to true, loads the active timeline and 
updates the {@link #commitsTimelineOpt} and {@link #allCommitsTimelineOpt} 
values
+   * @return a meta client for the table
+   * @throws IOException in case of any IOException
+   */
+  private HoodieTableMetaClient initializeMetaClient(boolean refreshTimeline) 
throws IOException {
     if (storage.exists(new StoragePath(cfg.targetBasePath))) {

Review Comment:
   This can be further optimized by using try-catch to assume the existence of 
`targetBasePath` and handle the new table case by catching 
`TableNotFoundException`, so we don't need to check target path for every sync. 
But I think we can use a separate PR to fix this



##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/streamer/StreamSync.java:
##########
@@ -1219,14 +1243,13 @@ private Pair<HoodieWriteConfig, Schema> 
getHoodieClientConfigAndWriterSchema(Sch
     return Pair.of(config, returnSchema);
   }
 
-  private Schema getSchemaForWriteConfig(Schema targetSchema) {
+  private Schema getSchemaForWriteConfig(Schema targetSchema, 
HoodieTableMetaClient metaClient) {

Review Comment:
   Would it make sense to make `metaClient` a member variable of `StreamSync` 
instead of passing it around?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to