[GitHub] [hudi] voonhous commented on a diff in pull request #6566: [HUDI-4766] Fix HoodieFlinkClusteringJob

GitBox Fri, 02 Sep 2022 22:03:08 -0700


voonhous commented on code in PR #6566:
URL: https://github.com/apache/hudi/pull/6566#discussion_r962104627



##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/clustering/FlinkClusteringConfig.java:
##########
@@ -69,13 +83,14 @@ public class FlinkClusteringConfig extends Configuration {
       required = false)
   public Integer archiveMaxCommits = 30;
 
-  @Parameter(names = {"--schedule", "-sc"}, description = "Not recommended. 
Schedule the clustering plan in this job.\n"
-      + "There is a risk of losing data when scheduling clustering outside the 
writer job.\n"
-      + "Scheduling clustering in the writer job and only let this job do the 
clustering execution is recommended.\n"
-      + "Default is true", required = false)
-  public Boolean schedule = true;
+  @Parameter(names = {"--schedule", "-sc"}, description = "Schedule the 
clustering plan in this job.\n"
+      + "Default is false", required = false)
+  public Boolean schedule = false;
+
+  @Parameter(names = {"--instant-time", "-it"}, description = "Clustering 
Instant time")
+  public String clusteringInstantTime = null;

Review Comment:
   From `HoodieClusteringJob.java`
   ```
   @Parameter(names = {"--instant-time", "-it"}, description = "Clustering 
Instant time, only used when set --mode execute. "
           + "If the instant time is not provided with --mode execute, "
           + "the earliest scheduled clustering instant time is used by 
default. "
           + "When set \"--mode scheduleAndExecute\" this instant-time will be 
ignored.")
       public String clusteringInstantTime = null;
   ```
   
   Should we standardise the parameter? Given that the Spark parameter is using 
`--instant-time`, we should ensure that both of them are the same to avoid 
confusion.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] voonhous commented on a diff in pull request #6566: [HUDI-4766] Fix HoodieFlinkClusteringJob

Reply via email to