nsivabalan commented on code in PR #6574:
URL: https://github.com/apache/hudi/pull/6574#discussion_r969073144


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/cluster/ClusteringPlanActionExecutor.java:
##########
@@ -63,6 +63,7 @@ protected Option<HoodieClusteringPlan> createClusteringPlan() 
{
     int commitsSinceLastClustering = 
table.getActiveTimeline().getCommitsTimeline().filterCompletedInstants()
         
.findInstantsAfter(lastClusteringInstant.map(HoodieInstant::getTimestamp).orElse("0"),
 Integer.MAX_VALUE)
         .countInstants();
+
     if (config.inlineClusteringEnabled() && 
config.getInlineClusterMaxCommits() > commitsSinceLastClustering) {

Review Comment:
   but one thing which I am finding it hard to comprehend is. wrt clustering, 
either both planning and execution is inline. or both are async atleast wrt 
spark datasource writer. So, not sure how the user ended up where clustering 
was just scheduled w/o getting to completion. 
   



##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/cluster/ClusteringPlanActionExecutor.java:
##########
@@ -63,6 +63,7 @@ protected Option<HoodieClusteringPlan> createClusteringPlan() 
{
     int commitsSinceLastClustering = 
table.getActiveTimeline().getCommitsTimeline().filterCompletedInstants()
         
.findInstantsAfter(lastClusteringInstant.map(HoodieInstant::getTimestamp).orElse("0"),
 Integer.MAX_VALUE)
         .countInstants();
+
     if (config.inlineClusteringEnabled() && 
config.getInlineClusterMaxCommits() > commitsSinceLastClustering) {

Review Comment:
   ok, so the issue we are trying to solve is:
   
   there is a regular writer which just schedules clustering and we have a 
async clustering job which does the execution of clustering. 
   
   if clustering is pending (may be will be executed by an async clustering 
job), every new successful commit with regular writer will keep adding new 
replacecommit.requested. 
   
   If yes, then the fix makes sense to me. 
   @yihua @danny0405 : wdyt. 
   
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to