cshuo commented on code in PR #18018:
URL: https://github.com/apache/hudi/pull/18018#discussion_r2744985557


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/partitioner/index/RecordLevelIndexBackend.java:
##########
@@ -106,9 +106,9 @@ public void onCheckpoint(long checkpointId) {
   }
 
   @Override
-  public void onCheckpointComplete(Correspondent correspondent) {
+  public void onCheckpointComplete(Correspondent correspondent, long 
currentCheckpointId) {
     Map<Long, String> inflightInstants = 
correspondent.requestInflightInstants();
-    
recordIndexCache.clean(inflightInstants.keySet().stream().min(Long::compareTo).orElse(Long.MAX_VALUE));
+    
recordIndexCache.markCleanable(inflightInstants.keySet().stream().min(Long::compareTo).orElse(currentCheckpointId));

Review Comment:
   not always, e.g., consider the following case:
   -> coordinator receives all the write events, and commit the instant 
successfully. The events buffer is empty now.
   -> no writer is flushing write buffer, so no new instant will be requested 
from coordinator.  The events buffer is empty now.
   -> JobManager notifys checkpoint complete, bucket assigner request inflight 
instants from coordinator, then the returned result is empty.
   
   For such case, we can use current checkpoint id in the bucket assigner as a 
fallback.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to