LoveHeat commented on code in PR #22806:
URL: https://github.com/apache/flink/pull/22806#discussion_r1246058742


##########
flink-runtime/src/main/java/org/apache/flink/runtime/source/coordinator/SourceCoordinator.java:
##########
@@ -195,9 +195,19 @@ void announceCombinedWatermark() {
                 "Distributing maxAllowedWatermark={} to subTaskIds={}",
                 maxAllowedWatermark,
                 subTaskIds);
-        for (Integer subtaskId : subTaskIds) {
-            context.sendEventToSourceOperator(
-                    subtaskId, new 
WatermarkAlignmentEvent(maxAllowedWatermark));
+        // Because of Java-ThreadPoolExecutor will not schedule the period task
+        // if it throws an exception, so we should handle the potential 
exception like
+        // "subtask xx is not ready yet to receive events" to increase 
robustness.
+        try {
+            for (Integer subtaskId : subTaskIds) {
+                context.sendEventToSourceOperator(
+                        subtaskId, new 
WatermarkAlignmentEvent(maxAllowedWatermark));
+            }

Review Comment:
   How about this: remove `context.getCoordinatorExecutor() `, and provide a 
function to schedule period task like below, PeriodTaskHook is a hook to let 
caller decide whether ignore the failure or fail job.
   
   
   
        interface PeriodTaskHook {
          
          static PeriodTaskHook DO_NOTHING_HOOK = new PeriodTaskHook() {};
   
           default void taskFailure(Throwable t, SourceCoordinatorContext<?> 
context) {}
   
           default void taskSuccess() {}
       }
   
       ScheduledFuture<?> schedulePeriodTask(
               Runnable command,
               long initDelay,
               long period,
               TimeUnit unit,
               PeriodTaskHook taskHook) {
           return coordinatorExecutor.scheduleAtFixedRate(
                   () -> {
                       try {
                           command.run();
                           taskHook.taskSuccess();
                       } catch (Throwable t) {
                           taskHook.taskFailure(t, this);
                       }
                   },
                   initDelay,
                   period,
                   unit);
       }
   
   
   For SourceAlignment, its auto recoverable after task become ready, so we can 
ignore these failure? @pnowojski 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to