LoveHeat commented on code in PR #22806:
URL: https://github.com/apache/flink/pull/22806#discussion_r1246058742
##########
flink-runtime/src/main/java/org/apache/flink/runtime/source/coordinator/SourceCoordinator.java:
##########
@@ -195,9 +195,19 @@ void announceCombinedWatermark() {
"Distributing maxAllowedWatermark={} to subTaskIds={}",
maxAllowedWatermark,
subTaskIds);
- for (Integer subtaskId : subTaskIds) {
- context.sendEventToSourceOperator(
- subtaskId, new
WatermarkAlignmentEvent(maxAllowedWatermark));
+ // Because of Java-ThreadPoolExecutor will not schedule the period task
+ // if it throws an exception, so we should handle the potential
exception like
+ // "subtask xx is not ready yet to receive events" to increase
robustness.
+ try {
+ for (Integer subtaskId : subTaskIds) {
+ context.sendEventToSourceOperator(
+ subtaskId, new
WatermarkAlignmentEvent(maxAllowedWatermark));
+ }
Review Comment:
How about this: remove `context.getCoordinatorExecutor() `, and provide a
function to schedule period task like below, PeriodTaskHook is a hook to let
caller decide whether ignore the failure or fail job.
interface PeriodTaskHook {
static PeriodTaskHook DO_NOTHING_HOOK = new PeriodTaskHook() {};
default void taskFailure(Throwable t, SourceCoordinatorContext<?>
context) {}
default void taskSuccess() {}
}
ScheduledFuture<?> schedulePeriodTask(
Runnable command,
long initDelay,
long period,
TimeUnit unit,
PeriodTaskHook taskHook) {
return coordinatorExecutor.scheduleAtFixedRate(
() -> {
try {
command.run();
taskHook.taskSuccess();
} catch (Throwable t) {
taskHook.taskFailure(t, this);
}
},
initDelay,
period,
unit);
}
For SourceAlignment, its auto recoverable after task become ready, so we can
ignore these failure? @pnowojski
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]