tillrohrmann commented on a change in pull request #15466:
URL: https://github.com/apache/flink/pull/15466#discussion_r607640453
##########
File path:
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/DefaultCheckpointPlanCalculator.java
##########
@@ -147,11 +147,7 @@ private void checkAllTasksInitiated() throws
CheckpointException {
*/
private void checkTasksStarted(List<Execution> toTrigger) throws
CheckpointException {
for (Execution execution : toTrigger) {
- if (execution.getState() == ExecutionState.CREATED
- || execution.getState() == ExecutionState.RECOVERING
- || execution.getState() == ExecutionState.SCHEDULED
- || execution.getState() == ExecutionState.DEPLOYING) {
-
+ if (execution.getState() != ExecutionState.RUNNING) {
Review comment:
Didn't @gaoyunhaii say that we also need to trigger tasks if they are in
state `FINISHED`? I think here he said it:
https://issues.apache.org/jira/browse/FLINK-22003?focusedCommentId=17312877&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17312877
##########
File path:
flink-tests/src/test/java/org/apache/flink/test/checkpointing/UnalignedCheckpointITCase.java
##########
@@ -260,6 +261,10 @@ public UnalignedCheckpointITCase(
.setChannelTypes(channelType)
.setExpectedFailures(5)
.setFailuresAfterSourceFinishes(1)
+ // prevent test from timing out in case when a
failover happens concurrently
+ // with triggering a checkpoint (some execution status
can change right
+ // after triggering)
+ .setCheckpointTimeout(Duration.ofMinutes(1))
Review comment:
Should we set the timeout to a smaller value? 1 minutes sounds still
quite long. How long does the UnalignedCheckpointITCase take? Do we have to
change any threshold to tolerate checkpoint failures in case they time out?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]