rkhachatryan commented on a change in pull request #12269:
URL: https://github.com/apache/flink/pull/12269#discussion_r428630224
##########
File path:
flink-runtime/src/test/java/org/apache/flink/runtime/checkpoint/CheckpointCoordinatorTest.java
##########
@@ -262,6 +251,40 @@ public void failJobDueToTaskFailure(Throwable cause,
ExecutionAttemptID failingT
}
}
+ @Test
+ public void testExpiredCheckpointExceedsTolerableFailureNumber() {
+ // create some mock Execution vertices that receive the
checkpoint trigger messages
+ ExecutionVertex vertex1 = mockExecutionVertex(new
ExecutionAttemptID());
+ ExecutionVertex vertex2 = mockExecutionVertex(new
ExecutionAttemptID());
+
+ final String errorMsg = "Exceeded checkpoint failure tolerance
number!";
+ CheckpointFailureManager checkpointFailureManager =
getCheckpointFailureManager(errorMsg);
+ CheckpointCoordinator coord = getCheckpointCoordinator(new
JobID(), vertex1, vertex2, checkpointFailureManager);
+
+ try {
+ // trigger the checkpoint. this should succeed
+ final CompletableFuture<CompletedCheckpoint>
checkPointFuture = coord.triggerCheckpoint(false);
+ manuallyTriggeredScheduledExecutor.triggerAll();
+
assertFalse(checkPointFuture.isCompletedExceptionally());
+
+ coord.abortPendingCheckpoints(new
CheckpointException(CHECKPOINT_EXPIRED));
+
+ fail("Test failed.");
+ }
+ catch (Exception e) {
+ //expected
+ assertTrue(e instanceof RuntimeException);
+ assertEquals(errorMsg, e.getMessage());
Review comment:
I think using `org.junit.Test#expected` (and a specific exception class)
would be more expressive and less verbose here.
##########
File path:
flink-runtime/src/test/java/org/apache/flink/runtime/checkpoint/CheckpointCoordinatorTest.java
##########
@@ -262,6 +251,40 @@ public void failJobDueToTaskFailure(Throwable cause,
ExecutionAttemptID failingT
}
}
+ @Test
+ public void testExpiredCheckpointExceedsTolerableFailureNumber() {
+ // create some mock Execution vertices that receive the
checkpoint trigger messages
+ ExecutionVertex vertex1 = mockExecutionVertex(new
ExecutionAttemptID());
+ ExecutionVertex vertex2 = mockExecutionVertex(new
ExecutionAttemptID());
+
+ final String errorMsg = "Exceeded checkpoint failure tolerance
number!";
+ CheckpointFailureManager checkpointFailureManager =
getCheckpointFailureManager(errorMsg);
+ CheckpointCoordinator coord = getCheckpointCoordinator(new
JobID(), vertex1, vertex2, checkpointFailureManager);
+
+ try {
+ // trigger the checkpoint. this should succeed
Review comment:
nit: this comment basically repeats what the code does, I think it's
unnecessary
##########
File path:
flink-runtime/src/test/java/org/apache/flink/runtime/checkpoint/CheckpointCoordinatorTest.java
##########
@@ -262,6 +251,40 @@ public void failJobDueToTaskFailure(Throwable cause,
ExecutionAttemptID failingT
}
}
+ @Test
+ public void testExpiredCheckpointExceedsTolerableFailureNumber() {
+ // create some mock Execution vertices that receive the
checkpoint trigger messages
+ ExecutionVertex vertex1 = mockExecutionVertex(new
ExecutionAttemptID());
+ ExecutionVertex vertex2 = mockExecutionVertex(new
ExecutionAttemptID());
+
+ final String errorMsg = "Exceeded checkpoint failure tolerance
number!";
+ CheckpointFailureManager checkpointFailureManager =
getCheckpointFailureManager(errorMsg);
+ CheckpointCoordinator coord = getCheckpointCoordinator(new
JobID(), vertex1, vertex2, checkpointFailureManager);
+
+ try {
+ // trigger the checkpoint. this should succeed
+ final CompletableFuture<CompletedCheckpoint>
checkPointFuture = coord.triggerCheckpoint(false);
+ manuallyTriggeredScheduledExecutor.triggerAll();
+
assertFalse(checkPointFuture.isCompletedExceptionally());
+
+ coord.abortPendingCheckpoints(new
CheckpointException(CHECKPOINT_EXPIRED));
+
+ fail("Test failed.");
+ }
+ catch (Exception e) {
+ //expected
+ assertTrue(e instanceof RuntimeException);
+ assertEquals(errorMsg, e.getMessage());
+ } finally {
+ try {
+ coord.shutdown(JobStatus.FINISHED);
+ } catch (Exception e) {
+ e.printStackTrace();
+ fail(e.getMessage());
Review comment:
Why do we need to handle this error? Won't it's stacktrace be printed
and test fail anyways?
##########
File path:
flink-runtime/src/test/java/org/apache/flink/runtime/checkpoint/CheckpointCoordinatorTest.java
##########
@@ -262,6 +251,40 @@ public void failJobDueToTaskFailure(Throwable cause,
ExecutionAttemptID failingT
}
}
+ @Test
+ public void testExpiredCheckpointExceedsTolerableFailureNumber() {
+ // create some mock Execution vertices that receive the
checkpoint trigger messages
+ ExecutionVertex vertex1 = mockExecutionVertex(new
ExecutionAttemptID());
+ ExecutionVertex vertex2 = mockExecutionVertex(new
ExecutionAttemptID());
+
+ final String errorMsg = "Exceeded checkpoint failure tolerance
number!";
+ CheckpointFailureManager checkpointFailureManager =
getCheckpointFailureManager(errorMsg);
+ CheckpointCoordinator coord = getCheckpointCoordinator(new
JobID(), vertex1, vertex2, checkpointFailureManager);
+
+ try {
+ // trigger the checkpoint. this should succeed
+ final CompletableFuture<CompletedCheckpoint>
checkPointFuture = coord.triggerCheckpoint(false);
+ manuallyTriggeredScheduledExecutor.triggerAll();
+
assertFalse(checkPointFuture.isCompletedExceptionally());
Review comment:
I think this check is not necessary here because triggering should be
tested separately
(and the next check should fail anyways if trigger failed)
##########
File path:
flink-runtime/src/test/java/org/apache/flink/runtime/checkpoint/CheckpointCoordinatorTest.java
##########
@@ -2292,6 +2315,22 @@ private CheckpointCoordinator getCheckpointCoordinator()
{
.build();
}
+ private CheckpointFailureManager getCheckpointFailureManager(String
errorMsg) {
Review comment:
:+1: for extracting shared code
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]