[GitHub] [flink] tillrohrmann commented on a change in pull request #12611: [FLINK-18137] Handle discarding of triggering checkpoint correctly

GitBox Fri, 12 Jun 2020 06:14:24 -0700


tillrohrmann commented on a change in pull request #12611:
URL: https://github.com/apache/flink/pull/12611#discussion_r439410491




##########
File path: 
flink-runtime/src/test/java/org/apache/flink/runtime/checkpoint/CheckpointCoordinatorTriggeringTest.java
##########
@@ -529,6 +537,48 @@ public void 
testTriggerCheckpointSnapshotMasterHookFailed() throws Exception {
                assertEquals(0, 
checkpointCoordinator.getTriggerRequestQueue().size());
        }
 
+       /**
+        * This test only fails eventually.
+        */
+       @Test
+       public void 
discardingTriggeringCheckpointWillExecuteNextCheckpointRequest() throws 
Exception {
+               final ExecutionVertex executionVertex = mockExecutionVertex(new 
ExecutionAttemptID());
+
+               final ScheduledExecutorService scheduledExecutorService = 
Executors.newSingleThreadScheduledExecutor();
+               final CheckpointCoordinator checkpointCoordinator = new 
CheckpointCoordinatorTestingUtils.CheckpointCoordinatorBuilder()
+                       .setTasks(new ExecutionVertex[]{executionVertex})
+                       .setTimer(new 
ScheduledExecutorServiceAdapter(scheduledExecutorService))
+                       
.setCheckpointCoordinatorConfiguration(CheckpointCoordinatorConfiguration.builder()
+                               .build())
+                       .build();
+
+               final CompletableFuture<String> masterHookCheckpointFuture = 
new CompletableFuture<>();
+               final OneShotLatch triggerCheckpointLatch = new OneShotLatch();
+               checkpointCoordinator.addMasterHook(new 
TestingMasterHook(masterHookCheckpointFuture, triggerCheckpointLatch));
+
+               try {
+                       checkpointCoordinator.triggerCheckpoint(false);
+                       final CompletableFuture<CompletedCheckpoint> 
secondCheckpoint = checkpointCoordinator.triggerCheckpoint(false);
+
+                       triggerCheckpointLatch.await();
+                       masterHookCheckpointFuture.complete("Completed");
+
+                       // discard triggering checkpoint
+                       checkpointCoordinator.abortPendingCheckpoints(new 
CheckpointException(CheckpointFailureReason.CHECKPOINT_DECLINED));
+
+                       try {
+                               // verify that the second checkpoint request 
will be executed and eventually times out
+                               secondCheckpoint.get();

Review comment:
       The test case should fail eventually. It is in the nature of concurrency 
related bugs that sometimes they require a certain thread interleaving which in 
this case is not easy to reproduce. I will check whether the test fails on my 
machine.
   
   Concerning the `manuallyTriggeredSchedulerExecutor`, I deliberately did not 
use it because it leaks internal implementation details. Concretely, the test 
has to know how many tasks are being enqueued until one reaches the 
`whenCompleteAsync` call. I think this is very brittle and should be avoided if 
possible. Imagine that someone changes the implementation of the 
`CheckpointCoordinator` so that there is another stage scheduled, this would 
immediately break the test w/o a very visible reason.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [flink] tillrohrmann commented on a change in pull request #12611: [FLINK-18137] Handle discarding of triggering checkpoint correctly

Reply via email to