gaoyunhaii commented on PR #19464: URL: https://github.com/apache/flink/pull/19464#issuecomment-1101106904
Very thanks @pltbkd for the PR! For the test, I think we might be able to simplify it a bit by 1. In this test, we should not need an actual `OperatorCoordinator`. We could have a mock one and check the order of triggering / abort. In this case, the expected sequence will be " abort / triggering checkpoint 2", and the bad sequence before the fixing is "abort / triggering checkpoint 1 / triggering checkpoint 2". We could detect if the error occurs by checking the received sequences. Then we could remove actions related to `manuallyTriggeredScheduledExecutor`. For mock `OperatorCoordinatorContext`, I think we could simply add a callback to `abortTriggering` in `MockOperatorCheckpointCoordinatorContextBuilder`. 2. We could also use manuallyTriggeredScheduledExecutor for `setIoExecutor`. Then the actions could be simplified to ```java checkpointCoordinator.triggerCheckpoint(false); // create checkpoint plan manuallyTriggeredScheduledExecutor.trigger(); // create pending checkpoint manuallyTriggeredScheduledExecutor.trigger(); declineCheckpoint(1L, checkpointCoordinator, jobVertexID, graph); // Here there the actions get queued included the action to acquire checkpoint // location followed by the actions related to aborting. Then we'll first // execute the getLocation, which adds more actions after the aborting // related actions. manuallyTriggeredScheduledExecutor.triggerAll(); checkState(!checkpointCoordinator.isTriggering()); // The second checkpoint checkpointCoordinator.triggerCheckpoint(false); manuallyTriggeredScheduledExecutor.triggerAll(); // Finally we could verify there is no triggering checkpoint 1 here. ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
