zentol commented on PR #19968:
URL: https://github.com/apache/flink/pull/19968#issuecomment-1157673594

   The tests for the savepoint operations are scattered around quite a bit.
   We unfortunately can't fully cover it in the `StopWithSavepointTest` because 
that requires an actual execution graph. Creating that ourselves isn't really 
an option (because there are barely any contracts; everything just relies on 
existing behavior of the scheduler), and we also lack good test utils. Moving 
away from the ExecutionGraph, while technically possible, can't be done quickly 
because so many re-used components expect an execution graph.
   
   The waiting for the savepoint completion in 
`onFailure`/`onGloballyTerminalState` is covered by the newly added cases in 
`StopWithSavepointTest`.
   
   Not accidentally triggering 2 state transitions from the state is now 
enforced by 71f72cf57d820ed62560f07f62259408e3a18b52; this on it's own would've 
failed tests in `StopWithSavepointTest`, like 
`testJobFailedAndSavepointOperationFails`. we likely would've noticed the issue 
sooner if we had this earlier.
   
   As for other pre-existing tests:
   
   The `AdaptiveSchedulerTest` contains tests for the proper archiving of 
errors that occurred during `StopWithSavepoint`. These make sure we don't 
accidentally drop task failures.
   
   The `AdaptiveSchedulerITCase` contains high-level tests for the happy path 
and certain errors on the TM side. These make sure the savepoint operation does 
complete if a task failed.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to