gyfora opened a new pull request, #236: URL: https://github.com/apache/flink-kubernetes-operator/pull/236
Fixes a very important cornercase where latest checkpoint wasn't observed correctly after job restore recovery. The current logic relies on being able to observe the latest checkpoint/savepoint correctly in case of a terminal job state and setting it in the status. Previously this was done completely based on the checkpoint history, without accounting for the possibility that the history may be empty if the job fails after restore before the first checkpoint. At the moment this bug would cause jobs to fall back to an invalid earlier checkpoint in some cases. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
