[GitHub] [flink-kubernetes-operator] gyfora opened a new pull request, #236: [hotfix] Fix last checkpoint observe with empty history

GitBox Sun, 22 May 2022 03:24:30 -0700


gyfora opened a new pull request, #236:
URL: https://github.com/apache/flink-kubernetes-operator/pull/236


   Fixes a very important cornercase where latest checkpoint wasn't observed 
correctly after job restore recovery.
   
   The current logic relies on being able to observe the latest 
checkpoint/savepoint correctly in case of a terminal job state and setting it 
in the status.
   
   Previously this was done completely based on the checkpoint history, without 
accounting for the possibility that the history may be empty if the job fails 
after restore before the first checkpoint.
   
   At the moment this bug would cause jobs to fall back to an invalid earlier 
checkpoint in some cases.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [flink-kubernetes-operator] gyfora opened a new pull request, #236: [hotfix] Fix last checkpoint observe with empty history

Reply via email to