gyfora commented on code in PR #821:
URL:
https://github.com/apache/flink-kubernetes-operator/pull/821#discussion_r1667757174
##########
flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/reconciler/deployment/AbstractJobReconciler.java:
##########
@@ -266,19 +304,31 @@ protected void restoreJob(
Optional<String> savepointOpt = Optional.empty();
if (spec.getJob().getUpgradeMode() != UpgradeMode.STATELESS) {
- savepointOpt =
- Optional.ofNullable(
- ctx.getResource()
- .getStatus()
- .getJobStatus()
- .getSavepointInfo()
- .getLastSavepoint())
- .flatMap(s ->
Optional.ofNullable(s.getLocation()));
+ if (FlinkStateSnapshotUtils.shouldCreateSnapshotResource(
+ ctx.getOperatorConfig(), deployConfig)) {
+ savepointOpt =
getLatestSavepointPathFromFlinkStateSnapshots(ctx);
+ } else {
+ savepointOpt =
+ Optional.ofNullable(
+ ctx.getResource()
+ .getStatus()
+ .getJobStatus()
+ .getSavepointInfo()
+ .getLastSavepoint())
+ .flatMap(s ->
Optional.ofNullable(s.getLocation()));
Review Comment:
@mateczagany I think the only good solution is to introduce a separate
`upgradeSnapshotReference` that is only updated during the upgrade cycle (and
during terminal job observe). Any snapshot observed this way is logically
speaking always the latest.
A manual / periodic snapshot should never override this, doing that would be
surely an error in the logic so this is absolutely not a problem. This way we
will actually separate the upgrade snapshot handling from the savepoint taking
/ management which was kind of confused together until now.
In other words we don't need any of the "3 solutions" :)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]