tweise commented on code in PR #356:
URL:
https://github.com/apache/flink-kubernetes-operator/pull/356#discussion_r959559030
##########
flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/reconciler/deployment/ApplicationReconciler.java:
##########
@@ -98,12 +99,39 @@ protected Optional<UpgradeMode> getAvailableUpgradeMode(
.OPERATOR_JOB_UPGRADE_LAST_STATE_FALLBACK_ENABLED)
&& FlinkUtils.isKubernetesHAActivated(deployConfig)
&& FlinkUtils.isKubernetesHAActivated(observeConfig)
- && flinkService.isHaMetadataAvailable(deployConfig)
&& !flinkVersionChanged(
ReconciliationUtils.getDeployedSpec(deployment),
deployment.getSpec())) {
- LOG.info(
- "Job is not running but HA metadata is available for last
state restore, ready for upgrade");
- return Optional.of(UpgradeMode.LAST_STATE);
+
+ if (!flinkService.isHaMetadataAvailable(deployConfig)) {
+ if
(deployment.getStatus().getReconciliationStatus().getLastStableSpec() == null) {
+ // initial deployment failure, reset to allow for spec
change to proceed
+ flinkService.deleteClusterDeployment(
+ deployment.getMetadata(), deployment.getStatus(),
false);
+ flinkService.waitForClusterShutdown(deployConfig);
+ // in case the deployment succeeded between check and
delete, fall through to
+ // the upgrade path
+ if (!flinkService.isHaMetadataAvailable(deployConfig)) {
+ LOG.info(
+ "Job never entered stable state. Clearing
previous spec to reset for initial deploy");
+ // TODO: lastSpecWithMeta.f1.isFirstDeployment() is
false
Review Comment:
I think there could be something wrong with the first deployment logic (or I
don't understand it). But it would be better to deal with that outside of this
PR.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]