Thomas Weise created FLINK-29100:
------------------------------------
Summary: Deployment with last-state upgrade mode stuck after
initial error
Key: FLINK-29100
URL: https://issues.apache.org/jira/browse/FLINK-29100
Project: Flink
Issue Type: Bug
Components: Kubernetes Operator
Affects Versions: kubernetes-operator-1.1.0
Reporter: Thomas Weise
Assignee: Thomas Weise
A deployment with last_state upgrade mode that never succeeds will be stuck in
deploying state because no HA data exists. This can be reproduced by creating a
deployment with invalid image or exception in entry point. Update to the CR
that corrects the issue won't be reconciled due to
"o.a.f.k.o.r.d.ApplicationReconciler [INFO ]
[default.basic-checkpoint-ha-example] Job is not running yet and HA metadata is
not available, waiting for upgradeable state". This forces manual intervention
to delete the CR.
Instead, operator should check if this is the initial deployment and if so
skip the HA metadata check.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)