[jira] [Created] (FLINK-29100) Deployment with last-state upgrade mode stuck after initial error

Thomas Weise (Jira) Wed, 24 Aug 2022 18:50:08 -0700

Thomas Weise created FLINK-29100:
------------------------------------

             Summary: Deployment with last-state upgrade mode stuck after 
initial error
                 Key: FLINK-29100
                 URL: https://issues.apache.org/jira/browse/FLINK-29100
             Project: Flink
          Issue Type: Bug
          Components: Kubernetes Operator
    Affects Versions: kubernetes-operator-1.1.0
            Reporter: Thomas Weise
            Assignee: Thomas Weise



A deployment with last_state upgrade mode that never succeeds will be stuck in 
deploying state because no HA data exists. This can be reproduced by creating a 
deployment with invalid image or exception in entry point. Update to the CR 
that corrects the issue won't be reconciled due to 
"o.a.f.k.o.r.d.ApplicationReconciler [INFO ] 
[default.basic-checkpoint-ha-example] Job is not running yet and HA metadata is 
not available, waiting for upgradeable state". This forces manual intervention 
to delete the CR.

Instead,  operator should check if this is the initial deployment and if so 
skip the HA metadata check.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-29100) Deployment with last-state upgrade mode stuck after initial error

Reply via email to