Ahmed Hamdy created FLINK-31998:
-----------------------------------
Summary: Flink Operator Deadlock on run job Failure
Key: FLINK-31998
URL: https://issues.apache.org/jira/browse/FLINK-31998
Project: Flink
Issue Type: Bug
Components: Kubernetes Operator
Affects Versions: kubernetes-operator-1.4.0, kubernetes-operator-1.3.0,
kubernetes-operator-1.2.0
Reporter: Ahmed Hamdy
Fix For: kubernetes-operator-1.5.0
Attachments: gleek-m6pLe3Wy--IpCKQavAQwBQ.png
h2. Description
FlinkOperator Reconciler goes into deadlock situation where it never udpates
Session job to DEPLOYED if {{deploy}} fails.
Attached sequence diagram of the issue where FlinkSessionJob is stuck in
UPGRADING indefinitely.
h2. proposed fix
Reconciler should roll back changes CR if
{{reconciliationStatus.isBeforeFirstDeployment()}} fails to {{deploy()}}.
[diagram|https://issues.apache.org/7239bb39-60d8-48a0-9052-f3231947edbe]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)