[
https://issues.apache.org/jira/browse/FLINK-31998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gyula Fora updated FLINK-31998:
-------------------------------
Fix Version/s: (was: kubernetes-operator-1.6.0)
> Flink Operator Deadlock on run job Failure
> ------------------------------------------
>
> Key: FLINK-31998
> URL: https://issues.apache.org/jira/browse/FLINK-31998
> Project: Flink
> Issue Type: Bug
> Components: Kubernetes Operator
> Affects Versions: kubernetes-operator-1.2.0, kubernetes-operator-1.3.0,
> kubernetes-operator-1.4.0
> Reporter: Ahmed Hamdy
> Priority: Major
> Attachments: gleek-m6pLe3Wy--IpCKQavAQwBQ.png
>
>
> h2. Description
> FlinkOperator Reconciler goes into deadlock situation where it never udpates
> Session job to DEPLOYED/ROLLED_BACK if {{deploy}} fails.
> Attached sequence diagram of the issue where FlinkSessionJob is stuck in
> UPGRADING indefinitely.
> h2. proposed fix
> Reconciler should roll back changes CR if
> {{reconciliationStatus.isBeforeFirstDeployment()}} fails to {{{}deploy(){}}}.
> [diagram|https://issues.apache.org/7239bb39-60d8-48a0-9052-f3231947edbe]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)