Gyula Fora created FLINK-30437:
----------------------------------
Summary: Last-state upgrade combined with state incompatibility
causes state loss
Key: FLINK-30437
URL: https://issues.apache.org/jira/browse/FLINK-30437
Project: Flink
Issue Type: Bug
Components: Kubernetes Operator
Affects Versions: kubernetes-operator-1.3.0, kubernetes-operator-1.2.0
Reporter: Gyula Fora
Assignee: Gyula Fora
Even though we set:
execution.shutdown-on-application-finish: false
execution.submit-failed-job-on-application-error: true
If there is a state incompatibility the jobmanager marks the Job failed, cleans
up HA metada and restarts itself. This is a very concerning behaviour, but we
have to fix this on the operator side to at least guarantee no state loss.
The solution is to harden the HA metadata check properly (like we tried but
failed in the past :) )
--
This message was sent by Atlassian Jira
(v8.20.10#820010)