Gyula Fora created FLINK-30437: ---------------------------------- Summary: Last-state upgrade combined with state incompatibility causes state loss Key: FLINK-30437 URL: https://issues.apache.org/jira/browse/FLINK-30437 Project: Flink Issue Type: Bug Components: Kubernetes Operator Affects Versions: kubernetes-operator-1.3.0, kubernetes-operator-1.2.0 Reporter: Gyula Fora Assignee: Gyula Fora
Even though we set: execution.shutdown-on-application-finish: false execution.submit-failed-job-on-application-error: true If there is a state incompatibility the jobmanager marks the Job failed, cleans up HA metada and restarts itself. This is a very concerning behaviour, but we have to fix this on the operator side to at least guarantee no state loss. The solution is to harden the HA metadata check properly (like we tried but failed in the past :) ) -- This message was sent by Atlassian Jira (v8.20.10#820010)