[
https://issues.apache.org/jira/browse/FLINK-38290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18050806#comment-18050806
]
Royston Tauro commented on FLINK-38290:
---------------------------------------
Any update on this, facing the same issue
> Application cluster: FINISHED FlinkDeployment falls back to RECONCILING if JM
> pod is lost/recreated
> ---------------------------------------------------------------------------------------------------
>
> Key: FLINK-38290
> URL: https://issues.apache.org/jira/browse/FLINK-38290
> Project: Flink
> Issue Type: Bug
> Components: Client / Job Submission, Deployment / Kubernetes
> Affects Versions: 1.20.1, kubernetes-operator-1.12.1
> Reporter: Urs Schoenenberger
> Priority: Major
>
> Hi folks,
> we are encountering the following issue, and I believe it's a bug.
> One-line Summary: In an ApplicationCluster, the Operator queries JobManager
> REST API for job status. This API does not have information about FINISHED
> jobs if the JM leader changed / JM restarted. This leads to the Job being
> reset to RECONCILING where it gets stuck.
> Steps to reproduce:
> * Deploy the example FlinkDeployment (
> [https://raw.githubusercontent.com/apache/flink-kubernetes-operator/release-1.12/examples/basic.yaml]
> ) with a bounded job (e.g. examples/streaming/WordCount.jar) and configure
> high-availability.type: "kubernetes" and a high-availability.storageDir.
> * Wait for the FlinkDeployment to reach FINISHED.
> * Kill the JobManager pod. (The way this happens in production use cases is
> e.g. if a node is tainted and scheduled for deletion due to being underused /
> a spot instance goes down / etc).
> Observed behaviour:
> * A new JobManager is started.
> * The new pod checks the HA dir and realizes that the job is already
> completed. Log from StandaloneDispatcher: "Ignoring JobGraph submission (...)
> because the job already reached a globally-terminal state (...).
> * The operator tries to reconcile the job. In JobStatusObserver, it queries
> the JobManager's REST API (/jobs/overview), but it receives a "not found".
> ** This is because the backend here does not check the HA store, but the
> JobStore instead. This is backed by RAM or a local file, so it is not
> recovered on JM restart.
> * This leads the k8s operator to believe something is wrong with the
> FlinkDeployment, and the FlinkDeployment goes back to state RECONCILING and
> gets stuck there.
>
> This messes with monitoring and alerting among other things.
> We are aware of the HistoryServer and have configured it, but since the
> Operator only checks the JM API, this does not resolve the problem. Could we
> make the JM expose the HA store with finished job information for this
> purpose?
--
This message was sent by Atlassian Jira
(v8.20.10#820010)