[
https://issues.apache.org/jira/browse/FLINK-19206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17211508#comment-17211508
]
Yang Wang commented on FLINK-19206:
-----------------------------------
Hi [~mikekap], thanks for creating this ticket.
In Flink 1.10, we are using internal service as the owner reference. And from
1.11, we are changing the owner to JobManager deployment. Since deployment is a
first-class resource in Kubernetes and use is as owner is more natural.
Let me try to understand your problem. You are not specifying a fixed
cluster-id via {{-Dkubernetes.cluster-id=flink-k8s-cluster1}}. So the Flink
client will generate a random name for your cluster every time. When a
high-level retry happens, a new Flink cluster will be started. This will yield
a lot of "stale clusters".
The root cause is you are using the Flink session cluster, which liveness is
independent of the job. So if you are using the external retry mechanism(e.g.
Airflow), although I strongly suggest to enable HA[1] for Flink and do not
retry externally, the application cluster is more appropriate for you[2]. When
the only job failed, the Flink cluster will be destroyed automatically.
Moreover, in Flink we are not setting the own reference to pod. Since
JobManager pod could be failed/deleted/crashed, then a new one will be launched
by Kubernetes deployment and the Flink job could recover from the latest
checkpoint if HA configured.
[1].
https://ci.apache.org/projects/flink/flink-docs-master/ops/jobmanager_high_availability.html
[2].
https://ci.apache.org/projects/flink/flink-docs-master/ops/deployment/native_kubernetes.html#flink-kubernetes-application
> Add an ability to set ownerReference manually in Kubernetes
> -----------------------------------------------------------
>
> Key: FLINK-19206
> URL: https://issues.apache.org/jira/browse/FLINK-19206
> Project: Flink
> Issue Type: Improvement
> Components: Deployment / Kubernetes
> Reporter: Mike Kaplinskiy
> Priority: Minor
>
> The current Kubernetes deployment creates a service that is the
> ownerReference of all the sub-objects (the JM & TM deployments & the rest
> service). However, something presumably has to start the cluster in the first
> place. If you are using a job cluster, that can be something like a
> kubernetes Job, a CronJob or a tool like Airflow. Unfortunately any failures
> in the Flink job can cause retries from these higher-level primitives, which
> can yield a lot of "stale clusters" that aren't GCed.
> The proposal here is to add a configuration option to set the ownerReference
> of the Flink Service. This way the service (and by proxy, all the cluster
> components) get deleted when the "parent" decides - including if the parent
> is itself a Kubernetes pod. For reference, Spark does something similar via
> {{spark.kubernetes.driver.pod.name}} (documented at
> [https://spark.apache.org/docs/latest/running-on-kubernetes.html#client-mode-executor-pod-garbage-collection]).
--
This message was sent by Atlassian Jira
(v8.3.4#803005)