Hi,
Can we have multiple replicas with ZK HA in K8 as well?In this case, how does
Task Managers and clients recover the Job Manager RPC address? Are they updated
in ZK?Also, since there are 3 replicas behind the same service endpoint and
only one of them is the leader, how should clients reach the leader Job Manager?
On Wednesday, 20 January, 2021, 07:41:20 am IST, Yang Wang
<[email protected]> wrote:
If you do not want to run multiple JobManagers simultaneously, then I think
the "Job" for application clusterwith HA enable is enough.K8s will also launch
a new pod/container when the old one terminated exceptionally.
Best,Yang
Yang Wang <[email protected]> 于2021年1月20日周三 上午10:08写道:
Yes. Using a "Deployment" instead of "Job" for the application cluster also
makes sense.Actually, in the native K8s integration, we always use the
deployment for JobManager.
But please note that the deployment may relaunch the JobManager pod even though
you cancelthe Flink job.
Best,Yang
Ashish Nigam <[email protected]> 于2021年1月20日周三 上午5:29写道:
Yang,For Application clusters, does it make sense to deploy JobManager as
"Deployment" rather than as a "Job", as suggested in docs?I am asking this
because I am thinking of deploying a job manager in HA mode even for
application clusters.
ThanksAshish
On Tue, Jan 19, 2021 at 6:16 AM Yang Wang <[email protected]> wrote:
Usually, you do not need to start multiple JobManager simultaneously. The
JobManager is a deployment.A new one pod/container will be launched once it
terminated exceptionally.
If you still want to start multiple JobManagers to get a faster recovery, you
could set the replica greater than 1for standalone cluster on K8s[1]. For
native integration[2], we still have not supported such configuration[2].
Please note that the key point to enable HA is not start multiple JobManagers
simultaneously or sequently.You need to set the ZooKeeperHAService[4] or
KubernetesHAService[5] to ensure the Flink job could recoverfrom latest
successful checkpoint.
[1].
https://ci.apache.org/projects/flink/flink-docs-master/deployment/resource-providers/standalone/kubernetes.html#session-cluster-resource-definitions[2].
https://ci.apache.org/projects/flink/flink-docs-master/deployment/resource-providers/native_kubernetes.html[3].
https://issues.apache.org/jira/browse/FLINK-17707[4].
https://ci.apache.org/projects/flink/flink-docs-master/deployment/ha/zookeeper_ha.html[5].
https://ci.apache.org/projects/flink/flink-docs-master/deployment/ha/kubernetes_ha.html
Best,Yang
Amit Bhatia <[email protected]> 于2021年1月19日周二 下午8:45写道:
Hi,
I am deploying Flink 1.12 on K8s. Can anyone confirm if we can deploy multiple
job manager pods in K8s for HA or it should always be only a single job manager
pod ?
Regards,Amit Bhatia