Thanks for preparing this FLIP, @Yang.

In general, I'm +1 for this new feature. Leveraging Kubernetes's buildtin
ConfigMap for Flink's HA services should significantly reduce the
maintenance overhead compared to deploying a ZK cluster. I think this is an
attractive feature for users.

Concerning the proposed design, I have some questions. Might not be
problems, just trying to understand.

## Architecture

Why does the leader election need two ConfigMaps (`lock for contending
leader`, and `leader RPC address`)? What happens if the two ConfigMaps are
not updated consistently? E.g., a TM learns about a new JM becoming leader
(lock for contending leader updated), but still gets the old leader's
address when trying to read `leader RPC address`?

## HA storage > Lock and release

It seems to me that the owner needs to explicitly release the lock so that
other peers can write/remove the stored object. What if the previous owner
failed to release the lock (e.g., dead before releasing)? Would there be
any problem?

## HA storage > HA data clean up

If the ConfigMap is destroyed on `kubectl delete deploy <ClusterID>`, how
are the HA dada retained?


Thank you~

Xintong Song



On Tue, Sep 15, 2020 at 11:26 AM Yang Wang <danrtsey...@gmail.com> wrote:

> Hi devs and users,
>
> I would like to start the discussion about FLIP-144[1], which will
> introduce
> a new native high availability service for Kubernetes.
>
> Currently, Flink has provided Zookeeper HA service and been widely used
> in production environments. It could be integrated in standalone cluster,
> Yarn, Kubernetes deployments. However, using the Zookeeper HA in K8s
> will take additional cost since we need to manage a Zookeeper cluster.
> In the meantime, K8s has provided some public API for leader election[2]
> and configuration storage(i.e. ConfigMap[3]). We could leverage these
> features and make running HA configured Flink cluster on K8s more
> convenient.
>
> Both the standalone on K8s and native K8s could benefit from the new
> introduced KubernetesHaService.
>
> [1].
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-144%3A+Native+Kubernetes+HA+for+Flink
> [2].
> https://kubernetes.io/blog/2016/01/simple-leader-election-with-kubernetes/
> [3]. https://kubernetes.io/docs/concepts/configuration/configmap/
>
> Looking forward to your feedback.
>
> Best,
> Yang
>

Reply via email to