[
https://issues.apache.org/jira/browse/FLINK-31780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matthias Pohl updated FLINK-31780:
----------------------------------
Component/s: Runtime / Coordination
> Allow users to disable "Ensemble tracking" for ZooKeeper
> --------------------------------------------------------
>
> Key: FLINK-31780
> URL: https://issues.apache.org/jira/browse/FLINK-31780
> Project: Flink
> Issue Type: Improvement
> Components: Runtime / Coordination
> Reporter: Oleksandr Nitavskyi
> Assignee: Oleksandr Nitavskyi
> Priority: Major
> Labels: pull-request-available
>
> In Apache Curator an option to skip ensemble tracking was added since version
> 5.0.0 ([CURATOR-568|https://issues.apache.org/jira/browse/CURATOR-568])
> This can be useful in certain scenarios in which CuratorFramework is
> accessing to ZK clusters via load balancer or Virtual IPs.
> Thus in case Zookeeper of Flink user is running behind LB or Virtual IP
> ensemble tracking could be disabled too.
> In case ZooKeeper is hidden under VIP it can return URL during Ensemble
> Tracking, which would lead to Unresolved Host Exception inside Curator
> Framework. On Flink level it would lead to cluster restart.
> Currently HA with ZooKeeper can even lead to the JobManager failure. The
> scenario of the failure is next:
> # Flink connects to ZooKeeper via configured URL.
> # Ensemble tracking gets a new URL of ensemble, which is not obligatory
> accessible for Flink, because Zookeeper is under VIP.
> # In case of reconnect Flink fails to Zookeeper, moreover due to
> "UnresolvedHostException" Flink's jobManager is killed.
> *Acceptance Criteria:* Users of Apache Flink has a Zookeeper config option to
> disable ensemble tracking for ZooKeeper.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)