[
https://issues.apache.org/jira/browse/FLINK-18677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17168458#comment-17168458
]
Till Rohrmann commented on FLINK-18677:
---------------------------------------
Good work [~mapohl]. According to FLINK-10052 the
{{ZooKeeperLeaderElectionService}} will revoke the leadership of the leader if
the connection is {{SUSPENDED}}. Hence, as a first step I believe it would make
sense to make this behaviour on the {{ZooKeeperLeaderRetrievalService}} side
symmetric.
> ZooKeeperLeaderRetrievalService does not invalidate leader in case of
> SUSPENDED connection
> ------------------------------------------------------------------------------------------
>
> Key: FLINK-18677
> URL: https://issues.apache.org/jira/browse/FLINK-18677
> Project: Flink
> Issue Type: Bug
> Components: Runtime / Coordination
> Affects Versions: 1.10.1, 1.12.0, 1.11.1
> Reporter: Till Rohrmann
> Priority: Major
> Fix For: 1.12.0
>
>
> The {{ZooKeeperLeaderRetrievalService}} does not invalidate the leader if the
> ZooKeeper connection gets SUSPENDED. This means that a {{TaskManager}} won't
> cancel its running tasks even though it might miss a leader change. I think
> we should at least make it configurable whether in such a situation the
> leader listener should be informed about the lost leadership. Otherwise, we
> might run into the situation where an old and a newly recovered instance of a
> {{Task}} can run at the same time.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)