[
https://issues.apache.org/jira/browse/FLINK-26522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Niklas Semmler updated FLINK-26522:
-----------------------------------
Component/s: Runtime / Coordination
> Refactoring code for multiple component leader election
> -------------------------------------------------------
>
> Key: FLINK-26522
> URL: https://issues.apache.org/jira/browse/FLINK-26522
> Project: Flink
> Issue Type: Improvement
> Components: Runtime / Coordination
> Affects Versions: 1.16.0
> Reporter: Niklas Semmler
> Priority: Major
>
> The current implementation of the multiple component leader election faces a
> number of issues. These issues mostly stem from an attempt to make the
> multiple leader election process work just the same way as the single
> component leader election.
> An attempt at listing the issues follows:
> * *Naming* MultipleComponentLeaderElectionService appears by name similar to
> the LeaderElectionService, but is in fact closer to the LeaderElectionDriver.
> * *Similarity* The interfaces LeaderElectionService, LeaderElectionDriver and
> MultipleComponentLeaderElectionDriver are very similar to each other.
> * *Cyclic dependency* DefaultMultipleComponentLeaderElectionService holds a
> reference to the ZooKeeperMultipleComponentLeaderElectionDriver
> (MultipleComponentLeaderElectionDriver), which in turn holds a reference to
> the DefaultMultipleComponentLeaderElectionService (LeaderLatchListener)
> * *Unclear contract* With single component leader election drivers such as
> ZooKeeperLeaderElectionDriver a call to the LeaderElectionService#stop from
> JobMasterServiceLeadershipRunner#closeAsync implies giving up the leadership
> of the JobMaster. With the multiple component leader election this is no
> longer the case. The leadership is held until the HighAvailabilityServices
> shutdown. This logic may be difficult to understand from the perspective of
> one of the components (e.g., the Dispatcher)
> * *Long call hierarchy*
> DefaultLeaderElectionService->MultipleComponentLeaderElectionDriverAdapter->MultipleComponentLeaderElectionService->ZooKeeperMultipleComponentLeaderElectionDriver
> * *Long prefix* "MultipleComponentLeaderElection" is quite a long prefix but
> shared by many classes.
> * *Adapter as primary implementation* All non-testing non-multiple-component
> leadership drivers are deprecated. The primary implementation of
> LeaderElectionDriver is the adapter
> MultipleComponentLeaderElectionDriverAdapter.
> * *Possible redundancy* We currently have similar methods for the Dispatcher,
> ResourceManager, JobMaster and WebMonitorEndpoint. (E.g., for granting
> leadership.) As these methods are called at the same time due to the multiple
> component leader election, it may make sense to combine this logic into a
> single object.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)