[
https://issues.apache.org/jira/browse/FLINK-31837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17739035#comment-17739035
]
Matthias Pohl edited comment on FLINK-31837 at 6/30/23 11:28 AM:
-----------------------------------------------------------------
{quote}
With the {{MultipleComponentLeaderElection*}} classes we added a circular
dependency between the {{DefaultLeaderElectionService}} and the
{{DefaultMultipleComponentLeaderElectionService}} which calls the
{{DefaultLeaderElectionService.onGrantLeadership}} while registering the
service in
[DefaultMultipleComponentLeaederElectionService:152|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/leaderelection/DefaultMultipleComponentLeaderElectionService.java#L152].
This call will result in accessing the {{DefaultLeaderElectionService}}
instance which is still in instantiation phase.
We're losing the circular dependency with the
{{MultipleComponentsLeaderElectionDriverAdapter}} class going away. Hence, we
can go ahead and move the driver instantiation into the constructor.
{quote}
Instead of creating the driver in the constructor, we should instantiate it as
late as possible (i.e. only if the first contender is registered). We ran into
a problem where we wanted to move the {{DefaultLeaderElectionService}}
instantiation into the constructor of the {{HighAvailabilityServices}}
implementations. This then blocked certain tests because the instantiation of
the driver caused the corresponding JVM process to participate in the leader
election in any way (even if the JVM process does not mean to participate in
the {{LeaderElections}} like the TaskManager). In a scenario where the
TaskManager starts first, it would block the JobManager from acquiring the
leadership.
I'm gonna update the Jira issues description and title accordingly.
was (Author: mapohl):
Instead of creating the driver in the constructor, we should instantiate it as
late as possible (i.e. only if the first contender is registered). We ran into
a problem where we wanted to move the {{DefaultLeaderElectionService}}
instantiation into the constructor of the {{HighAvailabilityServices}}
implementations. This then blocked certain tests because the instantiation of
the driver caused the corresponding JVM process to participate in the leader
election in any way (even if the JVM process does not mean to participate in
the {{LeaderElections}} like the TaskManager). In a scenario where the
TaskManager starts first, it would block the JobManager from acquiring the
leadership.
I'm gonna update the Jira issues description and title accordingly.
> Move LeaderElectionDriver instantiated into
> DefaultLeaderElectionService.register to instantiate the driver lazily
> ------------------------------------------------------------------------------------------------------------------
>
> Key: FLINK-31837
> URL: https://issues.apache.org/jira/browse/FLINK-31837
> Project: Flink
> Issue Type: Sub-task
> Components: Runtime / Coordination
> Reporter: Matthias Pohl
> Priority: Major
>
> With the {{MultipleComponentLeaderElection*}} classes we added a circular
> dependency between the {{DefaultLeaderElectionService}} and the
> {{DefaultMultipleComponentLeaderElectionService}} which calls the
> {{DefaultLeaderElectionService.onGrantLeadership}} while registering the
> service in
> [DefaultMultipleComponentLeaederElectionService:152|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/leaderelection/DefaultMultipleComponentLeaderElectionService.java#L152].
> This call will result in accessing the {{DefaultLeaderElectionService}}
> instance which is still in instantiation phase.
> We're losing the circular dependency with the
> {{MultipleComponentsLeaderElectionDriverAdapter}} class going away. Hence, we
> can go ahead and move the driver instantiation into the constructor.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)