Thanks Yang and Chesnay for your feedback. I'm gonna go ahead and start a
voting thread as this discussion thread is already open for some time.

Best,
Matthias

On Wed, Jan 25, 2023 at 4:08 AM Yang Wang <wangyang0...@apache.org> wrote:

> Having the *start()* in *LeaderContender* interface and bringing back the
> *LeaderElection* with some new methods make sense to me.
>
> I have no more concerns now.
>
>
> >    - *LeaderContender*: The LeaderContender is integrated as usual except
> >    that it accesses the LeaderElection instead of the
> LeaderElectionService.
> >    It's going to call startLeaderElection(LeaderContender) where,
> previously,
> >    LeaderElectionService.start(LeaderContender) was called.
> >
> > nit: we call the *LeaderElection#startLeaderElection()*, not the
> *LeaderElection#startLeaderElection(LeaderContender)*. Because we have
> already set the leaderContender in the
> *LeaderElection#register(LeaderContender)*.
>
>
> Best,
> Yang
>
>
> Chesnay Schepler <ches...@apache.org> 于2023年1月23日周一 23:16写道:
>
> > Thanks for updating the design. From my side this looks good.
> >
> > On 18/01/2023 17:59, Matthias Pohl wrote:
> > > After another round of discussion, I came up with a (hopefully) final
> > > proposal. The previously discussed approach was still not optimal
> because
> > > the contender ID lived in the LeaderContender even though it is
> actually
> > > LeaderElectionService-internal knowledge. Fixing that helped fix the
> > > overall architecture. Additionally, it brought back the LeaderElection
> > > interface with slightly different methods.
> > >
> > > I updated the "Code Cleanup: Merge
> MultipleComponentLeaderElectionService
> > > into LeaderElectionService" section and moved the old proposal into the
> > > section for rejected alternatives. Feel free to have another look at
> the
> > > updated version [1].
> > >
> > > Matthias
> > >
> > > [1]
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-285%3A+Refactoring+LeaderElection+to+make+Flink+support+multi-component+leader+election+out-of-the-box
> > >
> > > On Wed, Jan 18, 2023 at 1:40 PM Matthias Pohl <matthias.p...@aiven.io>
> > > wrote:
> > >
> > >> Thanks for participating in the discussion, Yang & Chesnay.
> > LeaderElection
> > >> interface extension gave me a headache as well. I added it initially
> > >> because I thought it would be of more value. But essentially, it
> doesn't
> > >> help but make the code harder to understand (as your questions
> > rightfully
> > >> point out). I agree that the FLIP is good enough without this
> > extension. I
> > >> moved it into the Rejected Alternatives section of the FLIP and would
> > >> propose going ahead without it.
> > >>
> > >> I will answer your questions about the LeaderElection extension,
> anyway:
> > >>
> > >> BTW, if the *LeaderElectionService#register(return LeaderElection)*
> and
> > >>> *LeaderElectionService#onGrantLeadership* are guarded by a same lock,
> > then
> > >>> we could ensure that the leaderElection in *LeaderContender* is
> always
> > >>> non-null when it tries to confirm the leadership. And then we do not
> > need
> > >>> the
> > >>> *LeaderContender#initializeLeaderElection*. Right?
> > >>
> > >> No, we still would need LeaderContender#initializeLeaderElection
> because
> > >> the LeaderElectionService needs to be capable of setting the
> > LeaderElection
> > >> within the LeaderContender before triggering the process for granting
> > the
> > >> leadership. This all needs to happen within the
> > >> LeaderElectionService#register(LeaderContender). It's indepent of the
> > lock.
> > >>
> > >> With the extension, how does the leader contender get access to the
> > >>> LeaderElection? I would've assumed that LEService returns a
> > LeaderElection
> > >>> when register is called, but according to the diagram this method
> > doesn't
> > >>> return anything. Is that what initiateLeaderElection is doing?
> > >>
> > >> Correct. My initial plan was to make
> > >> LeaderElectionService#register(LeaderContender) return the
> > LeaderElection
> > >> instance. That method could have been called within the
> LeaderContender.
> > >> But this approach has the flaw that LeaderContender would be in charge
> > >> within this control flow where, actually, we would want
> > >> LeaderElectionService to be still in charge to trigger the process for
> > >> granting the leadership. This required the
> > >> LeaderContender.initializeLeaderElection(LeaderElection) method to be
> > added
> > >> to enable the LeaderElectionService to do the initialization. I added
> a
> > >> comment to the corresponding class diagram to make this clearer.
> > >>
> > >> The DefaultLeaderElection will rely on package-private methods of the
> > >>> DLEService to handle confirm/hasLeadership calls?
> > >>
> > >> Correct. I added the missing package-private methods to the class
> > diagram
> > >> in the FLIP to clear things up.
> > >>
> > >> On Wed, Jan 18, 2023 at 11:47 AM Chesnay Schepler <ches...@apache.org
> >
> > >> wrote:
> > >>
> > >>> There are a lot of good things in this, and until the Extension bit
> I'm
> > >>> fully on board.
> > >>>
> > >>> With the extension, how does the leader contender get access to the
> > >>> LeaderElection? I would've assumed that LEService returns a
> > >>> LeaderElection when register is called, but according to the diagram
> > >>> this method doesn't return anything. Is that what
> > initiateLeaderElection
> > >>> is doing?
> > >>>
> > >>> The DefaultLeaderElection will rely on package-private methods of the
> > >>> DLEService to handle confirm/hasLeadership calls?
> > >>>
> > >>> I don't fully understand why LContender#initializeLeaderElection is
> > >>> required.
> > >>>
> > >>> On 05/01/2023 14:49, Matthias Pohl wrote:
> > >>>> Hi everyone,
> > >>>> I brought up FLINK-26522 [1] in the mailing list discussion about
> > >>>> consolidating the HighAvailabilityServices interfaces [2],
> previously.
> > >>>> There, it was concluded that the community still wants the ability
> to
> > >>> have
> > >>>> per-component leader election and, therefore, keep the
> > >>>> HighAvailabilityServices interface as is. I went back to work on
> > >>>> FLINK-26522 [1] to figure out how we can simplify the current
> codebase
> > >>>> keeping the decision in mind.
> > >>>>
> > >>>> I wanted to handle FLINK-26522 [1] as a follow-up cleanup task of
> > >>>> FLINK-24038 [3]. But while working on it, I realized that even
> > >>> FLINK-24038
> > >>>> [3] shouldn't have been handled without a FLIP. The per-process
> leader
> > >>>> election which was introduced in FLINK-24038 [3] changed the
> ownership
> > >>> of
> > >>>> certain components. This is actually a change that should have been
> > >>>> discussed in the mailing list and deserved a FLIP. To overcome this
> > >>>> shortcoming of FLINK-24038 [3], I decided to prepare FLIP-285 [4] to
> > >>>> provide proper documentation of what happened in FLINK-24038 and
> what
> > >>> will
> > >>>> be manifested with resolving its follow-up FLINK-26522 [1].
> > >>>>
> > >>>> Conceptually, this FLIP proposes moving away from Flink's support
> for
> > >>>> single-contender LeaderElectionServices and introducing
> > multi-contender
> > >>>> support by disconnecting the HA-backend leader election lifecycle
> from
> > >>> the
> > >>>> LeaderContender's lifecycle. This allows us to provide
> LeaderElection
> > >>> per
> > >>>> component (as it was requested in [2]) but also enables us to
> utilize
> > a
> > >>>> single leader election for multiple components/contenders as well
> > >>> without
> > >>>> the complexity of the code that was introduced by FLINK-24038 [3].
> > >>>>
> > >>>> I'm looking forward to your comments.
> > >>>>
> > >>>> Matthias
> > >>>>
> > >>>> [1] https://issues.apache.org/jira/browse/FLINK-26522
> > >>>> [2]
> https://lists.apache.org/thread/9oy2ml9s3j1v6r77h31sndhc3gw57cfm
> > >>>> [3] https://issues.apache.org/jira/browse/FLINK-24038
> > >>>> [4]
> > >>>>
> > >>>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-285%3A+Refactoring+LeaderElection+to+make+Flink+support+multi-component+leader+election+out-of-the-box
> > >>>
> >
> >
>

Reply via email to