[ 
https://issues.apache.org/jira/browse/FLINK-32422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17738107#comment-17738107
 ] 

Matthias Pohl edited comment on FLINK-32422 at 6/28/23 1:05 PM:
----------------------------------------------------------------

I thought about it once more: The revoke leadership event should be immediately 
handled by the {{LeaderElectionService}} because it means that the contender is 
deregistered and wouldn't be able to handle any outstanding leadership events 
anymore, anyway. In this way, we have to accept that a revoke event can be sent 
to the contender twice.

It's implemented in the same way in the {{DefaultLeaderElectionService}} (see 
[DefaultLeaderElectionService:233|https://github.com/apache/flink/blob/caa5f181598658403d081a0d8b733330c70ec51c/flink-runtime/src/main/java/org/apache/flink/runtime/leaderelection/DefaultLeaderElectionService.java#L233]).

I'm downgrading this issue to Minor. I will add a test case in 
{{EmbeddedLeaderServiceTest}}.


was (Author: mapohl):
I thought about it once more: The revoke leadership event should be immediately 
handled by the {{LeaderElectionService}} because it means that the contender is 
deregistered and wouldn't be able to handle any outstanding leadership events 
anymore, anyway. In this way, we have to accept that a revoke event can be sent 
to the contender twice.

It's implemented in the same way in the {{DefaultLeaderElectionService}} (see 
[DefaultLeaderElectionService:233|https://github.com/apache/flink/blob/caa5f181598658403d081a0d8b733330c70ec51c/flink-runtime/src/main/java/org/apache/flink/runtime/leaderelection/DefaultLeaderElectionService.java#L233]).

I'm downgrading this issue to Major. I will add a test case in 
{{EmbeddedLeaderServiceTest}}.

> EmbeddedLeaderService doesn't handle the leader events properly in edge cases
> -----------------------------------------------------------------------------
>
>                 Key: FLINK-32422
>                 URL: https://issues.apache.org/jira/browse/FLINK-32422
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Runtime / Coordination
>    Affects Versions: 1.18.0
>            Reporter: Matthias Pohl
>            Priority: Minor
>
> The leadership is granted when registering the first contender. This sets the 
> leadership flag within the EmbeddedLeaderService (see 
> [EmbeddedLeaderService:312ff|https://github.com/apache/flink/blob/033aca7566a0a561410b3c0e1ae8dca856cd26ce/flink-runtime/src/main/java/org/apache/flink/runtime/highavailability/nonha/embedded/EmbeddedLeaderService.java#L312]:
>  the grantLeadershipCall is triggered afterwards informing the contender 
> about its leadership). In the meantime, close can be called on the contender 
> which deregisters the contender again calling revoke on the contender without 
> having been able to gain the leadership.
> This issue was introduced by FLINK-30765.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to