[ 
https://issues.apache.org/jira/browse/KUDU-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17030205#comment-17030205
 ] 

ASF subversion and git services commented on KUDU-2149:
-------------------------------------------------------

Commit b32283d2e5ce3d88e4f6afdeedf1c616721cce3a in kudu's branch 
refs/heads/master from Adar Dembo
[ https://gitbox.apache.org/repos/asf?p=kudu.git;h=b32283d ]

KUDU-2155: disable failure detector around elections

This is a more complete fix for KUDU-2149 which disables the failure
detector completely around a leader election.

There are several changes to make this happen:
1. The FD is changed to use a one-shot timer, which automatically disables
   upon firing.
2. Because all elections are guaranteed to reach DoElectionCallback, that's
   where we reenable the FD.
3. We provide a special case for pre-elections where FD reenabling is
   deferred until after the subsequent real election finishes.

I'm still not convinced this is the cleanest approach, but it seems to work.

Change-Id: Idcd311cee028c48e908f290d60c474e8a4557d97
Reviewed-on: http://gerrit.cloudera.org:8080/8134
Tested-by: Kudu Jenkins
Reviewed-by: Alexey Serbin <[email protected]>
Reviewed-by: Andrew Wong <[email protected]>


> New failure detector implementation can lead to election stacking
> -----------------------------------------------------------------
>
>                 Key: KUDU-2149
>                 URL: https://issues.apache.org/jira/browse/KUDU-2149
>             Project: Kudu
>          Issue Type: Bug
>          Components: consensus
>    Affects Versions: 1.5.0
>            Reporter: Adar Dembo
>            Assignee: Adar Dembo
>            Priority: Critical
>             Fix For: 1.6.0
>
>
> A new failure detector (FD) implementation was merged in commit 21b0f3d and 
> is part of Kudu 1.5. One of the key changes is that the detection logic runs 
> on a reactor thread rather than on a dedicated per-replica thread. But, 
> because reactor threads are shared, the election started in the event of a 
> failure must be thunked to the Raft thread pool (starting an election means 
> casting a vote, which generally means performing IO, which is verboten on a 
> reactor thread).
> By thunking, the FD immediately rearms; the previous implementation did not 
> do this. If there's a lot of outstanding IO (i.e. during an election storm 
> across thousands of tablets), it's possible for the FD to fire again while 
> the first election task is still waiting to cast its vote. The new election 
> task will try to acquire the consensus lock and block on it (it's held by the 
> first election task). And so on. When the original IO finally completes, all 
> of the follow-on elections will get unblocked at the same time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to