Mike Percy created KUDU-2486:
--------------------------------

             Summary: Leader should back off heartbeating to failed followers
                 Key: KUDU-2486
                 URL: https://issues.apache.org/jira/browse/KUDU-2486
             Project: Kudu
          Issue Type: Improvement
          Components: consensus
    Affects Versions: 1.7.1
            Reporter: Mike Percy


At the time of writing, the replica leader -> follower heartbeat mechanism does 
not have a backoff mechanism built in. Rather it simply sends a heartbeat every 
configured period (say, 500ms). If a server is offline this can cause log spam 
until that replica is evicted, and if a server is overloaded this lack of a 
backoff contributes to the problem.

Since we now have pre-election support, having leaders slow down their 
heartbeat attempts when follower requests are returning errors should not cause 
unnecessary leader elections, so backing off is feasible.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to