Mike Percy created KUDU-2160:
--------------------------------

             Summary: Reduce UpdateConsensus RPC timeouts
                 Key: KUDU-2160
                 URL: https://issues.apache.org/jira/browse/KUDU-2160
             Project: Kudu
          Issue Type: Bug
          Components: consensus
    Affects Versions: 1.5.0
            Reporter: Mike Percy


We will often see many UpdateConsensus() RPC calls time out when disks are 
slow. We need to investigate this issue further and understand the dynamics 
better, then find a solution.

When the local disks on a Kudu cluster get overloaded, RaftConsensus metadata 
fsyncs caused by Raft votes and term changes take longer, which causes the 
RaftConsensus lock to be held. This causes "stacking" of UpdateConsensus() 
RPCs, resulting in timeouts.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to