Todd Lipcon created KUDU-1585:
---------------------------------
Summary: Leader should back off consensus request batch sizes
after follower throttling kicks in
Key: KUDU-1585
URL: https://issues.apache.org/jira/browse/KUDU-1585
Project: Kudu
Issue Type: Bug
Components: consensus
Affects Versions: 0.10.0
Reporter: Todd Lipcon
I've pushed a cluster to the point of overload such that most of the servers
are heavily in the memory limit zone, and completing writes at a very low rate
(only a few per second). In some cases, I see a leader which is sending batches
of 70+ operations to a follower, which is rejecting the whole batch most of the
time. Occasionally it accepts one or two of the operations. Nonetheless, every
second, the leader is sending several MB worth of data to the follower.
The overall network traffic on this cluster is quite high despite making very
little progress, and I'm guessing most of it is these sorts of retries.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)