Adar Dembo has posted comments on this change. Change subject: consensus_peers: replace bespoke Raft heartbeat logic with periodic timers ......................................................................
Patch Set 2: I reran Todd's experiments from https://gerrit.cloudera.org/#/c/7331. With this patch series: leader thread count: 122 512 heartbeats/sec received by each replica (based on replica RPC metrics) 2025 voluntary ctx switch/sec on leader (based on metrics) 214 ms/sec user CPU on leader (based on metrics) 78 ms/sec system CPU on leader (based on metrics) perf stat -I1000 on leader as sanity check: 5.003299495 205.429271 task-clock 5.003299495 5,654 context-switches 5.003299495 65 cpu-migrations 5.003299495 0 page-faults 5.003299495 380,464,010 cycles 5.003299495 125,979,277 instructions 5.003299495 23,021,050 branches 5.003299495 1,442,498 branch-misses Couple interesting things: 1. As noted in the commit description, the average heartbeat period has changed again. It's slightly longer, which accounts for the drop from 638 hb/s to 512 (~25%). 2. The thread count is much lower because I also tested with the fd consolidation patch enabled. 3. Same goes for voluntary ctx switch rate, which I assume has changed for the same reason. I also repeated the "kill -STOP, sleep 5s, kill -CONT" to make sure the heartbeats remain jittered. You can see a screenshot of the trace here: https://ibb.co/h1agK5. -- To view, visit http://gerrit.cloudera.org:8080/7734 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I5f7e1761d9f36dc6a25bd8e3e7d7a3b5c402afbf Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Dan Burkert <[email protected]> Gerrit-Reviewer: David Ribeiro Alves <[email protected]> Gerrit-Reviewer: Mike Percy <[email protected]> Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-HasComments: No
