Will Berkeley has posted comments on this change. ( http://gerrit.cloudera.org:8080/10076 )
Change subject: KUDU-2287 Expose election failures as a metric ...................................................................... Patch Set 1: (7 comments) Needs a test as well. http://gerrit.cloudera.org:8080/#/c/10076/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/10076/1//COMMIT_MSG@9 PS1, Line 9: Exposing the number of election failures since there was a stable : leader. nit: Could you rewrite this as complete sentence: "This patch exposes as a metric the number of election failures since the last successful election attempt." http://gerrit.cloudera.org:8080/#/c/10076/1/src/kudu/consensus/raft_consensus.h File src/kudu/consensus/raft_consensus.h: http://gerrit.cloudera.org:8080/#/c/10076/1/src/kudu/consensus/raft_consensus.h@839 PS1, Line 839: The number of times this node has called and lost a leader election This is different than what the patch is claiming the metric measures. failed_elections_since_stable_leader_ counts the number of times *this node* has called an election and not become leader since the last time it heard from a leader. Other nodes could have called elections and not won, too, and this metric doesn't count those. As a practical matter, the randomization of election timeouts more-or-less guarantees this node will call elections if it isn't hearing from a leader, so the difference might not matter too much depending on the application of the metric. http://gerrit.cloudera.org:8080/#/c/10076/1/src/kudu/consensus/raft_consensus.h@855 PS1, Line 855: > > nit: While you're here, the space between the >'s isn't necessary in C++11. http://gerrit.cloudera.org:8080/#/c/10076/1/src/kudu/consensus/raft_consensus.cc File src/kudu/consensus/raft_consensus.cc: http://gerrit.cloudera.org:8080/#/c/10076/1/src/kudu/consensus/raft_consensus.cc@156 PS1, Line 156: "Number of failed elections since there was a stable leader. Not exactly correct according to my prev comment. http://gerrit.cloudera.org:8080/#/c/10076/1/src/kudu/consensus/raft_consensus.cc@157 PS1, Line 157: reset resets http://gerrit.cloudera.org:8080/#/c/10076/1/src/kudu/consensus/raft_consensus.cc@2290 PS1, Line 2290: failed_elections_since_stable_leader_ Do you mean failed_elections_metric_? It should be non-null because of the FindOrCreate call, so I don't think the check here is necessary. http://gerrit.cloudera.org:8080/#/c/10076/1/src/kudu/consensus/raft_consensus.cc@2453 PS1, Line 2453: failed_elections_since_stable_leader_ Same. -- To view, visit http://gerrit.cloudera.org:8080/10076 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b25df258cdba7bdae7bb2d7b4eb3d73b53425c3 Gerrit-Change-Number: 10076 Gerrit-PatchSet: 1 Gerrit-Owner: Attila Bukor <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-Reviewer: Will Berkeley <[email protected]> Gerrit-Comment-Date: Mon, 16 Apr 2018 20:10:39 +0000 Gerrit-HasComments: Yes
