[kudu-CR] KUDU-2287 Expose election failures as a metric

Will Berkeley (Code Review) Mon, 16 Apr 2018 13:11:31 -0700

Will Berkeley has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10076 )


Change subject: KUDU-2287 Expose election failures as a metric
......................................................................


Patch Set 1:

(7 comments)

Needs a test as well.

http://gerrit.cloudera.org:8080/#/c/10076/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/10076/1//COMMIT_MSG@9
PS1, Line 9: Exposing the number of election failures since there was a stable
           : leader.
nit: Could you rewrite this as complete sentence:

"This patch exposes as a metric the number of election failures since the last 
successful election attempt."


http://gerrit.cloudera.org:8080/#/c/10076/1/src/kudu/consensus/raft_consensus.h
File src/kudu/consensus/raft_consensus.h:

http://gerrit.cloudera.org:8080/#/c/10076/1/src/kudu/consensus/raft_consensus.h@839
PS1, Line 839: The number of times this node has called and lost a leader 
election
This is different than what the patch is claiming the metric measures. 
failed_elections_since_stable_leader_ counts the number of times *this node* 
has called an election and not become leader since the last time it heard from 
a leader. Other nodes could have called elections and not won, too, and this 
metric doesn't count those.

As a practical matter, the randomization of election timeouts more-or-less 
guarantees this node will call elections if it isn't hearing from a leader, so 
the difference might not matter too much depending on the application of the 
metric.


http://gerrit.cloudera.org:8080/#/c/10076/1/src/kudu/consensus/raft_consensus.h@855
PS1, Line 855: > >
nit: While you're here, the space between the >'s isn't necessary in C++11.


http://gerrit.cloudera.org:8080/#/c/10076/1/src/kudu/consensus/raft_consensus.cc
File src/kudu/consensus/raft_consensus.cc:

http://gerrit.cloudera.org:8080/#/c/10076/1/src/kudu/consensus/raft_consensus.cc@156
PS1, Line 156: "Number of failed elections since there was a stable leader.
Not exactly correct according to my prev comment.


http://gerrit.cloudera.org:8080/#/c/10076/1/src/kudu/consensus/raft_consensus.cc@157
PS1, Line 157: reset
resets


http://gerrit.cloudera.org:8080/#/c/10076/1/src/kudu/consensus/raft_consensus.cc@2290
PS1, Line 2290: failed_elections_since_stable_leader_
Do you mean failed_elections_metric_? It should be non-null because of the 
FindOrCreate call, so I don't think the check here is necessary.


http://gerrit.cloudera.org:8080/#/c/10076/1/src/kudu/consensus/raft_consensus.cc@2453
PS1, Line 2453: failed_elections_since_stable_leader_
Same.



--
To view, visit http://gerrit.cloudera.org:8080/10076
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1b25df258cdba7bdae7bb2d7b4eb3d73b53425c3
Gerrit-Change-Number: 10076
Gerrit-PatchSet: 1
Gerrit-Owner: Attila Bukor <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-Reviewer: Will Berkeley <[email protected]>
Gerrit-Comment-Date: Mon, 16 Apr 2018 20:10:39 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2287 Expose election failures as a metric

Reply via email to