Hello Kudu Jenkins,
I'd like you to reexamine a change. Please visit
to look at the new patch set (#8).
Change subject: KUDU-763 consensus queue metrics on followers are messed up
KUDU-763 consensus queue metrics on followers are messed up
On follower tablet replicas, the majority_done_ops and
in_progress_ops metrics are wrong.
majority_done_ops = committed_index - all_replicated_opid
in_progress_ops = last_appended - committed_index
There are two reasons why:
1) followers do not update their consensus queue's committed index
2) followers do not maintain a correct value for all_replicated_opid,
since their queues generally only track the local peer and the leader
does not notify followers when ops are all-replicated.
This patch fixes 1 by having consensus notify the follower queues of
the updated committed index when the consensus committed index is
updated. This makes in_progress_ops meaningful for followers. Note
that a follower queue's committed index is not used for anything
besides the metrics.
Fixing 2 would require having the leader notify followers when
operations are all-replicated. This isn't needed for consensus, and
would be used by the followers just for the majority_done_ops metric,
so I think it's best just to zero the metric for followers and
document that it is not meaningful in that case.
4 files changed, 54 insertions(+), 9 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/01/3501/8
To view, visit http://gerrit.cloudera.org:8080/3501
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Owner: Will Berkeley <wdberke...@gmail.com>
Gerrit-Reviewer: David Ribeiro Alves <dral...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mpe...@apache.org>
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>
Gerrit-Reviewer: Will Berkeley <wdberke...@gmail.com>