Hello Mike Percy, Kudu Jenkins,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/6772
to look at the new patch set (#11).
Change subject: KUDU-1860: ksck doesn't identify tablets that are evicted but
still in config
......................................................................
KUDU-1860: ksck doesn't identify tablets that are evicted but still in config
This patch enhances ksck to gather consensus info from every
tablet. It compares this info with master and outputs the
master's config and every conflicting config, if there are any
conflicts. To do this efficiently it reimplements the
GetAllConsensusState RPC so that it gathers info about every
replica's consensus state.
This will catch at least the two problems identified in
KUDU-1860: 1. The leader has a pending config to remove a
tablet, but it is not committed so the master does not see this
config. This can hide an unhealthy tablet if, e.g., one pending
config member is down and the pending-to-be-kicked-out member is
up, so 1/2 replicas are alive in the leader's active config but
the master thinks 2/3 are alive. 2. No replica is leader but the
master believes there is a leader because its cache is old and
hasn't been updated.
Sample output showing #1:
https://gist.github.com/wdberkeley/d2606698e4f2e8ca3ef70d4dcef7ba9a
Change-Id: I16e4de09821b372c3773b4ade3fd9e37ab818808
---
M src/kudu/consensus/consensus.proto
M src/kudu/integration-tests/cluster_itest_util.cc
M src/kudu/master/catalog_manager.cc
M src/kudu/master/catalog_manager.h
M src/kudu/tools/ksck-test.cc
M src/kudu/tools/ksck.cc
M src/kudu/tools/ksck.h
M src/kudu/tools/ksck_remote.cc
M src/kudu/tools/ksck_remote.h
M src/kudu/tools/tool_action_cluster.cc
M src/kudu/tserver/tablet_replica_lookup.h
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/tablet_service.h
M src/kudu/tserver/ts_tablet_manager.h
14 files changed, 521 insertions(+), 71 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/72/6772/11
--
To view, visit http://gerrit.cloudera.org:8080/6772
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I16e4de09821b372c3773b4ade3fd9e37ab818808
Gerrit-PatchSet: 11
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <[email protected]>
Gerrit-Reviewer: Jean-Daniel Cryans <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <[email protected]>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-Reviewer: Will Berkeley <[email protected]>