Alexey Serbin created KUDU-2709:
-----------------------------------
Summary: ksck should not report short transient election states as
problematic
Key: KUDU-2709
URL: https://issues.apache.org/jira/browse/KUDU-2709
Project: Kudu
Issue Type: Improvement
Components: CLI, ksck
Affects Versions: 1.7.1, 1.8.0, 1.7.0, 1.9.0
Reporter: Alexey Serbin
Currently, when {{ksck}} captures a tablet's replicas in the process of Raft
leader election, it might report the tablet as unavailable if the captured Raft
configurations differ between
Below is an example of output from the {{kudu cluster ksck}} tool (version 1.8):
{noformat}
Tablet ab548e2415854f3b8bee49b59fb66d6b of table
'default.loadgen_auto_16ffa6f3c4c948459d3c86ab550d628d' is conflicted: Tablet
ab548e2415854f3b8bee49b59fb66d6b of table
'default.loadgen_auto_16ffa6f3c4c948459d3c86ab550d628d' replicas' active
configs disagree with the master's
1d707658cd6b4cdb9c58ec2a811c2c6e (quasar-swnlpg-4.vpc.cloudera.com:7050):
RUNNING [LEADER]
b0da2997f80447afbcb094456ac20fa6 (quasar-swnlpg-3.vpc.cloudera.com:7050):
RUNNING
b78b856b8a94446c9609325e66b9295f (quasar-swnlpg-2.vpc.cloudera.com:7050):
RUNNING
All reported replicas are:
A = 1d707658cd6b4cdb9c58ec2a811c2c6e
B = b0da2997f80447afbcb094456ac20fa6
C = b78b856b8a94446c9609325e66b9295f
The consensus matrix is:
Config source | Replicas | Current term | Config index | Committed?
---------------+--------------+--------------+--------------+------------
master | A* B C | | | Yes
A | A B C | 2 | -1 | Yes
B | A B C | 2 | -1 | Yes
C | A* B C | 1 | -1 | Yes
Summary by table
Name | RF | Status
| Total Tablets | Healthy | Recovering | Under-replicated | Unavailable
-------------------------------------------------------+----+--------------------+---------------+---------+------------+------------------+-------------
default.loadgen_auto_16ffa6f3c4c948459d3c86ab550d628d | 3 |
CONSENSUS_MISMATCH | 8 | 7 | 0 | 0 | 1
{noformat}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)