Alexey Serbin created KUDU-2709: ----------------------------------- Summary: ksck should not report short transient election states as problematic Key: KUDU-2709 URL: https://issues.apache.org/jira/browse/KUDU-2709 Project: Kudu Issue Type: Improvement Components: CLI, ksck Affects Versions: 1.7.1, 1.8.0, 1.7.0, 1.9.0 Reporter: Alexey Serbin
Currently, when {{ksck}} captures a tablet's replicas in the process of Raft leader election, it might report the tablet as unavailable if the captured Raft configurations differ between Below is an example of output from the {{kudu cluster ksck}} tool (version 1.8): {noformat} Tablet ab548e2415854f3b8bee49b59fb66d6b of table 'default.loadgen_auto_16ffa6f3c4c948459d3c86ab550d628d' is conflicted: Tablet ab548e2415854f3b8bee49b59fb66d6b of table 'default.loadgen_auto_16ffa6f3c4c948459d3c86ab550d628d' replicas' active configs disagree with the master's 1d707658cd6b4cdb9c58ec2a811c2c6e (quasar-swnlpg-4.vpc.cloudera.com:7050): RUNNING [LEADER] b0da2997f80447afbcb094456ac20fa6 (quasar-swnlpg-3.vpc.cloudera.com:7050): RUNNING b78b856b8a94446c9609325e66b9295f (quasar-swnlpg-2.vpc.cloudera.com:7050): RUNNING All reported replicas are: A = 1d707658cd6b4cdb9c58ec2a811c2c6e B = b0da2997f80447afbcb094456ac20fa6 C = b78b856b8a94446c9609325e66b9295f The consensus matrix is: Config source | Replicas | Current term | Config index | Committed? ---------------+--------------+--------------+--------------+------------ master | A* B C | | | Yes A | A B C | 2 | -1 | Yes B | A B C | 2 | -1 | Yes C | A* B C | 1 | -1 | Yes Summary by table Name | RF | Status | Total Tablets | Healthy | Recovering | Under-replicated | Unavailable -------------------------------------------------------+----+--------------------+---------------+---------+------------+------------------+------------- default.loadgen_auto_16ffa6f3c4c948459d3c86ab550d628d | 3 | CONSENSUS_MISMATCH | 8 | 7 | 0 | 0 | 1 {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)