Will Berkeley has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9528 )

Change subject: WIP KUDU-2129 make ksck less scary when copying

Patch Set 1:


Commit Message:

PS1, Line 25: Report "non-HEALTHY" instead of "bad" in the above log
This is the one change I have an issue with because it makes it less clear what 
ksck is saying. It tells you what the thing is not instead of what it is, which 
makes it harder to know how to respond. Admittedly "bad" isn't great either, 
since "bad" tables may actually be self-healing. Maybe we need to break down 
the results further into HEALTHY, RECOVERING, NEEDS_INTERVENTION, or something?

File src/kudu/tools/ksck-test.cc:

PS1, Line 322: with non-HEALTHY statuses
Howabout "are not healthy"?

PS1, Line 402: "Table Summary\n"
             :       " Name | Status  | Total Tablets | Healthy | Recovering | 
Under-replicated | Unavailable\n"
             :       " test | HEALTHY | 3             | 3       | 0          | 
0                | 0");
Are these the long lines you're complaining about? Maybe we can add a helper 
like ExpectedKsckTableSummary() that takes a container of expected CheckResults 
or just an expected count of each kind, and returns the table as a string.

PS1, Line 420: "The consensus matrix is:\n"
             :       " Config source |     Replicas     | Current term | Config 
index | Committed?\n"
             :       " master        | A*  B   C        |              |        
      | Yes\n"
             :       " A             | A*  B   C   D    | 0            |        
      | Yes\n"
             :       " B             | A*  B   C        | 0            |        
      | Yes\n"
             :       " C             | A*  B   C        | 0            |        
      | Yes");
On the other hand I think the consensus matrix literals are kind of a feature, 
since as we test for detecting more situations they serve as a reference for 
how certain situations look in ksck.

Also it'd be harder to write a helper.

File src/kudu/tools/ksck.h:

PS1, Line 461: // There was a discrepancy among the tablets' consensus configs 
and the master's.
Can you add a blurb like this for all non-OK statuses? Maybe like

// The tablet has less replicas than its table's replication factor.
// The tablet is missing a majority of its replicas and is unavailable for 
writes. If a majority cannot be brought back online then the tablet requires 
manual intervention to recover.
// The tablet is missing replicas but is in the process of self-healing.

I also have weak preferences for:
1. Renaming OK to HEALTHY.
2. Reordering the enum values to go from better to worse. IIRC it's

PS1, Line 489: if (recovering_tablets > 0) {
             :         return CheckResult::RECOVERING;
             :       }
             :       if (underreplicated_tablets > 0) {
             :         return CheckResult::UNDER_REPLICATED;
             :       }
Isn't under-replicated worse than recovering?

To view, visit http://gerrit.cloudera.org:8080/9528
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iba0c1f5b5a7083cbef99d3674dfebe1457075ce0
Gerrit-Change-Number: 9528
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley <wdberke...@gmail.com>
Gerrit-Comment-Date: Wed, 07 Mar 2018 18:04:32 +0000
Gerrit-HasComments: Yes

Reply via email to