Will Berkeley has posted comments on this change. ( http://gerrit.cloudera.org:8080/10151 )
Change subject: [tools] ksck improvements [6/n]: Refactor printing ...................................................................... Patch Set 5: (16 comments) http://gerrit.cloudera.org:8080/#/c/10151/5//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/10151/5//COMMIT_MSG@17 PS5, Line 17: - Removes the --consensus flag. ksck is far less useful without it and : it doesn't seem like having it brings any benefit. > I second Alexey's concern with putting this into a different patch. I also Yeah I'm just a curmudgeon. it was pretty easy to preserve --consensus. I realized I wanted to remove it and make ksck fail fast and obviously if consensus state gathering fails due to no perms, since too often I get ksck output that's useless b/c they didn't run as the kudu admin user and didn't look for the error messages at the top. That change is bigger and merits a test, so will do in a follow-up. http://gerrit.cloudera.org:8080/#/c/10151/9//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/10151/9//COMMIT_MSG@7 PS9, Line 7: 6/n]: Refactor pr > nit: Maybe this started out being the case, but I think this patch delves a You're right, I wrote a placeholder initial message here before the patch became as much of a refactor as it did. It's more than a nit too: having a good commit message, and especially a good subject, is very important. http://gerrit.cloudera.org:8080/#/c/10151/9//COMMIT_MSG@9 PS9, Line 9: It separates where : checks are done from where results are printed > It'd be helpful at a glance to describe what this separation is. A stab at Done http://gerrit.cloudera.org:8080/#/c/10151/9//COMMIT_MSG@15 PS9, Line 15: running all ksck checks > nit: could you flesh out what this means a little bit? Not having read thro Done http://gerrit.cloudera.org:8080/#/c/10151/9/src/kudu/tools/ksck-test.cc File src/kudu/tools/ksck-test.cc: http://gerrit.cloudera.org:8080/#/c/10151/9/src/kudu/tools/ksck-test.cc@632 PS9, Line 632: onsensus ma > nit: spacing here and elsewhere with similar error messages above and below This is actually intended because it's not 3 arguments- it's two, with the first being a long string literal broken over two lines. I couldn't find anything in the GSG about how many spaces to indent in this case. Anyway, I switched it to have each arg on a new, so the line is still annoyingly long but not lint-breaking. I wish there were some tool like go's gofmt that just did the formatting and that's it. http://gerrit.cloudera.org:8080/#/c/10151/9/src/kudu/tools/ksck-test.cc@674 PS9, Line 674: EST_F(KsckTest, Tes > nit: spacing here and elsewhere in similar summary messages Done http://gerrit.cloudera.org:8080/#/c/10151/9/src/kudu/tools/ksck-test.cc@808 PS9, Line 808: / KUDU-2113 > nit: spacing here and in the below runtime error assertion Do you have some kind of magic code review spectacles that let you see these 1 space differences in tightly paked lines of code http://gerrit.cloudera.org:8080/#/c/10151/9/src/kudu/tools/ksck.h File src/kudu/tools/ksck.h: http://gerrit.cloudera.org:8080/#/c/10151/9/src/kudu/tools/ksck.h@464 PS9, Line 464: : // Perform all > nit: is there some heuristic that specifies what the proper order is? "Do everything that fetches before you do stuff that checks, except you have to CheckClusterRunning to connect to the leader master before fetching. Oh, and you can run the master health check on its own, and the master consensus check after that. Also, table checks come after tablet checks." Part of the reason for adding this method is so no other clients of Ksck need to remember that. It's encoded in one place- the body of the method itself. http://gerrit.cloudera.org:8080/#/c/10151/9/src/kudu/tools/ksck.cc File src/kudu/tools/ksck.cc: http://gerrit.cloudera.org:8080/#/c/10151/9/src/kudu/tools/ksck.cc@321 PS9, Line 321: Substitute("failed to gather info for all tablet > nit: we should probably keep the parens consistent Done http://gerrit.cloudera.org:8080/#/c/10151/9/src/kudu/tools/ksck.cc@637 PS9, Line 637: : int num_results = 0; > nit: spacing auto to the rescue. http://gerrit.cloudera.org:8080/#/c/10151/9/src/kudu/tools/ksck_results.h File src/kudu/tools/ksck_results.h: http://gerrit.cloudera.org:8080/#/c/10151/9/src/kudu/tools/ksck_results.h@105 PS9, Line 105: boost::optional<int64_t> opid_index; : boost::optional<std::string> l > Making sure I understand: this field would be non-optional, but masters, wh Correct. And done. http://gerrit.cloudera.org:8080/#/c/10151/9/src/kudu/tools/ksck_results.h@141 PS9, Line 141: > nit: extra . Done http://gerrit.cloudera.org:8080/#/c/10151/9/src/kudu/tools/ksck_results.h@246 PS9, Line 246: std::vector<KsckTabletSummary> tabl > How do we reconcile what checks correspond to what error messages? I guess By reading them. These are meant for display, to quickly tell you at the end what to look for in the output above. The ordering is not specified, but it works out to be the order in which we run the checks, which is also the order we print results, so you can use the relative order to search the output for more detailed info about errors. http://gerrit.cloudera.org:8080/#/c/10151/9/src/kudu/tools/ksck_results.cc File src/kudu/tools/ksck_results.cc: http://gerrit.cloudera.org:8080/#/c/10151/9/src/kudu/tools/ksck_results.cc@99 PS9, Line 99: const map<string, char> replica_labels, : DataTable* table) { : const string replicas = format_replicas(replica_ > nit: spacing Done http://gerrit.cloudera.org:8080/#/c/10151/9/src/kudu/tools/kudu-admin-test.cc File src/kudu/tools/kudu-admin-test.cc: http://gerrit.cloudera.org:8080/#/c/10151/9/src/kudu/tools/kudu-admin-test.cc@341 PS9, Line 341: EnableKudu1097AndDownTS > nit: this seems easy enough to put into a separate patch It's literally the smallest possible patch. http://gerrit.cloudera.org:8080/#/c/10151/9/src/kudu/tools/tool_action_tablet.cc File src/kudu/tools/tool_action_tablet.cc: http://gerrit.cloudera.org:8080/#/c/10151/9/src/kudu/tools/tool_action_tablet.cc@413 PS9, Line 413: // We retry since occasionally ksck returns bad status because of transient : // issues like leader elections. : RETURN_NOT_OK_PREPEND(WaitForCleanKsck(master_addresses, : tablet_id, : MonoDelta::FromSeconds(5)), > nit: this seems like it should be a separate patch as well Done -- To view, visit http://gerrit.cloudera.org:8080/10151 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id8de619996b6cd753e6a9c01b1b60810a873e609 Gerrit-Change-Number: 10151 Gerrit-PatchSet: 5 Gerrit-Owner: Will Berkeley <wdberke...@gmail.com> Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com> Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Will Berkeley <wdberke...@gmail.com> Gerrit-Comment-Date: Fri, 27 Apr 2018 04:04:25 +0000 Gerrit-HasComments: Yes