[kudu-CR] [tools] ksck improvements [2/n]: KUDU-2306: ksck master health check
Will Berkeley has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/9882 ) Change subject: [tools] ksck improvements [2/n]: KUDU-2306: ksck master health check .. [tools] ksck improvements [2/n]: KUDU-2306: ksck master health check This patch adds support for checking the health of each master passed to ksck. Currently, ksck only checks that it can connect to the cluster by connecting to a leader master. Just like the tablet server health summary introduced recently, there's now a master health summary: $ build/latest/bin/kudu cluster ksck localhost:7051,localhost:7052,localhost:7053 WARNING: Unable to connect to master (localhost:7052): Network error: could not get status from master: Client connection negotiation failed: client connection to 127.0.0.1:7052: connect: Connection refused (error 61) Master Summary UUID |Address | Status --++- 2096af644c1946e98628b7e9356f9ecf | localhost:7053 | HEALTHY 99179f76035447ffaee00d15bfc89ebf | localhost:7051 | HEALTHY | localhost:7052 | UNAVAILABLE Change-Id: Ie0066468a0adc563979dcd2c61cd65d66978420c Reviewed-on: http://gerrit.cloudera.org:8080/9882 Reviewed-by: Andrew WongTested-by: Will Berkeley --- M src/kudu/integration-tests/cluster_verifier.cc M src/kudu/tools/ksck-test.cc M src/kudu/tools/ksck.cc M src/kudu/tools/ksck.h M src/kudu/tools/ksck_remote-test.cc M src/kudu/tools/ksck_remote.cc M src/kudu/tools/ksck_remote.h M src/kudu/tools/tool_action_cluster.cc 8 files changed, 404 insertions(+), 117 deletions(-) Approvals: Andrew Wong: Looks good to me, approved Will Berkeley: Verified -- To view, visit http://gerrit.cloudera.org:8080/9882 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ie0066468a0adc563979dcd2c61cd65d66978420c Gerrit-Change-Number: 9882 Gerrit-PatchSet: 6 Gerrit-Owner: Will Berkeley Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley
[kudu-CR] [tools] ksck improvements [2/n]: KUDU-2306: ksck master health check
Will Berkeley has posted comments on this change. ( http://gerrit.cloudera.org:8080/9882 ) Change subject: [tools] ksck improvements [2/n]: KUDU-2306: ksck master health check .. Patch Set 5: Verified+1 Unrelated failures. -- To view, visit http://gerrit.cloudera.org:8080/9882 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie0066468a0adc563979dcd2c61cd65d66978420c Gerrit-Change-Number: 9882 Gerrit-PatchSet: 5 Gerrit-Owner: Will BerkeleyGerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Thu, 05 Apr 2018 21:05:08 + Gerrit-HasComments: No
[kudu-CR] [tools] ksck improvements [2/n]: KUDU-2306: ksck master health check
Will Berkeley has removed Kudu Jenkins from this change. ( http://gerrit.cloudera.org:8080/9882 ) Change subject: [tools] ksck improvements [2/n]: KUDU-2306: ksck master health check .. Removed reviewer Kudu Jenkins with the following votes: * Verified-1 by Kudu Jenkins (120) -- To view, visit http://gerrit.cloudera.org:8080/9882 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: deleteReviewer Gerrit-Change-Id: Ie0066468a0adc563979dcd2c61cd65d66978420c Gerrit-Change-Number: 9882 Gerrit-PatchSet: 5 Gerrit-Owner: Will BerkeleyGerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley
[kudu-CR] [tools] ksck improvements [2/n]: KUDU-2306: ksck master health check
Will Berkeley has posted comments on this change. ( http://gerrit.cloudera.org:8080/9882 ) Change subject: [tools] ksck improvements [2/n]: KUDU-2306: ksck master health check .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/9882/3/src/kudu/tools/ksck.h File src/kudu/tools/ksck.h: http://gerrit.cloudera.org:8080/#/c/9882/3/src/kudu/tools/ksck.h@240 PS3, Line 240: enum class KsckFetchState { : // Information has not yet been fetched. : UNINITIALIZED, : // The attempt to fetch information failed. : FETCH_FAILED, : // Infor > Right, but does it matter whether we're FETCH_FAILED vs UNINITIALIZED? It s Yeah, UNINITIALIZED v FETCH_FAILED is useful for preconditions to check for programmer error. It's not a meaningful distinction for the ksck checks. -- To view, visit http://gerrit.cloudera.org:8080/9882 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie0066468a0adc563979dcd2c61cd65d66978420c Gerrit-Change-Number: 9882 Gerrit-PatchSet: 5 Gerrit-Owner: Will BerkeleyGerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Thu, 05 Apr 2018 21:04:31 + Gerrit-HasComments: Yes
[kudu-CR] [tools] ksck improvements [2/n]: KUDU-2306: ksck master health check
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9882 ) Change subject: [tools] ksck improvements [2/n]: KUDU-2306: ksck master health check .. Patch Set 5: Code-Review+2 (3 comments) http://gerrit.cloudera.org:8080/#/c/9882/3/src/kudu/tools/ksck.h File src/kudu/tools/ksck.h: http://gerrit.cloudera.org:8080/#/c/9882/3/src/kudu/tools/ksck.h@240 PS3, Line 240: enum class KsckFetchState { : // Information has not yet been fetched. : UNINITIALIZED, : // The attempt to fetch information failed. : FETCH_FAILED, : // Infor > UNINITIALIZED means we haven't tried to fetch. FETCH_FAILED means we tried Right, but does it matter whether we're FETCH_FAILED vs UNINITIALIZED? It seems like we won't know anything unless we're at FETCHED, and if that's not the case then is appropriate regardless. Although this is probably useful for catching programming errors. http://gerrit.cloudera.org:8080/#/c/9882/3/src/kudu/tools/ksck.cc File src/kudu/tools/ksck.cc: http://gerrit.cloudera.org:8080/#/c/9882/3/src/kudu/tools/ksck.cc@126 PS3, Line 126: std::ostream& operator<<(std::ostream& lhs, KsckFetchState state) { : switch (state) { : case KsckFetchState::UNINITIALIZED: : lhs << "UNINITIALIZED"; : break; : case KsckFetchState::FETCH_FAILED: : lhs << "FETCH_FAILED"; : break; : case KsckFetchState::FETCHED: : lhs << "FETCHED"; : break; : default: : LOG(FATAL) << "unknown KsckFetchState"; : } : return lhs; : } > There's a comment on its declaration in ksck.h. It's required for using Ksc Ack http://gerrit.cloudera.org:8080/#/c/9882/3/src/kudu/tools/ksck_remote-test.cc File src/kudu/tools/ksck_remote-test.cc: http://gerrit.cloudera.org:8080/#/c/9882/3/src/kudu/tools/ksck_remote-test.cc@a217 PS3, Line 217: > Mind if I pass on this one? If I do another rebase pass and remember I will Yea, np -- To view, visit http://gerrit.cloudera.org:8080/9882 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie0066468a0adc563979dcd2c61cd65d66978420c Gerrit-Change-Number: 9882 Gerrit-PatchSet: 5 Gerrit-Owner: Will BerkeleyGerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Thu, 05 Apr 2018 20:55:39 + Gerrit-HasComments: Yes
[kudu-CR] [tools] ksck improvements [2/n]: KUDU-2306: ksck master health check
Hello Tidy Bot, Attila Bukor, Kudu Jenkins, Andrew Wong, Todd Lipcon, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9882 to look at the new patch set (#5). Change subject: [tools] ksck improvements [2/n]: KUDU-2306: ksck master health check .. [tools] ksck improvements [2/n]: KUDU-2306: ksck master health check This patch adds support for checking the health of each master passed to ksck. Currently, ksck only checks that it can connect to the cluster by connecting to a leader master. Just like the tablet server health summary introduced recently, there's now a master health summary: $ build/latest/bin/kudu cluster ksck localhost:7051,localhost:7052,localhost:7053 WARNING: Unable to connect to master (localhost:7052): Network error: could not get status from master: Client connection negotiation failed: client connection to 127.0.0.1:7052: connect: Connection refused (error 61) Master Summary UUID |Address | Status --++- 2096af644c1946e98628b7e9356f9ecf | localhost:7053 | HEALTHY 99179f76035447ffaee00d15bfc89ebf | localhost:7051 | HEALTHY | localhost:7052 | UNAVAILABLE Change-Id: Ie0066468a0adc563979dcd2c61cd65d66978420c --- M src/kudu/integration-tests/cluster_verifier.cc M src/kudu/tools/ksck-test.cc M src/kudu/tools/ksck.cc M src/kudu/tools/ksck.h M src/kudu/tools/ksck_remote-test.cc M src/kudu/tools/ksck_remote.cc M src/kudu/tools/ksck_remote.h M src/kudu/tools/tool_action_cluster.cc 8 files changed, 404 insertions(+), 117 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/82/9882/5 -- To view, visit http://gerrit.cloudera.org:8080/9882 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie0066468a0adc563979dcd2c61cd65d66978420c Gerrit-Change-Number: 9882 Gerrit-PatchSet: 5 Gerrit-Owner: Will BerkeleyGerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley
[kudu-CR] [tools] ksck improvements [2/n]: KUDU-2306: ksck master health check
Will Berkeley has posted comments on this change. ( http://gerrit.cloudera.org:8080/9882 ) Change subject: [tools] ksck improvements [2/n]: KUDU-2306: ksck master health check .. Patch Set 3: (10 comments) http://gerrit.cloudera.org:8080/#/c/9882/3/src/kudu/tools/ksck.h File src/kudu/tools/ksck.h: http://gerrit.cloudera.org:8080/#/c/9882/3/src/kudu/tools/ksck.h@240 PS3, Line 240: // Information has not yet been fetched. : UNINITIALIZED, : // The attempt to fetch information failed. : FETCH_FAILED, : // Information was fetched successfully. : FETCHED, > Might be missing something, but it seems like we don't gain much from the e UNINITIALIZED means we haven't tried to fetch. FETCH_FAILED means we tried and failed. It's a useful distinction because a lot of method calls on Ksck{Master,TabletServer} have fetching as a precondition. http://gerrit.cloudera.org:8080/#/c/9882/3/src/kudu/tools/ksck.h@289 PS3, Line 289: masters > nit: "uninitialized masters" or "masters from which we have not successfull Done http://gerrit.cloudera.org:8080/#/c/9882/3/src/kudu/tools/ksck.h@338 PS3, Line 338: CHECK_NE(state_, KsckFetchState::UNINITIALIZED); > nit: swap order here and elsewhere Done http://gerrit.cloudera.org:8080/#/c/9882/3/src/kudu/tools/ksck.h@533 PS3, Line 533: ServerHealthScore > Doc this? Done http://gerrit.cloudera.org:8080/#/c/9882/3/src/kudu/tools/ksck.h@535 PS3, Line 535: Summarize > nit: Summarizes Done http://gerrit.cloudera.org:8080/#/c/9882/3/src/kudu/tools/ksck.h@575 PS3, Line 575: static Status PrintServerHealthSummaries(const std::string& summary_type, > Doc this. Also what are expected summary types? Maybe we should use an enum Done http://gerrit.cloudera.org:8080/#/c/9882/3/src/kudu/tools/ksck.cc File src/kudu/tools/ksck.cc: http://gerrit.cloudera.org:8080/#/c/9882/3/src/kudu/tools/ksck.cc@126 PS3, Line 126: std::ostream& operator<<(std::ostream& lhs, KsckFetchState state) { : switch (state) { : case KsckFetchState::UNINITIALIZED: : lhs << "UNINITIALIZED"; : break; : case KsckFetchState::FETCH_FAILED: : lhs << "FETCH_FAILED"; : break; : case KsckFetchState::FETCHED: : lhs << "FETCHED"; : break; : default: : LOG(FATAL) << "unknown KsckFetchState"; : } : return lhs; : } > Hrmm I was going to suggest maybe a ToString() method, but I can't seem to There's a comment on its declaration in ksck.h. It's required for using KsckFetchState in CHECK* macros. http://gerrit.cloudera.org:8080/#/c/9882/3/src/kudu/tools/ksck.cc@186 PS3, Line 186: if (!s.ok()) { : Warn() << Substitute("Unable to connect to master $0: $1", :master->ToString(), s.ToString()) << endl; : } : if (!s.ok()) bad_masters++; > nit: merge these? Done http://gerrit.cloudera.org:8080/#/c/9882/3/src/kudu/tools/ksck_remote-test.cc File src/kudu/tools/ksck_remote-test.cc: http://gerrit.cloudera.org:8080/#/c/9882/3/src/kudu/tools/ksck_remote-test.cc@a217 PS3, Line 217: > nit: probably belongs in CR [1/n]? Mind if I pass on this one? If I do another rebase pass and remember I will switch it but it's kind of a pain in the butt for a small thing like this. http://gerrit.cloudera.org:8080/#/c/9882/3/src/kudu/tools/ksck_remote.cc File src/kudu/tools/ksck_remote.cc: http://gerrit.cloudera.org:8080/#/c/9882/3/src/kudu/tools/ksck_remote.cc@383 PS3, Line 383: std::static_pointer_cast(master)->Init(), > What do you think about putting Init() into the KsckMaster base class? That Done -- To view, visit http://gerrit.cloudera.org:8080/9882 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie0066468a0adc563979dcd2c61cd65d66978420c Gerrit-Change-Number: 9882 Gerrit-PatchSet: 3 Gerrit-Owner: Will BerkeleyGerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Tue, 03 Apr 2018 20:44:55 + Gerrit-HasComments: Yes
[kudu-CR] [tools] ksck improvements [2/n]: KUDU-2306: ksck master health check
Hello Tidy Bot, Attila Bukor, Kudu Jenkins, Andrew Wong, Todd Lipcon, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9882 to look at the new patch set (#4). Change subject: [tools] ksck improvements [2/n]: KUDU-2306: ksck master health check .. [tools] ksck improvements [2/n]: KUDU-2306: ksck master health check This patch adds support for checking the health of each master passed to ksck. Currently, ksck only checks that it can connect to the cluster by connecting to a leader master. Just like the tablet server health summary introduced recently, there's now a master health summary: $ build/latest/bin/kudu cluster ksck localhost:7051,localhost:7052,localhost:7053 WARNING: Unable to connect to master (localhost:7052): Network error: could not get status from master: Client connection negotiation failed: client connection to 127.0.0.1:7052: connect: Connection refused (error 61) Master Summary UUID |Address | Status --++- 2096af644c1946e98628b7e9356f9ecf | localhost:7053 | HEALTHY 99179f76035447ffaee00d15bfc89ebf | localhost:7051 | HEALTHY | localhost:7052 | UNAVAILABLE Change-Id: Ie0066468a0adc563979dcd2c61cd65d66978420c --- M src/kudu/integration-tests/cluster_verifier.cc M src/kudu/tools/ksck-test.cc M src/kudu/tools/ksck.cc M src/kudu/tools/ksck.h M src/kudu/tools/ksck_remote-test.cc M src/kudu/tools/ksck_remote.cc M src/kudu/tools/ksck_remote.h M src/kudu/tools/tool_action_cluster.cc 8 files changed, 402 insertions(+), 116 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/82/9882/4 -- To view, visit http://gerrit.cloudera.org:8080/9882 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie0066468a0adc563979dcd2c61cd65d66978420c Gerrit-Change-Number: 9882 Gerrit-PatchSet: 4 Gerrit-Owner: Will BerkeleyGerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley
[kudu-CR] [tools] ksck improvements [2/n]: KUDU-2306: ksck master health check
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9882 ) Change subject: [tools] ksck improvements [2/n]: KUDU-2306: ksck master health check .. Patch Set 3: (10 comments) http://gerrit.cloudera.org:8080/#/c/9882/3/src/kudu/tools/ksck.h File src/kudu/tools/ksck.h: http://gerrit.cloudera.org:8080/#/c/9882/3/src/kudu/tools/ksck.h@240 PS3, Line 240: // Information has not yet been fetched. : UNINITIALIZED, : // The attempt to fetch information failed. : FETCH_FAILED, : // Information was fetched successfully. : FETCHED, Might be missing something, but it seems like we don't gain much from the extra expressiveness of UNINITIALIZED vs FETCH_FAILED. Couldn't we get by with some: bool fetch_succeeded; ...or something? http://gerrit.cloudera.org:8080/#/c/9882/3/src/kudu/tools/ksck.h@289 PS3, Line 289: masters nit: "uninitialized masters" or "masters from which we have not successfully fetched info" or something http://gerrit.cloudera.org:8080/#/c/9882/3/src/kudu/tools/ksck.h@338 PS3, Line 338: CHECK_NE(state_, KsckFetchState::UNINITIALIZED); nit: swap order here and elsewhere http://gerrit.cloudera.org:8080/#/c/9882/3/src/kudu/tools/ksck.h@533 PS3, Line 533: ServerHealthScore Doc this? http://gerrit.cloudera.org:8080/#/c/9882/3/src/kudu/tools/ksck.h@535 PS3, Line 535: Summarize nit: Summarizes http://gerrit.cloudera.org:8080/#/c/9882/3/src/kudu/tools/ksck.h@575 PS3, Line 575: static Status PrintServerHealthSummaries(const std::string& summary_type, Doc this. Also what are expected summary types? Maybe we should use an enum for that. Ah I see, maybe rename this to "server_type" or "server_name" or "daemon_type"? http://gerrit.cloudera.org:8080/#/c/9882/3/src/kudu/tools/ksck.cc File src/kudu/tools/ksck.cc: http://gerrit.cloudera.org:8080/#/c/9882/3/src/kudu/tools/ksck.cc@126 PS3, Line 126: std::ostream& operator<<(std::ostream& lhs, KsckFetchState state) { : switch (state) { : case KsckFetchState::UNINITIALIZED: : lhs << "UNINITIALIZED"; : break; : case KsckFetchState::FETCH_FAILED: : lhs << "FETCH_FAILED"; : break; : case KsckFetchState::FETCHED: : lhs << "FETCHED"; : break; : default: : LOG(FATAL) << "unknown KsckFetchState"; : } : return lhs; : } Hrmm I was going to suggest maybe a ToString() method, but I can't seem to find where this is being used. http://gerrit.cloudera.org:8080/#/c/9882/3/src/kudu/tools/ksck.cc@186 PS3, Line 186: if (!s.ok()) { : Warn() << Substitute("Unable to connect to master $0: $1", :master->ToString(), s.ToString()) << endl; : } : if (!s.ok()) bad_masters++; nit: merge these? http://gerrit.cloudera.org:8080/#/c/9882/3/src/kudu/tools/ksck_remote-test.cc File src/kudu/tools/ksck_remote-test.cc: http://gerrit.cloudera.org:8080/#/c/9882/3/src/kudu/tools/ksck_remote-test.cc@a217 PS3, Line 217: nit: probably belongs in CR [1/n]? http://gerrit.cloudera.org:8080/#/c/9882/3/src/kudu/tools/ksck_remote.cc File src/kudu/tools/ksck_remote.cc: http://gerrit.cloudera.org:8080/#/c/9882/3/src/kudu/tools/ksck_remote.cc@383 PS3, Line 383: std::static_pointer_cast(master)->Init(), What do you think about putting Init() into the KsckMaster base class? That way we can avoid this static casting? And maybe use auto instead of value_type? -- To view, visit http://gerrit.cloudera.org:8080/9882 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie0066468a0adc563979dcd2c61cd65d66978420c Gerrit-Change-Number: 9882 Gerrit-PatchSet: 3 Gerrit-Owner: Will BerkeleyGerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Tue, 03 Apr 2018 05:17:37 + Gerrit-HasComments: Yes
[kudu-CR] [tools] ksck improvements [2/n]: KUDU-2306: ksck master health check
Hello Tidy Bot, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9882 to look at the new patch set (#3). Change subject: [tools] ksck improvements [2/n]: KUDU-2306: ksck master health check .. [tools] ksck improvements [2/n]: KUDU-2306: ksck master health check This patch adds support for checking the health of each master passed to ksck. Currently, ksck only checks that it can connect to the cluster by connecting to a leader master. Just like the tablet server health summary introduced recently, there's now a master health summary: $ build/latest/bin/kudu cluster ksck localhost:7051,localhost:7052,localhost:7053 WARNING: Unable to connect to master (localhost:7052): Network error: could not get status from master: Client connection negotiation failed: client connection to 127.0.0.1:7052: connect: Connection refused (error 61) Master Summary UUID |Address | Status --++- 2096af644c1946e98628b7e9356f9ecf | localhost:7053 | HEALTHY 99179f76035447ffaee00d15bfc89ebf | localhost:7051 | HEALTHY | localhost:7052 | UNAVAILABLE Change-Id: Ie0066468a0adc563979dcd2c61cd65d66978420c --- M src/kudu/integration-tests/cluster_verifier.cc M src/kudu/tools/ksck-test.cc M src/kudu/tools/ksck.cc M src/kudu/tools/ksck.h M src/kudu/tools/ksck_remote-test.cc M src/kudu/tools/ksck_remote.cc M src/kudu/tools/ksck_remote.h M src/kudu/tools/tool_action_cluster.cc 8 files changed, 372 insertions(+), 116 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/82/9882/3 -- To view, visit http://gerrit.cloudera.org:8080/9882 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie0066468a0adc563979dcd2c61cd65d66978420c Gerrit-Change-Number: 9882 Gerrit-PatchSet: 3 Gerrit-Owner: Will BerkeleyGerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot
[kudu-CR] [tools] ksck improvements [2/n]: KUDU-2306: ksck master health check
Will Berkeley has uploaded this change for review. ( http://gerrit.cloudera.org:8080/9882 Change subject: [tools] ksck improvements [2/n]: KUDU-2306: ksck master health check .. [tools] ksck improvements [2/n]: KUDU-2306: ksck master health check This patch adds support for checking the health of each master passed to ksck. Currently, ksck only checks that it can connect to the cluster by connecting to a leader master. Just like the tablet server health summary introduced recently, there's now a master health summary: $ build/latest/bin/kudu cluster ksck localhost:7051,localhost:7052,localhost:7053 WARNING: Unable to connect to master (localhost:7052): Network error: could not get status from master: Client connection negotiation failed: client connection to 127.0.0.1:7052: connect: Connection refused (error 61) Master Summary UUID |Address | Status --++- 2096af644c1946e98628b7e9356f9ecf | localhost:7053 | HEALTHY 99179f76035447ffaee00d15bfc89ebf | localhost:7051 | HEALTHY | localhost:7052 | UNAVAILABLE Change-Id: Ie0066468a0adc563979dcd2c61cd65d66978420c --- M src/kudu/integration-tests/cluster_verifier.cc M src/kudu/tools/ksck-test.cc M src/kudu/tools/ksck.cc M src/kudu/tools/ksck.h M src/kudu/tools/ksck_remote-test.cc M src/kudu/tools/ksck_remote.cc M src/kudu/tools/ksck_remote.h M src/kudu/tools/tool_action_cluster.cc 8 files changed, 372 insertions(+), 117 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/82/9882/1 -- To view, visit http://gerrit.cloudera.org:8080/9882 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ie0066468a0adc563979dcd2c61cd65d66978420c Gerrit-Change-Number: 9882 Gerrit-PatchSet: 1 Gerrit-Owner: Will Berkeley