[kudu-CR] KUDU-2364 Add extra check in ksck for tserver ID
Will Berkeley has posted comments on this change. ( http://gerrit.cloudera.org:8080/9787 ) Change subject: KUDU-2364 Add extra check in ksck for tserver ID .. Patch Set 14: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/9787 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia2c18ba7af8eaa6f5e4d7842f18754d2c1e32526 Gerrit-Change-Number: 9787 Gerrit-PatchSet: 14 Gerrit-Owner: Attila Bukor Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Thu, 29 Mar 2018 08:28:36 + Gerrit-HasComments: No
[kudu-CR] KUDU-2364 Add extra check in ksck for tserver ID
Will Berkeley has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/9787 ) Change subject: KUDU-2364 Add extra check in ksck for tserver ID .. KUDU-2364 Add extra check in ksck for tserver ID ksck did not validate tablet server ID when checking connectivity. Whenever the TabletServer was nuked and readded, ksck would report the connection was successful to the old tablet servers. Now it will report an error like below: ./bin/kudu cluster ksck localhost Connected to the Master WARNING: Unable to connect to Tablet Server d92197fa2f034b33aa0bf998c41f637b (va1022.halxg.cloudera.com:7073): Remote error: ID reported by tablet server (50effcf1fe284ab693e7d1d43c5f18ad) doesn't match the expected ID: d92197fa2f034b33aa0bf998c41f637b Tablet Server Summary UUID | RPC Address | Status --++--- 22ec2c07d8aa4e1ba33a4eb42d4c3a21 | va1022.halxg.cloudera.com:7072 | HEALTHY 50effcf1fe284ab693e7d1d43c5f18ad | va1022.halxg.cloudera.com:7073 | HEALTHY a05c93549cca4ceebc275651d8117065 | va1022.halxg.cloudera.com:7074 | HEALTHY d92197fa2f034b33aa0bf998c41f637b | va1022.halxg.cloudera.com:7073 | WRONG_SERVER_UUID WARNING: Fetched info from 3 Tablet Servers, 1 weren't reachable The cluster doesn't have any matching tables == Errors: == error fetching info from tablet servers: Could not gather complete information from all tablet servers FAILED Runtime error: ksck discovered errors Change-Id: Ia2c18ba7af8eaa6f5e4d7842f18754d2c1e32526 Reviewed-on: http://gerrit.cloudera.org:8080/9787 Reviewed-by: Adar Dembo Tested-by: Kudu Jenkins Reviewed-by: Will Berkeley --- M src/kudu/tools/ksck-test.cc M src/kudu/tools/ksck.cc M src/kudu/tools/ksck.h M src/kudu/tools/ksck_remote-test.cc M src/kudu/tools/ksck_remote.cc 5 files changed, 138 insertions(+), 4 deletions(-) Approvals: Adar Dembo: Looks good to me, but someone else must approve Kudu Jenkins: Verified Will Berkeley: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/9787 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ia2c18ba7af8eaa6f5e4d7842f18754d2c1e32526 Gerrit-Change-Number: 9787 Gerrit-PatchSet: 15 Gerrit-Owner: Attila Bukor Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Will Berkeley
[kudu-CR] KUDU-2364 Add extra check in ksck for tserver ID
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/9787 ) Change subject: KUDU-2364 Add extra check in ksck for tserver ID .. Patch Set 14: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/9787 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia2c18ba7af8eaa6f5e4d7842f18754d2c1e32526 Gerrit-Change-Number: 9787 Gerrit-PatchSet: 14 Gerrit-Owner: Attila Bukor Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Wed, 28 Mar 2018 19:44:35 + Gerrit-HasComments: No
[kudu-CR] KUDU-2364 Add extra check in ksck for tserver ID
Hello Will Berkeley, Tidy Bot, Dan Burkert, Kudu Jenkins, Adar Dembo, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9787 to look at the new patch set (#14). Change subject: KUDU-2364 Add extra check in ksck for tserver ID .. KUDU-2364 Add extra check in ksck for tserver ID ksck did not validate tablet server ID when checking connectivity. Whenever the TabletServer was nuked and readded, ksck would report the connection was successful to the old tablet servers. Now it will report an error like below: ./bin/kudu cluster ksck localhost Connected to the Master WARNING: Unable to connect to Tablet Server d92197fa2f034b33aa0bf998c41f637b (va1022.halxg.cloudera.com:7073): Remote error: ID reported by tablet server (50effcf1fe284ab693e7d1d43c5f18ad) doesn't match the expected ID: d92197fa2f034b33aa0bf998c41f637b Tablet Server Summary UUID | RPC Address | Status --++--- 22ec2c07d8aa4e1ba33a4eb42d4c3a21 | va1022.halxg.cloudera.com:7072 | HEALTHY 50effcf1fe284ab693e7d1d43c5f18ad | va1022.halxg.cloudera.com:7073 | HEALTHY a05c93549cca4ceebc275651d8117065 | va1022.halxg.cloudera.com:7074 | HEALTHY d92197fa2f034b33aa0bf998c41f637b | va1022.halxg.cloudera.com:7073 | WRONG_SERVER_UUID WARNING: Fetched info from 3 Tablet Servers, 1 weren't reachable The cluster doesn't have any matching tables == Errors: == error fetching info from tablet servers: Could not gather complete information from all tablet servers FAILED Runtime error: ksck discovered errors Change-Id: Ia2c18ba7af8eaa6f5e4d7842f18754d2c1e32526 --- M src/kudu/tools/ksck-test.cc M src/kudu/tools/ksck.cc M src/kudu/tools/ksck.h M src/kudu/tools/ksck_remote-test.cc M src/kudu/tools/ksck_remote.cc 5 files changed, 138 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/87/9787/14 -- To view, visit http://gerrit.cloudera.org:8080/9787 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia2c18ba7af8eaa6f5e4d7842f18754d2c1e32526 Gerrit-Change-Number: 9787 Gerrit-PatchSet: 14 Gerrit-Owner: Attila Bukor Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Will Berkeley
[kudu-CR] KUDU-2364 Add extra check in ksck for tserver ID
Will Berkeley has posted comments on this change. ( http://gerrit.cloudera.org:8080/9787 ) Change subject: KUDU-2364 Add extra check in ksck for tserver ID .. Patch Set 13: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/9787 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia2c18ba7af8eaa6f5e4d7842f18754d2c1e32526 Gerrit-Change-Number: 9787 Gerrit-PatchSet: 13 Gerrit-Owner: Attila Bukor Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Wed, 28 Mar 2018 19:42:42 + Gerrit-HasComments: No
[kudu-CR] KUDU-2364 Add extra check in ksck for tserver ID
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/9787 ) Change subject: KUDU-2364 Add extra check in ksck for tserver ID .. Patch Set 13: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/9787 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia2c18ba7af8eaa6f5e4d7842f18754d2c1e32526 Gerrit-Change-Number: 9787 Gerrit-PatchSet: 13 Gerrit-Owner: Attila Bukor Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Wed, 28 Mar 2018 19:27:56 + Gerrit-HasComments: No
[kudu-CR] KUDU-2364 Add extra check in ksck for tserver ID
Attila Bukor has posted comments on this change. ( http://gerrit.cloudera.org:8080/9787 ) Change subject: KUDU-2364 Add extra check in ksck for tserver ID .. Patch Set 13: > Patch Set 12: Code-Review+2 > > Thanks for accepting the scope creep and adding the tablet server summary. It > will be useful in more ways in the future, I'm sure. thanks for accepting it :) needed to push a slight change due to IWYU -- To view, visit http://gerrit.cloudera.org:8080/9787 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia2c18ba7af8eaa6f5e4d7842f18754d2c1e32526 Gerrit-Change-Number: 9787 Gerrit-PatchSet: 13 Gerrit-Owner: Attila Bukor Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Wed, 28 Mar 2018 19:10:05 + Gerrit-HasComments: No
[kudu-CR] KUDU-2364 Add extra check in ksck for tserver ID
Hello Will Berkeley, Tidy Bot, Dan Burkert, Kudu Jenkins, Adar Dembo, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9787 to look at the new patch set (#13). Change subject: KUDU-2364 Add extra check in ksck for tserver ID .. KUDU-2364 Add extra check in ksck for tserver ID ksck did not validate tablet server ID when checking connectivity. Whenever the TabletServer was nuked and readded, ksck would report the connection was successful to the old tablet servers. Now it will report an error like below: ./bin/kudu cluster ksck localhost Connected to the Master WARNING: Unable to connect to Tablet Server d92197fa2f034b33aa0bf998c41f637b (va1022.halxg.cloudera.com:7073): Remote error: ID reported by tablet server (50effcf1fe284ab693e7d1d43c5f18ad) doesn't match the expected ID: d92197fa2f034b33aa0bf998c41f637b Tablet Server Summary UUID | RPC Address | Status --++--- 22ec2c07d8aa4e1ba33a4eb42d4c3a21 | va1022.halxg.cloudera.com:7072 | HEALTHY 50effcf1fe284ab693e7d1d43c5f18ad | va1022.halxg.cloudera.com:7073 | HEALTHY a05c93549cca4ceebc275651d8117065 | va1022.halxg.cloudera.com:7074 | HEALTHY d92197fa2f034b33aa0bf998c41f637b | va1022.halxg.cloudera.com:7073 | WRONG_SERVER_UUID WARNING: Fetched info from 3 Tablet Servers, 1 weren't reachable The cluster doesn't have any matching tables == Errors: == error fetching info from tablet servers: Could not gather complete information from all tablet servers FAILED Runtime error: ksck discovered errors Change-Id: Ia2c18ba7af8eaa6f5e4d7842f18754d2c1e32526 --- M src/kudu/tools/ksck-test.cc M src/kudu/tools/ksck.cc M src/kudu/tools/ksck.h M src/kudu/tools/ksck_remote-test.cc M src/kudu/tools/ksck_remote.cc 5 files changed, 138 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/87/9787/13 -- To view, visit http://gerrit.cloudera.org:8080/9787 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia2c18ba7af8eaa6f5e4d7842f18754d2c1e32526 Gerrit-Change-Number: 9787 Gerrit-PatchSet: 13 Gerrit-Owner: Attila Bukor Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Will Berkeley
[kudu-CR] KUDU-2364 Add extra check in ksck for tserver ID
Will Berkeley has posted comments on this change. ( http://gerrit.cloudera.org:8080/9787 ) Change subject: KUDU-2364 Add extra check in ksck for tserver ID .. Patch Set 12: Code-Review+2 Thanks for accepting the scope creep and adding the tablet server summary. It will be useful in more ways in the future, I'm sure. -- To view, visit http://gerrit.cloudera.org:8080/9787 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia2c18ba7af8eaa6f5e4d7842f18754d2c1e32526 Gerrit-Change-Number: 9787 Gerrit-PatchSet: 12 Gerrit-Owner: Attila Bukor Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Wed, 28 Mar 2018 18:43:31 + Gerrit-HasComments: No
[kudu-CR] KUDU-2364 Add extra check in ksck for tserver ID
Attila Bukor has posted comments on this change. ( http://gerrit.cloudera.org:8080/9787 ) Change subject: KUDU-2364 Add extra check in ksck for tserver ID .. Patch Set 12: (1 comment) I changed the tests to check for Status::IsNetworkError() instead of Status::IsRemoteError() as it's 'converted' to NetworkError. http://gerrit.cloudera.org:8080/#/c/9787/11/src/kudu/tools/ksck_remote-test.cc File src/kudu/tools/ksck_remote-test.cc: http://gerrit.cloudera.org:8080/#/c/9787/11/src/kudu/tools/ksck_remote-test.cc@247 PS11, Line 247: "($0) doesn't match the expected ID: $1"; > Nit: align this with the double quote in "Remote error..." Done -- To view, visit http://gerrit.cloudera.org:8080/9787 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia2c18ba7af8eaa6f5e4d7842f18754d2c1e32526 Gerrit-Change-Number: 9787 Gerrit-PatchSet: 12 Gerrit-Owner: Attila Bukor Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Wed, 28 Mar 2018 18:26:02 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2364 Add extra check in ksck for tserver ID
Hello Will Berkeley, Tidy Bot, Dan Burkert, Kudu Jenkins, Adar Dembo, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9787 to look at the new patch set (#12). Change subject: KUDU-2364 Add extra check in ksck for tserver ID .. KUDU-2364 Add extra check in ksck for tserver ID ksck did not validate tablet server ID when checking connectivity. Whenever the TabletServer was nuked and readded, ksck would report the connection was successful to the old tablet servers. Now it will report an error like below: ./bin/kudu cluster ksck localhost Connected to the Master WARNING: Unable to connect to Tablet Server d92197fa2f034b33aa0bf998c41f637b (va1022.halxg.cloudera.com:7073): Remote error: ID reported by tablet server (50effcf1fe284ab693e7d1d43c5f18ad) doesn't match the expected ID: d92197fa2f034b33aa0bf998c41f637b Tablet Server Summary UUID | RPC Address | Status --++--- 22ec2c07d8aa4e1ba33a4eb42d4c3a21 | va1022.halxg.cloudera.com:7072 | HEALTHY 50effcf1fe284ab693e7d1d43c5f18ad | va1022.halxg.cloudera.com:7073 | HEALTHY a05c93549cca4ceebc275651d8117065 | va1022.halxg.cloudera.com:7074 | HEALTHY d92197fa2f034b33aa0bf998c41f637b | va1022.halxg.cloudera.com:7073 | WRONG_SERVER_UUID WARNING: Fetched info from 3 Tablet Servers, 1 weren't reachable The cluster doesn't have any matching tables == Errors: == error fetching info from tablet servers: Could not gather complete information from all tablet servers FAILED Runtime error: ksck discovered errors Change-Id: Ia2c18ba7af8eaa6f5e4d7842f18754d2c1e32526 --- M src/kudu/tools/ksck-test.cc M src/kudu/tools/ksck.cc M src/kudu/tools/ksck.h M src/kudu/tools/ksck_remote-test.cc M src/kudu/tools/ksck_remote.cc 5 files changed, 139 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/87/9787/12 -- To view, visit http://gerrit.cloudera.org:8080/9787 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia2c18ba7af8eaa6f5e4d7842f18754d2c1e32526 Gerrit-Change-Number: 9787 Gerrit-PatchSet: 12 Gerrit-Owner: Attila Bukor Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Will Berkeley
[kudu-CR] KUDU-2364 Add extra check in ksck for tserver ID
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/9787 ) Change subject: KUDU-2364 Add extra check in ksck for tserver ID .. Patch Set 11: Code-Review+1 (1 comment) Will probably wants to re-review this. http://gerrit.cloudera.org:8080/#/c/9787/11/src/kudu/tools/ksck_remote-test.cc File src/kudu/tools/ksck_remote-test.cc: http://gerrit.cloudera.org:8080/#/c/9787/11/src/kudu/tools/ksck_remote-test.cc@247 PS11, Line 247: "($0) doesn't match the expected ID: $1"; Nit: align this with the double quote in "Remote error..." -- To view, visit http://gerrit.cloudera.org:8080/9787 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia2c18ba7af8eaa6f5e4d7842f18754d2c1e32526 Gerrit-Change-Number: 9787 Gerrit-PatchSet: 11 Gerrit-Owner: Attila Bukor Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Wed, 28 Mar 2018 18:00:41 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2364 Add extra check in ksck for tserver ID
Attila Bukor has posted comments on this change. ( http://gerrit.cloudera.org:8080/9787 ) Change subject: KUDU-2364 Add extra check in ksck for tserver ID .. Patch Set 11: (8 comments) http://gerrit.cloudera.org:8080/#/c/9787/10/src/kudu/tools/ksck.h File src/kudu/tools/ksck.h: http://gerrit.cloudera.org:8080/#/c/9787/10/src/kudu/tools/ksck.h@529 PS10, Line 529: : static Status PrintTabletServerSummaries( : const std::vector& tablet_server_summaries, : std::ostream& out); > Nit: when a declaration gets too long like this, rather than split the type Done http://gerrit.cloudera.org:8080/#/c/9787/10/src/kudu/tools/ksck.cc File src/kudu/tools/ksck.cc: http://gerrit.cloudera.org:8080/#/c/9787/10/src/kudu/tools/ksck.cc@42 PS10, Line 42: #include "kudu/gutil/strings/util.h" > Nit: alphabetical sorting would dictate that this be above gutil/strings/.. Done http://gerrit.cloudera.org:8080/#/c/9787/10/src/kudu/tools/ksck.cc@258 PS10, Line 258: auto& ts : tablet_s > Nit: could use 'auto' here instead. More terse. Done http://gerrit.cloudera.org:8080/#/c/9787/10/src/kudu/tools/ksck_remote-test.cc File src/kudu/tools/ksck_remote-test.cc: http://gerrit.cloudera.org:8080/#/c/9787/10/src/kudu/tools/ksck_remote-test.cc@242 PS10, Line 242: string new_uuid = new_tablet_server.uuid(); > Nit: don't need std:: prefixing here or below. Done http://gerrit.cloudera.org:8080/#/c/9787/10/src/kudu/tools/ksck_remote-test.cc@245 PS10, Line 245: > Don't need this anymore. Done http://gerrit.cloudera.org:8080/#/c/9787/10/src/kudu/tools/ksck_remote.cc File src/kudu/tools/ksck_remote.cc: http://gerrit.cloudera.org:8080/#/c/9787/10/src/kudu/tools/ksck_remote.cc@107 PS10, Line 107: RETURN_NOT_OK_PREPEND(generic_proxy_->GetStatus(req, &resp, &rpc), : "could not get status from server"); > Combine these? No need for 's' anymore. Done http://gerrit.cloudera.org:8080/#/c/9787/10/src/kudu/tools/ksck_remote.cc@109 PS10, Line 109: string response_uuid = resp.status().node_instance().permanent_uuid(); > Nit: no std:: prefix. Done http://gerrit.cloudera.org:8080/#/c/9787/10/src/kudu/tools/ksck_remote.cc@123 PS10, Line 123: RETURN_NOT_OK_PREPEND(ts_proxy_->ListTablets(req, &resp, &rpc), > Don't need this line anymore. Done -- To view, visit http://gerrit.cloudera.org:8080/9787 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia2c18ba7af8eaa6f5e4d7842f18754d2c1e32526 Gerrit-Change-Number: 9787 Gerrit-PatchSet: 11 Gerrit-Owner: Attila Bukor Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Wed, 28 Mar 2018 17:39:05 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2364 Add extra check in ksck for tserver ID
Hello Will Berkeley, Tidy Bot, Dan Burkert, Kudu Jenkins, Adar Dembo, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9787 to look at the new patch set (#11). Change subject: KUDU-2364 Add extra check in ksck for tserver ID .. KUDU-2364 Add extra check in ksck for tserver ID ksck did not validate tablet server ID when checking connectivity. Whenever the TabletServer was nuked and readded, ksck would report the connection was successful to the old tablet servers. Now it will report an error like below: ./bin/kudu cluster ksck localhost Connected to the Master WARNING: Unable to connect to Tablet Server d92197fa2f034b33aa0bf998c41f637b (va1022.halxg.cloudera.com:7073): Remote error: ID reported by tablet server (50effcf1fe284ab693e7d1d43c5f18ad) doesn't match the expected ID: d92197fa2f034b33aa0bf998c41f637b Tablet Server Summary UUID | RPC Address | Status --++--- 22ec2c07d8aa4e1ba33a4eb42d4c3a21 | va1022.halxg.cloudera.com:7072 | HEALTHY 50effcf1fe284ab693e7d1d43c5f18ad | va1022.halxg.cloudera.com:7073 | HEALTHY a05c93549cca4ceebc275651d8117065 | va1022.halxg.cloudera.com:7074 | HEALTHY d92197fa2f034b33aa0bf998c41f637b | va1022.halxg.cloudera.com:7073 | WRONG_SERVER_UUID WARNING: Fetched info from 3 Tablet Servers, 1 weren't reachable The cluster doesn't have any matching tables == Errors: == error fetching info from tablet servers: Could not gather complete information from all tablet servers FAILED Runtime error: ksck discovered errors Change-Id: Ia2c18ba7af8eaa6f5e4d7842f18754d2c1e32526 --- M src/kudu/tools/ksck-test.cc M src/kudu/tools/ksck.cc M src/kudu/tools/ksck.h M src/kudu/tools/ksck_remote-test.cc M src/kudu/tools/ksck_remote.cc 5 files changed, 139 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/87/9787/11 -- To view, visit http://gerrit.cloudera.org:8080/9787 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia2c18ba7af8eaa6f5e4d7842f18754d2c1e32526 Gerrit-Change-Number: 9787 Gerrit-PatchSet: 11 Gerrit-Owner: Attila Bukor Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Will Berkeley
[kudu-CR] KUDU-2364 Add extra check in ksck for tserver ID
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/9787 ) Change subject: KUDU-2364 Add extra check in ksck for tserver ID .. Patch Set 10: (8 comments) http://gerrit.cloudera.org:8080/#/c/9787/10/src/kudu/tools/ksck.h File src/kudu/tools/ksck.h: http://gerrit.cloudera.org:8080/#/c/9787/10/src/kudu/tools/ksck.h@529 PS10, Line 529: : static Status PrintTabletServerSummaries(const std::vector& : tablet_server_summaries, :std::ostream& out); Nit: when a declaration gets too long like this, rather than split the type and the variable name (which can be confusing; here it looks like the function takes three parameters and not two), format like this: static Status PrintTabletServerSummaries( const std::vector& tablet_server_summaries, std::ostream& out); http://gerrit.cloudera.org:8080/#/c/9787/10/src/kudu/tools/ksck.cc File src/kudu/tools/ksck.cc: http://gerrit.cloudera.org:8080/#/c/9787/10/src/kudu/tools/ksck.cc@42 PS10, Line 42: #include "kudu/gutil/spinlock.h" Nit: alphabetical sorting would dictate that this be above gutil/strings/... http://gerrit.cloudera.org:8080/#/c/9787/10/src/kudu/tools/ksck.cc@258 PS10, Line 258: TabletServerSummary Nit: could use 'auto' here instead. More terse. http://gerrit.cloudera.org:8080/#/c/9787/10/src/kudu/tools/ksck_remote-test.cc File src/kudu/tools/ksck_remote-test.cc: http://gerrit.cloudera.org:8080/#/c/9787/10/src/kudu/tools/ksck_remote-test.cc@242 PS10, Line 242: std::string new_uuid = new_tablet_server.uuid(); Nit: don't need std:: prefixing here or below. http://gerrit.cloudera.org:8080/#/c/9787/10/src/kudu/tools/ksck_remote-test.cc@245 PS10, Line 245: Status s = ksck_->FetchInfoFromTabletServers(); Don't need this anymore. http://gerrit.cloudera.org:8080/#/c/9787/10/src/kudu/tools/ksck_remote.cc File src/kudu/tools/ksck_remote.cc: http://gerrit.cloudera.org:8080/#/c/9787/10/src/kudu/tools/ksck_remote.cc@107 PS10, Line 107: Status s = generic_proxy_->GetStatus(req, &resp, &rpc); : RETURN_NOT_OK_PREPEND(s, "could not get status from server"); Combine these? No need for 's' anymore. http://gerrit.cloudera.org:8080/#/c/9787/10/src/kudu/tools/ksck_remote.cc@109 PS10, Line 109: std::string response_uuid = resp.status().node_instance().permanent_uuid(); Nit: no std:: prefix. http://gerrit.cloudera.org:8080/#/c/9787/10/src/kudu/tools/ksck_remote.cc@123 PS10, Line 123: Status s = ts_proxy_->ListTablets(req, &resp, &rpc); Don't need this line anymore. -- To view, visit http://gerrit.cloudera.org:8080/9787 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia2c18ba7af8eaa6f5e4d7842f18754d2c1e32526 Gerrit-Change-Number: 9787 Gerrit-PatchSet: 10 Gerrit-Owner: Attila Bukor Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Wed, 28 Mar 2018 17:31:22 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2364 Add extra check in ksck for tserver ID
Attila Bukor has posted comments on this change. ( http://gerrit.cloudera.org:8080/9787 ) Change subject: KUDU-2364 Add extra check in ksck for tserver ID .. Patch Set 10: (16 comments) http://gerrit.cloudera.org:8080/#/c/9787/8/src/kudu/tools/ksck-test.cc File src/kudu/tools/ksck-test.cc: http://gerrit.cloudera.org:8080/#/c/9787/8/src/kudu/tools/ksck-test.cc@337 PS8, Line 337: ASSERT_TRUE(ksck_->FetchInfoFromTabletServers().IsRemoteError()); : ASSERT_STR_CONTAINS(e > Nit: combine and and assert on the particular kind of error. Done http://gerrit.cloudera.org:8080/#/c/9787/8/src/kudu/tools/ksck-test.cc@339 PS8, Line 339: "Tablet Server Summary\n" > Was this only for debugging? If so, please remove. Done http://gerrit.cloudera.org:8080/#/c/9787/8/src/kudu/tools/ksck-test.cc@341 PS8, Line 341: "-+-+---\n" > Nit: got tabs here; please convert to spaces. Done http://gerrit.cloudera.org:8080/#/c/9787/8/src/kudu/tools/ksck.h File src/kudu/tools/ksck.h: http://gerrit.cloudera.org:8080/#/c/9787/8/src/kudu/tools/ksck.h@482 PS8, Line 482: HEALTHY, > Nit: should use 2-char indentation here, not 4. Basically the same as in Ch Done http://gerrit.cloudera.org:8080/#/c/9787/8/src/kudu/tools/ksck.cc File src/kudu/tools/ksck.cc: http://gerrit.cloudera.org:8080/#/c/9787/8/src/kudu/tools/ksck.cc@193 PS8, Line 193: CHECK_OK(pool->SubmitFunc([&]() { > I think you should use Kudu's simple_spinlock class here instead. The lock Done http://gerrit.cloudera.org:8080/#/c/9787/8/src/kudu/tools/ksck.cc@215 PS8, Line 215: pool->Wait(); > Use std::lock_guard to acquire/release the lock in an RAII way. Done http://gerrit.cloudera.org:8080/#/c/9787/8/src/kudu/tools/ksck.cc@272 PS8, Line 272: break; > I'd make this its own statement and do a LOG(FATAL) or something, since if Done http://gerrit.cloudera.org:8080/#/c/9787/8/src/kudu/tools/ksck_remote-test.cc File src/kudu/tools/ksck_remote-test.cc: http://gerrit.cloudera.org:8080/#/c/9787/8/src/kudu/tools/ksck_remote-test.cc@52 PS8, Line 52: #include "kudu/util/test_util.h" > Nit: shouldn't this go just after monotime.h to preserve alphabetical sort Done http://gerrit.cloudera.org:8080/#/c/9787/8/src/kudu/tools/ksck_remote-test.cc@228 PS8, Line 228: TEST_F(RemoteKsckTest, TestTabletServerMismatchUUID) { > Nit: 2 char indentation. Done http://gerrit.cloudera.org:8080/#/c/9787/8/src/kudu/tools/ksck_remote-test.cc@232 PS8, Line 232: tserver::MiniTabletServer* tablet_server = mini_cluster_->mini_tablet_server(0); > We have "using std::string" on L68 so you needn't qualify strings with the Done http://gerrit.cloudera.org:8080/#/c/9787/8/src/kudu/tools/ksck_remote-test.cc@238 PS8, Line 238: tserver::MiniTabletServer new_tablet_server(root_dir, address); : ASSERT_OK(new_tablet_server.Start( > These return a Status, right? If so you should ASSERT_OK them. Done http://gerrit.cloudera.org:8080/#/c/9787/8/src/kudu/tools/ksck_remote-test.cc@248 PS8, Line 248: "($0) doesn't match the expected ID: $1"; > Can we assert on the particular kind of error? Like, ASSERT_TRUE(s.IsRemote Done http://gerrit.cloudera.org:8080/#/c/9787/8/src/kudu/tools/ksck_remote.cc File src/kudu/tools/ksck_remote.cc: http://gerrit.cloudera.org:8080/#/c/9787/8/src/kudu/tools/ksck_remote.cc@102 PS8, Line 102: { > Nit: should be 2 char indentation. Done http://gerrit.cloudera.org:8080/#/c/9787/8/src/kudu/tools/ksck_remote.cc@111 PS8, Line 111: return Status::RemoteError(Substitute("ID reported by tablet server ($0) doesn't " : "match the expected ID: $1", > Nit: these continuation lines should be aligned with the double quote in "I Done http://gerrit.cloudera.org:8080/#/c/9787/8/src/kudu/tools/ksck_remote.cc@113 PS8, Line 113: response_uuid, uuid())); > Nit: since 'message' is used in just one place, you could embed Substitute( Done http://gerrit.cloudera.org:8080/#/c/9787/8/src/kudu/tools/ksck_remote.cc@123 PS8, Line 123: Status s = ts_proxy_->ListTablets(req, &resp, &rpc); : RETURN_NOT_OK_PREPEND(ts_proxy_->ListTablets(req, &resp, &rpc), : "could not list tablets"); : if (resp.has_error()) { : > This can be simplified: Done -- To view, visit http://gerrit.cloudera.org:8080/9787 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia2c18ba7af8eaa6f5e4d7842f18754d2c1e32526 Gerrit-Change-Number: 9787 Gerrit-PatchSet: 10 Gerrit-Owner: Attila Bukor Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Will Berkeley Ge
[kudu-CR] KUDU-2364 Add extra check in ksck for tserver ID
Hello Will Berkeley, Tidy Bot, Dan Burkert, Kudu Jenkins, Adar Dembo, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9787 to look at the new patch set (#10). Change subject: KUDU-2364 Add extra check in ksck for tserver ID .. KUDU-2364 Add extra check in ksck for tserver ID ksck did not validate tablet server ID when checking connectivity. Whenever the TabletServer was nuked and readded, ksck would report the connection was successful to the old tablet servers. Now it will report an error like below: ./bin/kudu cluster ksck localhost Connected to the Master WARNING: Unable to connect to Tablet Server d92197fa2f034b33aa0bf998c41f637b (va1022.halxg.cloudera.com:7073): Remote error: ID reported by tablet server (50effcf1fe284ab693e7d1d43c5f18ad) doesn't match the expected ID: d92197fa2f034b33aa0bf998c41f637b Tablet Server Summary UUID | RPC Address | Status --++--- 22ec2c07d8aa4e1ba33a4eb42d4c3a21 | va1022.halxg.cloudera.com:7072 | HEALTHY 50effcf1fe284ab693e7d1d43c5f18ad | va1022.halxg.cloudera.com:7073 | HEALTHY a05c93549cca4ceebc275651d8117065 | va1022.halxg.cloudera.com:7074 | HEALTHY d92197fa2f034b33aa0bf998c41f637b | va1022.halxg.cloudera.com:7073 | WRONG_SERVER_UUID WARNING: Fetched info from 3 Tablet Servers, 1 weren't reachable The cluster doesn't have any matching tables == Errors: == error fetching info from tablet servers: Could not gather complete information from all tablet servers FAILED Runtime error: ksck discovered errors Change-Id: Ia2c18ba7af8eaa6f5e4d7842f18754d2c1e32526 --- M src/kudu/tools/ksck-test.cc M src/kudu/tools/ksck.cc M src/kudu/tools/ksck.h M src/kudu/tools/ksck_remote-test.cc M src/kudu/tools/ksck_remote.cc 5 files changed, 141 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/87/9787/10 -- To view, visit http://gerrit.cloudera.org:8080/9787 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia2c18ba7af8eaa6f5e4d7842f18754d2c1e32526 Gerrit-Change-Number: 9787 Gerrit-PatchSet: 10 Gerrit-Owner: Attila Bukor Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Will Berkeley
[kudu-CR] KUDU-2364 Add extra check in ksck for tserver ID
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/9787 ) Change subject: KUDU-2364 Add extra check in ksck for tserver ID .. Patch Set 9: (16 comments) http://gerrit.cloudera.org:8080/#/c/9787/8/src/kudu/tools/ksck-test.cc File src/kudu/tools/ksck-test.cc: http://gerrit.cloudera.org:8080/#/c/9787/8/src/kudu/tools/ksck-test.cc@337 PS8, Line 337: Status s = ksck_->FetchInfoFromTabletServers(); : ASSERT_TRUE(!s.ok()); Nit: combine and and assert on the particular kind of error. http://gerrit.cloudera.org:8080/#/c/9787/8/src/kudu/tools/ksck-test.cc@339 PS8, Line 339: LOG(INFO) << err_stream_.str(); Was this only for debugging? If so, please remove. http://gerrit.cloudera.org:8080/#/c/9787/8/src/kudu/tools/ksck-test.cc@341 PS8, Line 341: "Tablet Server Summary\n" Nit: got tabs here; please convert to spaces. http://gerrit.cloudera.org:8080/#/c/9787/8/src/kudu/tools/ksck.h File src/kudu/tools/ksck.h: http://gerrit.cloudera.org:8080/#/c/9787/8/src/kudu/tools/ksck.h@482 PS8, Line 482: // The tablet server is healthy Nit: should use 2-char indentation here, not 4. Basically the same as in CheckResult just above. http://gerrit.cloudera.org:8080/#/c/9787/8/src/kudu/tools/ksck.cc File src/kudu/tools/ksck.cc: http://gerrit.cloudera.org:8080/#/c/9787/8/src/kudu/tools/ksck.cc@193 PS8, Line 193: int i = 0; I think you should use Kudu's simple_spinlock class here instead. The lock is acquired for a very short amount of time; in the worst case, it's held for a memory allocation as the array grows. So a sleeping mutex isn't necessary. http://gerrit.cloudera.org:8080/#/c/9787/8/src/kudu/tools/ksck.cc@215 PS8, Line 215: tablet_server_summaries_lock.unlock(); Use std::lock_guard to acquire/release the lock in an RAII way. http://gerrit.cloudera.org:8080/#/c/9787/8/src/kudu/tools/ksck.cc@272 PS8, Line 272: break; I'd make this its own statement and do a LOG(FATAL) or something, since if it fires it's a programming error: a sign that other code in ksck.cc needs to be updated. http://gerrit.cloudera.org:8080/#/c/9787/8/src/kudu/tools/ksck_remote-test.cc File src/kudu/tools/ksck_remote-test.cc: http://gerrit.cloudera.org:8080/#/c/9787/8/src/kudu/tools/ksck_remote-test.cc@52 PS8, Line 52: #include "kudu/util/net/net_util.h" Nit: shouldn't this go just after monotime.h to preserve alphabetical sort order? http://gerrit.cloudera.org:8080/#/c/9787/8/src/kudu/tools/ksck_remote-test.cc@228 PS8, Line 228: ASSERT_OK(ksck_->CheckMasterRunning()); Nit: 2 char indentation. http://gerrit.cloudera.org:8080/#/c/9787/8/src/kudu/tools/ksck_remote-test.cc@232 PS8, Line 232: std::string old_uuid = tablet_server->uuid(); We have "using std::string" on L68 so you needn't qualify strings with the std:: prefix. http://gerrit.cloudera.org:8080/#/c/9787/8/src/kudu/tools/ksck_remote-test.cc@238 PS8, Line 238: new_tablet_server.Start(); : new_tablet_server.WaitStarted(); These return a Status, right? If so you should ASSERT_OK them. http://gerrit.cloudera.org:8080/#/c/9787/8/src/kudu/tools/ksck_remote-test.cc@248 PS8, Line 248: ASSERT_TRUE(!s.ok()); Can we assert on the particular kind of error? Like, ASSERT_TRUE(s.IsRemoteError()) or some such? Plus, since 's' is only used here, you could combine the ASSERT_TRUE into the FetchInfoFromTabletServers() call. http://gerrit.cloudera.org:8080/#/c/9787/8/src/kudu/tools/ksck_remote.cc File src/kudu/tools/ksck_remote.cc: http://gerrit.cloudera.org:8080/#/c/9787/8/src/kudu/tools/ksck_remote.cc@102 PS8, Line 102: server::GetStatusRequestPB req; Nit: should be 2 char indentation. http://gerrit.cloudera.org:8080/#/c/9787/8/src/kudu/tools/ksck_remote.cc@111 PS8, Line 111:"match the expected ID: $1", :response_uuid, uuid()); Nit: these continuation lines should be aligned with the double quote in "ID reported...". http://gerrit.cloudera.org:8080/#/c/9787/8/src/kudu/tools/ksck_remote.cc@113 PS8, Line 113: return Status::RemoteError(message); Nit: since 'message' is used in just one place, you could embed Substitute() inside Status::RemoteError(...) and avoid declaring 'message' at all. http://gerrit.cloudera.org:8080/#/c/9787/8/src/kudu/tools/ksck_remote.cc@123 PS8, Line 123: Status s = ts_proxy_->ListTablets(req, &resp, &rpc); : RETURN_NOT_OK_PREPEND(s, "could not list tablets"); : if (s.ok() && resp.has_error()) { : RETURN_NOT_OK(StatusFromPB(resp.error().status())); : } This can be simplified: RETURN_NOT_OK_PREPEND(ts_proxy_->ListTablets(req, &resp, &rpc), "could not list tablets"); if (resp.has_error()) { return StatusFromPB(resp.error().status()) } B
[kudu-CR] KUDU-2364 Add extra check in ksck for tserver ID
Hello Will Berkeley, Tidy Bot, Dan Burkert, Kudu Jenkins, Adar Dembo, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9787 to look at the new patch set (#9). Change subject: KUDU-2364 Add extra check in ksck for tserver ID .. KUDU-2364 Add extra check in ksck for tserver ID ksck did not validate tablet server ID when checking connectivity. Whenever the TabletServer was nuked and readded, ksck would report the connection was successful to the old tablet servers. Now it will report an error like below: ./bin/kudu cluster ksck localhost Connected to the Master WARNING: Unable to connect to Tablet Server d92197fa2f034b33aa0bf998c41f637b (va1022.halxg.cloudera.com:7073): Remote error: ID reported by tablet server (50effcf1fe284ab693e7d1d43c5f18ad) doesn't match the expected ID: d92197fa2f034b33aa0bf998c41f637b Tablet Server Summary UUID | RPC Address | Status --++--- 22ec2c07d8aa4e1ba33a4eb42d4c3a21 | va1022.halxg.cloudera.com:7072 | HEALTHY 50effcf1fe284ab693e7d1d43c5f18ad | va1022.halxg.cloudera.com:7073 | HEALTHY a05c93549cca4ceebc275651d8117065 | va1022.halxg.cloudera.com:7074 | HEALTHY d92197fa2f034b33aa0bf998c41f637b | va1022.halxg.cloudera.com:7073 | WRONG_SERVER_UUID WARNING: Fetched info from 3 Tablet Servers, 1 weren't reachable The cluster doesn't have any matching tables == Errors: == error fetching info from tablet servers: Could not gather complete information from all tablet servers FAILED Runtime error: ksck discovered errors Change-Id: Ia2c18ba7af8eaa6f5e4d7842f18754d2c1e32526 --- M src/kudu/tools/ksck-test.cc M src/kudu/tools/ksck.cc M src/kudu/tools/ksck.h M src/kudu/tools/ksck_remote-test.cc M src/kudu/tools/ksck_remote.cc 5 files changed, 147 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/87/9787/9 -- To view, visit http://gerrit.cloudera.org:8080/9787 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia2c18ba7af8eaa6f5e4d7842f18754d2c1e32526 Gerrit-Change-Number: 9787 Gerrit-PatchSet: 9 Gerrit-Owner: Attila Bukor Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Will Berkeley
[kudu-CR] KUDU-2364 Add extra check in ksck for tserver ID
Attila Bukor has posted comments on this change. ( http://gerrit.cloudera.org:8080/9787 ) Change subject: KUDU-2364 Add extra check in ksck for tserver ID .. Patch Set 8: (1 comment) http://gerrit.cloudera.org:8080/#/c/9787/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/9787/2//COMMIT_MSG@21 PS2, Line 21: rver Summary > You could make a TabletServerErrors struct or something, and tabulate the O Done -- To view, visit http://gerrit.cloudera.org:8080/9787 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia2c18ba7af8eaa6f5e4d7842f18754d2c1e32526 Gerrit-Change-Number: 9787 Gerrit-PatchSet: 8 Gerrit-Owner: Attila Bukor Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Wed, 28 Mar 2018 16:15:29 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2364 Add extra check in ksck for tserver ID
Hello Will Berkeley, Dan Burkert, Kudu Jenkins, Adar Dembo, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9787 to look at the new patch set (#8). Change subject: KUDU-2364 Add extra check in ksck for tserver ID .. KUDU-2364 Add extra check in ksck for tserver ID ksck did not validate tablet server ID when checking connectivity. Whenever the TabletServer was nuked and readded, ksck would report the connection was successful to the old tablet servers. Now it will report an error like below: ./bin/kudu cluster ksck localhost Connected to the Master WARNING: Unable to connect to Tablet Server d92197fa2f034b33aa0bf998c41f637b (va1022.halxg.cloudera.com:7073): Remote error: ID reported by tablet server (50effcf1fe284ab693e7d1d43c5f18ad) doesn't match the expected ID: d92197fa2f034b33aa0bf998c41f637b Tablet Server Summary UUID | RPC Address | Status --++--- 22ec2c07d8aa4e1ba33a4eb42d4c3a21 | va1022.halxg.cloudera.com:7072 | HEALTHY 50effcf1fe284ab693e7d1d43c5f18ad | va1022.halxg.cloudera.com:7073 | HEALTHY a05c93549cca4ceebc275651d8117065 | va1022.halxg.cloudera.com:7074 | HEALTHY d92197fa2f034b33aa0bf998c41f637b | va1022.halxg.cloudera.com:7073 | WRONG_SERVER_UUID WARNING: Fetched info from 3 Tablet Servers, 1 weren't reachable The cluster doesn't have any matching tables == Errors: == error fetching info from tablet servers: Could not gather complete information from all tablet servers FAILED Runtime error: ksck discovered errors Change-Id: Ia2c18ba7af8eaa6f5e4d7842f18754d2c1e32526 --- M src/kudu/tools/ksck-test.cc M src/kudu/tools/ksck.cc M src/kudu/tools/ksck.h M src/kudu/tools/ksck_remote-test.cc M src/kudu/tools/ksck_remote.cc 5 files changed, 149 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/87/9787/8 -- To view, visit http://gerrit.cloudera.org:8080/9787 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia2c18ba7af8eaa6f5e4d7842f18754d2c1e32526 Gerrit-Change-Number: 9787 Gerrit-PatchSet: 8 Gerrit-Owner: Attila Bukor Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Will Berkeley
[kudu-CR] KUDU-2364 Add extra check in ksck for tserver ID
Will Berkeley has posted comments on this change. ( http://gerrit.cloudera.org:8080/9787 ) Change subject: KUDU-2364 Add extra check in ksck for tserver ID .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/9787/5/src/kudu/tools/ksck_remote-test.cc File src/kudu/tools/ksck_remote-test.cc: http://gerrit.cloudera.org:8080/#/c/9787/5/src/kudu/tools/ksck_remote-test.cc@231 PS5, Line 231: std::string root_dir = mini_cluster_->GetTabletServerFsRoot(0) + std::string("2"); > given that we're using InternalMiniCluster everywhere else in ksck_remote-t Yeah, I meant implement a method like InternalMiniCluster::DeleteFromDisk that's like ExternalMiniCluster::DeleteFromDisk. But, on second thought, I think it's ok just to do what you're already doing, so nevermind. -- To view, visit http://gerrit.cloudera.org:8080/9787 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia2c18ba7af8eaa6f5e4d7842f18754d2c1e32526 Gerrit-Change-Number: 9787 Gerrit-PatchSet: 5 Gerrit-Owner: Attila Bukor Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Tue, 27 Mar 2018 19:36:06 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2364 Add extra check in ksck for tserver ID
Attila Bukor has posted comments on this change. ( http://gerrit.cloudera.org:8080/9787 ) Change subject: KUDU-2364 Add extra check in ksck for tserver ID .. Patch Set 7: (8 comments) http://gerrit.cloudera.org:8080/#/c/9787/7//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/9787/7//COMMIT_MSG@10 PS7, Line 10: TabletServer > nit: "tablet server" is fine. Done http://gerrit.cloudera.org:8080/#/c/9787/7/src/kudu/server/server_base.proto File src/kudu/server/server_base.proto: http://gerrit.cloudera.org:8080/#/c/9787/7/src/kudu/server/server_base.proto@38 PS7, Line 38: optional bytes uuid = 5; > The NodeInstancePB already contains the uuid. See common/wire_protocol.prot thanks, removing it http://gerrit.cloudera.org:8080/#/c/9787/7/src/kudu/tools/ksck-test.cc File src/kudu/tools/ksck-test.cc: http://gerrit.cloudera.org:8080/#/c/9787/7/src/kudu/tools/ksck-test.cc@327 PS7, Line 327: /* > I think the test on the previous file does that via starting up a new tserv yep, I committed this test by mistake, removing it http://gerrit.cloudera.org:8080/#/c/9787/7/src/kudu/tools/ksck-test.cc@335 PS7, Line 335: TabletServer > Just "tablet server". I know ksck says "Tablet Server" in a lot of places b Done http://gerrit.cloudera.org:8080/#/c/9787/5/src/kudu/tools/ksck_remote-test.cc File src/kudu/tools/ksck_remote-test.cc: http://gerrit.cloudera.org:8080/#/c/9787/5/src/kudu/tools/ksck_remote-test.cc@231 PS5, Line 231: tserver::MiniTabletServer* tablet_server = mini_cluster_->mini_tablet_server(0); > I think it'd be nicer to implement a method like ExternalMiniCluster::Delet given that we're using InternalMiniCluster everywhere else in ksck_remote-test I assume I should keep this on InternalMiniCluster as well, right? If it stays on InternalMiniCluster, shouldn't it be InternalMiniCluster::DeleteFromDisk instead()? http://gerrit.cloudera.org:8080/#/c/9787/5/src/kudu/tools/ksck_remote-test.cc@242 PS5, Line 242: ASSERT_TRUE(!s.ok()); > Can you add a check that the expected + actual tserver uuids appear in the Done http://gerrit.cloudera.org:8080/#/c/9787/7/src/kudu/tools/ksck_remote-test.cc File src/kudu/tools/ksck_remote-test.cc: http://gerrit.cloudera.org:8080/#/c/9787/7/src/kudu/tools/ksck_remote-test.cc@232 PS7, Line 232: std::string("2") > I think you should be able to simplify this to thanks, don't know why I used std::string("2") here... http://gerrit.cloudera.org:8080/#/c/9787/7/src/kudu/tools/ksck_remote.cc File src/kudu/tools/ksck_remote.cc: http://gerrit.cloudera.org:8080/#/c/9787/7/src/kudu/tools/ksck_remote.cc@122 PS7, Line 122: req.set_dest_uuid(uuid()); > Is it necessary to add the new UUID check in ListTablets now that you also removing it, thanks -- To view, visit http://gerrit.cloudera.org:8080/9787 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia2c18ba7af8eaa6f5e4d7842f18754d2c1e32526 Gerrit-Change-Number: 9787 Gerrit-PatchSet: 7 Gerrit-Owner: Attila Bukor Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Tue, 27 Mar 2018 19:26:09 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2364 Add extra check in ksck for tserver ID
Dan Burkert has posted comments on this change. ( http://gerrit.cloudera.org:8080/9787 ) Change subject: KUDU-2364 Add extra check in ksck for tserver ID .. Patch Set 7: (3 comments) http://gerrit.cloudera.org:8080/#/c/9787/7/src/kudu/tools/ksck-test.cc File src/kudu/tools/ksck-test.cc: http://gerrit.cloudera.org:8080/#/c/9787/7/src/kudu/tools/ksck-test.cc@327 PS7, Line 327: /* > Oops, looks like this test is incomplete and commented out. I think the test on the previous file does that via starting up a new tserver bound to the same RPC addresses. http://gerrit.cloudera.org:8080/#/c/9787/7/src/kudu/tools/ksck_remote-test.cc File src/kudu/tools/ksck_remote-test.cc: http://gerrit.cloudera.org:8080/#/c/9787/7/src/kudu/tools/ksck_remote-test.cc@232 PS7, Line 232: std::string("2") I think you should be able to simplify this to string root_dir = mini_cluster_->GetTabletServerFsRoot(0) + "2"; std::string already has a 'using' clause, and string::operator+ is defined for string literal arguments (const char*). http://gerrit.cloudera.org:8080/#/c/9787/7/src/kudu/tools/ksck_remote.cc File src/kudu/tools/ksck_remote.cc: http://gerrit.cloudera.org:8080/#/c/9787/7/src/kudu/tools/ksck_remote.cc@122 PS7, Line 122: req.set_dest_uuid(uuid()); Is it necessary to add the new UUID check in ListTablets now that you also check it in the above block? If it's not adding any coverage here it may be easier to just skip adding that. -- To view, visit http://gerrit.cloudera.org:8080/9787 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia2c18ba7af8eaa6f5e4d7842f18754d2c1e32526 Gerrit-Change-Number: 9787 Gerrit-PatchSet: 7 Gerrit-Owner: Attila Bukor Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Tue, 27 Mar 2018 18:40:27 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2364 Add extra check in ksck for tserver ID
Will Berkeley has posted comments on this change. ( http://gerrit.cloudera.org:8080/9787 ) Change subject: KUDU-2364 Add extra check in ksck for tserver ID .. Patch Set 7: (10 comments) http://gerrit.cloudera.org:8080/#/c/9787/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/9787/2//COMMIT_MSG@21 PS2, Line 21: 2ff411da5bbe177631ace5d > I'm not sure how this should work. Currently ksck counts the number of !s.o You could make a TabletServerErrors struct or something, and tabulate the OKs and other errors in it, along with uuid/hostport, then print those out as a small table at the end. I'd say it should go above the tablet summary table since it's usually less interesting. http://gerrit.cloudera.org:8080/#/c/9787/7//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/9787/7//COMMIT_MSG@10 PS7, Line 10: TabletServer nit: "tablet server" is fine. http://gerrit.cloudera.org:8080/#/c/9787/7/src/kudu/server/server_base.proto File src/kudu/server/server_base.proto: http://gerrit.cloudera.org:8080/#/c/9787/7/src/kudu/server/server_base.proto@38 PS7, Line 38: optional bytes uuid = 5; The NodeInstancePB already contains the uuid. See common/wire_protocol.proto. http://gerrit.cloudera.org:8080/#/c/9787/7/src/kudu/tools/ksck-test.cc File src/kudu/tools/ksck-test.cc: http://gerrit.cloudera.org:8080/#/c/9787/7/src/kudu/tools/ksck-test.cc@327 PS7, Line 327: /* Oops, looks like this test is incomplete and commented out. You'll want to make a function like CreateClusterWithMismatchedTSUuid that reaches into the KsckCluster object and replaces one of its tablet servers with one with a different uuid, and replace CreateOneSmallReplicatedTable() with a call to that function. http://gerrit.cloudera.org:8080/#/c/9787/7/src/kudu/tools/ksck-test.cc@335 PS7, Line 335: TabletServer Just "tablet server". I know ksck says "Tablet Server" in a lot of places but I think the capitalization is a mistake and I'll be changing it whenever I add to or modify ksck. http://gerrit.cloudera.org:8080/#/c/9787/5/src/kudu/tools/ksck_remote-test.cc File src/kudu/tools/ksck_remote-test.cc: http://gerrit.cloudera.org:8080/#/c/9787/5/src/kudu/tools/ksck_remote-test.cc@231 PS5, Line 231: tserver::MiniTabletServer* tablet_server = mini_cluster_->mini_tablet_server(0); I think it'd be nicer to implement a method like ExternalMiniCluster::DeleteFromDisk for InternalMiniCluster daemons and use that. As an side, I think we should be using the ExternalMiniCluster for these tests anyway since they should be black-box. I'll have to put that on my todo. http://gerrit.cloudera.org:8080/#/c/9787/5/src/kudu/tools/ksck_remote-test.cc@242 PS5, Line 242: ASSERT_TRUE(!s.ok()); Can you add a check that the expected + actual tserver uuids appear in the message? They should be available as methods on the minicluster daemon objects. http://gerrit.cloudera.org:8080/#/c/9787/2/src/kudu/tools/ksck_remote.cc File src/kudu/tools/ksck_remote.cc: http://gerrit.cloudera.org:8080/#/c/9787/2/src/kudu/tools/ksck_remote.cc@116 PS2, Line 116: { > ok, moved this check for client side. Should I leave the uuid check in the Nope. http://gerrit.cloudera.org:8080/#/c/9787/5/src/kudu/tools/ksck_remote.cc File src/kudu/tools/ksck_remote.cc: http://gerrit.cloudera.org:8080/#/c/9787/5/src/kudu/tools/ksck_remote.cc@109 PS5, Line 109: std:: The namespace qualifier isn't needed. http://gerrit.cloudera.org:8080/#/c/9787/5/src/kudu/tools/ksck_remote.cc@121 PS5, Line 121: req.set_need_schema_info(f If we're doing the check in GetStatus I think we should remove it here and not put the uuid in ListTablets. Maybe at a later date we'll make most RPC endpoints optionally check uuid. ListTablets can wait for that patch. -- To view, visit http://gerrit.cloudera.org:8080/9787 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia2c18ba7af8eaa6f5e4d7842f18754d2c1e32526 Gerrit-Change-Number: 9787 Gerrit-PatchSet: 7 Gerrit-Owner: Attila Bukor Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Tue, 27 Mar 2018 17:50:56 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2364 Add extra check in ksck for tserver ID
Hello Will Berkeley, Dan Burkert, Kudu Jenkins, Adar Dembo, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9787 to look at the new patch set (#6). Change subject: KUDU-2364 Add extra check in ksck for tserver ID .. KUDU-2364 Add extra check in ksck for tserver ID ksck did not validate tablet server ID when checking connectivity. Whenever the TabletServer was nuked and readded, ksck would report the connection was successful to the old TabletServer. Now it will report an error like below: ./bin/kudu cluster ksck localhost Connected to the Master WARNING: Unable to connect to Tablet Server b68c8e3352ff411da5bbe177631ace5d (chargers-macbook-pro.local:7050): Remote error: ID reported by TabletServer (5e929ae0b09447cc9444a4b635a0d5ac) doesn't match the expected ID: b68c8e3352ff411da5bbe177631ace5d WARNING: Fetched info from 1 Tablet Servers, 1 weren't reachable The cluster doesn't have any matching tables == Errors: == error fetching info from tablet servers: Could not gather complete information from all tablet servers FAILED Runtime error: ksck discovered errors Change-Id: Ia2c18ba7af8eaa6f5e4d7842f18754d2c1e32526 --- M src/kudu/server/server_base.cc M src/kudu/server/server_base.proto M src/kudu/tools/ksck-test.cc M src/kudu/tools/ksck_remote-test.cc M src/kudu/tools/ksck_remote.cc M src/kudu/tserver/tablet_service.cc M src/kudu/tserver/tserver.proto 7 files changed, 67 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/87/9787/6 -- To view, visit http://gerrit.cloudera.org:8080/9787 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia2c18ba7af8eaa6f5e4d7842f18754d2c1e32526 Gerrit-Change-Number: 9787 Gerrit-PatchSet: 6 Gerrit-Owner: Attila Bukor Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Will Berkeley
[kudu-CR] KUDU-2364 Add extra check in ksck for tserver ID
Attila Bukor has posted comments on this change. ( http://gerrit.cloudera.org:8080/9787 ) Change subject: KUDU-2364 Add extra check in ksck for tserver ID .. Patch Set 5: (3 comments) http://gerrit.cloudera.org:8080/#/c/9787/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/9787/2//COMMIT_MSG@21 PS2, Line 21: 2ff411da5bbe177631ace5d > We'll need a more specific error message for the uuid mismatch case, and it I'm not sure how this should work. Currently ksck counts the number of !s.ok()-s and if it's larger than 0, it will print this message. Should I just count the number of RemoteErrors separately? Furthermore, how this method (Status Ksck::FetchInfoFromTabletServers()) returns a single Status, should I just change the message to something else than "Could not gather complete information from all tablet servers"? What if a TabletServer is not reachable and another is reachable but with a mismatched UUID? An old TabletServer is technically not reachable if this happens, there's just a new one on the same host:port. http://gerrit.cloudera.org:8080/#/c/9787/2/src/kudu/tools/ksck_remote.cc File src/kudu/tools/ksck_remote.cc: http://gerrit.cloudera.org:8080/#/c/9787/2/src/kudu/tools/ksck_remote.cc@116 PS2, Line 116: tserver::ListTabletsRequestPB req; > I would argue for checking the UUID via GetStatus(). I think it makes sense ok, moved this check for client side. Should I leave the uuid check in the ListTablets then? http://gerrit.cloudera.org:8080/#/c/9787/2/src/kudu/tserver/tablet_service.cc File src/kudu/tserver/tablet_service.cc: http://gerrit.cloudera.org:8080/#/c/9787/2/src/kudu/tserver/tablet_service.cc@1366 PS2, Line 1366: > Extra \? Done -- To view, visit http://gerrit.cloudera.org:8080/9787 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia2c18ba7af8eaa6f5e4d7842f18754d2c1e32526 Gerrit-Change-Number: 9787 Gerrit-PatchSet: 5 Gerrit-Owner: Attila Bukor Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Mon, 26 Mar 2018 16:31:26 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2364 Add extra check in ksck for tserver ID
Hello Will Berkeley, Dan Burkert, Kudu Jenkins, Adar Dembo, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9787 to look at the new patch set (#5). Change subject: KUDU-2364 Add extra check in ksck for tserver ID .. KUDU-2364 Add extra check in ksck for tserver ID ksck did not validate tablet server ID when checking connectivity. Whenever the TabletServer was nuked and readded, ksck would report the connection was successful to the old TabletServer. Now it will report an error like below: ./bin/kudu cluster ksck localhost Connected to the Master WARNING: Unable to connect to Tablet Server b68c8e3352ff411da5bbe177631ace5d (chargers-macbook-pro.local:7050): Remote error: ID reported by TabletServer (5e929ae0b09447cc9444a4b635a0d5ac) doesn't match the expected ID: b68c8e3352ff411da5bbe177631ace5d WARNING: Fetched info from 1 Tablet Servers, 1 weren't reachable The cluster doesn't have any matching tables == Errors: == error fetching info from tablet servers: Could not gather complete information from all tablet servers FAILED Runtime error: ksck discovered errors Change-Id: Ia2c18ba7af8eaa6f5e4d7842f18754d2c1e32526 --- M src/kudu/server/server_base.cc M src/kudu/server/server_base.proto M src/kudu/tools/ksck-test.cc M src/kudu/tools/ksck_remote-test.cc M src/kudu/tools/ksck_remote.cc M src/kudu/tserver/tablet_service.cc M src/kudu/tserver/tserver.proto 7 files changed, 65 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/87/9787/5 -- To view, visit http://gerrit.cloudera.org:8080/9787 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia2c18ba7af8eaa6f5e4d7842f18754d2c1e32526 Gerrit-Change-Number: 9787 Gerrit-PatchSet: 5 Gerrit-Owner: Attila Bukor Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Will Berkeley
[kudu-CR] KUDU-2364 Add extra check in ksck for tserver ID
Hello Will Berkeley, Dan Burkert, Kudu Jenkins, Adar Dembo, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9787 to look at the new patch set (#4). Change subject: KUDU-2364 Add extra check in ksck for tserver ID .. KUDU-2364 Add extra check in ksck for tserver ID ksck did not validate tablet server ID when checking connectivity. Whenever the TabletServer was nuked and readded, ksck would report the connection was successful to the old TabletServer. Now it will report an error like below: ./bin/kudu cluster ksck localhost Connected to the Master WARNING: Unable to connect to Tablet Server b68c8e3352ff411da5bbe177631ace5d (chargers-macbook-pro.local:7050): Remote error: ID reported by TabletServer (5e929ae0b09447cc9444a4b635a0d5ac) doesn't match the expected ID: b68c8e3352ff411da5bbe177631ace5d WARNING: Fetched info from 1 Tablet Servers, 1 weren't reachable The cluster doesn't have any matching tables == Errors: == error fetching info from tablet servers: Could not gather complete information from all tablet servers FAILED Runtime error: ksck discovered errors Change-Id: Ia2c18ba7af8eaa6f5e4d7842f18754d2c1e32526 --- M src/kudu/mini-cluster/internal_mini_cluster.h M src/kudu/server/server_base.cc M src/kudu/server/server_base.proto M src/kudu/tools/ksck-test.cc M src/kudu/tools/ksck_remote-test.cc M src/kudu/tools/ksck_remote.cc M src/kudu/tserver/tablet_service.cc M src/kudu/tserver/tserver.proto 8 files changed, 66 insertions(+), 12 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/87/9787/4 -- To view, visit http://gerrit.cloudera.org:8080/9787 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia2c18ba7af8eaa6f5e4d7842f18754d2c1e32526 Gerrit-Change-Number: 9787 Gerrit-PatchSet: 4 Gerrit-Owner: Attila Bukor Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Will Berkeley
[kudu-CR] KUDU-2364 Add extra check in ksck for tserver ID
Attila Bukor has posted comments on this change. ( http://gerrit.cloudera.org:8080/9787 ) Change subject: KUDU-2364 Add extra check in ksck for tserver ID .. Patch Set 3: > Patch Set 2: > > (3 comments) > > This also needs a test, maybe using the mock tablet server. Added a test, please check if it's a good test -- To view, visit http://gerrit.cloudera.org:8080/9787 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia2c18ba7af8eaa6f5e4d7842f18754d2c1e32526 Gerrit-Change-Number: 9787 Gerrit-PatchSet: 3 Gerrit-Owner: Attila Bukor Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Mon, 26 Mar 2018 16:14:14 + Gerrit-HasComments: No
[kudu-CR] KUDU-2364 Add extra check in ksck for tserver ID
Hello Will Berkeley, Dan Burkert, Kudu Jenkins, Adar Dembo, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9787 to look at the new patch set (#3). Change subject: KUDU-2364 Add extra check in ksck for tserver ID .. KUDU-2364 Add extra check in ksck for tserver ID ksck did not validate tablet server ID when checking connectivity. Whenever the TabletServer was nuked and readded, ksck would report the connection was successful to the old TabletServer. Now it will report an error like below: ./bin/kudu cluster ksck localhost Connected to the Master WARNING: Unable to connect to Tablet Server b68c8e3352ff411da5bbe177631ace5d (chargers-macbook-pro.local:7050): Remote error: ID reported by TabletServer (5e929ae0b09447cc9444a4b635a0d5ac) doesn't match the expected ID: b68c8e3352ff411da5bbe177631ace5d WARNING: Fetched info from 1 Tablet Servers, 1 weren't reachable The cluster doesn't have any matching tables == Errors: == error fetching info from tablet servers: Could not gather complete information from all tablet servers FAILED Runtime error: ksck discovered errors Change-Id: Ia2c18ba7af8eaa6f5e4d7842f18754d2c1e32526 --- M src/kudu/mini-cluster/internal_mini_cluster.h M src/kudu/server/server_base.cc M src/kudu/server/server_base.proto M src/kudu/tools/ksck-test.cc M src/kudu/tools/ksck_remote-test.cc M src/kudu/tools/ksck_remote.cc M src/kudu/tserver/tablet_service.cc M src/kudu/tserver/tserver.proto 8 files changed, 66 insertions(+), 12 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/87/9787/3 -- To view, visit http://gerrit.cloudera.org:8080/9787 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia2c18ba7af8eaa6f5e4d7842f18754d2c1e32526 Gerrit-Change-Number: 9787 Gerrit-PatchSet: 3 Gerrit-Owner: Attila Bukor Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Will Berkeley
[kudu-CR] KUDU-2364 Add extra check in ksck for tserver ID
Will Berkeley has posted comments on this change. ( http://gerrit.cloudera.org:8080/9787 ) Change subject: KUDU-2364 Add extra check in ksck for tserver ID .. Patch Set 2: (3 comments) This also needs a test, maybe using the mock tablet server. http://gerrit.cloudera.org:8080/#/c/9787/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/9787/2//COMMIT_MSG@21 PS2, Line 21: Fetched info from 1 Tablet Servers, 1 weren't reachable We'll need a more specific error message for the uuid mismatch case, and it'll also need to appear at the bottom with the summary information. http://gerrit.cloudera.org:8080/#/c/9787/2/src/kudu/tools/ksck_remote.cc File src/kudu/tools/ksck_remote.cc: http://gerrit.cloudera.org:8080/#/c/9787/2/src/kudu/tools/ksck_remote.cc@116 PS2, Line 116: req.set_dest_uuid(uuid()); > Couldn't we do the UUID verification client-side by issuing an additional G I would argue for checking the UUID via GetStatus(). I think it makes sense to have the Ping call just above this one be replaced with GetStatus, for the following reasons: 1. It's a clean separation between two different things: getting the tablets and getting server info. We might still want to get the tablet information even if the uuids mismatch, for example. 2. GetStatus() has other info we could use, for example version information. It'd be nice for ksck to check for version heterogeneity (but I'm not asking for that in this patch). I think it might be good, in general, for client to be able to set a destination uuid and have servers check it, so I do agree with Adar that there's nothing wrong with extending ListTablets as you have. But for a ksck uuid check I think GetStatus is most appropriate. http://gerrit.cloudera.org:8080/#/c/9787/2/src/kudu/tserver/tablet_service.cc File src/kudu/tserver/tablet_service.cc: http://gerrit.cloudera.org:8080/#/c/9787/2/src/kudu/tserver/tablet_service.cc@1366 PS2, Line 1366: \ Extra \? -- To view, visit http://gerrit.cloudera.org:8080/9787 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia2c18ba7af8eaa6f5e4d7842f18754d2c1e32526 Gerrit-Change-Number: 9787 Gerrit-PatchSet: 2 Gerrit-Owner: Attila Bukor Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Mon, 26 Mar 2018 00:45:04 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2364 Add extra check in ksck for tserver ID
Attila Bukor has posted comments on this change. ( http://gerrit.cloudera.org:8080/9787 ) Change subject: KUDU-2364 Add extra check in ksck for tserver ID .. Patch Set 2: > Patch Set 2: > > (1 comment) Actually I thought about removing the ping too as it's an unnecessary call, ListTablets will fail if the ping fails so there's no need to do a separate ping beforehand, but I decided it was out of scope for this change. Adding a GetStatus() would just add a second/third RPC call, so that's why I thought it would be better to "re-use" ListTablets(), plus it already had a method to compare UUIDs. -- To view, visit http://gerrit.cloudera.org:8080/9787 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia2c18ba7af8eaa6f5e4d7842f18754d2c1e32526 Gerrit-Change-Number: 9787 Gerrit-PatchSet: 2 Gerrit-Owner: Attila Bukor Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Sat, 24 Mar 2018 07:54:05 + Gerrit-HasComments: No
[kudu-CR] KUDU-2364 Add extra check in ksck for tserver ID
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/9787 ) Change subject: KUDU-2364 Add extra check in ksck for tserver ID .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/9787/2/src/kudu/tools/ksck_remote.cc File src/kudu/tools/ksck_remote.cc: http://gerrit.cloudera.org:8080/#/c/9787/2/src/kudu/tools/ksck_remote.cc@116 PS2, Line 116: req.set_dest_uuid(uuid()); Couldn't we do the UUID verification client-side by issuing an additional GetStatus() RPC and comparing the returned UUID with uuid()? On the other hand, there's nothing inherently wrong with augmenting ListTablets() in this way; which approach do you think makes more sense? -- To view, visit http://gerrit.cloudera.org:8080/9787 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia2c18ba7af8eaa6f5e4d7842f18754d2c1e32526 Gerrit-Change-Number: 9787 Gerrit-PatchSet: 2 Gerrit-Owner: Attila Bukor Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Fri, 23 Mar 2018 18:11:37 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2364 Add extra check in ksck for tserver ID
Hello Will Berkeley, Dan Burkert, Kudu Jenkins, Adar Dembo, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9787 to look at the new patch set (#2). Change subject: KUDU-2364 Add extra check in ksck for tserver ID .. KUDU-2364 Add extra check in ksck for tserver ID ksck did not validate tablet server ID when checking connectivity. Whenever the TabletServer was nuked and readded, ksck would report the connection was successful to the old TabletServer. Now it will report an error like below: Connected to the Master WARNING: Unable to connect to Tablet Server 71167ec2051f4c1bb96b6a2cb54cc31c (va1022.halxg.cloudera.com:7050): Invalid argument: ListTablets: Wrong destination UUID requested. Local UUID: d92197fa2f034b33aa0bf998c41f637b. Requested UUID: 71167ec2051f4c1bb96b6a2cb54cc31c WARNING: Fetched info from 1 Tablet Servers, 1 weren't reachable The cluster doesn't have any matching tables == Errors: == error fetching info from tablet servers: Could not gather complete information from all tablet servers Change-Id: Ia2c18ba7af8eaa6f5e4d7842f18754d2c1e32526 --- M src/kudu/tools/ksck_remote.cc M src/kudu/tserver/tablet_service.cc M src/kudu/tserver/tserver.proto 3 files changed, 16 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/87/9787/2 -- To view, visit http://gerrit.cloudera.org:8080/9787 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia2c18ba7af8eaa6f5e4d7842f18754d2c1e32526 Gerrit-Change-Number: 9787 Gerrit-PatchSet: 2 Gerrit-Owner: Attila Bukor Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Will Berkeley
[kudu-CR] KUDU-2364 Add extra check in ksck for tserver ID
Hello Will Berkeley, Dan Burkert, Adar Dembo, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/9787 to review the following change. Change subject: KUDU-2364 Add extra check in ksck for tserver ID .. KUDU-2364 Add extra check in ksck for tserver ID ksck did not validate tablet server ID when checking connectivity. Whenever the TabletServer was nuked and readded, ksck would report the connection was successful to the old TabletServer. Now it will report an error like below: Connected to the Master WARNING: Unable to connect to Tablet Server 71167ec2051f4c1bb96b6a2cb54cc31c (va1022.halxg.cloudera.com:7050): Invalid argument: ListTablets: Wrong destination UUID requested. Local UUID: d92197fa2f034b33aa0bf998c41f637b. Requested UUID: 71167ec2051f4c1bb96b6a2cb54cc31c WARNING: Fetched info from 1 Tablet Servers, 1 weren't reachable The cluster doesn't have any matching tables == Errors: == error fetching info from tablet servers: Could not gather complete information from all tablet servers Change-Id: Ia2c18ba7af8eaa6f5e4d7842f18754d2c1e32526 --- M src/kudu/tools/ksck_remote.cc M src/kudu/tserver/tablet_service.cc M src/kudu/tserver/tserver.proto 3 files changed, 13 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/87/9787/1 -- To view, visit http://gerrit.cloudera.org:8080/9787 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ia2c18ba7af8eaa6f5e4d7842f18754d2c1e32526 Gerrit-Change-Number: 9787 Gerrit-PatchSet: 1 Gerrit-Owner: Attila Bukor Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Will Berkeley