Dinesh Bhat has posted comments on this change. Change subject: [tools] Manual recovery tools (part 1) ......................................................................
Patch Set 9: (10 comments) http://gerrit.cloudera.org:8080/#/c/4834/6/src/kudu/tools/kudu-tool-test.cc File src/kudu/tools/kudu-tool-test.cc: > > I think having all the tests related to ./bin/kudu toolset in one file is I see, I didn't think about parallelization earlier, I was pointing out from source files grouping point of view earlier. Sure, let's do that in a follow-up patch, I will track this in a jira so that we won't miss this comment. Line 1149 > Not sure about macOS, but on the ExternalMiniCluster the proper thing to do it didn't look buggy after going through Restart/Start/Shutdown paths, it's just following a different Semantics than ExternalMiniCluster. Restart() here assumes that server should be running so that it does a 'Shotdown and Start'. Since we have manually shut it down earlier, Start() has brought up the tserver on same ports again. http://gerrit.cloudera.org:8080/#/c/4834/9/src/kudu/tools/kudu-tool-test.cc File src/kudu/tools/kudu-tool-test.cc: Line 944: TEST_F(ToolTest, TestRemoteReplicaCopy) { > If we're testing copying, should we write some data in this test? You could Yeah, I think that was the test I had earlier if you recall PS6. I resorted to keeping it simple, because the intent here is to test the functionality of the tool and not the actual tablet copy itself (I think Adar also had similar opinion here). Keeping that in mind, I thought it was sufficient to execute tablet copy tool and check whether or not we were able to bring the tablet or not. Line 994: // by attempting to copy tablets while we tombstone the tablet on destination server. > I don't think you ever elected a leader, so I don't know who would interfer Good point - honestly I didn't test this without shutting down the master or other tservers to begin with, I updated diffs to keep all of them running, and this also removed several unnecessary checks, thanks you! Line 1043: NO_FATALS(RunActionStdoutNone(Substitute("remote_replica copy $0 $1 $2 --force_copy", > Should --force be needed if it's tombstoned? Yeah, here. https://github.com/apache/kudu/blob/master/src/kudu/tserver/ts_tablet_manager.cc#L378 Was that unexpected ? Line 1048: // deleted_tablet_id is copied from source to destination. > nit: use active, not passive voice, i.e. Done Line 1050: ASSERT_OK(WaitUntilTabletInState(src_ts, deleted_tablet_id, > Why is this needed? We just did this on line 1039 That was for a different tablet_id, but in any case I removed these checks since we are never shuttong down the source tserver now. Line 1135: ASSERT_OK(ts->Start()); > Should be Restart() I think? But looking at the impl it seems that it just Start() is correct here, otherwise Restart() hits a check since it expects the server to be running, a slightly different semantics than we follow with ExternalMiniCluster. I will fix this separately if I find anything here, but for now jenkins/linux test runs are happy. http://gerrit.cloudera.org:8080/#/c/4834/6/src/kudu/tools/tool_action_remote_replica.cc File src/kudu/tools/tool_action_remote_replica.cc: Line 347: .Description("Copy a replica from one Kudu Tablet Server to another") > any thoughts on using "tablet replica" throughout this file instead of just If you see the latest patch, I updated this to just 'replica'. Since we use 'replica' in only one context under Kudu, tablet is probably redundant IMHO. http://gerrit.cloudera.org:8080/#/c/4834/9/src/kudu/tools/tool_action_remote_replica.cc File src/kudu/tools/tool_action_remote_replica.cc: Line 331: tserver::TabletServerErrorPB_Code(resp.error().code()))); > oops, typo in my original suggestion, should be TabletServerErrorPB::Code_N Done, I am just curious how did we generate Code_Name from the proto file ? I am not able to navigate the generator of that routine. -- To view, visit http://gerrit.cloudera.org:8080/4834 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I113a25e9b6c14f7c3814140917b61e35030b58d0 Gerrit-PatchSet: 9 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dinesh Bhat <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: David Ribeiro Alves <[email protected]> Gerrit-Reviewer: Dinesh Bhat <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy <[email protected]> Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-HasComments: Yes
