Dinesh Bhat has posted comments on this change.

Change subject: [tools] Manual recovery tools (part 1)
......................................................................


Patch Set 9:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/4834/6/src/kudu/tools/kudu-tool-test.cc
File src/kudu/tools/kudu-tool-test.cc:

> > I think having all the tests related to ./bin/kudu toolset in one file is
I see, I didn't think about parallelization earlier, I was pointing out from 
source files grouping point of view earlier. Sure, let's do that in a follow-up 
patch, I will track this in a jira so that we won't miss this comment.


Line 1149
> Not sure about macOS, but on the ExternalMiniCluster the proper thing to do
it didn't look buggy after going through Restart/Start/Shutdown paths, it's 
just following a different Semantics than ExternalMiniCluster. Restart() here 
assumes that server should be running so that it does a 'Shotdown and Start'. 
Since we have manually shut it down earlier, Start() has brought up the tserver 
on same ports again.


http://gerrit.cloudera.org:8080/#/c/4834/9/src/kudu/tools/kudu-tool-test.cc
File src/kudu/tools/kudu-tool-test.cc:

Line 944: TEST_F(ToolTest, TestRemoteReplicaCopy) {
> If we're testing copying, should we write some data in this test? You could
Yeah, I think that was the test I had earlier if you recall PS6. I resorted to 
keeping it simple, because the intent here is to test the functionality of the 
tool and not the actual tablet copy itself (I think Adar also had similar 
opinion here). Keeping that in mind, I thought it was sufficient to execute 
tablet copy tool and check whether or not we were able to bring the tablet or 
not.


Line 994:   // by attempting to copy tablets while we tombstone the tablet on 
destination server.
> I don't think you ever elected a leader, so I don't know who would interfer
Good point - honestly I didn't test this without shutting down the master or 
other tservers to begin with, I updated diffs to keep all of them running, and 
this also removed several unnecessary checks, thanks you!


Line 1043:   NO_FATALS(RunActionStdoutNone(Substitute("remote_replica copy $0 
$1 $2 --force_copy",
> Should --force be needed if it's tombstoned?
Yeah, here. 
https://github.com/apache/kudu/blob/master/src/kudu/tserver/ts_tablet_manager.cc#L378
Was that unexpected ?


Line 1048:   // deleted_tablet_id is copied from source to destination.
> nit: use active, not passive voice, i.e. 
Done


Line 1050:   ASSERT_OK(WaitUntilTabletInState(src_ts, deleted_tablet_id,
> Why is this needed? We just did this on line 1039
That was for a different tablet_id, but in any case I removed these checks 
since we are never shuttong down the source tserver now.


Line 1135:   ASSERT_OK(ts->Start());
> Should be Restart() I think? But looking at the impl it seems that it just 
Start() is correct here, otherwise Restart() hits a check since it expects the 
server to be running, a slightly different semantics than we follow with 
ExternalMiniCluster. I will fix this separately if I find anything here, but 
for now jenkins/linux test runs are happy.


http://gerrit.cloudera.org:8080/#/c/4834/6/src/kudu/tools/tool_action_remote_replica.cc
File src/kudu/tools/tool_action_remote_replica.cc:

Line 347:       .Description("Copy a replica from one Kudu Tablet Server to 
another")
> any thoughts on using "tablet replica" throughout this file instead of just
If you see the latest patch, I updated this to just 'replica'. Since we use 
'replica' in only one context under Kudu, tablet is probably redundant IMHO.


http://gerrit.cloudera.org:8080/#/c/4834/9/src/kudu/tools/tool_action_remote_replica.cc
File src/kudu/tools/tool_action_remote_replica.cc:

Line 331:                             
tserver::TabletServerErrorPB_Code(resp.error().code())));
> oops, typo in my original suggestion, should be TabletServerErrorPB::Code_N
Done, I am just curious how did we generate Code_Name from the proto file ? I 
am not able to navigate the generator of that routine.


-- 
To view, visit http://gerrit.cloudera.org:8080/4834
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I113a25e9b6c14f7c3814140917b61e35030b58d0
Gerrit-PatchSet: 9
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dinesh Bhat <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: David Ribeiro Alves <[email protected]>
Gerrit-Reviewer: Dinesh Bhat <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <[email protected]>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-HasComments: Yes

Reply via email to