Mike Percy has posted comments on this change. Change subject: [tools] Manual recovery tools (part 1) ......................................................................
Patch Set 6: (16 comments) http://gerrit.cloudera.org:8080/#/c/4834/6/src/kudu/integration-tests/cluster_itest_util.h File src/kudu/integration-tests/cluster_itest_util.h: Line 267: const shared_ptr<master::MasterServiceProxy>& master_proxy, > error: expected ')' [clang-diagnostic-error] std::shared_ptr Line 268: const string& tablet_id, std::string here and below http://gerrit.cloudera.org:8080/#/c/4834/6/src/kudu/tools/kudu-tool-test.cc File src/kudu/tools/kudu-tool-test.cc: Is it time to start breaking these tool tests out into separate tests? This file is getting quite large. Line 950: // Force a change_config within 3 seconds of follower unavailability. Why 3 seconds? Seems like it could make the test a bit unstable, no? Line 1069: Won't the below stuff race with the master trying to delete them because they aren't supposed to be there? Line 1126: ts->Shutdown(); Try running the tool before shutdown to ensure that it fails? Line 1149: } I think also restarting the TS and ensuring that the tablet doesn't exist anymore would also be useful http://gerrit.cloudera.org:8080/#/c/4834/6/src/kudu/tools/tool_action_local_replica.cc File src/kudu/tools/tool_action_local_replica.cc: Line 267: // This action cleans up metadata/data/WAL for a tablet on this node when I think the information in this comment should instead be moved to the tool help. Line 736: .Description("Delete Kudu replica from the local filesystem") s/replica/tablet replica/ here and elsewhere in this file's help text Line 743: .Description("Operate on local Kudu replicas via the local filesystem") tablet replicas http://gerrit.cloudera.org:8080/#/c/4834/6/src/kudu/tools/tool_action_remote_replica.cc File src/kudu/tools/tool_action_remote_replica.cc: Line 299: RETURN_NOT_OK(GetServerStatus(dst_address, TabletServer::kDefaultPort, Is there no way to override the port in these tools? Here and below. Seems like a problem Line 315: // This is also applicable for replicas which are tombstoned. What does "applicable" mean here? Line 322: RETURN_NOT_OK(HostPortToPB(src_host_port, req.mutable_copy_peer_addr())); Why all these conversions? How about: *req.mutable_copy_peer_addr() = src_status.bound_rpc_addresses(0); Line 331: return Status::RemoteError(resp.ShortDebugString()); I think something like this would be better: return StatusFromPB(resp.error().status()).CloneAndPrepend(Substitute("Remote server returned error code $0", TabletServerErrorPB_Code(resp.error().code()))); Maybe there is a helper function somewhere for that or we could create one. http://gerrit.cloudera.org:8080/#/c/4834/6/src/kudu/tserver/ts_tablet_manager.cc File src/kudu/tserver/ts_tablet_manager.cc: Line 563: DCHECK_NOTNULL(fs_manager); nit: DCHECK_NOTNULL macro returns its argument so consider using DCHECK(fs_manager != nullptr) instead to avoid a compiler warning when not using the argument Line 564: return "T " + tablet_id + " P " + fs_manager->uuid() + ": "; nit: consider using strings::Substitute here to cut down on memory allocations (I know it was already like that) -- To view, visit http://gerrit.cloudera.org:8080/4834 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I113a25e9b6c14f7c3814140917b61e35030b58d0 Gerrit-PatchSet: 6 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dinesh Bhat <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: David Ribeiro Alves <[email protected]> Gerrit-Reviewer: Dinesh Bhat <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy <[email protected]> Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-HasComments: Yes
