Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17528 )

Change subject: [master] AddMaster and copy sys catalog automatically
......................................................................


Patch Set 4:

(27 comments)

http://gerrit.cloudera.org:8080/#/c/17528/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17528/4//COMMIT_MSG@33
PS4, Line 33:  $ mkdir master-1
            :  $ mkdir master-1/log
nit: this could be done in one command:

  mkdir -p master-1/log


http://gerrit.cloudera.org:8080/#/c/17528/4//COMMIT_MSG@35
PS4, Line 35: 7050
nit: maybe 8767 is a better choice here to be more consistent with the port 
numbering scheme in start_kudu.sh?


http://gerrit.cloudera.org:8080/#/c/17528/4//COMMIT_MSG@35
PS4, Line 35:
nit: if showing the shell prompt for command lines above, maybe it makes sense 
to do so for this line as well?


http://gerrit.cloudera.org:8080/#/c/17528/4//COMMIT_MSG@35
PS4, Line 35: 7050
nit: would it be better to switch to port number 8766 to be aligned with the 
start_kudu.sh?  BTW, 7050 is also the default port for tablet server's RPC 
endpoint (but start_kudu.sh doesn't use the default port numbers anyways).


http://gerrit.cloudera.org:8080/#/c/17528/4/src/kudu/master/dynamic_multi_master-test.cc
File src/kudu/master/dynamic_multi_master-test.cc:

http://gerrit.cloudera.org:8080/#/c/17528/4/src/kudu/master/dynamic_multi_master-test.cc@123
PS4, Line 123:
nit: the indent is off


http://gerrit.cloudera.org:8080/#/c/17528/4/src/kudu/master/dynamic_multi_master-test.cc@218
PS4, Line 218:     if (master_hps) {
             :       
master_hps->emplace_back(HostPortFromPB(master.registration().rpc_addresses(0)));
             :     }
nit: not exactly a part of this changelist, but consider using a local 
vector<HostPort> and swap with master_hps only in the end before returning 
Status::OK() (feel free to ignore, though)


http://gerrit.cloudera.org:8080/#/c/17528/4/src/kudu/master/dynamic_multi_master-test.cc@226
PS4, Line 226: asynchronous request
nit: the LeaderStepDown request is actually sent synchronously, and this method 
checks for the status IIUC


http://gerrit.cloudera.org:8080/#/c/17528/4/src/kudu/master/dynamic_multi_master-test.cc@251
PS4, Line 251: LeaderStepDown request is asynchronous
nit: the LeaderStepDown request is sent synchronously and starts the process of 
leadership transfer, but the process of leadership transfer takes time


http://gerrit.cloudera.org:8080/#/c/17528/4/src/kudu/master/dynamic_multi_master-test.cc@1414
PS4, Line 1414: &
nit: why to keep this 64-bit reference over 32-bit args.orig_num_masters when 
it used only once in this scope?


http://gerrit.cloudera.org:8080/#/c/17528/4/src/kudu/master/dynamic_multi_master-test.cc@1421
PS4, Line 1421: 10
nit: I recall 10 seconds sometimes wasn't enough in case of TSAN builds -- does 
it seem to be enough here?


http://gerrit.cloudera.org:8080/#/c/17528/4/src/kudu/master/dynamic_multi_master-test.cc@1466
PS4, Line 1466:   SleepFor(MonoDelta::FromSeconds(3));
              :   
ASSERT_OK(VerifyVoterMastersForCluster(cluster_->num_masters(), nullptr, 
cluster_.get()));
Would be nice to add a comment why this is necessary given 
VerifyVoterMastersForCluster() successfully completed just above in the 
ASSERT_EVENTUALLY clause.


http://gerrit.cloudera.org:8080/#/c/17528/4/src/kudu/master/dynamic_multi_master-test.cc@1473
PS4, Line 1473: cluster_->num_masters()
nit: these do actually include the current leader master, don't they?  If so, 
does it make sense to refine the comment above?


http://gerrit.cloudera.org:8080/#/c/17528/4/src/kudu/master/dynamic_multi_master-test.cc@1477
PS4, Line 1477: an
nit: and


http://gerrit.cloudera.org:8080/#/c/17528/4/src/kudu/master/master.proto
File src/kudu/master/master.proto:

http://gerrit.cloudera.org:8080/#/c/17528/4/src/kudu/master/master.proto@1143
PS4, Line 1143: AuthorizeClientOrServiceUser
It would be great to clarify why this becomes necessary: probably, the commit 
description is the best place for that?

Also, it looks a bit off compared with the coarse-grained authz requirements 
for the RemoveMaster() RPC below.  Should the latter be updated as well?


http://gerrit.cloudera.org:8080/#/c/17528/4/src/kudu/master/master_runner.cc
File src/kudu/master/master_runner.cc:

http://gerrit.cloudera.org:8080/#/c/17528/4/src/kudu/master/master_runner.cc@367
PS4, Line 367:     
RETURN_NOT_OK(VerifyMastersGetHostPorts(opts.master_addresses(),
             :                                             local_uuid, 
server->messenger(),
             :                                             &leader_hp, 
&local_hp,
             :                                             &try_again, 
&needs_add));
I'm concerned how this is going to play out if it's a restart of 2 out of 3 
masters, where one master is unavailable.  In that situation, my expectation 
it's possible to start 2 masters and have the cluster in operable state, so the 
3rd failed master could be started (or replaced) afterwards.  Does it work the 
as expected?  Anyways, I guess it would be great to add a test scenario to make 
sure that's the case with these changes.


http://gerrit.cloudera.org:8080/#/c/17528/4/src/kudu/master/master_runner.cc@374
PS4, Line 374:       SleepFor(MonoDelta::FromSeconds(5));
Is it worth adding a LOG about the retry or there will be enough information 
output into the log to understand why it takes some time to start up a process?


http://gerrit.cloudera.org:8080/#/c/17528/4/src/kudu/master/master_runner.cc@401
PS4, Line 401: MonoDelta::FromSeconds(30)
Should this be controlled by a new run-time flag?


http://gerrit.cloudera.org:8080/#/c/17528/4/src/kudu/master/master_runner.cc@413
PS4, Line 413:     server.reset();
nit: would be nice to add a comment to explain why explicit reset here is 
necessary given the smart pointer is reset later on at line 418


http://gerrit.cloudera.org:8080/#/c/17528/4/src/kudu/mini-cluster/external_mini_cluster.h
File src/kudu/mini-cluster/external_mini_cluster.h:

http://gerrit.cloudera.org:8080/#/c/17528/4/src/kudu/mini-cluster/external_mini_cluster.h@283
PS4, Line 283:   Status AddMaster(const std::vector<std::string>& extra_flags = 
{});
Do you think it makes sense to add generic test coverage for this newly added 
functionality (number of masters, result flags, etc.) into 
external_mini_cluster-test.cc beyond the related test scenarios in 
dynamic_multi_master-test.cc?


http://gerrit.cloudera.org:8080/#/c/17528/4/src/kudu/mini-cluster/external_mini_cluster.h@501
PS4, Line 501: input
nit: drop?


http://gerrit.cloudera.org:8080/#/c/17528/4/src/kudu/mini-cluster/external_mini_cluster.h@503
PS4, Line 503: Callers are expected
nit: does it make sense to add a blurb about the necessity to reserve a port 
for a new master?


http://gerrit.cloudera.org:8080/#/c/17528/4/src/kudu/mini-cluster/external_mini_cluster.cc
File src/kudu/mini-cluster/external_mini_cluster.cc:

http://gerrit.cloudera.org:8080/#/c/17528/4/src/kudu/mini-cluster/external_mini_cluster.cc@663
PS4, Line 663:   vector<string> master_addrs_list;
Is it used at all in this scope?


http://gerrit.cloudera.org:8080/#/c/17528/4/src/kudu/mini-cluster/external_mini_cluster.cc@664
PS4, Line 664:   for (const auto& master : masters_) {
             :     master_rpc_addrs.emplace_back(master->bound_rpc_hostport());
             :   }
Maybe, re-use ExternalMiniCluster::master_rpc_addrs() instead?


http://gerrit.cloudera.org:8080/#/c/17528/4/src/kudu/mini-cluster/external_mini_cluster.cc@679
PS4, Line 679: Unable
nit: make it lowercase to match with error message at line 668?


http://gerrit.cloudera.org:8080/#/c/17528/4/src/kudu/mini-cluster/external_mini_cluster.cc@679
PS4, Line 679: Master
ditto


http://gerrit.cloudera.org:8080/#/c/17528/4/src/kudu/mini-cluster/external_mini_cluster.cc@680
PS4, Line 680:   ++opts_.num_masters;
nit: maybe, move this down to be along with updating the 'masters_' member 
field?


http://gerrit.cloudera.org:8080/#/c/17528/4/src/kudu/mini-cluster/external_mini_cluster.cc@686
PS4, Line 686: Substitute("--master_addresses=$0", new_master_addrs_list));
Would be nice to add a comment to clarify how --master_addresses is set/updated 
for the master just added.



--
To view, visit http://gerrit.cloudera.org:8080/17528
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic7635723f30317a45799ad7b9c9b725eafbd735b
Gerrit-Change-Number: 17528
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Wong <[email protected]>
Gerrit-Reviewer: Abhishek Chennaka <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Bankim Bhavsar <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Wed, 07 Jul 2021 18:44:55 +0000
Gerrit-HasComments: Yes

Reply via email to