Bankim Bhavsar has posted comments on this change. ( http://gerrit.cloudera.org:8080/16321 )
Change subject: [master] KUDU-2181 Raft change config request for adding a master ...................................................................... Patch Set 3: (8 comments) http://gerrit.cloudera.org:8080/#/c/16321/2/src/kudu/consensus/raft_consensus.cc File src/kudu/consensus/raft_consensus.cc: http://gerrit.cloudera.org:8080/#/c/16321/2/src/kudu/consensus/raft_consensus.cc@2470 PS2, Line 2470: for (const auto& peer : cmeta_->ActiveConfig().peers()) { > nit: maybe call this 'raft_config' or something to reflect what its type is Done http://gerrit.cloudera.org:8080/#/c/16321/2/src/kudu/master/catalog_manager.h File src/kudu/master/catalog_manager.h: http://gerrit.cloudera.org:8080/#/c/16321/2/src/kudu/master/catalog_manager.h@781 PS2, Line 781: talogManager's role and member type in a consen > nit: are there DCHECK()s we can add to enforce this? I simply copied the implementation and comment from the function "Role" above. Implementation checks whether catalog is running and returns UNKNOWN_ROLE/MEMBER_TYPE if not initialized. So updated the comments instead. http://gerrit.cloudera.org:8080/#/c/16321/2/src/kudu/master/catalog_manager.cc File src/kudu/master/catalog_manager.cc: http://gerrit.cloudera.org:8080/#/c/16321/2/src/kudu/master/catalog_manager.cc@5564 PS2, Line 5564: auto comp_promise_ptr = s > If this is not OK, how should we expect users to catch such errors? Is ther ChangeConfig() implementation in Raft is inherently async, so earlier thought was that clients can check with ListMasters call. See the comment in master.proto. But I see your point about informing clients about the final outcome. Adding/removing masters will be rare and I think it's better to make the call synchronous with some timeout instead as I don't expect clients/tool to do other async work while master addition/removal is in progress and clients would know the final status. So updated the code accordingly with 10 sec timeout which looks reasonable for such an operation though I haven't dived into the RPC implementation to determine whether that's okay. I looked around a bit but couldn't find existing examples of RPCs wherein the server returns the response to the client in an async fashion. http://gerrit.cloudera.org:8080/#/c/16321/2/src/kudu/master/dynamic_multi_master-test.cc File src/kudu/master/dynamic_multi_master-test.cc: http://gerrit.cloudera.org:8080/#/c/16321/2/src/kudu/master/dynamic_multi_master-test.cc@148 PS2, Line 148: const string new_master_id = Substitute("master-$0", num_masters_); : new_master_opts.wal_dir = cluster_->GetWalPath(new_master_id); : new_master_opts.data_dirs = cluster_->GetDataPaths(new_master_id); : new_master_opts.log_dir = cluster_->GetLogPath(new_master_id); : new_master_opts.rpc_bind_address = reserved_hp_; : new_master_opts.extra_flags = { : "--master_addresses=" + JoinStrings(master_addresses, ","), : "--rpc_reuseport=true", > Could we just copy some of the opts from an existing daemon? Seems then we' Done http://gerrit.cloudera.org:8080/#/c/16321/2/src/kudu/master/master.h File src/kudu/master/master.h: http://gerrit.cloudera.org:8080/#/c/16321/2/src/kudu/master/master.h@89 PS2, Line 89: MasterOptions* > nit: consider returning a pointer instead for mutable accessors. That's the Done http://gerrit.cloudera.org:8080/#/c/16321/2/src/kudu/master/master_options.h File src/kudu/master/master_options.h: http://gerrit.cloudera.org:8080/#/c/16321/2/src/kudu/master/master_options.h@57 PS2, Line 57: : // Allows setting/overwriting list of masters. > A static lock is incredibly surprising for things that aren't designed to b Done http://gerrit.cloudera.org:8080/#/c/16321/2/src/kudu/master/master_service.cc File src/kudu/master/master_service.cc: http://gerrit.cloudera.org:8080/#/c/16321/2/src/kudu/master/master_service.cc@243 PS2, Line 243: if (!FLAGS_master_support_change_config) { : rpc->RespondFailure(Status::NotSupported("Adding master is not supported")); : return; : } > Just FYI this can also be done via MasterServiceImpl::SupportsFeature() Thanks for the pointer. Added the feature in MasterFeatures as it helps with forward compatibility (newer clients communicating with older master server). I think this check is still needed. What if the client doesn't set the required feature and the feature is not yet complete and hence disabled in the server? http://gerrit.cloudera.org:8080/#/c/16321/2/src/kudu/master/master_service.cc@255 PS2, Line 255: ERRO > nit: users should probably be concerned about this, so this should probably Done -- To view, visit http://gerrit.cloudera.org:8080/16321 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0ac7e6e55220bcb01cad0fa386daaf656258088c Gerrit-Change-Number: 16321 Gerrit-PatchSet: 3 Gerrit-Owner: Bankim Bhavsar <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Bankim Bhavsar <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Sat, 12 Sep 2020 00:34:37 +0000 Gerrit-HasComments: Yes
