Mike Percy has posted comments on this change. ( http://gerrit.cloudera.org:8080/11251 )
Change subject: KUDU-2245 Graceful leadership transfer ...................................................................... Patch Set 13: (13 comments) I don't see a place where we lazily choose the new leader UUID if it's not specified in the graceful stepdown call. I think we should allow the leader to choose who gets to take over if it's not specified. http://gerrit.cloudera.org:8080/#/c/11251/13//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11251/13//COMMIT_MSG@16 PS13, Line 16: beings begins http://gerrit.cloudera.org:8080/#/c/11251/13//COMMIT_MSG@42 PS13, Line 42: Still WIP Still still WIP? http://gerrit.cloudera.org:8080/#/c/11251/13/src/kudu/consensus/consensus_peers.cc File src/kudu/consensus/consensus_peers.cc: http://gerrit.cloudera.org:8080/#/c/11251/13/src/kudu/consensus/consensus_peers.cc@501 PS13, Line 501: const rpc::ResponseCallback& /*callback*/) { : controller->set_timeout(MonoDelta::FromMilliseconds(FLAGS_consensus_rpc_timeout_ms)); : consensus_proxy_->RunLeaderElection(*request, response, controller); we are mixing async and sync calls here but ignoring the output from the sync call -- seems wrong http://gerrit.cloudera.org:8080/#/c/11251/13/src/kudu/consensus/consensus_queue.h File src/kudu/consensus/consensus_queue.h: http://gerrit.cloudera.org:8080/#/c/11251/13/src/kudu/consensus/consensus_queue.h@556 PS13, Line 556: TODO(unknown) TODO(todd) http://gerrit.cloudera.org:8080/#/c/11251/13/src/kudu/consensus/consensus_queue.cc File src/kudu/consensus/consensus_queue.cc: http://gerrit.cloudera.org:8080/#/c/11251/13/src/kudu/consensus/consensus_queue.cc@1071 PS13, Line 1071: DVLOG how about just VLOG http://gerrit.cloudera.org:8080/#/c/11251/13/src/kudu/consensus/consensus_queue.cc@1451 PS13, Line 1451: WARN_NOT_OK(raft_pool_observers_token_->SubmitClosure( add: DCHECK(queue_lock_.is_locked()); http://gerrit.cloudera.org:8080/#/c/11251/13/src/kudu/consensus/raft_consensus.cc File src/kudu/consensus/raft_consensus.cc: http://gerrit.cloudera.org:8080/#/c/11251/13/src/kudu/consensus/raft_consensus.cc@598 PS13, Line 598: new_leader_uuid.get() nit: it's more idiomatic to use *new_leader_uuid with optionals vs. get(), here and below http://gerrit.cloudera.org:8080/#/c/11251/13/src/kudu/consensus/raft_consensus.cc@601 PS13, Line 601: return Status::OK(); Is there a use case where we want to support this? Otherwise it seems like an illegal request and should be rejected. http://gerrit.cloudera.org:8080/#/c/11251/13/src/kudu/consensus/raft_consensus.cc@608 PS13, Line 608: resp->mutable_error()->set_code(TabletServerErrorPB::UNKNOWN_ERROR); : StatusToPB(Status::InvalidArgument(msg), resp->mutable_error()->mutable_status()); : // We return OK so that the tablet service won't overwrite the error code. : return Status::OK(); why not just return Status::InvalidArgument in this case and let TabletService take care of the rest? http://gerrit.cloudera.org:8080/#/c/11251/13/src/kudu/consensus/raft_consensus.cc@614 PS13, Line 614: BeginLeaderTransferPeriodUnlocked(new_leader_uuid); Is there something that prevents us from clobbering an existing leadership transfer period? http://gerrit.cloudera.org:8080/#/c/11251/13/src/kudu/consensus/raft_consensus.cc@622 PS13, Line 622: transfer_period_timer_->Snooze() What is the use case for this? http://gerrit.cloudera.org:8080/#/c/11251/13/src/kudu/consensus/raft_consensus.cc@631 PS13, Line 631: leader_transfer_in_progress_.Store(true, kMemOrderAcquire); how about: if (!leader_transfer_in_progresss_.compare_exchange_strong(false, true)) { return Status::IllegalArgument(Substitute("leadership transfer for $0 already in progress", tablet_id_)); } http://gerrit.cloudera.org:8080/#/c/11251/13/src/kudu/tools/kudu-admin-test.cc File src/kudu/tools/kudu-admin-test.cc: http://gerrit.cloudera.org:8080/#/c/11251/13/src/kudu/tools/kudu-admin-test.cc@1448 PS13, Line 1448: TEST_F(AdminCliTest, TestSimultaneousLeaderTransferAndAbruptStepdown) { This test could use an explanation -- I'm a little confused about the purpose of this test and the expected end result. -- To view, visit http://gerrit.cloudera.org:8080/11251 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic97343af9eb349556424c999799ed5e2941f0083 Gerrit-Change-Number: 11251 Gerrit-PatchSet: 13 Gerrit-Owner: Will Berkeley <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Fengling Wang <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy <[email protected]> Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Will Berkeley <[email protected]> Gerrit-Comment-Date: Fri, 12 Oct 2018 21:07:54 +0000 Gerrit-HasComments: Yes
