Mike Percy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11251 )

Change subject: KUDU-2245 Graceful leadership transfer
......................................................................


Patch Set 13:

(13 comments)

I don't see a place where we lazily choose the new leader UUID if it's not 
specified in the graceful stepdown call. I think we should allow the leader to 
choose who gets to take over if it's not specified.

http://gerrit.cloudera.org:8080/#/c/11251/13//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11251/13//COMMIT_MSG@16
PS13, Line 16: beings
begins


http://gerrit.cloudera.org:8080/#/c/11251/13//COMMIT_MSG@42
PS13, Line 42: Still WIP
Still still WIP?


http://gerrit.cloudera.org:8080/#/c/11251/13/src/kudu/consensus/consensus_peers.cc
File src/kudu/consensus/consensus_peers.cc:

http://gerrit.cloudera.org:8080/#/c/11251/13/src/kudu/consensus/consensus_peers.cc@501
PS13, Line 501:                                  const rpc::ResponseCallback& 
/*callback*/) {
              :   
controller->set_timeout(MonoDelta::FromMilliseconds(FLAGS_consensus_rpc_timeout_ms));
              :   consensus_proxy_->RunLeaderElection(*request, response, 
controller);
we are mixing async and sync calls here but ignoring the output from the sync 
call -- seems wrong


http://gerrit.cloudera.org:8080/#/c/11251/13/src/kudu/consensus/consensus_queue.h
File src/kudu/consensus/consensus_queue.h:

http://gerrit.cloudera.org:8080/#/c/11251/13/src/kudu/consensus/consensus_queue.h@556
PS13, Line 556: TODO(unknown)
TODO(todd)


http://gerrit.cloudera.org:8080/#/c/11251/13/src/kudu/consensus/consensus_queue.cc
File src/kudu/consensus/consensus_queue.cc:

http://gerrit.cloudera.org:8080/#/c/11251/13/src/kudu/consensus/consensus_queue.cc@1071
PS13, Line 1071: DVLOG
how about just VLOG


http://gerrit.cloudera.org:8080/#/c/11251/13/src/kudu/consensus/consensus_queue.cc@1451
PS13, Line 1451:   WARN_NOT_OK(raft_pool_observers_token_->SubmitClosure(
add: DCHECK(queue_lock_.is_locked());


http://gerrit.cloudera.org:8080/#/c/11251/13/src/kudu/consensus/raft_consensus.cc
File src/kudu/consensus/raft_consensus.cc:

http://gerrit.cloudera.org:8080/#/c/11251/13/src/kudu/consensus/raft_consensus.cc@598
PS13, Line 598: new_leader_uuid.get()
nit: it's more idiomatic to use *new_leader_uuid with optionals vs. get(), here 
and below


http://gerrit.cloudera.org:8080/#/c/11251/13/src/kudu/consensus/raft_consensus.cc@601
PS13, Line 601:       return Status::OK();
Is there a use case where we want to support this? Otherwise it seems like an 
illegal request and should be rejected.


http://gerrit.cloudera.org:8080/#/c/11251/13/src/kudu/consensus/raft_consensus.cc@608
PS13, Line 608:       
resp->mutable_error()->set_code(TabletServerErrorPB::UNKNOWN_ERROR);
              :       StatusToPB(Status::InvalidArgument(msg), 
resp->mutable_error()->mutable_status());
              :       // We return OK so that the tablet service won't 
overwrite the error code.
              :       return Status::OK();
why not just return Status::InvalidArgument in this case and let TabletService 
take care of the rest?


http://gerrit.cloudera.org:8080/#/c/11251/13/src/kudu/consensus/raft_consensus.cc@614
PS13, Line 614:   BeginLeaderTransferPeriodUnlocked(new_leader_uuid);
Is there something that prevents us from clobbering an existing leadership 
transfer period?


http://gerrit.cloudera.org:8080/#/c/11251/13/src/kudu/consensus/raft_consensus.cc@622
PS13, Line 622: transfer_period_timer_->Snooze()
What is the use case for this?


http://gerrit.cloudera.org:8080/#/c/11251/13/src/kudu/consensus/raft_consensus.cc@631
PS13, Line 631:   leader_transfer_in_progress_.Store(true, kMemOrderAcquire);
how about:

  if (!leader_transfer_in_progresss_.compare_exchange_strong(false, true)) {
    return Status::IllegalArgument(Substitute("leadership transfer for $0 
already in progress", tablet_id_));
  }


http://gerrit.cloudera.org:8080/#/c/11251/13/src/kudu/tools/kudu-admin-test.cc
File src/kudu/tools/kudu-admin-test.cc:

http://gerrit.cloudera.org:8080/#/c/11251/13/src/kudu/tools/kudu-admin-test.cc@1448
PS13, Line 1448: TEST_F(AdminCliTest, 
TestSimultaneousLeaderTransferAndAbruptStepdown) {
This test could use an explanation -- I'm a little confused about the purpose 
of this test and the expected end result.



--
To view, visit http://gerrit.cloudera.org:8080/11251
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic97343af9eb349556424c999799ed5e2941f0083
Gerrit-Change-Number: 11251
Gerrit-PatchSet: 13
Gerrit-Owner: Will Berkeley <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Fengling Wang <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <[email protected]>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Will Berkeley <[email protected]>
Gerrit-Comment-Date: Fri, 12 Oct 2018 21:07:54 +0000
Gerrit-HasComments: Yes

Reply via email to