Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/8138 )
Change subject: WIP [consensus] adding/removing NON_VOTER members ...................................................................... Patch Set 6: (8 comments) http://gerrit.cloudera.org:8080/#/c/8138/6/src/kudu/consensus/leader_election-test.cc File src/kudu/consensus/leader_election-test.cc: http://gerrit.cloudera.org:8080/#/c/8138/6/src/kudu/consensus/leader_election-test.cc@164 PS6, Line 164: ASSERT_FALSE(voter_uuids_.empty()); > Because of this change, we need to either add NO_FATALS() to all callsites Good point. Indeed, CHECK() would be best since this assertion is to protect from programmer's errors. http://gerrit.cloudera.org:8080/#/c/8138/6/src/kudu/consensus/leader_election.cc File src/kudu/consensus/leader_election.cc: http://gerrit.cloudera.org:8080/#/c/8138/6/src/kudu/consensus/leader_election.cc@167 PS6, Line 167: LOG_WITH_PREFIX(WARNING) << Substitute( > how about a CHECK? I thought about that, but decided that would be too much: if due do some bug some NON_VOTER replica sends a request to others, all others replicas will crash. I think it's better to add DCHECK() in at the point where RequestVotePB is being sent out -- I'll try to do that. That way only the rogue non-voter replica will crash in such condition, and only for DEBUG builds. http://gerrit.cloudera.org:8080/#/c/8138/6/src/kudu/consensus/quorum_util-test.cc File src/kudu/consensus/quorum_util-test.cc: http://gerrit.cloudera.org:8080/#/c/8138/6/src/kudu/consensus/quorum_util-test.cc@34 PS6, Line 34: SetPeerInfo > perhaps rename this to AddNewPeer() Done http://gerrit.cloudera.org:8080/#/c/8138/6/src/kudu/consensus/quorum_util-test.cc@37 PS6, Line 37: boost::optional<RaftPeerPB::MemberType> type = boost::none > I think it's better to require a type instead of make it optional since the I just wanted to use it for the case when member_type() is not set. OK, I'll just write a separate piece of code for that. http://gerrit.cloudera.org:8080/#/c/8138/6/src/kudu/consensus/quorum_util-test.cc@41 PS6, Line 41: !! > We don't need !! in this context because boost::optional<> implements opera Last time I checked I didn't see the operation you mentioned. However, optional has the following operator: bool operator!() const Anyway, this is no longer necessary. http://gerrit.cloudera.org:8080/#/c/8138/6/src/kudu/consensus/raft_consensus.cc File src/kudu/consensus/raft_consensus.cc: http://gerrit.cloudera.org:8080/#/c/8138/6/src/kudu/consensus/raft_consensus.cc@410 PS6, Line 410: DisableFailureDetector(); > Hmm. I think it would be best to be more proactive about enabling / disabli I'm not sure about that. This place seemed a better candidate since this is how it's done for NON_PARTICIPANT role (see below) and also this method is called from ConsensusServiceImpl::RunLeaderElection(). Also, this method seems to be the lowest level where election-related code is initiated, and handling non-voter case here looks the safest way of doing so, not missing anything else from upper level. So, placing it here get the job done in a status-quo way and also allows not to duplicate any code in ConsensusServiceImpl::RunLeaderElection(). http://gerrit.cloudera.org:8080/#/c/8138/6/src/kudu/integration-tests/raft_consensus-itest.cc File src/kudu/integration-tests/raft_consensus-itest.cc: http://gerrit.cloudera.org:8080/#/c/8138/6/src/kudu/integration-tests/raft_consensus-itest.cc@3287 PS6, Line 3287: TEST_F(RaftConsensusITest, AddNonVoterReplica) { > If the purpose of the test is simply to add a non-voter and then see if we I tried to do that without writing data, but then --tablet_copy_download_file_inject_latency_ms=1000 flag didn't have proper effect for some reason (that latency is necessary to reliably spot tablet copy session start and complete). And yes, this test does more than just ensuring the tablet copy sessions starts -- the idea is to make sure the tablet is available and writing data to it does not produce any errors. http://gerrit.cloudera.org:8080/#/c/8138/6/src/kudu/integration-tests/raft_consensus-itest.cc@3408 PS6, Line 3408: TEST_F(RaftConsensusITest, AddThenRemoveNonVoterReplica) { > Similar comment as above Yep, the comment is outdated -- thanks for pointing at that. I'll update it. -- To view, visit http://gerrit.cloudera.org:8080/8138 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2662d45ad9bb6a4bf325d4202c2ee619ffad02b7 Gerrit-Change-Number: 8138 Gerrit-PatchSet: 6 Gerrit-Owner: Alexey Serbin <aser...@cloudera.com> Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com> Gerrit-Reviewer: David Ribeiro Alves <davidral...@gmail.com> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy <mpe...@apache.org> Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon <t...@apache.org> Gerrit-Comment-Date: Wed, 04 Oct 2017 23:19:21 +0000 Gerrit-HasComments: Yes