Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8138 )

Change subject: WIP [consensus] adding/removing NON_VOTER members
......................................................................


Patch Set 6:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/8138/6/src/kudu/consensus/leader_election-test.cc
File src/kudu/consensus/leader_election-test.cc:

http://gerrit.cloudera.org:8080/#/c/8138/6/src/kudu/consensus/leader_election-test.cc@164
PS6, Line 164:   ASSERT_FALSE(voter_uuids_.empty());
> Because of this change, we need to either add NO_FATALS() to all callsites
Good point.  Indeed, CHECK() would be best since this assertion is to protect 
from programmer's errors.


http://gerrit.cloudera.org:8080/#/c/8138/6/src/kudu/consensus/leader_election.cc
File src/kudu/consensus/leader_election.cc:

http://gerrit.cloudera.org:8080/#/c/8138/6/src/kudu/consensus/leader_election.cc@167
PS6, Line 167:         LOG_WITH_PREFIX(WARNING) << Substitute(
> how about a CHECK?
I thought about that, but decided that would be too much: if due do some bug 
some NON_VOTER replica sends a request to others, all others replicas will 
crash.

I think it's better to add DCHECK() in at the point where RequestVotePB is 
being sent out -- I'll try to do that.  That way only the rogue non-voter 
replica will crash in such condition, and only for DEBUG builds.


http://gerrit.cloudera.org:8080/#/c/8138/6/src/kudu/consensus/quorum_util-test.cc
File src/kudu/consensus/quorum_util-test.cc:

http://gerrit.cloudera.org:8080/#/c/8138/6/src/kudu/consensus/quorum_util-test.cc@34
PS6, Line 34: SetPeerInfo
> perhaps rename this to AddNewPeer()
Done


http://gerrit.cloudera.org:8080/#/c/8138/6/src/kudu/consensus/quorum_util-test.cc@37
PS6, Line 37: boost::optional<RaftPeerPB::MemberType> type = boost::none
> I think it's better to require a type instead of make it optional since the
I just wanted to use it for the case when member_type() is not set.  OK, I'll 
just write a separate piece of code for that.


http://gerrit.cloudera.org:8080/#/c/8138/6/src/kudu/consensus/quorum_util-test.cc@41
PS6, Line 41: !!
> We don't need !! in this context because boost::optional<> implements opera
Last time I checked I didn't see the operation you mentioned.  However, 
optional has the following operator:

bool operator!() const

Anyway, this is no longer necessary.


http://gerrit.cloudera.org:8080/#/c/8138/6/src/kudu/consensus/raft_consensus.cc
File src/kudu/consensus/raft_consensus.cc:

http://gerrit.cloudera.org:8080/#/c/8138/6/src/kudu/consensus/raft_consensus.cc@410
PS6, Line 410:       DisableFailureDetector();
> Hmm. I think it would be best to be more proactive about enabling / disabli
I'm not sure about that.

This place seemed a better candidate since this is how it's done for 
NON_PARTICIPANT role (see below) and also this method is called from  
ConsensusServiceImpl::RunLeaderElection(). Also, this method seems to be the 
lowest level where election-related code is initiated, and handling non-voter 
case here looks the safest way of doing so, not missing anything else from 
upper level.

So, placing it here get the job done in a status-quo way and also allows not to 
duplicate any code in ConsensusServiceImpl::RunLeaderElection().


http://gerrit.cloudera.org:8080/#/c/8138/6/src/kudu/integration-tests/raft_consensus-itest.cc
File src/kudu/integration-tests/raft_consensus-itest.cc:

http://gerrit.cloudera.org:8080/#/c/8138/6/src/kudu/integration-tests/raft_consensus-itest.cc@3287
PS6, Line 3287: TEST_F(RaftConsensusITest, AddNonVoterReplica) {
> If the purpose of the test is simply to add a non-voter and then see if we
I tried to do that without writing data, but then 
--tablet_copy_download_file_inject_latency_ms=1000 flag didn't have proper 
effect for some reason (that latency is necessary to reliably spot tablet copy 
session start and complete).

And yes, this test does more than just ensuring the tablet copy sessions starts 
-- the idea is to make sure the tablet is available and writing data to it does 
not produce any errors.


http://gerrit.cloudera.org:8080/#/c/8138/6/src/kudu/integration-tests/raft_consensus-itest.cc@3408
PS6, Line 3408: TEST_F(RaftConsensusITest, AddThenRemoveNonVoterReplica) {
> Similar comment as above
Yep, the comment is outdated -- thanks for pointing at that.  I'll update it.



--
To view, visit http://gerrit.cloudera.org:8080/8138
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2662d45ad9bb6a4bf325d4202c2ee619ffad02b7
Gerrit-Change-Number: 8138
Gerrit-PatchSet: 6
Gerrit-Owner: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <davidral...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mpe...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>
Gerrit-Comment-Date: Wed, 04 Oct 2017 23:19:21 +0000
Gerrit-HasComments: Yes

Reply via email to