[kudu-CR] [quorum util] update criteria for non-voter replica eviction
Hello Tidy Bot, Mike Percy, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8679 to look at the new patch set (#4). Change subject: [quorum_util] update criteria for non-voter replica eviction .. [quorum_util] update criteria for non-voter replica eviction Updated the non-voter replica eviction criteria to address a scenario where newly added non-voter replicas fail one after another before the original failed voter replica is evicted. This is to make sure the replacement process does not end up with placing failed non-voter replicas at every available tablet server, deadlocking the replacement process. Added corresponding unit tests to cover the updated behavior of the consensus::CanEvictReplica() utility function. Change-Id: I2062c2963fface43062b3154bf5ffa61b4fa684f --- M src/kudu/consensus/quorum_util-test.cc M src/kudu/consensus/quorum_util.cc M src/kudu/integration-tests/raft_consensus_nonvoter-itest.cc 3 files changed, 263 insertions(+), 76 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/79/8679/4 -- To view, visit http://gerrit.cloudera.org:8080/8679 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2062c2963fface43062b3154bf5ffa61b4fa684f Gerrit-Change-Number: 8679 Gerrit-PatchSet: 4 Gerrit-Owner: Alexey SerbinGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot
[kudu-CR] [quorum util] update criteria for non-voter replica eviction
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/8679 ) Change subject: [quorum_util] update criteria for non-voter replica eviction .. Patch Set 3: (17 comments) http://gerrit.cloudera.org:8080/#/c/8679/3/src/kudu/consensus/quorum_util-test.cc File src/kudu/consensus/quorum_util-test.cc: http://gerrit.cloudera.org:8080/#/c/8679/3/src/kudu/consensus/quorum_util-test.cc@64 PS3, Line 64: FAIL > LOG(FATAL) so we don't have to put NO_FATALS() around the SetOverallHealth( It's a test, so FAIL() is enough here. And no need to put NO_FATALS() all over the place neither. http://gerrit.cloudera.org:8080/#/c/8679/3/src/kudu/consensus/quorum_util-test.cc@100 PS3, Line 100: auto* peers = config->mutable_peers(); : bool found_peer = false; : for (auto& peer : *peers) { : if (peer.permanent_uuid() == peer_uuid) { : found_peer = true; : peer.set_member_type(V); : //peer.mutable_attrs()->clear_promote(); : peer.mutable_attrs()->set_promote(false); : break; : } : } : if (!found_peer) { : FAIL() << peer_uuid << ": peer is not in the config"; : } > we can write this in fewer lines using GetRaftConfigMember(): Good point, but I don't want to use ASSERT_OK() here: if the specified peer is not in the config, that's a programming error. http://gerrit.cloudera.org:8080/#/c/8679/3/src/kudu/consensus/quorum_util-test.cc@116 PS3, Line 116: static void RemovePeer(RaftConfigPB* config, const string& peer_uuid) { > we can just use bool RemoveFromRaftConfig(RaftConfigPB* config, const strin Done http://gerrit.cloudera.org:8080/#/c/8679/3/src/kudu/consensus/quorum_util-test.cc@135 PS3, Line 135: if (!s.ok()) { : FAIL() > why not just ASSERT_OK() ? Because I don't think it's worth adding NO_FATALS() around every invocation of SetPeerHealth(): if the specified peer is not the part of the config, that must be programming error. Also, I don't want to use CHECK() here because I want the test to perform proper clean-up and there is no use of crashdump in that case. http://gerrit.cloudera.org:8080/#/c/8679/3/src/kudu/consensus/quorum_util-test.cc@530 PS3, Line 530: for (char replace_health : kHealthStatuses) { > this test has nothing to do with non-voters, right? Right: as other sub-scenarios, it works with voter replicas. I just used consolidated the testcases below, removing trying to remove some duplicated code. http://gerrit.cloudera.org:8080/#/c/8679/3/src/kudu/consensus/quorum_util-test.cc@609 PS3, Line 609: . > if we have enough healthy voters to commit the config change. Done http://gerrit.cloudera.org:8080/#/c/8679/3/src/kudu/consensus/quorum_util-test.cc@661 PS3, Line 661: ASSERT_TRUE > This one confuses me. Shouldn't this be ASSERT_FALSE because we don't have This has changed once the criteria for the eviction updated. See https://gerrit.cloudera.org/#/c/8786/ http://gerrit.cloudera.org:8080/#/c/8679/3/src/kudu/consensus/quorum_util-test.cc@727 PS3, Line 727: SetPeerHealth > nit: NO_FATALS() here and below I use FAIL() in the implementation of SetPeerHealth() specifically for the purpose of not adding NO_FATALS() all over the place. If SetPeerHealth() fails, that's a programming error. http://gerrit.cloudera.org:8080/#/c/8679/3/src/kudu/consensus/quorum_util-test.cc@731 PS3, Line 731: Adding a non-voter for replacement > nit: This comment could be a little clearer written as: Add a non-voter to Done http://gerrit.cloudera.org:8080/#/c/8679/3/src/kudu/consensus/quorum_util-test.cc@742 PS3, Line 742: PromotePeer > nit: NO_FATALS() here and below The same story as for SetPeerHealth(). http://gerrit.cloudera.org:8080/#/c/8679/3/src/kudu/consensus/quorum_util-test.cc@753 PS3, Line 753: SetPeerHealth(, "D", '-'); > should we also test the (D, '?') case before it goes fully FAILED? OK, we can do that. But that's not how it's going to be in current implementation for replica health monitoring: it goes from '+' to '-' in the case of a crash, right? http://gerrit.cloudera.org:8080/#/c/8679/3/src/kudu/consensus/quorum_util-test.cc@754 PS3, Line 754: ASSERT_FALSE(CanEvictReplica(config, "A", kReplicationFactor)); > nit: add comment: // We cannot evict because we don't have enough healthy v Done http://gerrit.cloudera.org:8080/#/c/8679/3/src/kudu/consensus/quorum_util-test.cc@757 PS3, Line 757: EXPECT_FALSE(CanEvictReplica(config, "A", kReplicationFactor)); : EXPECT_TRUE(IsUnderReplicated(config, kReplicationFactor)); > this duplicates the assertions above Good catch -- that's left from the conflict resolution. http://gerrit.cloudera.org:8080/#/c/8679/3/src/kudu/consensus/quorum_util-test.cc@775 PS3, Line 775: // The processs
[kudu-CR] [quorum util] update criteria for non-voter replica eviction
Mike Percy has posted comments on this change. ( http://gerrit.cloudera.org:8080/8679 ) Change subject: [quorum_util] update criteria for non-voter replica eviction .. Patch Set 3: (17 comments) http://gerrit.cloudera.org:8080/#/c/8679/3/src/kudu/consensus/quorum_util-test.cc File src/kudu/consensus/quorum_util-test.cc: http://gerrit.cloudera.org:8080/#/c/8679/3/src/kudu/consensus/quorum_util-test.cc@64 PS3, Line 64: FAIL LOG(FATAL) so we don't have to put NO_FATALS() around the SetOverallHealth() calls http://gerrit.cloudera.org:8080/#/c/8679/3/src/kudu/consensus/quorum_util-test.cc@100 PS3, Line 100: auto* peers = config->mutable_peers(); : bool found_peer = false; : for (auto& peer : *peers) { : if (peer.permanent_uuid() == peer_uuid) { : found_peer = true; : peer.set_member_type(V); : //peer.mutable_attrs()->clear_promote(); : peer.mutable_attrs()->set_promote(false); : break; : } : } : if (!found_peer) { : FAIL() << peer_uuid << ": peer is not in the config"; : } we can write this in fewer lines using GetRaftConfigMember(): RaftPeerPB* peer; ASSERT_OK(GetRaftConfigMember(config->mutable_peers(), peer_uuid, peer)); peer->set_member_type(V); peer->mutable_attrs()->set_promote(false); http://gerrit.cloudera.org:8080/#/c/8679/3/src/kudu/consensus/quorum_util-test.cc@116 PS3, Line 116: static void RemovePeer(RaftConfigPB* config, const string& peer_uuid) { we can just use bool RemoveFromRaftConfig(RaftConfigPB* config, const string& uuid); http://gerrit.cloudera.org:8080/#/c/8679/3/src/kudu/consensus/quorum_util-test.cc@135 PS3, Line 135: if (!s.ok()) { : FAIL() why not just ASSERT_OK() ? http://gerrit.cloudera.org:8080/#/c/8679/3/src/kudu/consensus/quorum_util-test.cc@530 PS3, Line 530: for (char replace_health : kHealthStatuses) { this test has nothing to do with non-voters, right? http://gerrit.cloudera.org:8080/#/c/8679/3/src/kudu/consensus/quorum_util-test.cc@609 PS3, Line 609: . if we have enough healthy voters to commit the config change. http://gerrit.cloudera.org:8080/#/c/8679/3/src/kudu/consensus/quorum_util-test.cc@661 PS3, Line 661: ASSERT_TRUE This one confuses me. Shouldn't this be ASSERT_FALSE because we don't have enough healthy voters to commit the config change? we only have 1/2 healthy voters here, while MajoritySize(2) == 2. http://gerrit.cloudera.org:8080/#/c/8679/3/src/kudu/consensus/quorum_util-test.cc@727 PS3, Line 727: SetPeerHealth nit: NO_FATALS() here and below http://gerrit.cloudera.org:8080/#/c/8679/3/src/kudu/consensus/quorum_util-test.cc@731 PS3, Line 731: Adding a non-voter for replacement nit: This comment could be a little clearer written as: Add a non-voter to replace B. http://gerrit.cloudera.org:8080/#/c/8679/3/src/kudu/consensus/quorum_util-test.cc@742 PS3, Line 742: PromotePeer nit: NO_FATALS() here and below http://gerrit.cloudera.org:8080/#/c/8679/3/src/kudu/consensus/quorum_util-test.cc@753 PS3, Line 753: SetPeerHealth(, "D", '-'); should we also test the (D, '?') case before it goes fully FAILED? http://gerrit.cloudera.org:8080/#/c/8679/3/src/kudu/consensus/quorum_util-test.cc@754 PS3, Line 754: ASSERT_FALSE(CanEvictReplica(config, "A", kReplicationFactor)); nit: add comment: // We cannot evict because we don't have enough healthy voters to commit an eviction config change. http://gerrit.cloudera.org:8080/#/c/8679/3/src/kudu/consensus/quorum_util-test.cc@757 PS3, Line 757: EXPECT_FALSE(CanEvictReplica(config, "A", kReplicationFactor)); : EXPECT_TRUE(IsUnderReplicated(config, kReplicationFactor)); this duplicates the assertions above http://gerrit.cloudera.org:8080/#/c/8679/3/src/kudu/consensus/quorum_util-test.cc@775 PS3, Line 775: // The processs converges: 3 voter replicas, all are healthy. nit: it's a little early to say this in this comment. This state would occur after this next operation at the end of the test. http://gerrit.cloudera.org:8080/#/c/8679/3/src/kudu/consensus/quorum_util-test.cc@779 PS3, Line 779: EXPECT_FALSE(IsUnderReplicated(config, kReplicationFactor)); perhaps also assert that we cannot evict any nodes here, either. http://gerrit.cloudera.org:8080/#/c/8679/3/src/kudu/consensus/quorum_util.cc File src/kudu/consensus/quorum_util.cc: http://gerrit.cloudera.org:8080/#/c/8679/3/src/kudu/consensus/quorum_util.cc@531 PS3, Line 531: not to evict to not evict http://gerrit.cloudera.org:8080/#/c/8679/3/src/kudu/consensus/quorum_util.cc@544 PS3, Line 544: // * A voter replica may be evicted only if the number of voter replicas in : // good health without the REPLACE attribute is greater or equal to the : // specified
[kudu-CR] [quorum util] update criteria for non-voter replica eviction
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/8679 ) Change subject: [quorum_util] update criteria for non-voter replica eviction .. Patch Set 2: (20 comments) > (17 comments) > > I think we should consider splitting this into two patches. The > non-voter changes as outlined in the commit message are important > and I think non-controversial. > > I am not sure about the voter eviction changes because a HEALTHY > health report does not currently guarantee that a replica is not > lagging, at least a little bit. I think we need to be more > conservative about evicting voters right now and only evict a voter > if we have # replication factory HEALTHY voters. Done. http://gerrit.cloudera.org:8080/#/c/8679/2/src/kudu/consensus/quorum_util-test.cc File src/kudu/consensus/quorum_util-test.cc: http://gerrit.cloudera.org:8080/#/c/8679/2/src/kudu/consensus/quorum_util-test.cc@41 PS2, Line 41: constexpr auto U = RaftPeerPB::UNKNOWN_MEMBER_TYPE; > warning: invalid case style for global constant 'U' [readability-identifier Done http://gerrit.cloudera.org:8080/#/c/8679/2/src/kudu/consensus/quorum_util-test.cc@41 PS2, Line 41: constexpr auto U = RaftPeerPB::UNKNOWN_MEMBER_TYPE; > how about // NOLINT here and below? Done http://gerrit.cloudera.org:8080/#/c/8679/2/src/kudu/consensus/quorum_util-test.cc@42 PS2, Line 42: constexpr auto V = RaftPeerPB::VOTER; > warning: invalid case style for global constant 'V' [readability-identifier Done http://gerrit.cloudera.org:8080/#/c/8679/2/src/kudu/consensus/quorum_util-test.cc@45 PS2, Line 45: const auto health_statuses = { '?', '-', '+' }; > kHeatlhStatuses? Done http://gerrit.cloudera.org:8080/#/c/8679/2/src/kudu/consensus/quorum_util-test.cc@45 PS2, Line 45: const auto health_statuses = { '?', '-', '+' }; > warning: invalid case style for global constant 'health_statuses' [readabil Done http://gerrit.cloudera.org:8080/#/c/8679/2/src/kudu/consensus/quorum_util-test.cc@734 PS2, Line 734: EXPECT_EQ("C", to_evict) > how about: EXPECT(to_evict == "C" || to_evict == "D") << to_evict; ? Done http://gerrit.cloudera.org:8080/#/c/8679/2/src/kudu/consensus/quorum_util-test.cc@745 PS2, Line 745: 3 > magic number, can you pull this out into a constant? Such as const auto kRe Done http://gerrit.cloudera.org:8080/#/c/8679/2/src/kudu/consensus/quorum_util-test.cc@773 PS2, Line 773: prior B is evicted > prior to B getting evicted Done http://gerrit.cloudera.org:8080/#/c/8679/2/src/kudu/consensus/quorum_util-test.cc@774 PS2, Line 774: the failed voter > who? That was about replica D. http://gerrit.cloudera.org:8080/#/c/8679/2/src/kudu/consensus/quorum_util-test.cc@779 PS2, Line 779: ASSERT_EQ("B", to_evict); > Why is "B" the correct choice instead of "D" here? aren't they equally good You are right -- that's just an implementation detail. Both replicas are the candidates for eviction at this point. Yep, that way of asserting is better. Good point. http://gerrit.cloudera.org:8080/#/c/8679/2/src/kudu/consensus/quorum_util.cc File src/kudu/consensus/quorum_util.cc: http://gerrit.cloudera.org:8080/#/c/8679/2/src/kudu/consensus/quorum_util.cc@531 PS2, Line 531: which > that Done http://gerrit.cloudera.org:8080/#/c/8679/2/src/kudu/consensus/quorum_util.cc@532 PS2, Line 532: : > , while Done http://gerrit.cloudera.org:8080/#/c/8679/2/src/kudu/consensus/quorum_util.cc@533 PS2, Line 533: which > whose Done http://gerrit.cloudera.org:8080/#/c/8679/2/src/kudu/consensus/quorum_util.cc@536 PS2, Line 536: in 'bad' > with FAILED Done http://gerrit.cloudera.org:8080/#/c/8679/2/src/kudu/consensus/quorum_util.cc@537 PS2, Line 537: REPLACE > 'replace' Done http://gerrit.cloudera.org:8080/#/c/8679/2/src/kudu/consensus/quorum_util.cc@539 PS2, Line 539: . > , while replacing failed non-voters with healthy non-voters as aggressively Done http://gerrit.cloudera.org:8080/#/c/8679/2/src/kudu/consensus/quorum_util.cc@540 PS2, Line 540: when attempting an eviction, it's better to be certain about the ability : // to carry out the operation. > we want to be sure that an eviction can succeed before initiating it. Done http://gerrit.cloudera.org:8080/#/c/8679/2/src/kudu/consensus/quorum_util.cc@545 PS2, Line 545: greater > greater than Done http://gerrit.cloudera.org:8080/#/c/8679/2/src/kudu/consensus/quorum_util.cc@547 PS2, Line 547: bad > failed Done http://gerrit.cloudera.org:8080/#/c/8679/2/src/kudu/consensus/quorum_util.cc@557 PS2, Line 557: (num_voters_healthy >= replication_factor && num_voters_total > replication_factor) || : (num_voters_total > replication_factor && !voter_failed.empty() && :num_voters_healthy >= MajoritySize(num_voters_total - 1)); > nit: consider hoisting out one of the conditions, just for readability: Done -- To view, visit http://gerrit.cloudera.org:8080/8679 To
[kudu-CR] [quorum util] update criteria for non-voter replica eviction
Hello Tidy Bot, Mike Percy, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8679 to look at the new patch set (#3). Change subject: [quorum_util] update criteria for non-voter replica eviction .. [quorum_util] update criteria for non-voter replica eviction Updated the non-voter replica eviction criteria to address a scenario where newly added non-voter replicas fail one after another before the original failed voter replica is evicted. This is to make sure the replacement process does not end up with placing failed non-voter replicas at every available tablet server, deadlocking the replacement process. Added corresponding unit tests to cover the updated behavior of the consensus::CanEvictReplica() utility function. Change-Id: I2062c2963fface43062b3154bf5ffa61b4fa684f --- M src/kudu/consensus/quorum_util-test.cc M src/kudu/consensus/quorum_util.cc M src/kudu/integration-tests/raft_consensus_nonvoter-itest.cc 3 files changed, 272 insertions(+), 76 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/79/8679/3 -- To view, visit http://gerrit.cloudera.org:8080/8679 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2062c2963fface43062b3154bf5ffa61b4fa684f Gerrit-Change-Number: 8679 Gerrit-PatchSet: 3 Gerrit-Owner: Alexey SerbinGerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot