Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/17585 )
Change subject: [consensus] KUDU-2302: don't crash if new leader can't resolve peer ...................................................................... Patch Set 5: Code-Review+1 (2 comments) Overall looks good, but it would be great to add a test scenario which would exercise one other realistic scenario when the names many other tablet servers in the cluster are still resolvable, so the system replaces the failed peer with a new one for the time of the DNS resolver's malfunction for one particular kudu-tserver node. http://gerrit.cloudera.org:8080/#/c/17585/4/src/kudu/integration-tests/raft_consensus_election-itest.cc File src/kudu/integration-tests/raft_consensus_election-itest.cc: http://gerrit.cloudera.org:8080/#/c/17585/4/src/kudu/integration-tests/raft_consensus_election-itest.cc@167 PS4, Line 167: TEST_F(RaftConsensusElectionITest, TestNewLeaderCantResolvePeers) { I'm curious: in the scope of this test, is it important or not to have connections already established between Raft peers? Will anything change if adding --rpc_reopen_outbound_connections=true in this scenario? http://gerrit.cloudera.org:8080/#/c/17585/4/src/kudu/integration-tests/raft_consensus_election-itest.cc@247 PS4, Line 247: // Once we stop failing DNS resolution, the peer should become healthy. : for (const auto& ts : tablet_servers_) { : ASSERT_OK(cluster_->SetFlag(cluster_->tablet_server_by_uuid(ts.first), : "fail_dns_resolution", "false")); : } : NO_FATALS(check_for_failed_peer(/*expect_failed*/false)); In addition to this happy-ending, could you add a scenario when there is an extra tablet servers to spawn a replacement replica? I guess the expected result would be having the peer with failed DNS name replaced, and it's important to make sure everything works as expected in that case. -- To view, visit http://gerrit.cloudera.org:8080/17585 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I22d870ecc526fa47b97f6856c3b023bc1ec029c7 Gerrit-Change-Number: 17585 Gerrit-PatchSet: 5 Gerrit-Owner: Andrew Wong <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Bankim Bhavsar <[email protected]> Gerrit-Reviewer: Grant Henke <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Fri, 11 Jun 2021 19:52:17 +0000 Gerrit-HasComments: Yes
