Alexey Serbin has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/22390


Change subject: KUDU-3641 fix flaky TestNewLeaderCantResolvePeers
......................................................................

KUDU-3641 fix flaky TestNewLeaderCantResolvePeers

I noticed that RaftConsensusElectionITest.TestNewLeaderCantResolvePeers
scenario was failing from time to time in pre-commit tests, and the same
issue was also exposed by the flaky tests dashboard [1].

The scenario would usually succeed because in most cases the system
catalog was able to establish a tablet replica at the newly added tablet
server even before LeaderStepDown() had been called.  Since the UUIDs
of the new and the old leader were the same for the LeaderStepDown()
invocation, the implementation was using the short-circuited path
(i.e. doing nothing) instead of starting an actual election round.
The scenario would fail if the tablet replica hadn't yet been placed
at the newly added server by the time of checking for its presence by
ListRunningTabletIds().

The fix is trivial: use StartElection() instead of LeaderStepDown().

To verify that this patch fixes the issue, I ran the following command
against DEBUG bits built with and without the patch at the same machine.
Without the patch, the scenario would fail once in ~150 runs.
With the patch, there hasn't been a single failure.

  ./bin/raft_consensus_election-itest \
    --gtest_filter='*TestNewLeaderCantResolvePeers' \
    --stress_cpu_threads=24 \
    --gtest_repeat=1000

This is a follow-up to f9647149a49ddb87ea0ecf069eab3b5ec0217136.

[1] 
http://dist-test.cloudera.org:8080/test_drilldown?test_name=raft_consensus_election-itest

Change-Id: I9f724fee15eec74c068ce0aecfd4544f99a46866
Reviewed-on: http://gerrit.cloudera.org:8080/22389
Tested-by: Kudu Jenkins
Reviewed-by: Yifan Zhang <[email protected]>
(cherry picked from commit 6c77ec8752dce6c8253c980c71a25859a3b63f67)
---
M src/kudu/integration-tests/raft_consensus_election-itest.cc
1 file changed, 2 insertions(+), 2 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/90/22390/1
--
To view, visit http://gerrit.cloudera.org:8080/22390
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: branch-1.18.x
Gerrit-MessageType: newchange
Gerrit-Change-Id: I9f724fee15eec74c068ce0aecfd4544f99a46866
Gerrit-Change-Number: 22390
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin <[email protected]>

Reply via email to