David Ribeiro Alves has submitted this change and it was merged.

Change subject: [flaky tests] Fix "Already present" failures on 
raft_consensus-itest
......................................................................


[flaky tests] Fix "Already present" failures on raft_consensus-itest

In the flaky tests dashboard TestSlowLeader fails with:

"Check failed: e->status().ok() Unexpected status: Already present: key already 
present"

This happens because it's possible for TestWorkload to generate
identical random numbers on different threads, even though we use
a multiplicative linear congruential PRNG that is supposed to
generate all unique numbers within a single period of the PRNG.

This patch changes TestWorkload to use a ThreadSafeRandom. We
could also change the key type to int64 and do something like
int64 key = r.Next32() << 32 | thread_index, however changing
the type of the key is very invasive as a bunch of tests
depend on it.

This also increases the timeout of the snapshot scan in
TestReplicaBehaviorViaRPC as this would spuriously fail and
increases the time we wait on TestCommitIndexFarBehindAfterLeaderElection
which would cause spurious failures.

Results of running this on dist-test:
http://dist-test.cloudera.org//job?job_id=david.alves.1480656349.3518

To be fair I've only seen the failure "Already present" failure
outside of the flaky dashboard once, it's probably rarer than
1000 loops would allow to assert. However it's apparently difficult
to mimic the exact same conditions as the flaky dashboard tests:
running raft_consensus-itest with the stress option makes it incredibly
flaky, with different failures than the ones seen on the dashboard.
Not running it with stress makes it pass the large majority of
the time.

Change-Id: I35faf53cb9bb8585ec1c01d038b1cd64a0bb533e
Reviewed-on: http://gerrit.cloudera.org:8080/5319
Reviewed-by: David Ribeiro Alves <dral...@apache.org>
Tested-by: Kudu Jenkins
---
M src/kudu/integration-tests/raft_consensus-itest.cc
M src/kudu/integration-tests/test_workload.cc
M src/kudu/integration-tests/test_workload.h
3 files changed, 23 insertions(+), 12 deletions(-)

Approvals:
  David Ribeiro Alves: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/5319
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I35faf53cb9bb8585ec1c01d038b1cd64a0bb533e
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: David Ribeiro Alves <dral...@apache.org>
Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <dral...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mpe...@apache.org>
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>

Reply via email to