David Ribeiro Alves has uploaded a new change for review.
Change subject: [flaky tests] Fix "Already present" failures on
[flaky tests] Fix "Already present" failures on raft_consensus-itest
In the flaky tests dashboarf TestSlowLeader fails with:
"Check failed: e->status().ok() Unexpected status: Already present: key already
This happens because it's possible for TestWorload to generate
identical random numbers on different threads, even though we use
a multiplicative linear congruential PRNG that is supposed to
generate all unique numbers within a single period of the PRNG.
This patch changes TestWorkload to use a ThreadSafeRandom. We
could also change the key type to int64 and do something like
int64 key = r.Next32() << 32 | thread_index, however changing
the type of the key is very invasive as a bunch of tests
depend on it.
This also increses the timeout of the snapshot scan in
TestReplicaBehaviorViaRPC as this would spuriously fail and
increases the time we wain on TestCommitIndexFarBehindAfterLeaderElection
which would cause spurious failures.
Results of running this on dist-test:
To be fair I've only seen the failure "Already present" failure
outside of the flaky dashboard once, it's probably rarer than
1000 loops would allow to assert. However it's apparently difficult
to mimick the exact same conditions as the flaky dashboard tests:
running raft_consensus-itest with the stress option makes it incredibly
flaky, with different failures than the ones seen on the dashboard.
Not running it with stress makes it pass the large majority of
3 files changed, 23 insertions(+), 12 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/19/5319/1
To view, visit http://gerrit.cloudera.org:8080/5319
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Owner: David Ribeiro Alves <dral...@apache.org>