Andrew Wong has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/15113


Change subject: KUDU-3046: dfelake TabletServerQuiescingITest
......................................................................

KUDU-3046: dfelake TabletServerQuiescingITest

The test was flaky for a number of reasons including:
- Slowness in TSAN mode along with a low Raft timeout meant workloads
  would fail to even create tablets.
  - Addressed this by increasing the heartbeat interval in TSAN mode.
- Not hitting exact the number of scanners when running the tool because
  of a TOCTOU race between checking the number of scanners and running
  the tool.
  - Addressed this by reducing the number of read threads and thus
    reducing the degrees of freedom with which the tool can run (either
    0 scanners or 1 scanner).
- TestAbruptStepdownWhileAllQuiescing failed because the test would step
  down a leader without the guarantee that it was the latest leader, so
  a leader could still exist.
  - Addressed this by stepping down on all tablet servers just to be
    sure.

There appears to be another source of flakiness that are less specific
to this test, but this dropped flakiness from failing 4/100 to failing
9/2000 (all due to a TSAN issue in the TestWorkload that I'm still
getting to the bottom of).

Change-Id: I3f9ef531062c4b66648840e04962070768fbad5d
---
M src/kudu/integration-tests/tablet_server_quiescing-itest.cc
1 file changed, 26 insertions(+), 10 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/13/15113/1
--
To view, visit http://gerrit.cloudera.org:8080/15113
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I3f9ef531062c4b66648840e04962070768fbad5d
Gerrit-Change-Number: 15113
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>

Reply via email to