Hello Mike Percy, Andrew Wong, Grant Henke,

I'd like you to do a code review. Please visit

    http://gerrit.cloudera.org:8080/12917

to review the following change.


Change subject: build: adapt new Java flaky test infrastructure to existing 
controls
......................................................................

build: adapt new Java flaky test infrastructure to existing controls

Now that Java tests are reporting success/failure, we can use the existing
flaky test controls to drive it. As a refresher, the C++ tests rely on these
environment variables:
- RUN_FLAKY_ONLY: whether to run just flaky tests or all tests
- KUDU_FLAKY_TEST_ATTEMPTS: number of attempts for flaky tests
- KUDU_FLAKY_TEST_LIST: path to list of flaky tests, one on each line
- KUDU_RETRY_ALL_FAILED_TESTS: whether to retry all tests or just the ones
                               in the flaky test list

The algorithm is roughly:
  if RUN_FLAKY_ONLY or KUDU_FLAKY_TEST_ATTEMPTS > 1:
    populate KUDU_FLAKY_TEST_LIST from test result server

  if RUN_FLAKY_ONLY:
    testset = tests listed in KUDU_FLAKY_TEST_LIST
  else:
    testset = all tests

  for t in testset:
    if KUDU_RETRY_ALL_FAILED_TESTS or (KUDU_FLAKY_TEST_LIST and
                                       t in KUDU_FLAKY_TEST_LIST):
      num_attempts = KUDU_FLAKY_TEST_ATTEMPTS (or 1 if unset)
    else:
      num_attempts = 1

    run t up to num_attempts times

You can see it at work in build-and-test.sh/run-test.sh. You can also see it
in dist-test.py though notably, it doesn't care about RUN_FLAKY_ONLY because
we never used that particular combination (presumably the list of flaky
tests is short enough that it wouldn't benefit from distributed testing).

This patch attempts to mirror these exact semantics for Java tests. Here are
the interesting changes:
- In RetryRule, rerunFailingTestsCount is gone. The behavior is informed via
  the aforementioned environment variables instead.
- In build-and-test.sh, if RUN_FLAKY_ONLY is set, parse the flaky test list
  into a series of --tests gradle command line arguments.
- In dist-test.py, opt into the C++ flaky test handling (which reflects the
  above algorithm). There are also some small changes to flaky handling to
  accommodate Java's per-method flaky test tracking.

Note: all of this assumes that there's no overlap between the names of any
C++ or Java tests, which is currently true as all C++ tests have names like
"tablet-test" or "master_cert_authority-itest" while all Java tests are
prefixed with "org.apache.kudu...". If this were to change, we'd need to
properly "namespace" the test results in the reporting infrastructure and
fetch the flaky test lists separately for C++ and Java tests. For now
there's just one flaky test list, and both ctest and gradle are OK with
being asked to run irrelevant tests (they'll just be ignored).

Change-Id: Ia89598d7eeb5ab642ab4ebb7aa583adcce770eae
---
M build-support/dist_test.py
M build-support/jenkins/build-and-test.sh
M java/gradle/tests.gradle
M java/kudu-test-utils/src/main/java/org/apache/kudu/test/junit/RetryRule.java
4 files changed, 139 insertions(+), 31 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/17/12917/1
-- 
To view, visit http://gerrit.cloudera.org:8080/12917
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ia89598d7eeb5ab642ab4ebb7aa583adcce770eae
Gerrit-Change-Number: 12917
Gerrit-PatchSet: 1
Gerrit-Owner: Adar Dembo <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Grant Henke <[email protected]>
Gerrit-Reviewer: Mike Percy <[email protected]>

Reply via email to