Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11374 )

Change subject: build: retry failed tests by default
......................................................................


Patch Set 3:

(3 comments)

> Ah, actually pre-commit uses build-and-test too, just with enabled dist-test. 
> So changing the default value in build-and-test.sh does affect the pre-commit 
> case.

We have to choose between two options: 1) the status quo, wherein only tests 
from the flaky test list can be retried, or 2) this patch, wherein all tests 
can be retried. Neither are particularly good for the reasons you've outlined: 
#1 means we don't let ourselves retry tests that fail once per week, and #2 
means we can potentially introduce new flaky tests without noticing at 
precommit time.

If you're open to modifying test_result_server.py, we can explore other 
options. For example, we could use the flaky database to figure out if a 
particular test is "new" (i.e. has no runs) and not retry it, but retry the 
rest.

http://gerrit.cloudera.org:8080/#/c/11374/3/build-support/dist_test.py
File build-support/dist_test.py:

http://gerrit.cloudera.org:8080/#/c/11374/3/build-support/dist_test.py@59
PS3, Line 59: FLAKY_TEST_RETRIES = 
int(os.environ.get('KUDU_FLAKY_TEST_ATTEMPTS', 1)) - 1
Might want to rename/recomment this so it reflects the fact that _any_ failure 
is retried, not just failures in tests that are in the (now no longer used) 
flaky test list.

Changing KUDU_FLAKY_TEST_ATTEMPTS probably isn't worth it though (since it's 
something of an interface that various things may depend on).


http://gerrit.cloudera.org:8080/#/c/11374/3/build-support/dist_test.py@395
PS3, Line 395:                "max_retries": max_retries
Can just put FLAKY_TEST_RETRIES directly in here.


http://gerrit.cloudera.org:8080/#/c/11374/3/build-support/run-test.sh
File build-support/run-test.sh:

http://gerrit.cloudera.org:8080/#/c/11374/3/build-support/run-test.sh@71
PS3, Line 71: TEST_EXECUTION_ATTEMPTS=${KUDU_FLAKY_TEST_ATTEMPTS:-3}
dist-test uses run-test.sh, so won't this cause any dist-test triggered test to 
be retried? I thought you didn't want that.



--
To view, visit http://gerrit.cloudera.org:8080/11374
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6fc069527d6f2553a4e04e14c821090f826cbc0a
Gerrit-Change-Number: 11374
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Wong <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-Comment-Date: Sat, 01 Sep 2018 22:48:55 +0000
Gerrit-HasComments: Yes

Reply via email to