IMPALA-4868: Fix flaky TestRequestPoolService.testUpdatingConfigs Occasionally due to timing, testUpdatingConfigs() fails in jenkins jobs. This can be reproduced by manually changing the sleep times in the test. The fix is to attempt checking the results several times, sleeping briefly in between attempts.
Testing: Manually changed the sleep times to simulate failure and success cases. Change-Id: Id94b59039363368d21ebb01cec18ae82d1390546 Reviewed-on: http://gerrit.cloudera.org:8080/5876 Reviewed-by: Tim Armstrong <[email protected]> Tested-by: Impala Public Jenkins Project: http://git-wip-us.apache.org/repos/asf/incubator-impala/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-impala/commit/3b36e939 Tree: http://git-wip-us.apache.org/repos/asf/incubator-impala/tree/3b36e939 Diff: http://git-wip-us.apache.org/repos/asf/incubator-impala/diff/3b36e939 Branch: refs/heads/master Commit: 3b36e939c0b18d79be079da79881865e83e4bdb2 Parents: 59cdf6b Author: Matthew Jacobs <[email protected]> Authored: Thu Feb 2 13:28:28 2017 -0800 Committer: Impala Public Jenkins <[email protected]> Committed: Fri Feb 3 03:40:06 2017 +0000 ---------------------------------------------------------------------- .../impala/util/TestRequestPoolService.java | 23 +++++++++++++++----- 1 file changed, 17 insertions(+), 6 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3b36e939/fe/src/test/java/org/apache/impala/util/TestRequestPoolService.java ---------------------------------------------------------------------- diff --git a/fe/src/test/java/org/apache/impala/util/TestRequestPoolService.java b/fe/src/test/java/org/apache/impala/util/TestRequestPoolService.java index f0887ef..4bb5d0c 100644 --- a/fe/src/test/java/org/apache/impala/util/TestRequestPoolService.java +++ b/fe/src/test/java/org/apache/impala/util/TestRequestPoolService.java @@ -198,12 +198,23 @@ public class TestRequestPoolService { Thread.sleep(1000L); Files.copy(getClasspathFile(ALLOCATION_FILE_MODIFIED), allocationConfFile_); Files.copy(getClasspathFile(LLAMA_CONFIG_FILE_MODIFIED), llamaConfFile_); - // Wait at least 1 second more than the time it will take for the - // AllocationFileLoaderService to update the file. The FileWatchService does not - // have that additional wait time, so it will be updated within 'CHECK_INTERVAL_MS' - Thread.sleep(1000L + CHECK_INTERVAL_MS + - AllocationFileLoaderService.ALLOC_RELOAD_WAIT_MS); - checkModifiedConfigResults(); + + // Need to wait for the YARN AllocationFileLoaderService (for the + // allocationConfFile_) as well as the FileWatchService (for the llamaConfFile_). If + // the system is busy this may take even longer, so we need to try a few times. + Thread.sleep(CHECK_INTERVAL_MS + AllocationFileLoaderService.ALLOC_RELOAD_WAIT_MS); + + int numAttempts = 10; + while (true) { + try { + checkModifiedConfigResults(); + break; + } catch (AssertionError e) { + if (numAttempts == 0) throw e; + --numAttempts; + Thread.sleep(1000L); + } + } } @Test
