On 27/10/2021 15.22, Tom Rini wrote: > On Wed, Oct 27, 2021 at 12:43:38PM +0800, Bin Meng wrote: >> Hi Simon, >> >> gitlab reported the following test error below: >> >> =================================== FAILURES >> =================================== >> __________________________ test_ut[ut_dm_rtc_set_get] >> __________________________ >> test/py/tests/test_ut.py:43: in test_ut >> assert output.endswith('Failures: 0') >> E AssertionError: assert False >> E + where False = <built-in method endswith of str object at >> 0x7f3bb792dcb0>('Failures: 0') >> E + where <built-in method endswith of str object at 0x7f3bb792dcb0> = >> 'Test: dm_test_rtc_set_get: rtc.c\r\r\nexpected: 27/10/2021 >> 03:38:15\r\r\nactual: 27/10/2021 03:38:14\r\r\ntest/dm/rtc...w, &cmp, >> 1): Expected 0x0 (0), got 0xffffffea (-22)\r\r\nTest: >> dm_test_rtc_set_get: rtc.c (flat tree)\r\r\nFailures: 1'.endswith >> ----------------------------- Captured stdout call >> ----------------------------- >> => >> >> See https://source.denx.de/u-boot/custodians/u-boot-x86/-/jobs/341905 >> >> But the same branch same commit, azure test results passed: >> https://dev.azure.com/bmeng/GitHub/_build/results?buildId=460&view=results >> >> It looks like the error is an off-by-one where actual time is 1 second >> behind the expected time? >> >> expected: 27/10/2021 03:38:15 >> actual: 27/10/2021 03:38:14 >> >> Is this a known issue? > > Yes, which is why the test checks for a certain amount of "fuzz" around > the return value.
You said the same thing about dm_test_rtc_reset() in https://lore.kernel.org/u-boot/20210831124441.GC858@bill-the-cat/ , but I can't find anything about any fuzz in the code. Could you point out where you think that's implemented? In both cases, the expected and actual values were just 1 from each other, and I fail to see how any fuzz value could be smaller than that. I've wondered about if we need to increase that value > slightly sometimes, or just live with hitting the re-run failed jobs > button on whatever CI system was a bit too slow sometimes. It has nothing to do with a CI being slow, it's plain and simple buggy test code AFAICT. It's essentially "assert(time(NULL) == time(NULL))". If a call to time() takes 1us, do this a million times and it will on average fail once. Obviously, a loaded system increases the chance of being preempted between the two calls and hence effectively increases the delta and proportionally the probability of hitting this. Rasmus