Re: an off-by-one error in dm_test_rtc_set_get()?

2021-11-24 Thread Simon Glass
Hi,

I filed

https://source.denx.de/u-boot/u-boot/-/issues/4

Regards,
Simon


On Thu, 28 Oct 2021 at 05:58, Tom Rini  wrote:
>
> On Thu, Oct 28, 2021 at 09:22:30AM +0200, Rasmus Villemoes wrote:
> > On 27/10/2021 15.22, Tom Rini wrote:
> > > On Wed, Oct 27, 2021 at 12:43:38PM +0800, Bin Meng wrote:
> > >> Hi Simon,
> > >>
> > >> gitlab reported the following test error below:
> > >>
> > >> === FAILURES 
> > >> ===
> > >> __ test_ut[ut_dm_rtc_set_get] 
> > >> __
> > >> test/py/tests/test_ut.py:43: in test_ut
> > >> assert output.endswith('Failures: 0')
> > >> E AssertionError: assert False
> > >> E + where False =  > >> 0x7f3bb792dcb0>('Failures: 0')
> > >> E + where  =
> > >> 'Test: dm_test_rtc_set_get: rtc.c\r\r\nexpected: 27/10/2021
> > >> 03:38:15\r\r\nactual: 27/10/2021 03:38:14\r\r\ntest/dm/rtc...w, &cmp,
> > >> 1): Expected 0x0 (0), got 0xffea (-22)\r\r\nTest:
> > >> dm_test_rtc_set_get: rtc.c (flat tree)\r\r\nFailures: 1'.endswith
> > >> - Captured stdout call 
> > >> -
> > >> =>
> > >>
> > >> See https://source.denx.de/u-boot/custodians/u-boot-x86/-/jobs/341905
> > >>
> > >> But the same branch same commit, azure test results passed:
> > >> https://dev.azure.com/bmeng/GitHub/_build/results?buildId=460&view=results
> > >>
> > >> It looks like the error is an off-by-one where actual time is 1 second
> > >> behind the expected time?
> > >>
> > >> expected: 27/10/2021 03:38:15
> > >> actual: 27/10/2021 03:38:14
> > >>
> > >> Is this a known issue?
> > >
> > > Yes, which is why the test checks for a certain amount of "fuzz" around
> > > the return value.
> >
> > You said the same thing about dm_test_rtc_reset() in
> > https://lore.kernel.org/u-boot/20210831124441.GC858@bill-the-cat/ , but
> > I can't find anything about any fuzz in the code. Could you point out
> > where you think that's implemented? In both cases, the expected and
> > actual values were just 1 from each other, and I fail to see how any
> > fuzz value could be smaller than that.
> >
> >  I've wondered about if we need to increase that value
> > > slightly sometimes, or just live with hitting the re-run failed jobs
> > > button on whatever CI system was a bit too slow sometimes.
> >
> > It has nothing to do with a CI being slow, it's plain and simple buggy
> > test code AFAICT. It's essentially "assert(time(NULL) == time(NULL))".
> > If a call to time() takes 1us, do this a million times and it will on
> > average fail once. Obviously, a loaded system increases the chance of
> > being preempted between the two calls and hence effectively increases
> > the delta and proportionally the probability of hitting this.
>
> You're right, I confused this with the sleep test, which does have a bit
> of fuzz to it.
>
> --
> Tom


Re: an off-by-one error in dm_test_rtc_set_get()?

2021-10-28 Thread Tom Rini
On Thu, Oct 28, 2021 at 09:22:30AM +0200, Rasmus Villemoes wrote:
> On 27/10/2021 15.22, Tom Rini wrote:
> > On Wed, Oct 27, 2021 at 12:43:38PM +0800, Bin Meng wrote:
> >> Hi Simon,
> >>
> >> gitlab reported the following test error below:
> >>
> >> === FAILURES 
> >> ===
> >> __ test_ut[ut_dm_rtc_set_get] 
> >> __
> >> test/py/tests/test_ut.py:43: in test_ut
> >> assert output.endswith('Failures: 0')
> >> E AssertionError: assert False
> >> E + where False =  >> 0x7f3bb792dcb0>('Failures: 0')
> >> E + where  =
> >> 'Test: dm_test_rtc_set_get: rtc.c\r\r\nexpected: 27/10/2021
> >> 03:38:15\r\r\nactual: 27/10/2021 03:38:14\r\r\ntest/dm/rtc...w, &cmp,
> >> 1): Expected 0x0 (0), got 0xffea (-22)\r\r\nTest:
> >> dm_test_rtc_set_get: rtc.c (flat tree)\r\r\nFailures: 1'.endswith
> >> - Captured stdout call 
> >> -
> >> =>
> >>
> >> See https://source.denx.de/u-boot/custodians/u-boot-x86/-/jobs/341905
> >>
> >> But the same branch same commit, azure test results passed:
> >> https://dev.azure.com/bmeng/GitHub/_build/results?buildId=460&view=results
> >>
> >> It looks like the error is an off-by-one where actual time is 1 second
> >> behind the expected time?
> >>
> >> expected: 27/10/2021 03:38:15
> >> actual: 27/10/2021 03:38:14
> >>
> >> Is this a known issue?
> > 
> > Yes, which is why the test checks for a certain amount of "fuzz" around
> > the return value. 
> 
> You said the same thing about dm_test_rtc_reset() in
> https://lore.kernel.org/u-boot/20210831124441.GC858@bill-the-cat/ , but
> I can't find anything about any fuzz in the code. Could you point out
> where you think that's implemented? In both cases, the expected and
> actual values were just 1 from each other, and I fail to see how any
> fuzz value could be smaller than that.
> 
>  I've wondered about if we need to increase that value
> > slightly sometimes, or just live with hitting the re-run failed jobs
> > button on whatever CI system was a bit too slow sometimes.
> 
> It has nothing to do with a CI being slow, it's plain and simple buggy
> test code AFAICT. It's essentially "assert(time(NULL) == time(NULL))".
> If a call to time() takes 1us, do this a million times and it will on
> average fail once. Obviously, a loaded system increases the chance of
> being preempted between the two calls and hence effectively increases
> the delta and proportionally the probability of hitting this.

You're right, I confused this with the sleep test, which does have a bit
of fuzz to it.

-- 
Tom


signature.asc
Description: PGP signature


Re: an off-by-one error in dm_test_rtc_set_get()?

2021-10-28 Thread Rasmus Villemoes
On 27/10/2021 15.22, Tom Rini wrote:
> On Wed, Oct 27, 2021 at 12:43:38PM +0800, Bin Meng wrote:
>> Hi Simon,
>>
>> gitlab reported the following test error below:
>>
>> === FAILURES 
>> ===
>> __ test_ut[ut_dm_rtc_set_get] 
>> __
>> test/py/tests/test_ut.py:43: in test_ut
>> assert output.endswith('Failures: 0')
>> E AssertionError: assert False
>> E + where False = > 0x7f3bb792dcb0>('Failures: 0')
>> E + where  =
>> 'Test: dm_test_rtc_set_get: rtc.c\r\r\nexpected: 27/10/2021
>> 03:38:15\r\r\nactual: 27/10/2021 03:38:14\r\r\ntest/dm/rtc...w, &cmp,
>> 1): Expected 0x0 (0), got 0xffea (-22)\r\r\nTest:
>> dm_test_rtc_set_get: rtc.c (flat tree)\r\r\nFailures: 1'.endswith
>> - Captured stdout call 
>> -
>> =>
>>
>> See https://source.denx.de/u-boot/custodians/u-boot-x86/-/jobs/341905
>>
>> But the same branch same commit, azure test results passed:
>> https://dev.azure.com/bmeng/GitHub/_build/results?buildId=460&view=results
>>
>> It looks like the error is an off-by-one where actual time is 1 second
>> behind the expected time?
>>
>> expected: 27/10/2021 03:38:15
>> actual: 27/10/2021 03:38:14
>>
>> Is this a known issue?
> 
> Yes, which is why the test checks for a certain amount of "fuzz" around
> the return value. 

You said the same thing about dm_test_rtc_reset() in
https://lore.kernel.org/u-boot/20210831124441.GC858@bill-the-cat/ , but
I can't find anything about any fuzz in the code. Could you point out
where you think that's implemented? In both cases, the expected and
actual values were just 1 from each other, and I fail to see how any
fuzz value could be smaller than that.

 I've wondered about if we need to increase that value
> slightly sometimes, or just live with hitting the re-run failed jobs
> button on whatever CI system was a bit too slow sometimes.

It has nothing to do with a CI being slow, it's plain and simple buggy
test code AFAICT. It's essentially "assert(time(NULL) == time(NULL))".
If a call to time() takes 1us, do this a million times and it will on
average fail once. Obviously, a loaded system increases the chance of
being preempted between the two calls and hence effectively increases
the delta and proportionally the probability of hitting this.

Rasmus


Re: an off-by-one error in dm_test_rtc_set_get()?

2021-10-27 Thread Bin Meng
On Wed, Oct 27, 2021 at 9:22 PM Tom Rini  wrote:
>
> On Wed, Oct 27, 2021 at 12:43:38PM +0800, Bin Meng wrote:
> > Hi Simon,
> >
> > gitlab reported the following test error below:
> >
> > === FAILURES 
> > ===
> > __ test_ut[ut_dm_rtc_set_get] 
> > __
> > test/py/tests/test_ut.py:43: in test_ut
> > assert output.endswith('Failures: 0')
> > E AssertionError: assert False
> > E + where False =  > 0x7f3bb792dcb0>('Failures: 0')
> > E + where  =
> > 'Test: dm_test_rtc_set_get: rtc.c\r\r\nexpected: 27/10/2021
> > 03:38:15\r\r\nactual: 27/10/2021 03:38:14\r\r\ntest/dm/rtc...w, &cmp,
> > 1): Expected 0x0 (0), got 0xffea (-22)\r\r\nTest:
> > dm_test_rtc_set_get: rtc.c (flat tree)\r\r\nFailures: 1'.endswith
> > - Captured stdout call 
> > -
> > =>
> >
> > See https://source.denx.de/u-boot/custodians/u-boot-x86/-/jobs/341905
> >
> > But the same branch same commit, azure test results passed:
> > https://dev.azure.com/bmeng/GitHub/_build/results?buildId=460&view=results
> >
> > It looks like the error is an off-by-one where actual time is 1 second
> > behind the expected time?
> >
> > expected: 27/10/2021 03:38:15
> > actual: 27/10/2021 03:38:14
> >
> > Is this a known issue?
>
> Yes, which is why the test checks for a certain amount of "fuzz" around
> the return value.  I've wondered about if we need to increase that value
> slightly sometimes, or just live with hitting the re-run failed jobs
> button on whatever CI system was a bit too slow sometimes.
>

Thanks Tom. So it is a known issue. It's just my first time seeing
this failure in CI :)

Regards,
Bin


Re: an off-by-one error in dm_test_rtc_set_get()?

2021-10-27 Thread Tom Rini
On Wed, Oct 27, 2021 at 12:43:38PM +0800, Bin Meng wrote:
> Hi Simon,
> 
> gitlab reported the following test error below:
> 
> === FAILURES 
> ===
> __ test_ut[ut_dm_rtc_set_get] 
> __
> test/py/tests/test_ut.py:43: in test_ut
> assert output.endswith('Failures: 0')
> E AssertionError: assert False
> E + where False =  0x7f3bb792dcb0>('Failures: 0')
> E + where  =
> 'Test: dm_test_rtc_set_get: rtc.c\r\r\nexpected: 27/10/2021
> 03:38:15\r\r\nactual: 27/10/2021 03:38:14\r\r\ntest/dm/rtc...w, &cmp,
> 1): Expected 0x0 (0), got 0xffea (-22)\r\r\nTest:
> dm_test_rtc_set_get: rtc.c (flat tree)\r\r\nFailures: 1'.endswith
> - Captured stdout call 
> -
> =>
> 
> See https://source.denx.de/u-boot/custodians/u-boot-x86/-/jobs/341905
> 
> But the same branch same commit, azure test results passed:
> https://dev.azure.com/bmeng/GitHub/_build/results?buildId=460&view=results
> 
> It looks like the error is an off-by-one where actual time is 1 second
> behind the expected time?
> 
> expected: 27/10/2021 03:38:15
> actual: 27/10/2021 03:38:14
> 
> Is this a known issue?

Yes, which is why the test checks for a certain amount of "fuzz" around
the return value.  I've wondered about if we need to increase that value
slightly sometimes, or just live with hitting the re-run failed jobs
button on whatever CI system was a bit too slow sometimes.

-- 
Tom


signature.asc
Description: PGP signature


an off-by-one error in dm_test_rtc_set_get()?

2021-10-26 Thread Bin Meng
Hi Simon,

gitlab reported the following test error below:

=== FAILURES ===
__ test_ut[ut_dm_rtc_set_get] __
test/py/tests/test_ut.py:43: in test_ut
assert output.endswith('Failures: 0')
E AssertionError: assert False
E + where False = ('Failures: 0')
E + where  =
'Test: dm_test_rtc_set_get: rtc.c\r\r\nexpected: 27/10/2021
03:38:15\r\r\nactual: 27/10/2021 03:38:14\r\r\ntest/dm/rtc...w, &cmp,
1): Expected 0x0 (0), got 0xffea (-22)\r\r\nTest:
dm_test_rtc_set_get: rtc.c (flat tree)\r\r\nFailures: 1'.endswith
- Captured stdout call -
=>

See https://source.denx.de/u-boot/custodians/u-boot-x86/-/jobs/341905

But the same branch same commit, azure test results passed:
https://dev.azure.com/bmeng/GitHub/_build/results?buildId=460&view=results

It looks like the error is an off-by-one where actual time is 1 second
behind the expected time?

expected: 27/10/2021 03:38:15
actual: 27/10/2021 03:38:14

Is this a known issue?

Regards,
Bin