Re: [lng-odp] [PATCH] validation: scheduler: increase delay tolerance

Ivan Khoronzhuk Mon, 29 Feb 2016 14:23:25 -0800


On 29.02.16 23:39, Bill Fischofer wrote:



On Mon, Feb 29, 2016 at 11:49 AM, Ivan Khoronzhuk <[email protected] 
<mailto:[email protected]>> wrote:

    For some systems sensitive to time deviation, would be better
    to have some time backup while testing scheduler time, so increase
    it to 3 jiffies.

    https://bugs.linaro.org/show_bug.cgi?id=2076

    Signed-off-by: Ivan Khoronzhuk <[email protected] 
<mailto:[email protected]>>
    ---
      test/validation/scheduler/scheduler.c | 2 +-
      1 file changed, 1 insertion(+), 1 deletion(-)

    diff --git a/test/validation/scheduler/scheduler.c 
b/test/validation/scheduler/scheduler.c
    index dcf01c0..cb04209 100644
    --- a/test/validation/scheduler/scheduler.c
    +++ b/test/validation/scheduler/scheduler.c
    @@ -48,7 +48,7 @@
      #define CHAOS_NDX_TO_PTR(n) ((void *)(uintptr_t)n)
      #define CHAOS_WAIT_FAIL     (5 * ODP_TIME_SEC_IN_NS)

    -#define ODP_WAIT_TOLERANCE     (20 * ODP_TIME_MSEC_IN_NS)
    +#define ODP_WAIT_TOLERANCE     (60 * ODP_TIME_MSEC_IN_NS)


Do we know that this is a fix or is it just a guess at a "better" number?

It's based on jiffy. It definitely should pass for "normal" platforms w/o 
impact of Linux scheduler.
As we cannot predict the load on system better to have backup, practice says 
that 1 jiffy is not enough in some cases.

The original code used unconditional waits.
If the concern is simply to avoid the possibility of indefinite stalls then why 
try to cut things so close?

Nope. Intention here not simply catch indefinite stalls, the intention to check 
if time sense for scheduler is working in normal ranges.
It can differ greatly in case of some incorrect initialization or calculation. 
This test is going to catch this.
Actually, the test caught this bug, this bug is not a bug of test, it's bug of 
linux-generic implementation when scheduler timeout is corrupted
with LK scheduler and it's nice to see this captured here. In the same way it's going to 
catch issues on "real" boards, where such huge impact can
be only in case of incorrect timings.

 We could agree that a wait of one minute is sufficient to say that something 
definitely isn't right, but do we care what sort of jitter we may see on a 
run-by-run basis here?

1 minute is a very huge amount of time. Here I just increased it from 20ms on 
5seconds to 60ms on 5seconds. Does it a very small error? It's about 1.2%.
But it's not based on percentage currently, it's based on slices the linux 
kernel scheduler splits time, if you want it can be bound with 10% error for 
instance.
I cannot test and predict this value, no one can, which delays can be on non 
real time systems.
If platform cannot pass this test it definitely should improve it`s timing. For 
instance, if app decides to wait no more than 100ms, but scheduler waits 150ms, 
is it normal?
Maybe it's normal for linux-generic, but 50ms of waste time it's big amount of 
time for normal cases.


A one minute timeout would mean that tests would always get a result.  
Implementations that observe waits of that magnitude would clearly be in need 
of investigation while others would still pass this functional validation.  
Other tests generate performance numbers and if scheduling waits are 
unacceptably large they'd be better covered in that context.


      /* Test global variables */
      typedef struct {
    --
    1.9.1

    _______________________________________________
    lng-odp mailing list
    [email protected] <mailto:[email protected]>
    https://lists.linaro.org/mailman/listinfo/lng-odp


--
Regards,
Ivan Khoronzhuk
_______________________________________________
lng-odp mailing list
[email protected]
https://lists.linaro.org/mailman/listinfo/lng-odp

Re: [lng-odp] [PATCH] validation: scheduler: increase delay tolerance

Reply via email to