Hi Michal,

On Thu, Jul 03, 2025 at 05:58:48PM +0200, Michal Koutný wrote:
> On Thu, Jul 03, 2025 at 09:03:20PM +0900, Shashank Balaji 
> <[email protected]> wrote:
> > Current cpu.max tests (both the normal one and the nested one) are 
> > inaccurate.
> > 
> > They setup cpu.max with 1000 us quota and the default period (100,000 us).
> > A cpu hog is run for a duration of 1s as per wall clock time. This 
> > corresponds
> > to 10 periods, hence an expected usage of 10,000 us. We want the measured
> > usage (as per cpu.stat) to be close to 10,000 us.
> > 
> > Previously, this approximate equality test was done by
> > `!values_close(usage_usec, duration_usec, 95)`: if the absolute
> > difference between usage_usec and duration_usec is greater than 95% of
> > their sum, then we pass. This is problematic for two reasons:
> > 
> > 1. Semantics: When one sees `values_close` they expect the error
> >    percentage to be some small number, not 95. The intent behind using
> > `values_close` is lost by using a high error percent such as 95. The
> > intent it's actually going for is "values far".
> > 
> > 2. Bound too wide: The condition translates to the following expression:
> > 
> >     |usage_usec - duration_usec| > (usage_usec + duration_usec)*0.95
> > 
> >     0.05*duration_usec > 1.95*usage_usec (usage < duration)
> > 
> >     usage_usec < 0.0257*duration_usec = 25,641 us
> > 
> >    So, this condition passes as long as usage_usec is lower than 25,641
> > us, while all we want is for it to be close to 10,000 us.
> > 
> > Fix this by explicitly calcuating the expected usage duration based on the
> > configured quota, default period, and the duration, and compare usage_usec
> > and expected_usage_usec using values_close() with a 10% error margin.
> > 
> > Also, use snprintf to get the quota string to write to cpu.max instead of
> > hardcoding the quota, ensuring a single source of truth.
> > 
> > Signed-off-by: Shashank Balaji <[email protected]>
> > 
> > ---
> > 
> > Changes in v2:
> > - Incorporate Michal's suggestions:
> >     - Merge two patches into one
> >     - Generate the quota string from the variable instead of hardcoding it
> >     - Use values_close() instead of labs()
> >     - Explicitly calculate expected_usage_usec
> > - v1: 
> > https://lore.kernel.org/all/[email protected]/
> > ---
> >  tools/testing/selftests/cgroup/test_cpu.c | 63 ++++++++++++++++-------
> >  1 file changed, 43 insertions(+), 20 deletions(-)
> 
> 
> > -   user_usec = cg_read_key_long(cpucg, "cpu.stat", "user_usec");
> > -   if (user_usec <= 0)
> > +   if (usage_usec <= 0)
> >             goto cleanup;
> >  
> > -   if (user_usec >= expected_usage_usec)
> > -           goto cleanup;
> 
> I think this was a meaningful check. Not sure if dropped accidentally or
> on purpose w/out explanation.
> 
> After that's addressed, feel free to add
> Acked-by: Michal Koutný <[email protected]>

Sorry about that. I dropped it accidentally. This check should be okay,
right?

        if (usage_usec > expected_usage_usec)
                goto cleanup;

1. We don't need to separately check user_usec because it'll always be
less than user_usec, and usage_usec is what's directly affected by
throttling.
2. I changed the >= to > because, not that it'll ever happen, but we can
let usage_usec = expected_usage_usec pass. Afterall, it's called
"expected" for a reason.

Thanks

Shashank

Reply via email to