gem_eio: Adjust for long reset-resume cycle on Gen12+

Janusz Krzysztofik Wed, 01 Oct 2025 08:15:22 -0700

Hi Kamil,

Thanks for looking at this.


On Wednesday, 1 October 2025 15:36:00 CEST Kamil Konieczny wrote:
> Hi Janusz,
> On 2025-09-30 at 14:49:01 +0200, Janusz Krzysztofik wrote:
> 
> could you improve subject? Now it is a little confusing why
> there are two, very similar changes needed, see first and
> second subjects:
> 
> [i-g-t,1/3] tests/gem_eio: Adjust for long reset-resume cycle on Gen12+
> 
> [i-g-t,2/3] tests/gem_eio: Adjust for slow resume after reset on Gen12+
> 
> At first I do not know why two very similar changes are made with
> two different commits.

In my opinion these two patches address two separate issues.  The first issue 
-- not enough measurements for median calculation -- prevents the exercise 
from calculating and reporting results.  The second issue in turn sometimes 
applies not quite realistic expectations to those results.  In order to learn 
which platforms suffer from the second issue, you have to resolve the first 
one to see those results.  To avoid confusion, I can try to reword those 
commit messages so they don't look so similar.

> 
> See also below.
> 
> > Subtests that measure time of resume after engine reset require results
> > from at least 9 reset-resume cycles for reasonable calculation of a median
> > value to be compared against presumed limits.  On most of Gen12+
> > platforms, the limit of 5 seconds for collecting those results occurs too
> > short for executing 9 reset-resum cycles.
> > 
> > Raise the limit to 20 seconds, and break the loop as soon as 9 results are
> > collected.  Also, warn if less than 9 resets have been completed within
> > the limit instead of silently succeeding despite the check being skipped.
> > 
> > Signed-off-by: Janusz Krzysztofik <[email protected]>
> > ---
> >  tests/intel/gem_eio.c | 11 ++++++++---
> >  1 file changed, 8 insertions(+), 3 deletions(-)
> > 
> > diff --git a/tests/intel/gem_eio.c b/tests/intel/gem_eio.c
> > index b65b914faf..b6155c7fc4 100644
> > --- a/tests/intel/gem_eio.c
> > +++ b/tests/intel/gem_eio.c
> > @@ -409,8 +409,10 @@ static void check_wait_elapsed(const char *prefix, int 
> > fd, igt_stats_t *st)
> >              igt_stats_get_median(st)*1e-6,
> >              igt_stats_get_max(st)*1e-6);
> >  
> > -   if (st->n_values < 9)
> > -           return; /* too few for stable median */
> > +   if (igt_warn_on_f(st->n_values < 9,
> > +       "%d resets completed -- less than 9, too few for stable median\n",
> > +       st->n_values))
> > +           return;
> 
> imho this could be a separate change, as there was silence before.

OK, unless I kill that warning, taking into account the comment from Krzysztof 
Karaś.

> 
> >  
> >     /*
> >      * Older platforms need to reset the display (incl. modeset to off,
> > @@ -928,7 +930,7 @@ static void reset_stress(int fd, uint64_t ahnd, const 
> > intel_ctx_t *ctx0,
> >     gem_write(fd, obj.handle, 0, &bbe, sizeof(bbe));
> >  
> >     igt_stats_init(&stats);
> > -   igt_until_timeout(5) {
> > +   igt_until_timeout(20) {
> 
> Could you increase it only when needed?

Taking into account the other part of this change that breaks this loop as 
soon as enough measurements are collected, I don't think such complication is 
worth of effort, can you please explain why you think it is?

Thanks,
Janusz

> 
> Regards,
> Kamil
> 
> >             const intel_ctx_t *ctx = context_create_safe(fd);
> >             igt_spin_t *hang;
> >             unsigned int i;
> > @@ -977,6 +979,9 @@ static void reset_stress(int fd, uint64_t ahnd, const 
> > intel_ctx_t *ctx0,
> >             gem_sync(fd, obj.handle);
> >             igt_spin_free(fd, hang);
> >             intel_ctx_destroy(fd, ctx);
> > +
> > +           if (stats.n_values > 8)
> > +                   break;
> >     }
> >     check_wait_elapsed(name, fd, &stats);
> >     igt_stats_fini(&stats);
>

Re: [PATCH i-g-t 1/3] tests/gem_eio: Adjust for long reset-resume cycle on Gen12+

Reply via email to