Re: [PATCH i-g-t] tests/dmabuf: Avoid soft lockups on slow platforms

Janusz Krzysztofik Wed, 05 Nov 2025 02:15:24 -0800

Hi Krzysztof,

Thanks for taking a look.


On Wednesday, 5 November 2025 09:37:14 CET Krzysztof Karas wrote:
> Hi Janusz,
> 
> On 2025-11-04 at 12:20:24 +0100, Janusz Krzysztofik wrote:
> > Certain selftests, while basically correct, may fail on certain platforms.
> > E.g., igt@dmabuf@all-test@dma_fence_chain used to complete successfully,
> > but on slow machines it triggers soft lockup warnings which taint the
> > kernel.
> > 
> > Sometimes, like in the above mentioned case, it's not possible to fix a
> > root cause of the issue since it is not recognized as a bug.  To avoid
> > ever returning CI bug reports in such cases, allow selftests to be called
> > via user provided wrappers that take care of not triggering unavoidable
> > failures, e.g. by skipping specific selftests if some conditions are not
> > met, or watching their execution and acting upon certain conditions or
> > events.
> > 
> > With that in place, update the dmabuf test so it, as the first user of the
> > new feature, skips the dma_fence_chain selftest if a machine looks too
> > slow.  Since that's a hardware agnostic selftest, running it on a limited
> > subset of machines seems acceptable, especially when the soft lockups it
> > can trigger aren't recognized as bugs on the kernel side.
> > 
> > Signed-off-by: Janusz Krzysztofik <[email protected]>
> > ---
> >  
> 
> [...]
> 
> > +static unsigned int bogomips(void)
> > +{
> > +   unsigned int bogomips, ret = 0;
> > +   char *line = NULL;
> > +   size_t size = 0;
> > +   FILE *cpuinfo;
> > +
> > +   cpuinfo = fopen("/proc/cpuinfo", "r");
> > +   if (igt_debug_on(!cpuinfo))
> > +           return UINT_MAX;
> Could this result in running the test on a slower machine
> unintentionally? fopen() could fail, the bogomips() would return
> a large value and wrapper function would behave as if the test
> is running on a faster platform.

Yes, if opening /proc/cpuinfo fails then my approach could result in running 
the test on a machine that occurs too slow.  If that happens then an abort 
caused by soft lockup will be reported again, but this time with the fopen() 
failure also pointed out among debug messages.  But why could that failing 
fopen() happen in practice?

One possible case is an environment with no /proc/cpuinfo (non-Linux, e.g., 
FreeBSD).

> 
> > +
> > +   while (getline(&line, &size, cpuinfo) != -1) {
> It is unlikely, but getline() may fail with EINVAL or ENOMEM, so
> maybe while(getline(&line, &size, cpuinfo) > 0)?

No, getline returns -1 on any error, errno provides those extra error codes, 
if that's what you had on mind.

> 
> > +           char *colon;
> > +
> > +           if (strncmp(line, "bogomips", 8))
> > +                   continue;
> > +
> > +           colon = strchr(line, ':');
> > +           if (igt_debug_on(!colon))
> > +                   bogomips = 0;
> > +           else
> > +                   bogomips = atoi(colon + 1);
> > +
> > +           if (igt_debug_on(!bogomips))
> > +                   break;
> > +
> > +           ret += bogomips;
> > +   }
> > +   free(line);
> > +   fclose(cpuinfo);
> > +
> > +   return igt_debug_on(!bogomips) ? UINT_MAX : ret;
> Same as my first comment in this function: will this not cause
> sporadic runs on machines that are supposed to skip the test?

Yes, but if you think that's wrong then please first think of e.g. fopen() 
failing on a fast machine for some reason, and then suggest a behaviour that 
you think would be more correct in any of possible cases.  My thinking is that 
it's more important to execute the selftest and preserve coverage than fail on 
potential test issues that don't mean the selftest can't be executed.

> 
> > +}
> > +
> > +static int wrapper(const char *dynamic_name,
> The name could be more descriptive about what function does/is
> used for, so maybe "check_skip_wrapper" would better achieve
> that?

If that was a global function then sure, its name should be more descriptive, 
but here in this context I don't think that name may introduce any doubts.  
But anyway, I can try to propose a more descriptive name if v2 is needed.

Thanks,
Janusz

> 
> [...]
> 
>

Re: [PATCH i-g-t] tests/dmabuf: Avoid soft lockups on slow platforms

Reply via email to