Hi Christian,

On Tuesday, 16 September 2025 12:58:22 CEST Christian König wrote:
> 
> On 16.09.25 11:55, Janusz Krzysztofik wrote:
> > CI reports kernel soft lockups when running a wait_backward test case of
> > igt@dmabuf@all-tests@dma_fence_chain selftest on less powerful machines.
> > A kernel fix has been developed that has proven to resolve the issue, but
> > it hasn't been accepted upstream, with a recommendation for dropping that
> > test case as a "nonsense".
> > 
> > Before we decide to take that path, try to implement the problematic test
> > case in user space as an IGT subtest.  Since no kernel uAPIs have been
> > found that allow strict reimplementation of exact algorithm of the
> > problematic test case, where every link of a dma-fence chain is signaled
> > one by one from a loop running in kernel space, provide two approximate
> > variants, one that signals each fence with an individual system call, and
> > one that signals them all in one shot with one system call.
> 
> Those tests are unrealistic outside of the syncobj framework.
> 
> E.g. a test which exercises signaling each fence individually would require 
HW which would do that to happen in reality.

I've been trying to understand what you meant but I've failed.  If a user 
submits a number of DRM exec requests, each with an out fence, then HW will 
signal each of those fences when its corresponding request completes, won't 
it?

But anyway, some of those subtests, e.g. stress-enable-all-signal-all-forward 
or stress-enable-all-signal-all-backward, can trigger hard lockups.  Shouldn't 
we care?

Thanks,
Janusz

> 
> Regards,
> Christian.
> 
> > 
> > For more comprehensive testing, also implement the _forward and _random
> > scenarios from the original selftest, as well as simplified variants that
> > don't enable signaling on each link of the dma-fence chain, and yet others
> > that not only enable but also wait on every link of the chain.
> > 
> > Signed-off-by: Janusz Krzysztofik <janusz.krzyszto...@linux.intel.com>
> > ---
> >  tests/syncobj_timeline.c | 289 +++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 289 insertions(+)
> > 
> > diff --git a/tests/syncobj_timeline.c b/tests/syncobj_timeline.c
> > index a77896ec1d..80c5970687 100644
> > --- a/tests/syncobj_timeline.c
> > +++ b/tests/syncobj_timeline.c
> > @@ -427,6 +427,61 @@
> >   *
> >   * SUBTEST: wait-zero-handles
> >   * Description: Verifies that waiting on an empty list of syncobj handles 
is accepted
> > + *
> > + * SUBTEST: stress-wait-last-signal-forward
> > + * Description: Signals each fence of a large timeline while another 
thread is waiting on that timeline
> > + *
> > + * SUBTEST: stress-wait-last-signal-backward
> > + * Description: Signals each fence of a large timeline in reverse order 
while another thread is waiting on that timeline
> > + *
> > + * SUBTEST: stress-wait-last-signal-random
> > + * Description: Signals each fence of a large timeline in random order 
while another thread is waiting on that timeline
> > + *
> > + * SUBTEST: stress-wait-last-signal-all-forward
> > + * Description: Signals all fences of a large timeline while another 
thread is waiting on that timeline
> > + *
> > + * SUBTEST: stress-wait-last-signal-all-backward
> > + * Description: Signals all fences of a large reverse ordered timeline 
while another thread is waiting on that timeline
> > + *
> > + * SUBTEST: stress-wait-last-signal-all-random
> > + * Description: Signals all fences of a large randomly ordered timeline 
while another thread is waiting on that timeline
> > + *
> > + * SUBTEST: stress-enable-all-signal-forward
> > + * Description: Signals each fence of a large timeline with signaling 
enabled on each point while another thread is waiting on that timeline
> > + *
> > + * SUBTEST: stress-enable-all-signal-backward
> > + * Description: Signals each fence of a large timeline in reversed order 
with signaling enabled on each point while another thread is waiting on that 
timeline
> > + *
> > + * SUBTEST: stress-enable-all-signal-random
> > + * Description: Signals each fence of a large timeline in random order 
with signaling enabled on each point while another thread is waiting on that 
timeline
> > + *
> > + * SUBTEST: stress-enable-all-signal-all-forward
> > + * Description: Signals all fences of a large timeline with signaling 
enabled on each point while another thread is waiting on that timeline
> > + *
> > + * SUBTEST: stress-enable-all-signal-all-backward
> > + * Description: Signals all fences of a large reversed ordered timeline 
with signaling enabled on each point while another thread is waiting on that 
timeline
> > + *
> > + * SUBTEST: stress-enable-all-signal-all-random
> > + * Description: Signals all fences of a large randomly ordered timeline 
with signaling enabled on each point while another thread is waiting on that 
timeline
> > + *
> > + * SUBTEST: stress-wait-all-signal-forward
> > + * Description: Signals each fence of a large timeline while another 
thread is waiting on each point of that timeline
> > + *
> > + * SUBTEST: stress-wait-all-signal-backward
> > + * Description: Signals each fence of a large timeline in reversed order 
while another thread is waiting on each point of that timeline
> > + *
> > + * SUBTEST: stress-wait-all-signal-random
> > + * Description: Signals each fence of a large timeline in random order 
while another thread is waiting on each point of that timeline
> > + *
> > + * SUBTEST: stress-wait-all-signal-all-forward
> > + * Description: Signals all fences of a large timeline while another 
thread is waiting on each point of that timeline
> > + *
> > + * SUBTEST: stress-wait-all-signal-all-backward
> > + * Description: Signals all fences of a large reversed ordered timeline 
while another thread is waiting on each point of that timeline
> > + *
> > + * SUBTEST: stress-wait-all-signal-all-random
> > + * Description: Signals all fences of a large randomly ordered timeline 
while another thread is waiting on each point of that timeline
> > + *
> >   */
> >  
> >  IGT_TEST_DESCRIPTION("Tests for the drm timeline sync object API");
> > @@ -1675,6 +1730,217 @@ test_32bits_limit(int fd)
> >     close(timeline);
> >  }
> >  
> > +#define STRESS_FLAGS_WAIT_ALL              
DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL
> > +#define STRESS_FLAGS_ENABLE_ALL            (STRESS_FLAGS_WAIT_ALL << 
1)
> > +#define STRESS_FLAGS_SIGNAL_ALL            (STRESS_FLAGS_ENABLE_ALL << 
1)
> > +#define STRESS_FLAGS_SIGNAL_BACKWARD       (STRESS_FLAGS_SIGNAL_ALL << 
1)
> > +#define STRESS_FLAGS_SIGNAL_RANDOM (STRESS_FLAGS_SIGNAL_BACKWARD << 1)
> > +
> > +const char *stress_descriptions[] = {
> > +   /* stress-wait-last-signal-forward */
> > +   [0] =
> > +           "Signals each fence of a large timeline while another 
thread is waiting on that timeline",
> > +   /* stress-wait-last-signal-backward */
> > +   [STRESS_FLAGS_SIGNAL_BACKWARD] =
> > +           "Signals each fence of a large timeline in reverse order 
while another thread is waiting on that timeline",
> > +   /* stress-wait-last-signal-random */
> > +   [STRESS_FLAGS_SIGNAL_RANDOM] =
> > +           "Signals each fence of a large timeline in random order 
while another thread is waiting on that timeline",
> > +   /* stress-wait-last-signal-all-forward */
> > +   [STRESS_FLAGS_SIGNAL_ALL] =
> > +           "Signals all fences of a large timeline while another 
thread is waiting on that timeline",
> > +   /* stress-wait-last-signal-all-backward */
> > +   [STRESS_FLAGS_SIGNAL_ALL | STRESS_FLAGS_SIGNAL_BACKWARD] =
> > +           "Signals all fences of a large reverse ordered timeline 
while another thread is waiting on that timeline",
> > +   /* stress-wait-last-signal-all-random */
> > +   [STRESS_FLAGS_SIGNAL_ALL | STRESS_FLAGS_SIGNAL_RANDOM] =
> > +           "Signals all fences of a large randomly ordered timeline 
while another thread is waiting on that timeline",
> > +   /* stress-enable-all-signal-forward */
> > +   [STRESS_FLAGS_ENABLE_ALL] =
> > +           "Signals each fence of a large timeline with signaling 
enabled on each point while another thread is waiting on that timeline",
> > +   /* stress-enable-all-signal-backward */
> > +   [STRESS_FLAGS_ENABLE_ALL | STRESS_FLAGS_SIGNAL_BACKWARD] =
> > +           "Signals each fence of a large timeline in reversed order 
with signaling enabled on each point while another thread is waiting on that 
timeline",
> > +   /* stress-enable-all-signal-random */
> > +   [STRESS_FLAGS_ENABLE_ALL | STRESS_FLAGS_SIGNAL_RANDOM] =
> > +           "Signals each fence of a large timeline in random order 
with signaling enabled on each point while another thread is waiting on that 
timeline",
> > +   /* stress-enable-all-signal-all-forward */
> > +   [STRESS_FLAGS_ENABLE_ALL | STRESS_FLAGS_SIGNAL_ALL] =
> > +           "Signals all fences of a large timeline with signaling 
enabled on each point while another thread is waiting on that timeline",
> > +   /* stress-enable-all-signal-all-backward */
> > +   [STRESS_FLAGS_ENABLE_ALL | STRESS_FLAGS_SIGNAL_ALL | 
STRESS_FLAGS_SIGNAL_BACKWARD] =
> > +           "Signals all fences of a large reversed ordered timeline 
with signaling enabled on each point while another thread is waiting on that 
timeline",
> > +   /* stress-enable-all-signal-all-random */
> > +   [STRESS_FLAGS_ENABLE_ALL | STRESS_FLAGS_SIGNAL_ALL | 
STRESS_FLAGS_SIGNAL_RANDOM] =
> > +           "Signals all fences of a large randomly ordered timeline 
with signaling enabled on each point while another thread is waiting on that 
timeline",
> > +   /* stress-wait-all-signal-forward */
> > +   [STRESS_FLAGS_WAIT_ALL] =
> > +           "Signals each fence of a large timeline while another 
thread is waiting on each point of that timeline",
> > +   /* stress-wait-all-signal-backward */
> > +   [STRESS_FLAGS_WAIT_ALL | STRESS_FLAGS_SIGNAL_BACKWARD] =
> > +           "Signals each fence of a large timeline in reversed order 
while another thread is waiting on each point of that timeline",
> > +   /* stress-wait-all-signal-random */
> > +   [STRESS_FLAGS_WAIT_ALL | STRESS_FLAGS_SIGNAL_RANDOM] =
> > +           "Signals each fence of a large timeline in random order 
while another thread is waiting on each point of that timeline",
> > +   /* stress-wait-all-signal-all-forward */
> > +   [STRESS_FLAGS_WAIT_ALL | STRESS_FLAGS_SIGNAL_ALL] =
> > +           "Signals all fences of a large timeline while another 
thread is waiting on each point of that timeline",
> > +   /* stress-wait-all-signal-all-backward */
> > +   [STRESS_FLAGS_WAIT_ALL | STRESS_FLAGS_SIGNAL_ALL | 
STRESS_FLAGS_SIGNAL_BACKWARD] =
> > +           "Signals all fences of a large reversed ordered timeline 
while another thread is waiting on each point of that timeline",
> > +   /* stress-wait-all-signal-all-random */
> > +   [STRESS_FLAGS_WAIT_ALL | STRESS_FLAGS_SIGNAL_ALL | 
STRESS_FLAGS_SIGNAL_RANDOM] =
> > +           "Signals all fences of a large randomly ordered timeline 
while another thread is waiting on each point of that timeline",
> > +};
> > +
> > +#define TL_LENGTH 4096
> > +
> > +struct stress_timeline {
> > +   int fd;
> > +   int swsync;
> > +   uint32_t syncobj;
> > +   int tmp_fence;
> > +   uint32_t *syncobjs;
> > +   uint64_t *points;
> > +   unsigned int length;
> > +   unsigned int flags;
> > +   pthread_t thread;
> > +   int retval;
> > +};
> > +
> > +static void stress_init(int fd, struct stress_timeline **timeline, 
unsigned int flags)
> > +{
> > +   struct stress_timeline *tl;
> > +   uint64_t point;
> > +   int i;
> > +
> > +   tl = calloc(TL_LENGTH, sizeof(*tl));
> > +   igt_assert(tl);
> > +   *timeline = tl;
> > +
> > +   tl->fd = fd;
> > +   tl->tmp_fence = -1;
> > +   tl->length = TL_LENGTH;
> > +   tl->flags = flags;
> > +
> > +   tl->swsync = sw_sync_timeline_create();
> > +   tl->syncobj = syncobj_create(fd, 0);
> > +
> > +   tl->syncobjs = calloc(tl->length, sizeof(*tl->syncobjs));
> > +   igt_assert(tl->syncobjs);
> > +
> > +   tl->points = calloc(tl->length, sizeof(*tl->points));
> > +   igt_assert(tl->points);
> > +
> > +   for (i = 0; i < tl->length; i++)
> > +           tl->points[i] = (flags & STRESS_FLAGS_SIGNAL_BACKWARD) ? 
tl->length - 1 : i + 1;
> > +   if (flags & STRESS_FLAGS_SIGNAL_RANDOM)
> > +           igt_permute_array(tl->points, tl->length, 
igt_exchange_int64);
> > +
> > +   for (i = 0; i < tl->length; i++) {
> > +           tl->tmp_fence = sw_sync_timeline_create_fence(tl->swsync, 
tl->points[i]);
> > +           tl->syncobjs[i] = syncobj_create(fd, 0);
> > +
> > +           syncobj_import_sync_file(fd, tl->syncobjs[i], tl-
>tmp_fence);
> > +           close(tl->tmp_fence);
> > +           tl->tmp_fence = -1;
> > +
> > +           syncobj_binary_to_timeline(fd, tl->syncobj, i + 1, tl-
>syncobjs[i]);
> > +           syncobj_destroy(fd, tl->syncobjs[i]);
> > +
> > +           tl->syncobjs[i] = tl->syncobj;
> > +           tl->points[i] = i + 1;
> > +   }
> > +
> > +   if (flags & STRESS_FLAGS_ENABLE_ALL)
> > +           igt_assert_eq(syncobj_timeline_wait_err(tl->fd, tl-
>syncobjs,
> > +                                                   tl-
>points, tl->length, 0,
> > +                                                   
DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL), -ETIME);
> > +
> > +   syncobj_timeline_query(fd, &tl->syncobj, &point, 1);
> > +   igt_assert_eq(point, 0);
> > +}
> > +
> > +static void *stress_wait_syncobj_thread_func(void *data)
> > +{
> > +   struct stress_timeline *tl = data;
> > +   unsigned int count = (tl->flags & STRESS_FLAGS_WAIT_ALL) ? tl-
>length : 1;
> > +   uint64_t *points = &tl->points[tl->length - count];
> > +
> > +   tl->retval = -EINPROGRESS;
> > +
> > +   /* Wait for the timeline signaled */
> > +   tl->retval = syncobj_timeline_wait_err(tl->fd, tl->syncobjs, points, 
count,
> > +                                          gettime_ns() + 600 * 
NSECS_PER_SEC,
> > +                                          tl->flags & 
DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL);
> > +
> > +   return &tl->retval;
> > +}
> > +
> > +static void test_stress_enable_wait_signal(int fd, struct stress_timeline 
**timeline,
> > +                                      unsigned int flags)
> > +{
> > +   struct stress_timeline *tl;
> > +   int64_t dt;
> > +   int i;
> > +
> > +   stress_init(fd, timeline, flags);
> > +   tl = *timeline;
> > +
> > +   tl->retval = 0;
> > +   igt_assert_eq(pthread_create(&tl->thread, NULL,
> > +                                stress_wait_syncobj_thread_func, 
tl), 0);
> > +   igt_assert_eq(sched_yield(), 0);
> > +   while (READ_ONCE(tl->retval) != -EINPROGRESS)
> > +           ;
> > +   igt_assert_eq(sched_yield(), 0);
> > +
> > +   dt = -gettime_ns();
> > +   if (flags & STRESS_FLAGS_SIGNAL_ALL)
> > +           sw_sync_timeline_inc(tl->swsync, tl->length);
> > +   else
> > +           for (i = 0; i < tl->length; i++)
> > +                   sw_sync_timeline_inc(tl->swsync, 1);
> > +   dt += gettime_ns();
> > +   igt_info("%s: %d signals in %ld ns\n", __func__, tl->length, dt);
> > +
> > +   igt_assert_eq(pthread_join(tl->thread, NULL), 0);
> > +   tl->thread = 0;
> > +   igt_assert_eq(tl->retval, 0);
> > +}
> > +
> > +static void stress_cleanup(struct stress_timeline *timeline)
> > +{
> > +   if (!timeline)
> > +           return;
> > +
> > +   if (timeline->thread)
> > +           igt_warn_on(pthread_join(timeline->thread, NULL));
> > +
> > +   if (timeline->points)
> > +           free(timeline->points);
> > +
> > +   if (timeline->syncobjs) {
> > +           int i;
> > +
> > +           for (i = 0; i < timeline->length; i++)
> > +                   if (timeline->syncobjs && timeline->syncobjs[i] 
!= timeline->syncobj)
> > +                           syncobj_destroy(timeline->fd, 
timeline->syncobjs[i]);
> > +           free(timeline->syncobjs);
> > +   }
> > +
> > +   if (timeline->tmp_fence >= 0)
> > +           igt_warn_on(close(timeline->tmp_fence));
> > +
> > +   if (timeline->syncobj)
> > +           syncobj_destroy(timeline->fd, timeline->syncobj);
> > +
> > +   if (timeline->swsync >= 0)
> > +           igt_warn_on(close(timeline->swsync));
> > +
> > +   free(timeline);
> > +}
> > +
> >  static bool
> >  has_syncobj_timeline_wait(int fd)
> >  {
> > @@ -1934,6 +2200,29 @@ igt_main
> >     igt_subtest("32bits-limit")
> >             test_32bits_limit(fd);
> >  
> > +   for (unsigned int flags = 0;
> > +        flags < (STRESS_FLAGS_WAIT_ALL | STRESS_FLAGS_SIGNAL_ALL |
> > +                 STRESS_FLAGS_ENABLE_ALL | 
STRESS_FLAGS_SIGNAL_RANDOM);
> > +        flags++) {
> > +           struct stress_timeline *timeline = NULL;
> > +
> > +           if (flags & STRESS_FLAGS_ENABLE_ALL && flags & 
STRESS_FLAGS_WAIT_ALL)
> > +                   continue;
> > +
> > +           igt_describe(stress_descriptions[flags]);
> > +           igt_subtest_f("stress-%s-%s-signal%s-%s",
> > +                         (flags & STRESS_FLAGS_ENABLE_ALL) ? 
"enable" : "wait",
> > +                         (flags & (STRESS_FLAGS_WAIT_ALL | 
STRESS_FLAGS_ENABLE_ALL)) ? "all" :
> > +                                                           
                            "last",
> > +                         (flags & STRESS_FLAGS_SIGNAL_ALL) ? "-all" 
: "",
> > +                         (flags & STRESS_FLAGS_SIGNAL_RANDOM) ? 
"random" :
> > +                         (flags & STRESS_FLAGS_SIGNAL_BACKWARD) ? 
"backward" : "forward")
> > +                   test_stress_enable_wait_signal(fd, &timeline, 
flags);
> > +
> > +           igt_fixture
> > +                   stress_cleanup(READ_ONCE(timeline));
> > +   }
> > +
> >     igt_fixture {
> >             drm_close_driver(fd);
> >     }
> 
> 




Reply via email to