The smem-oom subtest forks SMEM helper processes from a loop run in the
main process.  That loop is supposed to be terminated only when exit
handler of a formerly forked LMEM eviction process signals its completion.
However, since the subtest arranges OOM conditions, the LMEM process may
get killed and its completion never signaled.  When that happens, the
subtest may keep re-spawning SMEM helpers indefinitely and complete only
when killed, e.g., by igt_runner on per-test timeout expiration.

Instead of waiting for completion of the loop of the SMEM helpers, run
the loop in background and wait for completion of the LMEM eviction
process.  Also, take care of signaling the SMEM helper processes about
LMEM eviction process completion in case it has got killed and hasn't had
a chance to do that itself.

This patch addresses timeout results reported to the below mentioned
upstream issue.  Other failures (incomplete / dmesg-warn / crash) may
need additional patches, but let's fix those timeouts first to get a more
clear picture.

v2: Add a comment on why igt_waitchildren() has been replaced with its
    non-failing variant (Krzysztof),
  - in commit description, use "LMEM eviction process" wording in place of
    just "LMEM process" (Krzysztof),
  - insert "intel" component into the commit message tag (Kamil).

Link: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/5493
Reviewed-by: Krzysztof Karas <[email protected]>
Cc: Kamil Konieczny <[email protected]>
Signed-off-by: Janusz Krzysztofik <[email protected]>
---
 tests/intel/gem_lmem_swapping.c | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/tests/intel/gem_lmem_swapping.c b/tests/intel/gem_lmem_swapping.c
index 8e0dac42d8..6f494b5342 100644
--- a/tests/intel/gem_lmem_swapping.c
+++ b/tests/intel/gem_lmem_swapping.c
@@ -678,6 +678,7 @@ static void test_smem_oom(int i915,
        const unsigned int num_alloc = 1 + smem_size / (alloc >> 20);
        struct igt_helper_process smem_proc = {};
        unsigned int n;
+       int lmem_err;
 
        lmem_done = mmap(0, sizeof(*lmem_done), PROT_WRITE,
                         MAP_SHARED | MAP_ANON, -1, 0);
@@ -703,8 +704,8 @@ static void test_smem_oom(int i915,
        }
 
        /* smem memory hog process, respawn till the lmem process completes */
-       while (!READ_ONCE(*lmem_done)) {
-               igt_fork_helper(&smem_proc) {
+       igt_fork_helper(&smem_proc) {
+               while (!READ_ONCE(*lmem_done)) {
                        igt_fork(child, 1) {
                                for (int pass = 0; pass < num_alloc; pass++) {
                                        if (READ_ONCE(*lmem_done))
@@ -730,11 +731,19 @@ static void test_smem_oom(int i915,
                        for (n = 0; n < 2; n++)
                                wait(NULL);
                }
-               igt_wait_helper(&smem_proc);
        }
+
+       /* Reap exit status of the lmem process but don't fail before cleanup */
+       lmem_err = __igt_waitchildren();
+
+       /* Make sure SMEM helpers stop even when the LMEM process gets killed */
+       if (lmem_err)
+               (*lmem_done)++;
        munmap(lmem_done, sizeof(*lmem_done));
-       /* Reap exit status of the lmem process */
-       igt_waitchildren();
+
+       igt_wait_helper(&smem_proc);
+
+       igt_assert_eq(lmem_err, 0);
 }
 
 #define dynamic_lmem_subtest(reg, regs, subtest_name...) \
-- 
2.51.1

Reply via email to