On 03/12/2020 09:59, Chris Wilson wrote:
Race the execution and interrupt handlers along a context, while
closing it at a random time.

Signed-off-by: Chris Wilson <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
---
  tests/i915/gem_ctx_exec.c | 60 +++++++++++++++++++++++++++++++++++++++
  1 file changed, 60 insertions(+)

diff --git a/tests/i915/gem_ctx_exec.c b/tests/i915/gem_ctx_exec.c
index 194191def..18d5d1217 100644
--- a/tests/i915/gem_ctx_exec.c
+++ b/tests/i915/gem_ctx_exec.c
@@ -336,6 +336,63 @@ static void nohangcheck_hostile(int i915)
        close(i915);
  }
+static void close_race(int i915)
+{
+       const int ncpus = sysconf(_SC_NPROCESSORS_ONLN);
+       uint32_t *contexts;
+
+       contexts = mmap(NULL, 4096, PROT_WRITE, MAP_SHARED | MAP_ANON, -1, 0);
+       igt_assert(contexts != MAP_FAILED);
+
+       for (int child = 0; child < ncpus; child++)
+               contexts[child] = gem_context_clone_with_engines(i915, 0);
+
+       igt_fork(child, ncpus) {
+               igt_spin_t *spin;
+
+               spin = igt_spin_new(i915, .flags = IGT_SPIN_POLL_RUN);
+               igt_spin_end(spin);
+               gem_sync(i915, spin->handle);
+
+               while (!READ_ONCE(contexts[ncpus])) {
+                       int64_t timeout = 1;
+
+                       igt_spin_reset(spin);
+                       igt_assert(!igt_spin_has_started(spin));
+
+                       spin->execbuf.rsvd1 = READ_ONCE(contexts[child]);
+                       if (__gem_execbuf(i915, &spin->execbuf))
+                               continue;
+
+                       igt_assert(gem_bo_busy(i915, spin->handle));

I've seen this line fail in CI results - any idea how that can happen?

+                       gem_wait(i915, spin->handle, &timeout); /* prime irq */

Is this depending on implementation specific behaviour, that we will leave the irq on after the waiter had exited?

+                       igt_spin_busywait_until_started(spin);
+
+                       igt_spin_end(spin);
+                       gem_sync(i915, spin->handle);
+               }
+
+               igt_spin_free(i915, spin);
+       }
+
+       igt_until_timeout(5) {
+               for (int child = 0; child < ncpus; child++) {
+                       gem_context_destroy(i915, contexts[child]);
+                       contexts[child] =
+                               gem_context_clone_with_engines(i915, 0);

Right so deliberate attempt to occasionally make the child use closed context. Presumably, well according to the CI results, it does manage to consistently hit it, which surprises me a bit. A comment here would be good.

+               }
+               usleep(1000);

Maybe add some randomness here? Or even a random busy loop within the child loop? I haven't looked at the i915 patch yet to know where the race actually is..

+       }
+
+       contexts[ncpus] = 1;
+       igt_waitchildren();
+
+       for (int child = 0; child < ncpus; child++)
+               gem_context_destroy(i915, contexts[child]);
+
+       munmap(contexts, 4096);
+}
+
  igt_main
  {
        const uint32_t batch[2] = { 0, MI_BATCH_BUFFER_END };
@@ -380,6 +437,9 @@ igt_main
        igt_subtest("basic-nohangcheck")
                nohangcheck_hostile(fd);
+ igt_subtest("basic-close-race")
+               close_race(fd);
+
        igt_subtest("reset-pin-leak") {
                int i;

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Reply via email to