The i915_active selftests live_active_wait and live_active_retire
operate on an i915_active attached to a mock, empty request, created as
part of test setup. A fence is attached to this request to control when
the request is processed. The tests then wait for the completion of the
active with __i915_active_wait(), and the test is considered successful
if this results in setting a variable in the active callback.

However, the behavior of __i915_active_wait() is such that if the
refcount for the active is 0, the function is almost completely skipped;
waiting on a already completed active yields no effect. This includes a
subsequent call to the retire() function of the active (which is the
callback that the test is interested about, and which dictates whether
its successful or not). So, if the active is completed before the
aforementioned call to __i915_active_wait(), the test will fail.

Most of the test runs in a single thread, including creating the
request, creating the fence for it, signalling that fence, and calling
__i915_active_wait(). However, the request itself is handled
asynchronously. This creates a race condition where if the request is
completed after signalling the fence, but before waiting on its active,
the active callback will not be invoked, failing the test.

Defer signalling the request's fence, to ensure the main test thread
gets to call __i915_active_wait() before request completion.

v2:
- Clarify the need for a fix a little more (Krzysztof K., Janusz)

Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/14808
Signed-off-by: Krzysztof Niemiec <[email protected]>
---
 drivers/gpu/drm/i915/selftests/i915_active.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/selftests/i915_active.c 
b/drivers/gpu/drm/i915/selftests/i915_active.c
index 0d89d70b9c36..a82a56c3eeb6 100644
--- a/drivers/gpu/drm/i915/selftests/i915_active.c
+++ b/drivers/gpu/drm/i915/selftests/i915_active.c
@@ -74,15 +74,25 @@ static struct live_active *__live_alloc(struct 
drm_i915_private *i915)
        return active;
 }
 
+static struct i915_sw_fence *submit;
+static struct delayed_work __live_submit_work;
+
+static void __live_submit_work_handler(struct work_struct *work)
+{
+       i915_sw_fence_commit(submit);
+       heap_fence_put(submit);
+}
+
 static struct live_active *
 __live_active_setup(struct drm_i915_private *i915)
 {
        struct intel_engine_cs *engine;
-       struct i915_sw_fence *submit;
        struct live_active *active;
        unsigned int count = 0;
        int err = 0;
 
+       INIT_DELAYED_WORK(&__live_submit_work, __live_submit_work_handler);
+
        active = __live_alloc(i915);
        if (!active)
                return ERR_PTR(-ENOMEM);
@@ -132,8 +142,7 @@ __live_active_setup(struct drm_i915_private *i915)
        }
 
 out:
-       i915_sw_fence_commit(submit);
-       heap_fence_put(submit);
+       schedule_delayed_work(&__live_submit_work, msecs_to_jiffies(500));
        if (err) {
                __live_put(active);
                active = ERR_PTR(err);
-- 
2.45.2

Reply via email to