On 08/11/2018 08:17, Chris Wilson wrote:
Ignore trying to shrink from i915 if we fail to acquire the struct_mutex
in the shrinker while performing direct-reclaim. The trade-off being
(much) lower latency for non-i915 clients at an increased risk of being
unable to obtain a page from direct-reclaim without hitting the
oom-notifier. The proviso being that we still keep trying to hard
obtain the lock for oom so that we can reap under heavy memory pressure.

Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursu...@intel.com>
---
  drivers/gpu/drm/i915/i915_gem_shrinker.c | 24 +++++++++++-------------
  1 file changed, 11 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_shrinker.c 
b/drivers/gpu/drm/i915/i915_gem_shrinker.c
index ea90d3a0d511..d461f458f4af 100644
--- a/drivers/gpu/drm/i915/i915_gem_shrinker.c
+++ b/drivers/gpu/drm/i915/i915_gem_shrinker.c
@@ -36,7 +36,9 @@
  #include "i915_drv.h"
  #include "i915_trace.h"
-static bool shrinker_lock(struct drm_i915_private *i915, bool *unlock)
+static bool shrinker_lock(struct drm_i915_private *i915,
+                         unsigned int flags,
+                         bool *unlock)
  {
        switch (mutex_trylock_recursive(&i915->drm.struct_mutex)) {
        case MUTEX_TRYLOCK_RECURSIVE:
@@ -45,15 +47,11 @@ static bool shrinker_lock(struct drm_i915_private *i915, 
bool *unlock)
case MUTEX_TRYLOCK_FAILED:
                *unlock = false;
-               preempt_disable();
-               do {
-                       cpu_relax();
-                       if (mutex_trylock(&i915->drm.struct_mutex)) {
-                               *unlock = true;
-                               break;
-                       }
-               } while (!need_resched());
-               preempt_enable();
+               if (flags & I915_SHRINK_ACTIVE) {

So until I915_SHRINK_ACTIVE, which is the last ditch attempt to shrink in the normal case (direct reclaim?) or oom, we bail out on the first sign of struct mutex contention. Doesn't this make our shrinker much less effective at runtime and why is that OK?

Or in other words, for what use cases, tests or benchmark was the existing approach of busy looping a problem?

+                       mutex_lock_nested(&i915->drm.struct_mutex,
+                                         I915_MM_SHRINKER);

_nested is needed since abandoning trylock would otherwise cause lockdep issues? But is it really safe? I don't know.. how can it be? It is guaranteed to be a different process here otherwise the result wouldn't be MUTEX_TRYLOCK_FAILED.

Also, I915_MM_SHRINKER is so far only documented to apply to obj->mm.lock.

Regards,

Tvrtko

+                       *unlock = true;
+               }
                return *unlock;
case MUTEX_TRYLOCK_SUCCESS:
@@ -160,7 +158,7 @@ i915_gem_shrink(struct drm_i915_private *i915,
        unsigned long scanned = 0;
        bool unlock;
- if (!shrinker_lock(i915, &unlock))
+       if (!shrinker_lock(i915, flags, &unlock))
                return 0;
/*
@@ -357,7 +355,7 @@ i915_gem_shrinker_scan(struct shrinker *shrinker, struct 
shrink_control *sc)
sc->nr_scanned = 0; - if (!shrinker_lock(i915, &unlock))
+       if (!shrinker_lock(i915, 0, &unlock))
                return SHRINK_STOP;
freed = i915_gem_shrink(i915,
@@ -397,7 +395,7 @@ shrinker_lock_uninterruptible(struct drm_i915_private 
*i915, bool *unlock,
        do {
                if (i915_gem_wait_for_idle(i915,
                                           0, MAX_SCHEDULE_TIMEOUT) == 0 &&
-                   shrinker_lock(i915, unlock))
+                   shrinker_lock(i915, 0, unlock))
                        break;
schedule_timeout_killable(1);

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Reply via email to