On 06/09/2021 13:53, Tvrtko Ursulin wrote:

On 06/09/2021 13:30, Matthew Auld wrote:
On 06/09/2021 13:19, Tvrtko Ursulin wrote:

On 06/09/2021 10:17, Matthew Auld wrote:
Since the object might still be active here, the shrink_all will simply
ignore it, which blows up in the test, since the pages will still be
there. Currently THP is disabled which should result in the test being
skipped, but if we ever re-enable THP we might start seeing the failure.
Fix this by forcing I915_SHRINK_ACTIVE.

v2: Some machine in the shard runs doesn't seem to have any available
swap when running this test. Try to handle this.

Signed-off-by: Matthew Auld <matthew.a...@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursu...@intel.com>
Cc: Thomas Hellström <thomas.hellst...@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursu...@intel.com> #v1
---
  .../gpu/drm/i915/gem/selftests/huge_pages.c   | 31 ++++++++++++++-----
  1 file changed, 24 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
index a094f3ce1a90..46ea1997c114 100644
--- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
+++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
@@ -1519,6 +1519,7 @@ static int igt_shrink_thp(void *arg)
      struct i915_vma *vma;
      unsigned int flags = PIN_USER;
      unsigned int n;
+    bool should_swap;
      int err = 0;
      /*
@@ -1567,23 +1568,39 @@ static int igt_shrink_thp(void *arg)
              break;
      }
      i915_gem_context_unlock_engines(ctx);
+    /*
+     * Nuke everything *before* we unpin the pages so we can be reasonably +     * sure that when later checking get_nr_swap_pages() that some random
+     * leftover object doesn't steal the remaining swap space.
+     */
+    i915_gem_shrink(NULL, i915, -1UL, NULL,
+            I915_SHRINK_BOUND |
+            I915_SHRINK_UNBOUND |
+            I915_SHRINK_ACTIVE);
      i915_vma_unpin(vma);
      if (err)
          goto out_put;
+
      /*
-     * Now that the pages are *unpinned* shrink-all should invoke
-     * shmem to truncate our pages.
+     * Now that the pages are *unpinned* shrinking should invoke
+     * shmem to truncate our pages, if we have available swap.
       */
-    i915_gem_shrink_all(i915);
-    if (i915_gem_object_has_pages(obj)) {
-        pr_err("shrink-all didn't truncate the pages\n");
+    should_swap = get_nr_swap_pages() > 0;
+    i915_gem_shrink(NULL, i915, -1UL, NULL,
+            I915_SHRINK_BOUND |
+            I915_SHRINK_UNBOUND |
+            I915_SHRINK_ACTIVE);
+    if (should_swap == i915_gem_object_has_pages(obj)) {

Hmm is there any value running the test if no swap (given objects used by the test are "willneed"), or you could simplify and just do early skip?

Maybe. My thinking was that this adds some coverage if say the device is not configured with swap. i.e assert that the pages don't magically disappear, and that their contents still persist etc.

Happy to make it skip instead though?

So reducing it to a basic shrinker test in that case. Hm.. do you know if we have a non THP specific tests for that already somewhere in selftests (I can't spot any), or just in IGT?

Just IGT I think, outside of some cases where we call gem_shrink in very specific places, which would be hard to do from an IGT.


If we indeed don't have it in selftests, then I guess question is whether it is warranted to "hide" such a basic test in the THP "drawer", or instead adding a generic shrinker test should be considered. (And one could then follow with a question should a basic generic test have a THP sub-test.)

The reason for the selftest vs IGT is mostly because userspace doesn't have any knowledge of the underlying pages, or whether THP is used. IIRC there was some issue with THP + our shmem backend in the past, so also adding some basic coverage for THP + i915-gem shrinker seemed reasonable. Even if we don't have swap space, I think it still makes some sense to call into gem_shrink with our target THP object.


It's hard to say where the boundary for selftests-vs-IGT coverage should be in this case. I mean would it be warranted to add such a generic shrinker selftest. It is mostly testable from userspace, but kernel can do a few more introspections and sanity checks at cost of growing kernel code.

Regards,

Tvrtko



Regards,

Tvrtko

+        pr_err("unexpected pages mismatch, should_swap=%s\n",
+               yesno(should_swap));
          err = -EINVAL;
          goto out_put;
      }
-    if (obj->mm.page_sizes.sg || obj->mm.page_sizes.phys) {
-        pr_err("residual page-size bits left\n");
+    if (should_swap == (obj->mm.page_sizes.sg || obj->mm.page_sizes.phys)) {
+        pr_err("unexpected residual page-size bits, should_swap=%s\n",
+               yesno(should_swap));
          err = -EINVAL;
          goto out_put;
      }

Reply via email to