amdgpu: fix lock cleanup during buffer creation

Nicolai Hähnle Thu, 16 Feb 2017 01:28:00 -0800

On 16.02.2017 02:00, Michel Dänzer wrote:

On 16/02/17 04:10 AM, Nicolai Hähnle wrote:

From: Nicolai Hähnle <[email protected]>


Open-code the initial ttm_bo_validate call, so that we can properly
unlock the reservation lock when it fails. Also, properly destruct
the reservation object when the first part of TTM BO initialization
fails.

Actual deadlocks caused by the missing unlock should have been fixed
by "drm/ttm: never add BO that failed to validate to the LRU list",
superseding the flawed fix in commit 38fc4856ad98 ("drm/amdgpu: fix
a potential deadlock in amdgpu_bo_create_restricted()").

This change fixes remaining recursive locking errors that can be seen
with lock debugging enabled, and avoids the error of freeing a locked
mutex.

v2: use ttm_bo_unref for error handling, and rebase on latest changes

Fixes: 12a852219583 ("drm/amdgpu: improve AMDGPU_GEM_CREATE_VRAM_CLEARED handling 
(v2)")
Signed-off-by: Nicolai Hähnle <[email protected]>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 28 ++++++++++++++++++++++------
 1 file changed, 22 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index c2e57f7..15944ea 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -404,15 +404,31 @@ int amdgpu_bo_create_restricted(struct amdgpu_device 
*adev,
        }

        initial_bytes_moved = atomic64_read(&adev->num_bytes_moved);
-       r = ttm_bo_init(&adev->mman.bdev, &bo->tbo, size, type,
-                       &bo->placement, page_align, !kernel, NULL,
-                       acc_size, sg, resv ? resv : &bo->tbo.ttm_resv,
-                       &amdgpu_ttm_bo_destroy);
+       r = ttm_bo_init_top(&adev->mman.bdev, &bo->tbo, size, type,
+                           page_align, NULL,
+                           acc_size, sg, resv ? resv : &bo->tbo.ttm_resv,
+                           &amdgpu_ttm_bo_destroy);
+
+       if (likely(r == 0))
+               r = ttm_bo_validate(&bo->tbo, &bo->placement, !kernel, false);
+
+       if (unlikely(r != 0)) {
+               struct ttm_buffer_object *tbo = &bo->tbo;
+
+               if (!resv)
+                       ww_mutex_unlock(&bo->tbo.ttm_resv.lock);
+               ttm_bo_unref(&tbo);
+               return r;
+       }


I think this would be clearer as

        if (unlikely(r != 0)) {
                struct ttm_buffer_object *tbo = &bo->tbo;

                if (!resv)
                        ww_mutex_unlock(&bo->tbo.ttm_resv.lock);
                ttm_bo_unref(&tbo);
                return r;
        }

        r = ttm_bo_validate(&bo->tbo, &bo->placement, !kernel, false);
        [...]

        amdgpu_cs_report_moved_bytes(adev,
                atomic64_read(&adev->num_bytes_moved) - initial_bytes_moved);

-       if (unlikely(r != 0))
-               return r;
+       if (!(bo->tbo.mem.placement & TTM_PL_FLAG_NO_EVICT)) {
+               spin_lock(&bo->tbo.glob->lru_lock);
+               ttm_bo_add_to_lru(&bo->tbo);
+               spin_unlock(&bo->tbo.glob->lru_lock);
+       }


It's a bit ugly to open-code this logic in driver code. Maybe add
another TTM helper which calls ttm_bo_validate and ttm_bo_add_to_lru?

On further reflection, maybe a better approach would be to have afunction ttm_bo_init_reserved, which returns with the reservation lockheld upon success. We can then just change amdgpu_bo_create_restrictedto ttm_bo_unreserve instead of ww_mutex_unlock, which does the add-to-LRU.

AFAICT, it doesn't make much sense to add the buffer to the LRU thatearly anyway, if amdgpu_fill_buffer is used.


Cheers,
Nicolai


_______________________________________________
amd-gfx mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v2 3/3] drm/amdgpu: fix lock cleanup during buffer creation

Reply via email to