Re: [PATCH v2 4/9] drm/ttm: Allow continued swapout after -ENOSPC falure

2024-04-16 Thread Matthew Brost
On Tue, Apr 16, 2024 at 12:07:25PM +0200, Thomas Hellström wrote:
> The -ENOSPC failure from ttm_bo_swapout() meant that the lru_lock
> was dropped and simply restarting the iteration meant we'd likely
> hit the same error again on the same resource. Now that we can
> restart the iteration even if the lock was dropped, do that.
> 

It is not clear what you describe in this commit message (-ENOSPC ==
-EBUSY + lru_lock dropped) is true (no comments in code).

It does appears to be true after examining ttm_bo_swapout() closely.
Maybe out of scope for the series but would it be possible to add some
kernel doc to ttm_device_swapout stating this?

Patch it self makes sense to me.

Matt

> Cc: Christian König 
> Cc: Somalapuram Amaranath 
> Cc: 
> Signed-off-by: Thomas Hellström 
> ---
>  drivers/gpu/drm/ttm/ttm_device.c | 21 +
>  1 file changed, 13 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/drm/ttm/ttm_device.c 
> b/drivers/gpu/drm/ttm/ttm_device.c
> index e8a6a1dab669..4a030b4bc848 100644
> --- a/drivers/gpu/drm/ttm/ttm_device.c
> +++ b/drivers/gpu/drm/ttm/ttm_device.c
> @@ -168,15 +168,20 @@ int ttm_device_swapout(struct ttm_device *bdev, struct 
> ttm_operation_ctx *ctx,
>  
>   num_pages = PFN_UP(bo->base.size);
>   ret = ttm_bo_swapout(bo, ctx, gfp_flags);
> - /* ttm_bo_swapout has dropped the lru_lock */
> - if (!ret) {
> - ttm_resource_cursor_fini();
> - return num_pages;
> - }
> - if (ret != -EBUSY) {
> - ttm_resource_cursor_fini();
> - return ret;
> + /* Couldn't swap out, and retained the lru_lock */
> + if (ret == -EBUSY)
> + continue;
> + /* Couldn't swap out and dropped the lru_lock */
> + if (ret == -ENOSPC) {
> + spin_lock(>lru_lock);
> + continue;
>   }
> + /*
> +  * Dropped the lock and either succeeded or
> +  * hit an error that forces us to break.
> +  */
> + ttm_resource_cursor_fini();
> + return ret ? ret : num_pages;
>   }
>   }
>   ttm_resource_cursor_fini_locked();
> -- 
> 2.44.0
> 


[PATCH v2 4/9] drm/ttm: Allow continued swapout after -ENOSPC falure

2024-04-16 Thread Thomas Hellström
The -ENOSPC failure from ttm_bo_swapout() meant that the lru_lock
was dropped and simply restarting the iteration meant we'd likely
hit the same error again on the same resource. Now that we can
restart the iteration even if the lock was dropped, do that.

Cc: Christian König 
Cc: Somalapuram Amaranath 
Cc: 
Signed-off-by: Thomas Hellström 
---
 drivers/gpu/drm/ttm/ttm_device.c | 21 +
 1 file changed, 13 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_device.c b/drivers/gpu/drm/ttm/ttm_device.c
index e8a6a1dab669..4a030b4bc848 100644
--- a/drivers/gpu/drm/ttm/ttm_device.c
+++ b/drivers/gpu/drm/ttm/ttm_device.c
@@ -168,15 +168,20 @@ int ttm_device_swapout(struct ttm_device *bdev, struct 
ttm_operation_ctx *ctx,
 
num_pages = PFN_UP(bo->base.size);
ret = ttm_bo_swapout(bo, ctx, gfp_flags);
-   /* ttm_bo_swapout has dropped the lru_lock */
-   if (!ret) {
-   ttm_resource_cursor_fini();
-   return num_pages;
-   }
-   if (ret != -EBUSY) {
-   ttm_resource_cursor_fini();
-   return ret;
+   /* Couldn't swap out, and retained the lru_lock */
+   if (ret == -EBUSY)
+   continue;
+   /* Couldn't swap out and dropped the lru_lock */
+   if (ret == -ENOSPC) {
+   spin_lock(>lru_lock);
+   continue;
}
+   /*
+* Dropped the lock and either succeeded or
+* hit an error that forces us to break.
+*/
+   ttm_resource_cursor_fini();
+   return ret ? ret : num_pages;
}
}
ttm_resource_cursor_fini_locked();
-- 
2.44.0