Re: [Intel-gfx] [PATCH] drm/i915/ttm: Abort suspend on i915_ttm_backup failure

2022-09-01 Thread Das, Nirmoy



On 9/1/2022 5:57 PM, Andrzej Hajda wrote:

On 31.08.2022 18:18, Nirmoy Das wrote:

On system suspend when system memory is low then i915_gem_obj_copy_ttm()
could fail trying to backup a lmem obj. GEM_WARN_ON() is not enough,
suspend shouldn't continue if i915_ttm_backup() throws an error.

References: https://gitlab.freedesktop.org/drm/intel/-/issues/6529
Reviewed-by: Matthew Auld 
Suggested-by: Chris P Wilson 
Signed-off-by: Nirmoy Das 
---
  drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c | 7 ++-
  1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c 
b/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c

index 9aad84059d56..6f5d5c0909b4 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c
@@ -79,7 +79,12 @@ static int i915_ttm_backup(struct 
i915_gem_apply_to_region *apply,

  goto out_no_populate;
    err = i915_gem_obj_copy_ttm(backup, obj, pm_apply->allow_gpu, 
false);

-    GEM_WARN_ON(err);
+    if (err) {
+    drm_err(&i915->drm,
+    "Unable to copy from device to system memory, err:%d\n",
+    err);


I wonder if %pe wouldn't be better here, up to you.



More readable err should be useful, resend with %pe.



Reviewed-by: Andrzej Hajda 



Thanks,

Nirmoy



Regards
Andrzej



+    goto out_no_populate;
+    }
  ttm_bo_wait_ctx(backup_bo, &ctx);
    obj->ttm.backup = backup;




Re: [Intel-gfx] [PATCH] drm/i915/ttm: Abort suspend on i915_ttm_backup failure

2022-09-01 Thread Andrzej Hajda

On 31.08.2022 18:18, Nirmoy Das wrote:

On system suspend when system memory is low then i915_gem_obj_copy_ttm()
could fail trying to backup a lmem obj. GEM_WARN_ON() is not enough,
suspend shouldn't continue if i915_ttm_backup() throws an error.

References: https://gitlab.freedesktop.org/drm/intel/-/issues/6529
Reviewed-by: Matthew Auld 
Suggested-by: Chris P Wilson 
Signed-off-by: Nirmoy Das 
---
  drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c | 7 ++-
  1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c 
b/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c
index 9aad84059d56..6f5d5c0909b4 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c
@@ -79,7 +79,12 @@ static int i915_ttm_backup(struct i915_gem_apply_to_region 
*apply,
goto out_no_populate;
  
  	err = i915_gem_obj_copy_ttm(backup, obj, pm_apply->allow_gpu, false);

-   GEM_WARN_ON(err);
+   if (err) {
+   drm_err(&i915->drm,
+   "Unable to copy from device to system memory, err:%d\n",
+   err);


I wonder if %pe wouldn't be better here, up to you.

Reviewed-by: Andrzej Hajda 

Regards
Andrzej



+   goto out_no_populate;
+   }
ttm_bo_wait_ctx(backup_bo, &ctx);
  
  	obj->ttm.backup = backup;




[Intel-gfx] [PATCH] drm/i915/ttm: Abort suspend on i915_ttm_backup failure

2022-08-31 Thread Nirmoy Das
On system suspend when system memory is low then i915_gem_obj_copy_ttm()
could fail trying to backup a lmem obj. GEM_WARN_ON() is not enough,
suspend shouldn't continue if i915_ttm_backup() throws an error.

References: https://gitlab.freedesktop.org/drm/intel/-/issues/6529
Reviewed-by: Matthew Auld 
Suggested-by: Chris P Wilson 
Signed-off-by: Nirmoy Das 
---
 drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c 
b/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c
index 9aad84059d56..6f5d5c0909b4 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c
@@ -79,7 +79,12 @@ static int i915_ttm_backup(struct i915_gem_apply_to_region 
*apply,
goto out_no_populate;
 
err = i915_gem_obj_copy_ttm(backup, obj, pm_apply->allow_gpu, false);
-   GEM_WARN_ON(err);
+   if (err) {
+   drm_err(&i915->drm,
+   "Unable to copy from device to system memory, err:%d\n",
+   err);
+   goto out_no_populate;
+   }
ttm_bo_wait_ctx(backup_bo, &ctx);
 
obj->ttm.backup = backup;
-- 
2.35.1



Re: [Intel-gfx] [PATCH] drm/i915/ttm: Abort suspend on i915_ttm_backup failure

2022-08-31 Thread Das, Nirmoy



On 8/31/2022 5:50 PM, Matthew Auld wrote:

On 29/08/2022 13:04, Nirmoy Das wrote:

On system suspend when system memory is low then i915_gem_obj_copy_ttm()
could fail trying to backup a lmem obj. GEM_WARN_ON() is not enough,
suspend shouldn't continue if i915_ttm_backup() throws an error.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/6529


Does this fix it? Does CI not complain about the drm_err? Also do we 
know what the actual error was?



The error isn't reoccurring so the best guess is: large framebuffer copy 
took long time and  wait_for_suspend()


timed out. This needs more coverage from IGT and I am looking into 
that.  Let's ignore the "Closes" tag from this


patch till I come up a IGT test for this.


Nirmoy





Suggested-by: Chris P Wilson 
Signed-off-by: Nirmoy Das 


Passing the error along seems reasonable to me,
Reviewed-by: Matthew Auld 


---
  drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c | 7 ++-
  1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c 
b/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c

index 9aad84059d56..6f5d5c0909b4 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c
@@ -79,7 +79,12 @@ static int i915_ttm_backup(struct 
i915_gem_apply_to_region *apply,

  goto out_no_populate;
    err = i915_gem_obj_copy_ttm(backup, obj, pm_apply->allow_gpu, 
false);

-    GEM_WARN_ON(err);
+    if (err) {
+    drm_err(&i915->drm,
+    "Unable to copy from device to system memory, err:%d\n",
+    err);
+    goto out_no_populate;
+    }
  ttm_bo_wait_ctx(backup_bo, &ctx);
    obj->ttm.backup = backup;


Re: [Intel-gfx] [PATCH] drm/i915/ttm: Abort suspend on i915_ttm_backup failure

2022-08-31 Thread Matthew Auld

On 29/08/2022 13:04, Nirmoy Das wrote:

On system suspend when system memory is low then i915_gem_obj_copy_ttm()
could fail trying to backup a lmem obj. GEM_WARN_ON() is not enough,
suspend shouldn't continue if i915_ttm_backup() throws an error.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/6529


Does this fix it? Does CI not complain about the drm_err? Also do we 
know what the actual error was?



Suggested-by: Chris P Wilson 
Signed-off-by: Nirmoy Das 


Passing the error along seems reasonable to me,
Reviewed-by: Matthew Auld 


---
  drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c | 7 ++-
  1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c 
b/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c
index 9aad84059d56..6f5d5c0909b4 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c
@@ -79,7 +79,12 @@ static int i915_ttm_backup(struct i915_gem_apply_to_region 
*apply,
goto out_no_populate;
  
  	err = i915_gem_obj_copy_ttm(backup, obj, pm_apply->allow_gpu, false);

-   GEM_WARN_ON(err);
+   if (err) {
+   drm_err(&i915->drm,
+   "Unable to copy from device to system memory, err:%d\n",
+   err);
+   goto out_no_populate;
+   }
ttm_bo_wait_ctx(backup_bo, &ctx);
  
  	obj->ttm.backup = backup;


[Intel-gfx] [PATCH] drm/i915/ttm: Abort suspend on i915_ttm_backup failure

2022-08-29 Thread Nirmoy Das
On system suspend when system memory is low then i915_gem_obj_copy_ttm()
could fail trying to backup a lmem obj. GEM_WARN_ON() is not enough,
suspend shouldn't continue if i915_ttm_backup() throws an error.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/6529
Suggested-by: Chris P Wilson 
Signed-off-by: Nirmoy Das 
---
 drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c 
b/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c
index 9aad84059d56..6f5d5c0909b4 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c
@@ -79,7 +79,12 @@ static int i915_ttm_backup(struct i915_gem_apply_to_region 
*apply,
goto out_no_populate;
 
err = i915_gem_obj_copy_ttm(backup, obj, pm_apply->allow_gpu, false);
-   GEM_WARN_ON(err);
+   if (err) {
+   drm_err(&i915->drm,
+   "Unable to copy from device to system memory, err:%d\n",
+   err);
+   goto out_no_populate;
+   }
ttm_bo_wait_ctx(backup_bo, &ctx);
 
obj->ttm.backup = backup;
-- 
2.35.1