Re: [Intel-gfx] [PATCH] drm/i915/ttm: Abort suspend on i915_ttm_backup failure
On 9/1/2022 5:57 PM, Andrzej Hajda wrote: On 31.08.2022 18:18, Nirmoy Das wrote: On system suspend when system memory is low then i915_gem_obj_copy_ttm() could fail trying to backup a lmem obj. GEM_WARN_ON() is not enough, suspend shouldn't continue if i915_ttm_backup() throws an error. References: https://gitlab.freedesktop.org/drm/intel/-/issues/6529 Reviewed-by: Matthew Auld Suggested-by: Chris P Wilson Signed-off-by: Nirmoy Das --- drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c index 9aad84059d56..6f5d5c0909b4 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c @@ -79,7 +79,12 @@ static int i915_ttm_backup(struct i915_gem_apply_to_region *apply, goto out_no_populate; err = i915_gem_obj_copy_ttm(backup, obj, pm_apply->allow_gpu, false); - GEM_WARN_ON(err); + if (err) { + drm_err(&i915->drm, + "Unable to copy from device to system memory, err:%d\n", + err); I wonder if %pe wouldn't be better here, up to you. More readable err should be useful, resend with %pe. Reviewed-by: Andrzej Hajda Thanks, Nirmoy Regards Andrzej + goto out_no_populate; + } ttm_bo_wait_ctx(backup_bo, &ctx); obj->ttm.backup = backup;
Re: [Intel-gfx] [PATCH] drm/i915/ttm: Abort suspend on i915_ttm_backup failure
On 31.08.2022 18:18, Nirmoy Das wrote: On system suspend when system memory is low then i915_gem_obj_copy_ttm() could fail trying to backup a lmem obj. GEM_WARN_ON() is not enough, suspend shouldn't continue if i915_ttm_backup() throws an error. References: https://gitlab.freedesktop.org/drm/intel/-/issues/6529 Reviewed-by: Matthew Auld Suggested-by: Chris P Wilson Signed-off-by: Nirmoy Das --- drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c index 9aad84059d56..6f5d5c0909b4 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c @@ -79,7 +79,12 @@ static int i915_ttm_backup(struct i915_gem_apply_to_region *apply, goto out_no_populate; err = i915_gem_obj_copy_ttm(backup, obj, pm_apply->allow_gpu, false); - GEM_WARN_ON(err); + if (err) { + drm_err(&i915->drm, + "Unable to copy from device to system memory, err:%d\n", + err); I wonder if %pe wouldn't be better here, up to you. Reviewed-by: Andrzej Hajda Regards Andrzej + goto out_no_populate; + } ttm_bo_wait_ctx(backup_bo, &ctx); obj->ttm.backup = backup;
[Intel-gfx] [PATCH] drm/i915/ttm: Abort suspend on i915_ttm_backup failure
On system suspend when system memory is low then i915_gem_obj_copy_ttm() could fail trying to backup a lmem obj. GEM_WARN_ON() is not enough, suspend shouldn't continue if i915_ttm_backup() throws an error. References: https://gitlab.freedesktop.org/drm/intel/-/issues/6529 Reviewed-by: Matthew Auld Suggested-by: Chris P Wilson Signed-off-by: Nirmoy Das --- drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c index 9aad84059d56..6f5d5c0909b4 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c @@ -79,7 +79,12 @@ static int i915_ttm_backup(struct i915_gem_apply_to_region *apply, goto out_no_populate; err = i915_gem_obj_copy_ttm(backup, obj, pm_apply->allow_gpu, false); - GEM_WARN_ON(err); + if (err) { + drm_err(&i915->drm, + "Unable to copy from device to system memory, err:%d\n", + err); + goto out_no_populate; + } ttm_bo_wait_ctx(backup_bo, &ctx); obj->ttm.backup = backup; -- 2.35.1
Re: [Intel-gfx] [PATCH] drm/i915/ttm: Abort suspend on i915_ttm_backup failure
On 8/31/2022 5:50 PM, Matthew Auld wrote: On 29/08/2022 13:04, Nirmoy Das wrote: On system suspend when system memory is low then i915_gem_obj_copy_ttm() could fail trying to backup a lmem obj. GEM_WARN_ON() is not enough, suspend shouldn't continue if i915_ttm_backup() throws an error. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/6529 Does this fix it? Does CI not complain about the drm_err? Also do we know what the actual error was? The error isn't reoccurring so the best guess is: large framebuffer copy took long time and wait_for_suspend() timed out. This needs more coverage from IGT and I am looking into that. Let's ignore the "Closes" tag from this patch till I come up a IGT test for this. Nirmoy Suggested-by: Chris P Wilson Signed-off-by: Nirmoy Das Passing the error along seems reasonable to me, Reviewed-by: Matthew Auld --- drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c index 9aad84059d56..6f5d5c0909b4 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c @@ -79,7 +79,12 @@ static int i915_ttm_backup(struct i915_gem_apply_to_region *apply, goto out_no_populate; err = i915_gem_obj_copy_ttm(backup, obj, pm_apply->allow_gpu, false); - GEM_WARN_ON(err); + if (err) { + drm_err(&i915->drm, + "Unable to copy from device to system memory, err:%d\n", + err); + goto out_no_populate; + } ttm_bo_wait_ctx(backup_bo, &ctx); obj->ttm.backup = backup;
Re: [Intel-gfx] [PATCH] drm/i915/ttm: Abort suspend on i915_ttm_backup failure
On 29/08/2022 13:04, Nirmoy Das wrote: On system suspend when system memory is low then i915_gem_obj_copy_ttm() could fail trying to backup a lmem obj. GEM_WARN_ON() is not enough, suspend shouldn't continue if i915_ttm_backup() throws an error. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/6529 Does this fix it? Does CI not complain about the drm_err? Also do we know what the actual error was? Suggested-by: Chris P Wilson Signed-off-by: Nirmoy Das Passing the error along seems reasonable to me, Reviewed-by: Matthew Auld --- drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c index 9aad84059d56..6f5d5c0909b4 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c @@ -79,7 +79,12 @@ static int i915_ttm_backup(struct i915_gem_apply_to_region *apply, goto out_no_populate; err = i915_gem_obj_copy_ttm(backup, obj, pm_apply->allow_gpu, false); - GEM_WARN_ON(err); + if (err) { + drm_err(&i915->drm, + "Unable to copy from device to system memory, err:%d\n", + err); + goto out_no_populate; + } ttm_bo_wait_ctx(backup_bo, &ctx); obj->ttm.backup = backup;
[Intel-gfx] [PATCH] drm/i915/ttm: Abort suspend on i915_ttm_backup failure
On system suspend when system memory is low then i915_gem_obj_copy_ttm() could fail trying to backup a lmem obj. GEM_WARN_ON() is not enough, suspend shouldn't continue if i915_ttm_backup() throws an error. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/6529 Suggested-by: Chris P Wilson Signed-off-by: Nirmoy Das --- drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c index 9aad84059d56..6f5d5c0909b4 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c @@ -79,7 +79,12 @@ static int i915_ttm_backup(struct i915_gem_apply_to_region *apply, goto out_no_populate; err = i915_gem_obj_copy_ttm(backup, obj, pm_apply->allow_gpu, false); - GEM_WARN_ON(err); + if (err) { + drm_err(&i915->drm, + "Unable to copy from device to system memory, err:%d\n", + err); + goto out_no_populate; + } ttm_bo_wait_ctx(backup_bo, &ctx); obj->ttm.backup = backup; -- 2.35.1