The order of job->base.s_fence and preventing concurrent CS was guaranteed by
the VM PD being reserved, that is now no longer the case and we need a new lock
for protection.
Thought of this either, but since UMD/APP submits their jobs in one thread upon
one ctx (if submits jobs within multiple thread upon one ctx that means the
order is not important for UMD/app), so after we unreserved PD's resv lock
We still only have one thread doing the cs_ioctl upon *current ctx*,
So how that not work now ? thanks !
BR Monk
-----Original Message-----
From: Christian König [mailto:[email protected]]
Sent: 2017年9月11日 17:13
To: Liu, Monk <[email protected]>; [email protected]
Subject: Re: [PATCH] drm/amdgpu:resolv deadlock between reset and cs_ioctl
Am 11.09.2017 um 11:02 schrieb Monk Liu:
need to unreserve ttm bo before "cs_add_fence" and "entity_push_job"
otherwise there will be deadlock between "recover_vram_from_shadow"
and previous two routines on the ttm bo's resv lock.
Change-Id: I9c18b677e474ce5b45a9cc24da3fa255d77b3d44
Signed-off-by: Monk Liu <[email protected]>
Yeah, I was thinking about this as well. Problem is this won't work without
further changes.
---
drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 35
++++++++++++++++++----------------
1 file changed, 19 insertions(+), 16 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index c1f1ae9..823ac12 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -736,26 +736,12 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser
*p)
/**
* cs_parser_fini() - clean parser states
* @parser: parser structure holding parsing context.
- * @error: error number
- *
- * If error is set than unvalidate buffer, otherwise just free memory
- * used by parsing context.
**/
-static void amdgpu_cs_parser_fini(struct amdgpu_cs_parser *parser,
int error, bool backoff)
+static void amdgpu_cs_parser_fini(struct amdgpu_cs_parser *parser)
{
struct amdgpu_fpriv *fpriv = parser->filp->driver_priv;
unsigned i;
- if (!error) {
- amdgpu_vm_move_pt_bos_in_lru(parser->adev, &fpriv->vm);
-
- ttm_eu_fence_buffer_objects(&parser->ticket,
- &parser->validated,
- parser->fence);
- } else if (backoff) {
- ttm_eu_backoff_reservation(&parser->ticket,
- &parser->validated);
- }
fence_put(parser->fence);
if (parser->ctx)
@@ -1075,6 +1061,16 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
job->owner = p->filp;
job->fence_ctx = entity->fence_context;
p->fence = fence_get(&job->base.s_fence->finished);
+
+ /* hook sched fence to all BOs' reservation in validated list
+ * and unreserve them.
+ *
+ * we unreserve at here is because otherwise
+ * there'll be deadlock between ctx_add_fence/sched_entity_push_job
+ * and gpu_reset routine's recover_bo_from_shadow on PD/PTEs' ttm bo
lock
+ */
+ ttm_eu_fence_buffer_objects(&p->ticket, &p->validated, p->fence);
The order of job->base.s_fence and preventing concurrent CS was guaranteed by
the VM PD being reserved, that is now no longer the case and we need a new lock
for protection.
I suggest to add another lock to the context which we grab in
amdgpu_ctx_get() and release in amdgpu_ctx_put().
Regards,
Christian.
+
r = amdgpu_ctx_add_fence(p->ctx, ring, p->fence, &seq);
if (r) {
fence_put(p->fence);
@@ -1095,6 +1091,7 @@ int amdgpu_cs_ioctl(struct drm_device *dev, void *data,
struct drm_file *filp)
{
struct amdgpu_device *adev = dev->dev_private;
union drm_amdgpu_cs *cs = data;
+ struct amdgpu_fpriv *fpriv;
struct amdgpu_cs_parser parser = {};
bool reserved_buffers = false;
int i, r;
@@ -1104,6 +1101,7 @@ int amdgpu_cs_ioctl(struct drm_device *dev, void
*data, struct drm_file *filp)
parser.adev = adev;
parser.filp = filp;
+ fpriv = filp->driver_priv;
r = amdgpu_cs_parser_init(&parser, data);
if (r) {
@@ -1141,7 +1139,12 @@ int amdgpu_cs_ioctl(struct drm_device *dev, void *data,
struct drm_file *filp)
r = amdgpu_cs_submit(&parser, cs);
out:
- amdgpu_cs_parser_fini(&parser, r, reserved_buffers);
+ if (!r)
+ amdgpu_vm_move_pt_bos_in_lru(parser.adev, &fpriv->vm);
+ else if (reserved_buffers)
+ ttm_eu_backoff_reservation(&parser.ticket, &parser.validated);
+
+ amdgpu_cs_parser_fini(&parser);
return r;
}
_______________________________________________
amd-gfx mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/amd-gfx