I made a patch as attached.
I'm not sure how to make patch as your proposal, Could you make a patch
for that if mine isn't enough?
-David
On 2019年04月24日 15:12, Christian König wrote:
how about just adding a wrapper for pin function as below?
I considered this as well and don't think it will work reliable.
We could use it as a band aid for this specific problem, but in
general we need to improve the handling in TTM to resolve those kind
of resource conflicts.
Regards,
Christian.
Am 23.04.19 um 17:09 schrieb Zhou, David(ChunMing):
>3. If we have a ticket we grab a reference to the first BO on the
LRU, drop the LRU lock and try to grab the reservation lock with the
ticket.
The BO on LRU is already locked by cs user, can it be dropped here by
DC user? and then DC user grab its lock with ticket, how does CS grab
it again?
If you think waiting in ttm has this risk, how about just adding a
wrapper for pin function as below?
amdgpu_get_pin_bo_timeout()
{
do {
amdgpo_bo_reserve();
r=amdgpu_bo_pin();
if(!r)
break;
amdgpu_bo_unreserve();
timeout--;
} while(timeout>0);
}
-------- Original Message --------
Subject: Re: [PATCH] ttm: wait mem space if user allow while gpu busy
From: Christian König
To: "Zhou, David(ChunMing)" ,"Koenig, Christian" ,"Liang, Prike"
,dri-devel@lists.freedesktop.org
CC:
Well that's not so easy of hand.
The basic problem here is that when you busy wait at this place you
can easily run into situations where application A busy waits for B
while B busy waits for A -> deadlock.
So what we need here is the deadlock detection logic of the ww_mutex.
To use this we at least need to do the following steps:
1. Reserve the BO in DC using a ww_mutex ticket (trivial).
2. If we then run into this EBUSY condition in TTM check if the BO we
need memory for (or rather the ww_mutex of its reservation object)
has a ticket assigned.
3. If we have a ticket we grab a reference to the first BO on the
LRU, drop the LRU lock and try to grab the reservation lock with the
ticket.
4. If getting the reservation lock with the ticket succeeded we check
if the BO is still the first one on the LRU in question (the BO could
have moved).
5. If the BO is still the first one on the LRU in question we try to
evict it as we would evict any other BO.
6. If any of the "If's" above fail we just back off and return -EBUSY.
Steps 2-5 are certainly not trivial, but doable as far as I can see.
Have fun :)
Christian.
Am 23.04.19 um 15:19 schrieb Zhou, David(ChunMing):
How about adding more condition ctx->resv inline to address your
concern? As well as don't wait from same user, shouldn't lead to
deadlock.
Otherwise, any other idea?
-------- Original Message --------
Subject: Re: [PATCH] ttm: wait mem space if user allow while gpu busy
From: Christian König
To: "Liang, Prike" ,"Zhou, David(ChunMing)"
,dri-devel@lists.freedesktop.org
CC:
Well that is certainly a NAK because it can lead to deadlock in the
memory management.
You can't just busy wait with all those locks held.
Regards,
Christian.
Am 23.04.19 um 03:45 schrieb Liang, Prike:
> Acked-by: Prike Liang <prike.li...@amd.com>
>
> Thanks,
> Prike
> -----Original Message-----
> From: Chunming Zhou <david1.z...@amd.com>
> Sent: Monday, April 22, 2019 6:39 PM
> To: dri-devel@lists.freedesktop.org
> Cc: Liang, Prike <prike.li...@amd.com>; Zhou, David(ChunMing)
<david1.z...@amd.com>
> Subject: [PATCH] ttm: wait mem space if user allow while gpu busy
>
> heavy gpu job could occupy memory long time, which could lead to
other user fail to get memory.
>
> Change-Id: I0b322d98cd76e5ac32b00462bbae8008d76c5e11
> Signed-off-by: Chunming Zhou <david1.z...@amd.com>
> ---
> drivers/gpu/drm/ttm/ttm_bo.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c
b/drivers/gpu/drm/ttm/ttm_bo.c index 7c484729f9b2..6c596cc24bec 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> @@ -830,8 +830,10 @@ static int ttm_bo_mem_force_space(struct
ttm_buffer_object *bo,
> if (mem->mm_node)
> break;
> ret = ttm_mem_evict_first(bdev, mem_type, place, ctx);
> - if (unlikely(ret != 0))
> - return ret;
> + if (unlikely(ret != 0)) {
> + if (!ctx || ctx->no_wait_gpu || ret != -EBUSY)
> + return ret;
> + }
> } while (1);
> mem->mem_type = mem_type;
> return ttm_bo_add_move_fence(bo, man, mem);
> --
> 2.17.1
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
>From 184941165665d80ad1992179e7652d7e4d739b2e Mon Sep 17 00:00:00 2001
From: Chunming Zhou <david1.z...@amd.com>
Date: Wed, 24 Apr 2019 11:35:29 +0800
Subject: [PATCH] ttm: return EBUSY to user
EBUSY means there is no idle BO on LRU which can be evicted, the error
is meaningful to user.
Change-Id: I3857c4b99859c9d2f354cf2c486dbacb8520d0ed
Signed-off-by: Chunming Zhou <david1.z...@amd.com>
---
drivers/gpu/drm/ttm/ttm_bo.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 7c484729f9b2..dce90053f29e 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -1000,7 +1000,7 @@ int ttm_bo_mem_space(struct ttm_buffer_object *bo,
return -EINVAL;
}
- return (has_erestartsys) ? -ERESTARTSYS : -ENOMEM;
+ return ret == -EBUSY ? ret : (has_erestartsys) ? -ERESTARTSYS : -ENOMEM;
}
EXPORT_SYMBOL(ttm_bo_mem_space);
--
2.17.1
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel