RE: [PATCH] drm/scheduler: Partially revert "drm/scheduler: track GPU active time per entity"
[AMD Official Use Only - General] Can we just add kref for entity? Or just collect such job time usage somewhere else? -Original Message- From: Pan, Xinhui Sent: Thursday, August 17, 2023 1:05 PM To: amd-...@lists.freedesktop.org Cc: Tuikov, Luben ; airl...@gmail.com; dri-devel@lists.freedesktop.org; l.st...@pengutronix.de; Koenig, Christian ; Pan, Xinhui Subject: [PATCH] drm/scheduler: Partially revert "drm/scheduler: track GPU active time per entity" This patch partially revert commit df622729ddbf ("drm/scheduler: track GPU active time per entity") which touchs entity without any reference. I notice there is one memory overwritten from gpu scheduler side. The case is like below. A(drm_sched_main) B(vm fini) drm_sched_job_begin drm_sched_entity_kill //job in pending_list wait_for_completion complete_all... ... kfree entity drm_sched_get_cleanup_job //fetch job from pending_list access job->entity //memory overwitten As long as we can NOT guarantee entity is alive in this case, lets revert it for now. Signed-off-by: xinhui pan --- drivers/gpu/drm/scheduler/sched_main.c | 6 -- 1 file changed, 6 deletions(-) diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 602361c690c9..1b3f1a6a8514 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -907,12 +907,6 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched) spin_unlock(>job_list_lock); - if (job) { - job->entity->elapsed_ns += ktime_to_ns( - ktime_sub(job->s_fence->finished.timestamp, - job->s_fence->scheduled.timestamp)); - } - return job; } -- 2.34.1
回复: 回复: [PATCH v4] drm: Optimise for continuous memory allocation
[AMD Official Use Only - General] comments line. 发件人: Koenig, Christian 发送时间: 2022年11月29日 20:07 收件人: Pan, Xinhui; amd-...@lists.freedesktop.org 抄送: dan...@ffwll.ch; matthew.a...@intel.com; dri-devel@lists.freedesktop.org; linux-ker...@vger.kernel.org; Paneer Selvam, Arunpravin; intel-...@lists.freedesktop.org 主题: Re: 回复: [PATCH v4] drm: Optimise for continuous memory allocation Am 29.11.22 um 12:54 schrieb Pan, Xinhui: > [AMD Official Use Only - General] > > comments inline. > > > 发件人: Koenig, Christian > 发送时间: 2022年11月29日 19:32 > 收件人: Pan, Xinhui; amd-...@lists.freedesktop.org > 抄送: dan...@ffwll.ch; matthew.a...@intel.com; dri-devel@lists.freedesktop.org; > linux-ker...@vger.kernel.org; Paneer Selvam, Arunpravin; > intel-...@lists.freedesktop.org > 主题: Re: [PATCH v4] drm: Optimise for continuous memory allocation > > Am 29.11.22 um 11:56 schrieb xinhui pan: >> Currently drm-buddy does not have full knowledge of continuous memory. >> >> Lets consider scenario below. >> order 1:L R >> order 0: LL LR RL RR >> for order 1 allocation, it can offer L or R or LR+RL. >> >> For now, we only implement L or R case for continuous memory allocation. >> So this patch aims to implement the rest cases. >> >> Adding a new member leaf_link which links all leaf blocks in asceding >> order. Now we can find more than 2 sub-order blocks easier. >> Say, order 4 can be combined with corresponding order 4, 2+2, 1+2+1, >> 0+1+2+0, 0+2+1+0. > Well that description is a bit confusing and doesn't make to much sense > to me. > > When you have two adjacent free order 0 blocks then those should be > automatically combined into an order 1. This is a fundamental property > of the buddy allocator, otherwise the whole algorithm won't work. > > [xh] sorry, The order above is not 4, should be 3. > order 3 can be combined with corresponding order 3, 2+2, 1+2+1, 0+1+2+0, > 0+2+1+0 > the order 0 + 1 + 2 + 0 case does not have two same order 0 adjacent. they > are in different tree. > looks like below > order 3:L3 >R3 > order 2:L2 (R2)* > L2* > order 1:L1 (R1) L1 > order 0: L0 (R0) (L0) > R0 + R1+R2 +L0 with () around combined to be order 3. > R2 + L2 with * followed combined to be order 3. > etc > > When you have the case of a free order 1 block with two adjacent free > order 0 blocks then we a fragmented address space. In this case the best > approach is to fail the allocation and start to swap things out. > > [xh] Eviction is expensive. No, it isn't. Eviction is part of the algorithm to clean this up. When we can't find any free room then evicting and moving things back in is the best we can do to de-fragment the address space. This is expected behavior. [xh] I believe eviction is the best approach to cleanup memory. But as its cost is not cheap, it should be the final step. As long as we could find any room to satisfy the request, no need to trigger eviction. Just a test in theory two threads run parallelly. total memory is 128. while true { alloc 32 alloc 32 free 32 free 32 alloc 64 free 64 } when thread 0 wants to alloc 64, the memory layout might be (32) means allocated, _32_ means free. case 1: (32) _32_ _32_ (32) case 2: (32) _32_ (32) _32_ case 3: (32) (32) _64_ case 4: (32) _32_ 64_ case 5: _128_ case 6: (64) _64_ without this patch, it would trigger eviction in case 1 and case 2. with this patch, it would trigger eviction only in case 2. obviously, the two threads totally consume memory at most 128 at any time, no overcommit. The eviction is the less the better. Regards, Christian. > And if it still fails to find the continuous memory with this approach, then > let's evict. > > So what exactly is the goal here? > > Regards, > Christian. > >> Signed-off-by: xinhui pan >> --- >> change from v3: >> reworked totally. adding leaf_link. >> >> change from v2: >> search continuous block in nearby root if needed >> >> change from v1: >> implement top-down continuous allocation >> --- >>drivers/gpu/drm/drm_buddy.c | 108 +--- >>include/drm/drm_buddy.h | 1 + >>2 files changed, 102 insertions(+), 7 deletions(-) >> >> diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c >> index 11bb59399471..8edafb99b02c 100644 >> --- a/drivers/gpu/drm/drm_buddy.c >> +++ b/d
回复: [PATCH v4] drm: Optimise for continuous memory allocation
[AMD Official Use Only - General] comments inline. 发件人: Koenig, Christian 发送时间: 2022年11月29日 19:32 收件人: Pan, Xinhui; amd-...@lists.freedesktop.org 抄送: dan...@ffwll.ch; matthew.a...@intel.com; dri-devel@lists.freedesktop.org; linux-ker...@vger.kernel.org; Paneer Selvam, Arunpravin; intel-...@lists.freedesktop.org 主题: Re: [PATCH v4] drm: Optimise for continuous memory allocation Am 29.11.22 um 11:56 schrieb xinhui pan: > Currently drm-buddy does not have full knowledge of continuous memory. > > Lets consider scenario below. > order 1:L R > order 0: LL LR RL RR > for order 1 allocation, it can offer L or R or LR+RL. > > For now, we only implement L or R case for continuous memory allocation. > So this patch aims to implement the rest cases. > > Adding a new member leaf_link which links all leaf blocks in asceding > order. Now we can find more than 2 sub-order blocks easier. > Say, order 4 can be combined with corresponding order 4, 2+2, 1+2+1, > 0+1+2+0, 0+2+1+0. Well that description is a bit confusing and doesn't make to much sense to me. When you have two adjacent free order 0 blocks then those should be automatically combined into an order 1. This is a fundamental property of the buddy allocator, otherwise the whole algorithm won't work. [xh] sorry, The order above is not 4, should be 3. order 3 can be combined with corresponding order 3, 2+2, 1+2+1, 0+1+2+0, 0+2+1+0 the order 0 + 1 + 2 + 0 case does not have two same order 0 adjacent. they are in different tree. looks like below order 3:L3 R3 order 2:L2 (R2)*L2* order 1:L1 (R1) L1 order 0: L0 (R0) (L0) R0 + R1+R2 +L0 with () around combined to be order 3. R2 + L2 with * followed combined to be order 3. etc When you have the case of a free order 1 block with two adjacent free order 0 blocks then we a fragmented address space. In this case the best approach is to fail the allocation and start to swap things out. [xh] Eviction is expensive. And if it still fails to find the continuous memory with this approach, then let's evict. So what exactly is the goal here? Regards, Christian. > > Signed-off-by: xinhui pan > --- > change from v3: > reworked totally. adding leaf_link. > > change from v2: > search continuous block in nearby root if needed > > change from v1: > implement top-down continuous allocation > --- > drivers/gpu/drm/drm_buddy.c | 108 +--- > include/drm/drm_buddy.h | 1 + > 2 files changed, 102 insertions(+), 7 deletions(-) > > diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c > index 11bb59399471..8edafb99b02c 100644 > --- a/drivers/gpu/drm/drm_buddy.c > +++ b/drivers/gpu/drm/drm_buddy.c > @@ -80,6 +80,7 @@ int drm_buddy_init(struct drm_buddy *mm, u64 size, u64 > chunk_size) > { > unsigned int i; > u64 offset; > + LIST_HEAD(leaf); > > if (size < chunk_size) > return -EINVAL; > @@ -136,6 +137,7 @@ int drm_buddy_init(struct drm_buddy *mm, u64 size, u64 > chunk_size) > goto out_free_roots; > > mark_free(mm, root); > + list_add_tail(>leaf_link, ); > > BUG_ON(i > mm->max_order); > BUG_ON(drm_buddy_block_size(mm, root) < chunk_size); > @@ -147,6 +149,7 @@ int drm_buddy_init(struct drm_buddy *mm, u64 size, u64 > chunk_size) > i++; > } while (size); > > + list_del(); > return 0; > > out_free_roots: > @@ -205,6 +208,9 @@ static int split_block(struct drm_buddy *mm, > mark_free(mm, block->left); > mark_free(mm, block->right); > > + list_add(>right->leaf_link, >leaf_link); > + list_add(>left->leaf_link, >leaf_link); > + list_del(>leaf_link); > mark_split(block); > > return 0; > @@ -256,6 +262,9 @@ static void __drm_buddy_free(struct drm_buddy *mm, > break; > > list_del(>link); > + list_add(>leaf_link, >leaf_link); > + list_del(>leaf_link); > + list_del(>leaf_link); > > drm_block_free(mm, block); > drm_block_free(mm, buddy); > @@ -386,6 +395,78 @@ alloc_range_bias(struct drm_buddy *mm, > return ERR_PTR(err); > } > > +static struct drm_buddy_block * > +find_continuous_blocks(struct drm_buddy *mm, > +int order, > +
回复: [PATCH v4] drm: Optimise for continuous memory allocation
[AMD Official Use Only - General] In one ROCM + gdm restart test, find_continuous_blocks() succeed with ratio 35%. the cod coverage report is below. 7723998 : if (order-- == min_order) { 773 352 : if (!(flags & DRM_BUDDY_RANGE_ALLOCATION) && 774 352 : min_order != 0 && pages == BIT(order + 1)) { 775 79 : block = find_continuous_blocks(mm, 776 : order, 777 : flags, 778 : ); 779 79 : if (block) 780 : break; 781 : } 782 300 : err = -ENOSPC; 783 300 : goto err_free; thanks xinhui ____________ 发件人: Pan, Xinhui 发送时间: 2022年11月29日 18:56 收件人: amd-...@lists.freedesktop.org 抄送: dan...@ffwll.ch; matthew.a...@intel.com; Koenig, Christian; dri-devel@lists.freedesktop.org; linux-ker...@vger.kernel.org; Paneer Selvam, Arunpravin; intel-...@lists.freedesktop.org; Pan, Xinhui 主题: [PATCH v4] drm: Optimise for continuous memory allocation Currently drm-buddy does not have full knowledge of continuous memory. Lets consider scenario below. order 1:L R order 0: LL LR RL RR for order 1 allocation, it can offer L or R or LR+RL. For now, we only implement L or R case for continuous memory allocation. So this patch aims to implement the rest cases. Adding a new member leaf_link which links all leaf blocks in asceding order. Now we can find more than 2 sub-order blocks easier. Say, order 4 can be combined with corresponding order 4, 2+2, 1+2+1, 0+1+2+0, 0+2+1+0. Signed-off-by: xinhui pan --- change from v3: reworked totally. adding leaf_link. change from v2: search continuous block in nearby root if needed change from v1: implement top-down continuous allocation --- drivers/gpu/drm/drm_buddy.c | 108 +--- include/drm/drm_buddy.h | 1 + 2 files changed, 102 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c index 11bb59399471..8edafb99b02c 100644 --- a/drivers/gpu/drm/drm_buddy.c +++ b/drivers/gpu/drm/drm_buddy.c @@ -80,6 +80,7 @@ int drm_buddy_init(struct drm_buddy *mm, u64 size, u64 chunk_size) { unsigned int i; u64 offset; + LIST_HEAD(leaf); if (size < chunk_size) return -EINVAL; @@ -136,6 +137,7 @@ int drm_buddy_init(struct drm_buddy *mm, u64 size, u64 chunk_size) goto out_free_roots; mark_free(mm, root); + list_add_tail(>leaf_link, ); BUG_ON(i > mm->max_order); BUG_ON(drm_buddy_block_size(mm, root) < chunk_size); @@ -147,6 +149,7 @@ int drm_buddy_init(struct drm_buddy *mm, u64 size, u64 chunk_size) i++; } while (size); + list_del(); return 0; out_free_roots: @@ -205,6 +208,9 @@ static int split_block(struct drm_buddy *mm, mark_free(mm, block->left); mark_free(mm, block->right); + list_add(>right->leaf_link, >leaf_link); + list_add(>left->leaf_link, >leaf_link); + list_del(>leaf_link); mark_split(block); return 0; @@ -256,6 +262,9 @@ static void __drm_buddy_free(struct drm_buddy *mm, break; list_del(>link); + list_add(>leaf_link, >leaf_link); + list_del(>leaf_link); + list_del(>leaf_link); drm_block_free(mm, block); drm_block_free(mm, buddy); @@ -386,6 +395,78 @@ alloc_range_bias(struct drm_buddy *mm, return ERR_PTR(err); } +static struct drm_buddy_block * +find_continuous_blocks(struct drm_buddy *mm, + int order, + unsigned long flags, + struct drm_buddy_block **rblock) +{ + struct list_head *head = >free_list[order]; + struct drm_buddy_block *free_block, *max_block = NULL, *end, *begin; + u64 pages = BIT(order + 1); + u64 cur_pages; + + list_for_each_entry(free_block, head, link) { + if (max_block) { + if (!(flags & DRM_BUDDY_TOPDOWN_ALLOCATION)) + break; + + if (drm_buddy_block_offset(free_block) < +
回复: [PATCH v3] drm: Optimise for continuous memory allocation
[AMD Official Use Only - General] Hi Arun, Thanks for your reply. comments are inline. 发件人: Paneer Selvam, Arunpravin 发送时间: 2022年11月29日 1:09 收件人: Pan, Xinhui; amd-...@lists.freedesktop.org 抄送: linux-ker...@vger.kernel.org; dri-devel@lists.freedesktop.org; matthew.a...@intel.com; dan...@ffwll.ch; Koenig, Christian 主题: Re: [PATCH v3] drm: Optimise for continuous memory allocation Hi Xinhui, On 11/28/2022 12:04 PM, xinhui pan wrote: > Currently drm-buddy does not have full knowledge of continuous memory. > > Lets consider scenario below. > order 1:L R > order 0: LL LR RL RR > for order 1 allocation, it can offer L or R or LR+RL. > > For now, we only implement L or R case for continuous memory allocation. > So this patch aims to implement the LR+RL case. > > Signed-off-by: xinhui pan > --- > change from v2: > search continuous block in nearby root if needed > > change from v1: > implement top-down continuous allocation > --- > drivers/gpu/drm/drm_buddy.c | 78 + > 1 file changed, 71 insertions(+), 7 deletions(-) > > diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c > index 11bb59399471..ff58eb3136d2 100644 > --- a/drivers/gpu/drm/drm_buddy.c > +++ b/drivers/gpu/drm/drm_buddy.c > @@ -386,6 +386,58 @@ alloc_range_bias(struct drm_buddy *mm, > return ERR_PTR(err); > } > > +static struct drm_buddy_block * > +find_continuous_blocks(struct drm_buddy *mm, > +int order, > +unsigned long flags, > +struct drm_buddy_block **rn) > +{ > + struct list_head *head = >free_list[order]; > + struct drm_buddy_block *node, *parent, *free_node, *max_node = NULL; NIT: We usually name the variable as *block or ***_block for drm buddy and we have *node or ***_node for drm mm manager. [xh] Oh, yes. The code naming is important. Will fix it. > + int i; > + > + list_for_each_entry(free_node, head, link) { > + if (max_node) { > + if (!(flags & DRM_BUDDY_TOPDOWN_ALLOCATION)) > + break; > + > + if (drm_buddy_block_offset(free_node) < > + drm_buddy_block_offset(max_node)) > + continue; > + } > + > + parent = free_node; > + do { > + node = parent; > + parent = parent->parent; > + } while (parent && parent->right == node); > + > + if (!parent) { > + for (i = 0; i < mm->n_roots - 1; i++) > + if (mm->roots[i] == node) > + break; > + if (i == mm->n_roots - 1) > + continue; > + node = mm->roots[i + 1]; > + } else { > + node = parent->right; > + } > + > + while (drm_buddy_block_is_split(node)) > + node = node->left; > + > + if (drm_buddy_block_is_free(node) && > + drm_buddy_block_order(node) == order) { > + *rn = node; > + max_node = free_node; > + BUG_ON(drm_buddy_block_offset(node) != > + drm_buddy_block_offset(max_node) + > + drm_buddy_block_size(mm, max_node)); > + } > + } > + return max_node; > +} > + > static struct drm_buddy_block * > get_maxblock(struct list_head *head) > { > @@ -637,7 +689,7 @@ int drm_buddy_alloc_blocks(struct drm_buddy *mm, > struct list_head *blocks, > unsigned long flags) > { > - struct drm_buddy_block *block = NULL; > + struct drm_buddy_block *block = NULL, *rblock = NULL; > unsigned int min_order, order; > unsigned long pages; > LIST_HEAD(allocated); > @@ -689,17 +741,29 @@ int drm_buddy_alloc_blocks(struct drm_buddy *mm, > break; > > if (order-- == min_order) { > + if (!(flags & DRM_BUDDY_RANGE_ALLOCATION) && > + min_order != 0 && pages == BIT(order + 1)) { > + block = find_continuous_blocks(mm, > +order, > +flags, > +
RE: [PATCH] drm/ttm: fix ttm_bo_swapout
[AMD Official Use Only] Looks good to me. Thanks -Original Message- From: Christian König Sent: Thursday, December 2, 2021 6:38 PM To: Pan, Xinhui ; dri-devel@lists.freedesktop.org Subject: [PATCH] drm/ttm: fix ttm_bo_swapout Commit 7120a447c7fe ("drm/ttm: Double check mem_type of BO while eviction") made ttm_bo_evict_swapout_allowable() function actually check the placement, but we always used a dummy placement in ttm_bo_swapout. Fix this by using the real placement instead. Signed-off-by: Christian König Fixes: 7120a447c7fe ("drm/ttm: Double check mem_type of BO while eviction") --- drivers/gpu/drm/ttm/ttm_bo.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index 739f11c0109c..047adc42d9a0 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -1103,7 +1103,7 @@ int ttm_bo_swapout(struct ttm_buffer_object *bo, struct ttm_operation_ctx *ctx, * as an indication that we're about to swap out. */ memset(, 0, sizeof(place)); - place.mem_type = TTM_PL_SYSTEM; + place.mem_type = bo->resource->mem_type; if (!ttm_bo_evict_swapout_allowable(bo, ctx, , , NULL)) return -EBUSY; @@ -1135,6 +1135,7 @@ int ttm_bo_swapout(struct ttm_buffer_object *bo, struct ttm_operation_ctx *ctx, struct ttm_place hop; memset(, 0, sizeof(hop)); + place.mem_type = TTM_PL_SYSTEM; ret = ttm_resource_alloc(bo, , _mem); if (unlikely(ret)) goto out; -- 2.25.1
回复: 回复: 回复: [PATCH] drm/ttm: Put BO in its memory manager's lru list
[AMD Official Use Only] yes, a double check is needed. how about change below. As long as we detect such mismatch, it indicates another eviction is on going. return false here is reasonable. @@ -1335,6 +1336,8 @@ static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo, struct dma_fence *f; int i; + if (bo->resource->mem_type != place->mem_type) + return false; /* Swapout? */ if (bo->resource->mem_type == TTM_PL_SYSTEM) return true; 发件人: Koenig, Christian 发送时间: 2021年11月9日 21:18 收件人: Pan, Xinhui; amd-...@lists.freedesktop.org 抄送: dri-devel@lists.freedesktop.org 主题: Re: 回复: 回复: [PATCH] drm/ttm: Put BO in its memory manager's lru list Exactly that's the reason why we should have the double check in TTM I've mentioned in the other mail. Christian. Am 09.11.21 um 14:16 schrieb Pan, Xinhui: > [AMD Official Use Only] > > Actually this patch does not totally fix the mismatch of lru list with > mem_type as mem_type is changed in ->move() and lru list is changed after > that. > > During this small period, another eviction could still happed and evict this > mismatched BO from sMam(say, its lru list is on vram domain) to sMem. > ____________ > 发件人: Pan, Xinhui > 发送时间: 2021年11月9日 21:05 > 收件人: Koenig, Christian; amd-...@lists.freedesktop.org > 抄送: dri-devel@lists.freedesktop.org > 主题: 回复: 回复: [PATCH] drm/ttm: Put BO in its memory manager's lru list > > Yes, a stable tag is needed. vulkan guys say 5.14 hit this issue too. > > I think that amdgpu_bo_move() does support copy from sysMem to sysMem > correctly. > maybe something below is needed. > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c > index c83ef42ca702..aa63ae7ddf1e 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c > @@ -485,7 +485,8 @@ static int amdgpu_bo_move(struct ttm_buffer_object *bo, > bool evict, > } > if (old_mem->mem_type == TTM_PL_SYSTEM && > (new_mem->mem_type == TTM_PL_TT || > -new_mem->mem_type == AMDGPU_PL_PREEMPT)) { > +new_mem->mem_type == AMDGPU_PL_PREEMPT || > +new_mem->mem_type == TTM_PL_SYSTEM)) { > ttm_bo_move_null(bo, new_mem); > goto out; > } > > otherwise, amdgpu_move_blit() is called to do the system memory copy which > use a wrong address. > 206 /* Map only what can't be accessed directly */ > 207 if (!tmz && mem->start != AMDGPU_BO_INVALID_OFFSET) { > 208 *addr = amdgpu_ttm_domain_start(adev, mem->mem_type) + > 209 mm_cur->start; > 210 return 0; > 211 } > > line 208, *addr is zero. So when amdgpu_copy_buffer submit job with such > addr, page fault happens. > > > > 发件人: Koenig, Christian > 发送时间: 2021年11月9日 20:35 > 收件人: Pan, Xinhui; amd-...@lists.freedesktop.org > 抄送: dri-devel@lists.freedesktop.org > 主题: Re: 回复: [PATCH] drm/ttm: Put BO in its memory manager's lru list > > Mhm, I'm not sure what the rational behind that is. > > Not moving the BO would make things less efficient, but should never > cause a crash. > > Maybe we should add a CC: stable tag and push it to -fixes instead? > > Christian. > > Am 09.11.21 um 13:28 schrieb Pan, Xinhui: >> [AMD Official Use Only] >> >> I hit vulkan cts test hang with navi23. >> >> dmesg says gmc page fault with address 0x0, 0x1000, 0x2000 >> And some debug log also says amdgu copy one BO from system Domain to system >> Domain which is really weird. >> >> 发件人: Koenig, Christian >> 发送时间: 2021年11月9日 20:20 >> 收件人: Pan, Xinhui; amd-...@lists.freedesktop.org >> 抄送: dri-devel@lists.freedesktop.org >> 主题: Re: [PATCH] drm/ttm: Put BO in its memory manager's lru list >> >> Am 09.11.21 um 12:19 schrieb xinhui pan: >>> After we move BO to a new memory region, we should put it to >>> the new memory manager's lru list regardless we unlock the resv or not. >>> >>> Signed-off-by: xinhui pan >> Interesting find, did you trigger that somehow or did you just stumbled >> over it by reading the code? >> >> Patch is Reviewed-by: Christian König , I will >> pick that up for drm-misc-next. >> >> Thanks, >> Christian. >> >>> --- >>> drivers/gpu/drm/ttm/ttm_bo.c | 2 ++ >>> 1 file changed, 2
回复: 回复: [PATCH] drm/ttm: Put BO in its memory manager's lru list
[AMD Official Use Only] Actually this patch does not totally fix the mismatch of lru list with mem_type as mem_type is changed in ->move() and lru list is changed after that. During this small period, another eviction could still happed and evict this mismatched BO from sMam(say, its lru list is on vram domain) to sMem. 发件人: Pan, Xinhui 发送时间: 2021年11月9日 21:05 收件人: Koenig, Christian; amd-...@lists.freedesktop.org 抄送: dri-devel@lists.freedesktop.org 主题: 回复: 回复: [PATCH] drm/ttm: Put BO in its memory manager's lru list Yes, a stable tag is needed. vulkan guys say 5.14 hit this issue too. I think that amdgpu_bo_move() does support copy from sysMem to sysMem correctly. maybe something below is needed. diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c index c83ef42ca702..aa63ae7ddf1e 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c @@ -485,7 +485,8 @@ static int amdgpu_bo_move(struct ttm_buffer_object *bo, bool evict, } if (old_mem->mem_type == TTM_PL_SYSTEM && (new_mem->mem_type == TTM_PL_TT || -new_mem->mem_type == AMDGPU_PL_PREEMPT)) { +new_mem->mem_type == AMDGPU_PL_PREEMPT || +new_mem->mem_type == TTM_PL_SYSTEM)) { ttm_bo_move_null(bo, new_mem); goto out; } otherwise, amdgpu_move_blit() is called to do the system memory copy which use a wrong address. 206 /* Map only what can't be accessed directly */ 207 if (!tmz && mem->start != AMDGPU_BO_INVALID_OFFSET) { 208 *addr = amdgpu_ttm_domain_start(adev, mem->mem_type) + 209 mm_cur->start; 210 return 0; 211 } line 208, *addr is zero. So when amdgpu_copy_buffer submit job with such addr, page fault happens. 发件人: Koenig, Christian 发送时间: 2021年11月9日 20:35 收件人: Pan, Xinhui; amd-...@lists.freedesktop.org 抄送: dri-devel@lists.freedesktop.org 主题: Re: 回复: [PATCH] drm/ttm: Put BO in its memory manager's lru list Mhm, I'm not sure what the rational behind that is. Not moving the BO would make things less efficient, but should never cause a crash. Maybe we should add a CC: stable tag and push it to -fixes instead? Christian. Am 09.11.21 um 13:28 schrieb Pan, Xinhui: > [AMD Official Use Only] > > I hit vulkan cts test hang with navi23. > > dmesg says gmc page fault with address 0x0, 0x1000, 0x2000 > And some debug log also says amdgu copy one BO from system Domain to system > Domain which is really weird. > ________ > 发件人: Koenig, Christian > 发送时间: 2021年11月9日 20:20 > 收件人: Pan, Xinhui; amd-...@lists.freedesktop.org > 抄送: dri-devel@lists.freedesktop.org > 主题: Re: [PATCH] drm/ttm: Put BO in its memory manager's lru list > > Am 09.11.21 um 12:19 schrieb xinhui pan: >> After we move BO to a new memory region, we should put it to >> the new memory manager's lru list regardless we unlock the resv or not. >> >> Signed-off-by: xinhui pan > Interesting find, did you trigger that somehow or did you just stumbled > over it by reading the code? > > Patch is Reviewed-by: Christian König , I will > pick that up for drm-misc-next. > > Thanks, > Christian. > >> --- >>drivers/gpu/drm/ttm/ttm_bo.c | 2 ++ >>1 file changed, 2 insertions(+) >> >> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c >> index f1367107925b..e307004f0b28 100644 >> --- a/drivers/gpu/drm/ttm/ttm_bo.c >> +++ b/drivers/gpu/drm/ttm/ttm_bo.c >> @@ -701,6 +701,8 @@ int ttm_mem_evict_first(struct ttm_device *bdev, >>ret = ttm_bo_evict(bo, ctx); >>if (locked) >>ttm_bo_unreserve(bo); >> + else >> + ttm_bo_move_to_lru_tail_unlocked(bo); >> >>ttm_bo_put(bo); >>return ret;
回复: 回复: [PATCH] drm/ttm: Put BO in its memory manager's lru list
[AMD Official Use Only] Yes, a stable tag is needed. vulkan guys say 5.14 hit this issue too. I think that amdgpu_bo_move() does support copy from sysMem to sysMem correctly. maybe something below is needed. diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c index c83ef42ca702..aa63ae7ddf1e 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c @@ -485,7 +485,8 @@ static int amdgpu_bo_move(struct ttm_buffer_object *bo, bool evict, } if (old_mem->mem_type == TTM_PL_SYSTEM && (new_mem->mem_type == TTM_PL_TT || -new_mem->mem_type == AMDGPU_PL_PREEMPT)) { +new_mem->mem_type == AMDGPU_PL_PREEMPT || +new_mem->mem_type == TTM_PL_SYSTEM)) { ttm_bo_move_null(bo, new_mem); goto out; } otherwise, amdgpu_move_blit() is called to do the system memory copy which use a wrong address. 206 /* Map only what can't be accessed directly */ 207 if (!tmz && mem->start != AMDGPU_BO_INVALID_OFFSET) { 208 *addr = amdgpu_ttm_domain_start(adev, mem->mem_type) + 209 mm_cur->start; 210 return 0; 211 } line 208, *addr is zero. So when amdgpu_copy_buffer submit job with such addr, page fault happens. 发件人: Koenig, Christian 发送时间: 2021年11月9日 20:35 收件人: Pan, Xinhui; amd-...@lists.freedesktop.org 抄送: dri-devel@lists.freedesktop.org 主题: Re: 回复: [PATCH] drm/ttm: Put BO in its memory manager's lru list Mhm, I'm not sure what the rational behind that is. Not moving the BO would make things less efficient, but should never cause a crash. Maybe we should add a CC: stable tag and push it to -fixes instead? Christian. Am 09.11.21 um 13:28 schrieb Pan, Xinhui: > [AMD Official Use Only] > > I hit vulkan cts test hang with navi23. > > dmesg says gmc page fault with address 0x0, 0x1000, 0x2000 > And some debug log also says amdgu copy one BO from system Domain to system > Domain which is really weird. > ________ > 发件人: Koenig, Christian > 发送时间: 2021年11月9日 20:20 > 收件人: Pan, Xinhui; amd-...@lists.freedesktop.org > 抄送: dri-devel@lists.freedesktop.org > 主题: Re: [PATCH] drm/ttm: Put BO in its memory manager's lru list > > Am 09.11.21 um 12:19 schrieb xinhui pan: >> After we move BO to a new memory region, we should put it to >> the new memory manager's lru list regardless we unlock the resv or not. >> >> Signed-off-by: xinhui pan > Interesting find, did you trigger that somehow or did you just stumbled > over it by reading the code? > > Patch is Reviewed-by: Christian König , I will > pick that up for drm-misc-next. > > Thanks, > Christian. > >> --- >>drivers/gpu/drm/ttm/ttm_bo.c | 2 ++ >>1 file changed, 2 insertions(+) >> >> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c >> index f1367107925b..e307004f0b28 100644 >> --- a/drivers/gpu/drm/ttm/ttm_bo.c >> +++ b/drivers/gpu/drm/ttm/ttm_bo.c >> @@ -701,6 +701,8 @@ int ttm_mem_evict_first(struct ttm_device *bdev, >>ret = ttm_bo_evict(bo, ctx); >>if (locked) >>ttm_bo_unreserve(bo); >> + else >> + ttm_bo_move_to_lru_tail_unlocked(bo); >> >>ttm_bo_put(bo); >>return ret;
回复: [PATCH] drm/ttm: Put BO in its memory manager's lru list
[AMD Official Use Only] I hit vulkan cts test hang with navi23. dmesg says gmc page fault with address 0x0, 0x1000, 0x2000 And some debug log also says amdgu copy one BO from system Domain to system Domain which is really weird. 发件人: Koenig, Christian 发送时间: 2021年11月9日 20:20 收件人: Pan, Xinhui; amd-...@lists.freedesktop.org 抄送: dri-devel@lists.freedesktop.org 主题: Re: [PATCH] drm/ttm: Put BO in its memory manager's lru list Am 09.11.21 um 12:19 schrieb xinhui pan: > After we move BO to a new memory region, we should put it to > the new memory manager's lru list regardless we unlock the resv or not. > > Signed-off-by: xinhui pan Interesting find, did you trigger that somehow or did you just stumbled over it by reading the code? Patch is Reviewed-by: Christian König , I will pick that up for drm-misc-next. Thanks, Christian. > --- > drivers/gpu/drm/ttm/ttm_bo.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c > index f1367107925b..e307004f0b28 100644 > --- a/drivers/gpu/drm/ttm/ttm_bo.c > +++ b/drivers/gpu/drm/ttm/ttm_bo.c > @@ -701,6 +701,8 @@ int ttm_mem_evict_first(struct ttm_device *bdev, > ret = ttm_bo_evict(bo, ctx); > if (locked) > ttm_bo_unreserve(bo); > + else > + ttm_bo_move_to_lru_tail_unlocked(bo); > > ttm_bo_put(bo); > return ret;
回复: [RFC PATCH] drm/ttm: Try to check if new ttm man out of bounds during compile
[AMD Official Use Only] ttm_range_man_init/fini are exported. Someone else might use it by find_symbol. I just want to not break things. Developer usually compile the whole kernel. So add a checked version of ttm_range_man_init/fini by the wrappers. 发件人: Christian König 发送时间: 2021年9月13日 14:35 收件人: Pan, Xinhui; amd-...@lists.freedesktop.org 抄送: Koenig, Christian; dan...@ffwll.ch; dri-devel@lists.freedesktop.org; Chen, Guchun 主题: Re: [RFC PATCH] drm/ttm: Try to check if new ttm man out of bounds during compile Am 13.09.21 um 05:36 schrieb xinhui pan: > Allow TTM know if vendor set new ttm mananger out of bounds by adding > build_bug_on. I really like the part in the inline functions, but the wrappers around the ttm_range_man_init/fini look a bit awkward of hand. Christian. > > Signed-off-by: xinhui pan > --- > drivers/gpu/drm/ttm/ttm_range_manager.c | 2 ++ > include/drm/ttm/ttm_device.h| 3 +++ > include/drm/ttm/ttm_range_manager.h | 10 ++ > 3 files changed, 15 insertions(+) > > diff --git a/drivers/gpu/drm/ttm/ttm_range_manager.c > b/drivers/gpu/drm/ttm/ttm_range_manager.c > index 03395386e8a7..47e304719b88 100644 > --- a/drivers/gpu/drm/ttm/ttm_range_manager.c > +++ b/drivers/gpu/drm/ttm/ttm_range_manager.c > @@ -127,6 +127,8 @@ static const struct ttm_resource_manager_func > ttm_range_manager_func = { > .debug = ttm_range_man_debug > }; > > +#undef ttm_range_man_init > +#undef ttm_range_man_fini > /** >* ttm_range_man_init >* > diff --git a/include/drm/ttm/ttm_device.h b/include/drm/ttm/ttm_device.h > index 07d722950d5b..6f23724f5a06 100644 > --- a/include/drm/ttm/ttm_device.h > +++ b/include/drm/ttm/ttm_device.h > @@ -285,12 +285,15 @@ int ttm_device_swapout(struct ttm_device *bdev, struct > ttm_operation_ctx *ctx, > static inline struct ttm_resource_manager * > ttm_manager_type(struct ttm_device *bdev, int mem_type) > { > + BUILD_BUG_ON(__builtin_constant_p(mem_type) > + && mem_type >= TTM_NUM_MEM_TYPES); > return bdev->man_drv[mem_type]; > } > > static inline void ttm_set_driver_manager(struct ttm_device *bdev, int type, > struct ttm_resource_manager *manager) > { > + BUILD_BUG_ON(__builtin_constant_p(type) && type >= TTM_NUM_MEM_TYPES); > bdev->man_drv[type] = manager; > } > > diff --git a/include/drm/ttm/ttm_range_manager.h > b/include/drm/ttm/ttm_range_manager.h > index 22b6fa42ac20..9250ade54e2c 100644 > --- a/include/drm/ttm/ttm_range_manager.h > +++ b/include/drm/ttm/ttm_range_manager.h > @@ -38,5 +38,15 @@ int ttm_range_man_init(struct ttm_device *bdev, > unsigned long p_size); > int ttm_range_man_fini(struct ttm_device *bdev, > unsigned type); > +#define ttm_range_man_init(bdev, type, use_tt, size) ({ \ > + BUILD_BUG_ON(__builtin_constant_p(type) \ > + && type >= TTM_NUM_MEM_TYPES); \ > + ttm_range_man_init(bdev, type, use_tt, size); \ > +}) > +#define ttm_range_man_fini(bdev, type) ({\ > + BUILD_BUG_ON(__builtin_constant_p(type) \ > + && type >= TTM_NUM_MEM_TYPES); \ > + ttm_range_man_fini(bdev, type); \ > +}) > > #endif
Re: [PATCH] drm/ttm: add a BUG_ON in ttm_set_driver_manager when array bounds
[AMD Official Use Only] looks good to me. But maybe build_bug_on works too and more reasonable to detect such wrong usage. From: Chen, Guchun Sent: Friday, September 10, 2021 12:30:14 PM To: amd-...@lists.freedesktop.org ; dri-devel@lists.freedesktop.org ; Koenig, Christian ; Pan, Xinhui ; Deucher, Alexander Cc: Chen, Guchun ; Shi, Leslie Subject: [PATCH] drm/ttm: add a BUG_ON in ttm_set_driver_manager when array bounds Vendor will define their own memory types on top of TTM_PL_PRIV, but call ttm_set_driver_manager directly without checking mem_type value when setting up memory manager. So add such check to aware the case when array bounds. Signed-off-by: Leslie Shi Signed-off-by: Guchun Chen --- include/drm/ttm/ttm_device.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/drm/ttm/ttm_device.h b/include/drm/ttm/ttm_device.h index 7a0f561c57ee..24ad76ca8022 100644 --- a/include/drm/ttm/ttm_device.h +++ b/include/drm/ttm/ttm_device.h @@ -308,6 +308,7 @@ ttm_manager_type(struct ttm_device *bdev, int mem_type) static inline void ttm_set_driver_manager(struct ttm_device *bdev, int type, struct ttm_resource_manager *manager) { + BUG_ON(type >= TTM_NUM_MEM_TYPES); bdev->man_drv[type] = manager; } -- 2.17.1
RE: [PATCH v2 1/2] drm/ttm: Fix a deadlock if the target BO is not idle during swap
[AMD Official Use Only] It is the internal staging drm-next. -Original Message- From: Koenig, Christian Sent: 2021年9月6日 19:26 To: Pan, Xinhui ; amd-...@lists.freedesktop.org Cc: Deucher, Alexander ; che...@uniontech.com; dri-devel@lists.freedesktop.org Subject: Re: [PATCH v2 1/2] drm/ttm: Fix a deadlock if the target BO is not idle during swap Which branch is this patch based on? Please rebase on top drm-misc-fixes and resend. Thanks, Christian. Am 06.09.21 um 03:12 schrieb xinhui pan: > The ret value might be -EBUSY, caller will think lru lock is still > locked but actually NOT. So return -ENOSPC instead. Otherwise we hit > list corruption. > > ttm_bo_cleanup_refs might fail too if BO is not idle. If we return 0, > caller(ttm_tt_populate -> ttm_global_swapout ->ttm_device_swapout) > will be stuck as we actually did not free any BO memory. This usually > happens when the fence is not signaled for a long time. > > Signed-off-by: xinhui pan > Reviewed-by: Christian König > --- > drivers/gpu/drm/ttm/ttm_bo.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/ttm/ttm_bo.c > b/drivers/gpu/drm/ttm/ttm_bo.c index 1fedd0eb67ba..f1367107925b 100644 > --- a/drivers/gpu/drm/ttm/ttm_bo.c > +++ b/drivers/gpu/drm/ttm/ttm_bo.c > @@ -1159,9 +1159,9 @@ int ttm_bo_swapout(struct ttm_buffer_object *bo, struct > ttm_operation_ctx *ctx, > } > > if (bo->deleted) { > - ttm_bo_cleanup_refs(bo, false, false, locked); > + ret = ttm_bo_cleanup_refs(bo, false, false, locked); > ttm_bo_put(bo); > - return 0; > + return ret == -EBUSY ? -ENOSPC : ret; > } > > ttm_bo_move_to_pinned(bo); > @@ -1215,7 +1215,7 @@ int ttm_bo_swapout(struct ttm_buffer_object *bo, struct > ttm_operation_ctx *ctx, > if (locked) > dma_resv_unlock(bo->base.resv); > ttm_bo_put(bo); > - return ret; > + return ret == -EBUSY ? -ENOSPC : ret; > } > > void ttm_bo_tt_destroy(struct ttm_buffer_object *bo)
Re: [PATCH v2 0/2] Fix a hung during memory pressure test
> 2021年9月6日 17:04,Christian König 写道: > > > > Am 06.09.21 um 03:12 schrieb xinhui pan: >> A long time ago, someone reports system got hung during memory test. >> In recent days, I am trying to look for or understand the potential >> deadlock in ttm/amdgpu code. >> >> This patchset aims to fix the deadlock during ttm populate. >> >> TTM has a parameter called pages_limit, when allocated GTT memory >> reaches this limit, swapout would be triggered. As ttm_bo_swapout does >> not return the correct retval, populate might get hung. >> >> UVD ib test uses GTT which might be insufficient. So a gpu recovery >> would hung if populate hung. > > Ah, now I understand what you are trying to do. > > Problem is that won't work either. Allocating VRAM can easily land you inside > the same deadlock. > > We need to avoid the allocation altogether for this for work correctly. looks like we need reserve some pages at sw init. > >> >> I have made one drm test which alloc two GTT BOs, submit gfx copy >> commands and free these BOs without waiting fence. What's more, these >> gfx copy commands will cause gfx ring hang. So gpu recovery would be >> triggered. > > Mhm, that should never be possible. It is perfectly valid for an application > to terminate without waitting for the GFX submission to be completed. gfx ring hangs because of the command is illegal. the packet is COMMAND [30:21] | BYTE_COUNT [20:0] I use 0xFF << 20 to hang the ring on purpose. > > Going to push patch #1 to drm-misc-fixes or drm-misc-next-fixes in a moment. > > Thanks, > Christian. > >> >> Now here is one possible deadlock case. >> gpu_recovery >> -> stop drm scheduler >> -> asic reset >>-> ib test >> -> tt populate (uvd ib test) >> -> ttm_bo_swapout (BO A) // this always fails as the fence of >> BO A would not be signaled by schedluer or HW. Hit deadlock. >> >> I paste the drm test patch below. >> #modprobe ttm pages_limit=65536 >> #amdgpu_test -s 1 -t 4 >> --- >> tests/amdgpu/basic_tests.c | 32 ++-- >> 1 file changed, 14 insertions(+), 18 deletions(-) >> >> diff --git a/tests/amdgpu/basic_tests.c b/tests/amdgpu/basic_tests.c >> index dbf02fee..f85ed340 100644 >> --- a/tests/amdgpu/basic_tests.c >> +++ b/tests/amdgpu/basic_tests.c >> @@ -65,13 +65,16 @@ static void amdgpu_direct_gma_test(void); >> static void amdgpu_command_submission_write_linear_helper(unsigned ip_type); >> static void amdgpu_command_submission_const_fill_helper(unsigned ip_type); >> static void amdgpu_command_submission_copy_linear_helper(unsigned ip_type); >> -static void amdgpu_test_exec_cs_helper(amdgpu_context_handle context_handle, >> +static void _amdgpu_test_exec_cs_helper(amdgpu_context_handle >> context_handle, >> unsigned ip_type, >> int instance, int pm4_dw, uint32_t >> *pm4_src, >> int res_cnt, amdgpu_bo_handle *resources, >> struct amdgpu_cs_ib_info *ib_info, >> - struct amdgpu_cs_request *ibs_request); >> + struct amdgpu_cs_request *ibs_request, >> int sync, int repeat); >> +#define amdgpu_test_exec_cs_helper(...) \ >> +_amdgpu_test_exec_cs_helper(__VA_ARGS__, 1, 1) >> + >> CU_TestInfo basic_tests[] = { >> { "Query Info Test", amdgpu_query_info_test }, >> { "Userptr Test", amdgpu_userptr_test }, >> @@ -1341,12 +1344,12 @@ static void amdgpu_command_submission_compute(void) >> * pm4_src, resources, ib_info, and ibs_request >> * submit command stream described in ibs_request and wait for this IB >> accomplished >> */ >> -static void amdgpu_test_exec_cs_helper(amdgpu_context_handle context_handle, >> +static void _amdgpu_test_exec_cs_helper(amdgpu_context_handle >> context_handle, >> unsigned ip_type, >> int instance, int pm4_dw, uint32_t >> *pm4_src, >> int res_cnt, amdgpu_bo_handle *resources, >> struct amdgpu_cs_ib_info *ib_info, >> - struct amdgpu_cs_request *ibs_request) >> + struct amdgpu_cs_request *ibs_request, >> int sync, int repeat) >> { >> int r; >> uint32_t expired; >> @@ -1395,12 +1398,15 @@ static void >> amdgpu_test_exec_cs_helper(amdgpu_context_handle context_handle, >> CU_ASSERT_NOT_EQUAL(ibs_request, NULL); >> /* submit CS */ >> -r = amdgpu_cs_submit(context_handle, 0, ibs_request, 1); >> +while (repeat--) >> +r = amdgpu_cs_submit(context_handle, 0, ibs_request, 1); >> CU_ASSERT_EQUAL(r, 0); >> r = amdgpu_bo_list_destroy(ibs_request->resources); >> CU_ASSERT_EQUAL(r, 0); >> + if (!sync) >> +return; >> fence_status.ip_type = ip_type; >>
[PATCH v2 2/2] drm/amdpgu: Use VRAM domain in UVD IB test
[AMD Official Use Only] Like vce/vcn does, visible VRAM is OK for ib test. While commit a11d9ff3ebe0 ("drm/amdgpu: use GTT for uvd_get_create/destory_msg") says VRAM is not mapped correctly in his platform which is likely an arm64. So lets change back to use VRAM on x86_64 platform. Signed-off-by: xinhui pan --- drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c | 8 1 file changed, 8 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c index d451c359606a..e4b75f33ccc8 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c @@ -1178,7 +1178,11 @@ int amdgpu_uvd_get_create_msg(struct amdgpu_ring *ring, uint32_t handle, int r, i; r = amdgpu_bo_create_reserved(adev, 1024, PAGE_SIZE, +#ifdef CONFIG_X86_64 + AMDGPU_GEM_DOMAIN_VRAM, +#else AMDGPU_GEM_DOMAIN_GTT, +#endif , NULL, (void **)); if (r) return r; @@ -1210,7 +1214,11 @@ int amdgpu_uvd_get_destroy_msg(struct amdgpu_ring *ring, uint32_t handle, int r, i; r = amdgpu_bo_create_reserved(adev, 1024, PAGE_SIZE, +#ifdef CONFIG_X86_64 + AMDGPU_GEM_DOMAIN_VRAM, +#else AMDGPU_GEM_DOMAIN_GTT, +#endif , NULL, (void **)); if (r) return r; -- 2.25.1
[PATCH v2 1/2] drm/ttm: Fix a deadlock if the target BO is not idle during swap
[AMD Official Use Only] The ret value might be -EBUSY, caller will think lru lock is still locked but actually NOT. So return -ENOSPC instead. Otherwise we hit list corruption. ttm_bo_cleanup_refs might fail too if BO is not idle. If we return 0, caller(ttm_tt_populate -> ttm_global_swapout ->ttm_device_swapout) will be stuck as we actually did not free any BO memory. This usually happens when the fence is not signaled for a long time. Signed-off-by: xinhui pan Reviewed-by: Christian König --- drivers/gpu/drm/ttm/ttm_bo.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index 1fedd0eb67ba..f1367107925b 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -1159,9 +1159,9 @@ int ttm_bo_swapout(struct ttm_buffer_object *bo, struct ttm_operation_ctx *ctx, } if (bo->deleted) { - ttm_bo_cleanup_refs(bo, false, false, locked); + ret = ttm_bo_cleanup_refs(bo, false, false, locked); ttm_bo_put(bo); - return 0; + return ret == -EBUSY ? -ENOSPC : ret; } ttm_bo_move_to_pinned(bo); @@ -1215,7 +1215,7 @@ int ttm_bo_swapout(struct ttm_buffer_object *bo, struct ttm_operation_ctx *ctx, if (locked) dma_resv_unlock(bo->base.resv); ttm_bo_put(bo); - return ret; + return ret == -EBUSY ? -ENOSPC : ret; } void ttm_bo_tt_destroy(struct ttm_buffer_object *bo) -- 2.25.1
Subject: [PATCH v2 0/2] Fix a hung during memory pressure test
[AMD Official Use Only] A long time ago, someone reports system got hung during memory test. In recent days, I am trying to look for or understand the potential deadlock in ttm/amdgpu code. This patchset aims to fix the deadlock during ttm populate. TTM has a parameter called pages_limit, when allocated GTT memory reaches this limit, swapout would be triggered. As ttm_bo_swapout does not return the correct retval, populate might get hung. UVD ib test uses GTT which might be insufficient. So a gpu recovery would hung if populate hung. I have made one drm test which alloc two GTT BOs, submit gfx copy commands and free these BOs without waiting fence. What's more, these gfx copy commands will cause gfx ring hang. So gpu recovery would be triggered. Now here is one possible deadlock case. gpu_recovery -> stop drm scheduler -> asic reset -> ib test -> tt populate (uvd ib test) -> ttm_bo_swapout (BO A) // this always fails as the fence of BO A would not be signaled by schedluer or HW. Hit deadlock. I paste the drm test patch below. #modprobe ttm pages_limit=65536 #amdgpu_test -s 1 -t 4 --- tests/amdgpu/basic_tests.c | 32 ++-- 1 file changed, 14 insertions(+), 18 deletions(-) diff --git a/tests/amdgpu/basic_tests.c b/tests/amdgpu/basic_tests.c index dbf02fee..f85ed340 100644 --- a/tests/amdgpu/basic_tests.c +++ b/tests/amdgpu/basic_tests.c @@ -65,13 +65,16 @@ static void amdgpu_direct_gma_test(void); static void amdgpu_command_submission_write_linear_helper(unsigned ip_type); static void amdgpu_command_submission_const_fill_helper(unsigned ip_type); static void amdgpu_command_submission_copy_linear_helper(unsigned ip_type); -static void amdgpu_test_exec_cs_helper(amdgpu_context_handle context_handle, +static void _amdgpu_test_exec_cs_helper(amdgpu_context_handle context_handle, unsigned ip_type, int instance, int pm4_dw, uint32_t *pm4_src, int res_cnt, amdgpu_bo_handle *resources, struct amdgpu_cs_ib_info *ib_info, - struct amdgpu_cs_request *ibs_request); + struct amdgpu_cs_request *ibs_request, int sync, int repeat); +#define amdgpu_test_exec_cs_helper(...) \ + _amdgpu_test_exec_cs_helper(__VA_ARGS__, 1, 1) + CU_TestInfo basic_tests[] = { { "Query Info Test", amdgpu_query_info_test }, { "Userptr Test", amdgpu_userptr_test }, @@ -1341,12 +1344,12 @@ static void amdgpu_command_submission_compute(void) * pm4_src, resources, ib_info, and ibs_request * submit command stream described in ibs_request and wait for this IB accomplished */ -static void amdgpu_test_exec_cs_helper(amdgpu_context_handle context_handle, +static void _amdgpu_test_exec_cs_helper(amdgpu_context_handle context_handle, unsigned ip_type, int instance, int pm4_dw, uint32_t *pm4_src, int res_cnt, amdgpu_bo_handle *resources, struct amdgpu_cs_ib_info *ib_info, - struct amdgpu_cs_request *ibs_request) + struct amdgpu_cs_request *ibs_request, int sync, int repeat) { int r; uint32_t expired; @@ -1395,12 +1398,15 @@ static void amdgpu_test_exec_cs_helper(amdgpu_context_handle context_handle, CU_ASSERT_NOT_EQUAL(ibs_request, NULL); /* submit CS */ - r = amdgpu_cs_submit(context_handle, 0, ibs_request, 1); + while (repeat--) + r = amdgpu_cs_submit(context_handle, 0, ibs_request, 1); CU_ASSERT_EQUAL(r, 0); r = amdgpu_bo_list_destroy(ibs_request->resources); CU_ASSERT_EQUAL(r, 0); + if (!sync) + return; fence_status.ip_type = ip_type; fence_status.ip_instance = 0; fence_status.ring = ibs_request->ring; @@ -1667,7 +1673,7 @@ static void amdgpu_command_submission_sdma_const_fill(void) static void amdgpu_command_submission_copy_linear_helper(unsigned ip_type) { - const int sdma_write_length = 1024; + const int sdma_write_length = (255) << 20; const int pm4_dw = 256; amdgpu_context_handle context_handle; amdgpu_bo_handle bo1, bo2; @@ -1715,8 +1721,6 @@ static void amdgpu_command_submission_copy_linear_helper(unsigned ip_type) _va_handle); CU_ASSERT_EQUAL(r, 0); - /* set bo1 */ - memset((void*)bo1_cpu, 0xaa, sdma_write_length); /* allocate UC bo2 for sDMA use */ r = amdgpu_bo_alloc_and_map(device_handle, @@ -1727,8 +1731,6
Re: [PATCH 2/2] drm/amdpgu: Use VRAM domain in UVD IB test
在 2021/9/3 15:04,“Koenig, Christian” 写入: Am 03.09.21 um 08:49 schrieb Pan, Xinhui: > Like vce/vcn does, visible VRAM is ok for ib test. > And in ib test stage, VRAM is sufficient. NAK, that won't work for older hw generations (e.g. SI, maybe CIK as well) where the IBs must be in a specific GTT hardware window. Christian. Not has older HW on hand for test. But the uvd code says below. Looks like IBs should be in specific VRAM range[0 - 256MB]? if (!ring->adev->uvd.address_64_bit) { struct ttm_operation_ctx ctx = { true, false }; amdgpu_bo_placement_from_domain(bo, AMDGPU_GEM_DOMAIN_VRAM); amdgpu_uvd_force_into_uvd_segment(bo); r = ttm_bo_validate(>tbo, >placement, ); if (r) goto err; } > > Signed-off-by: xinhui pan > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c > index d451c359606a..1c099b79d12c 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c > @@ -1178,7 +1178,7 @@ int amdgpu_uvd_get_create_msg(struct amdgpu_ring *ring, uint32_t handle, > int r, i; > > r = amdgpu_bo_create_reserved(adev, 1024, PAGE_SIZE, > - AMDGPU_GEM_DOMAIN_GTT, > + AMDGPU_GEM_DOMAIN_VRAM, > , NULL, (void **)); > if (r) > return r; > @@ -1210,7 +1210,7 @@ int amdgpu_uvd_get_destroy_msg(struct amdgpu_ring *ring, uint32_t handle, > int r, i; > > r = amdgpu_bo_create_reserved(adev, 1024, PAGE_SIZE, > - AMDGPU_GEM_DOMAIN_GTT, > + AMDGPU_GEM_DOMAIN_VRAM, > , NULL, (void **)); > if (r) > return r;
[PATCH 2/2] drm/amdpgu: Use VRAM domain in UVD IB test
Like vce/vcn does, visible VRAM is ok for ib test. And in ib test stage, VRAM is sufficient. Signed-off-by: xinhui pan --- drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c index d451c359606a..1c099b79d12c 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c @@ -1178,7 +1178,7 @@ int amdgpu_uvd_get_create_msg(struct amdgpu_ring *ring, uint32_t handle, int r, i; r = amdgpu_bo_create_reserved(adev, 1024, PAGE_SIZE, - AMDGPU_GEM_DOMAIN_GTT, + AMDGPU_GEM_DOMAIN_VRAM, , NULL, (void **)); if (r) return r; @@ -1210,7 +1210,7 @@ int amdgpu_uvd_get_destroy_msg(struct amdgpu_ring *ring, uint32_t handle, int r, i; r = amdgpu_bo_create_reserved(adev, 1024, PAGE_SIZE, - AMDGPU_GEM_DOMAIN_GTT, + AMDGPU_GEM_DOMAIN_VRAM, , NULL, (void **)); if (r) return r; -- 2.25.1
[PATCH 1/2] drm/ttm: Fix a deadlock if the target BO is not idle during swap
The ret value might be -EBUSY, caller will think lru lock is still locked but actually NOT. So return -ENOSPC instead. Otherwise we hit list corruption. ttm_bo_cleanup_refs might fail too if BO is not idle. If we return 0, caller(ttm_tt_populate -> ttm_global_swapout ->ttm_device_swapout) will be stuck as we actually did not free any BO memory. This usually happens when the fence is not signaled for a long time. Signed-off-by: xinhui pan --- drivers/gpu/drm/ttm/ttm_bo.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index 1fedd0eb67ba..f1367107925b 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -1159,9 +1159,9 @@ int ttm_bo_swapout(struct ttm_buffer_object *bo, struct ttm_operation_ctx *ctx, } if (bo->deleted) { - ttm_bo_cleanup_refs(bo, false, false, locked); + ret = ttm_bo_cleanup_refs(bo, false, false, locked); ttm_bo_put(bo); - return 0; + return ret == -EBUSY ? -ENOSPC : ret; } ttm_bo_move_to_pinned(bo); @@ -1215,7 +1215,7 @@ int ttm_bo_swapout(struct ttm_buffer_object *bo, struct ttm_operation_ctx *ctx, if (locked) dma_resv_unlock(bo->base.resv); ttm_bo_put(bo); - return ret; + return ret == -EBUSY ? -ENOSPC : ret; } void ttm_bo_tt_destroy(struct ttm_buffer_object *bo) -- 2.25.1
[PATCH 0/2] Fix a hung during memory pressure test
A long time ago, someone reports system got hung during memory test. In recent days, I am trying to look for or understand the potential deadlock in ttm/amdgpu code. This patchset aims to fix the deadlock during ttm populate. TTM has a parameter called pages_limit, when allocated GTT memory reaches this limit, swapout would be triggered. As ttm_bo_swapout does not return the correct retval, populate might get hung. UVD ib test uses GTT which might be insufficient. So a gpu recovery would hung if populate hung. I have made one drm test which alloc two GTT BOs, submit gfx copy commands and free these BOs without waiting fence. What's more, these gfx copy commands will cause gfx ring hang. So gpu recovery would be triggered. Now here is one possible deadlock case. gpu_recovery -> stop drm scheduler -> asic reset -> ib test -> tt populate (uvd ib test) -> ttm_bo_swapout (BO A) // this always fails as the fence of BO A would not be signaled by schedluer or HW. Hit deadlock. I paste the drm test patch below. Try it with modprobe ttm pages_limit=65536 --- tests/amdgpu/basic_tests.c | 32 ++-- 1 file changed, 14 insertions(+), 18 deletions(-) diff --git a/tests/amdgpu/basic_tests.c b/tests/amdgpu/basic_tests.c index dbf02fee..f85ed340 100644 --- a/tests/amdgpu/basic_tests.c +++ b/tests/amdgpu/basic_tests.c @@ -65,13 +65,16 @@ static void amdgpu_direct_gma_test(void); static void amdgpu_command_submission_write_linear_helper(unsigned ip_type); static void amdgpu_command_submission_const_fill_helper(unsigned ip_type); static void amdgpu_command_submission_copy_linear_helper(unsigned ip_type); -static void amdgpu_test_exec_cs_helper(amdgpu_context_handle context_handle, +static void _amdgpu_test_exec_cs_helper(amdgpu_context_handle context_handle, unsigned ip_type, int instance, int pm4_dw, uint32_t *pm4_src, int res_cnt, amdgpu_bo_handle *resources, struct amdgpu_cs_ib_info *ib_info, - struct amdgpu_cs_request *ibs_request); + struct amdgpu_cs_request *ibs_request, int sync, int repeat); +#define amdgpu_test_exec_cs_helper(...) \ + _amdgpu_test_exec_cs_helper(__VA_ARGS__, 1, 1) + CU_TestInfo basic_tests[] = { { "Query Info Test", amdgpu_query_info_test }, { "Userptr Test", amdgpu_userptr_test }, @@ -1341,12 +1344,12 @@ static void amdgpu_command_submission_compute(void) * pm4_src, resources, ib_info, and ibs_request * submit command stream described in ibs_request and wait for this IB accomplished */ -static void amdgpu_test_exec_cs_helper(amdgpu_context_handle context_handle, +static void _amdgpu_test_exec_cs_helper(amdgpu_context_handle context_handle, unsigned ip_type, int instance, int pm4_dw, uint32_t *pm4_src, int res_cnt, amdgpu_bo_handle *resources, struct amdgpu_cs_ib_info *ib_info, - struct amdgpu_cs_request *ibs_request) + struct amdgpu_cs_request *ibs_request, int sync, int repeat) { int r; uint32_t expired; @@ -1395,12 +1398,15 @@ static void amdgpu_test_exec_cs_helper(amdgpu_context_handle context_handle, CU_ASSERT_NOT_EQUAL(ibs_request, NULL); /* submit CS */ - r = amdgpu_cs_submit(context_handle, 0, ibs_request, 1); + while (repeat--) + r = amdgpu_cs_submit(context_handle, 0, ibs_request, 1); CU_ASSERT_EQUAL(r, 0); r = amdgpu_bo_list_destroy(ibs_request->resources); CU_ASSERT_EQUAL(r, 0); + if (!sync) + return; fence_status.ip_type = ip_type; fence_status.ip_instance = 0; fence_status.ring = ibs_request->ring; @@ -1667,7 +1673,7 @@ static void amdgpu_command_submission_sdma_const_fill(void) static void amdgpu_command_submission_copy_linear_helper(unsigned ip_type) { - const int sdma_write_length = 1024; + const int sdma_write_length = (255) << 20; const int pm4_dw = 256; amdgpu_context_handle context_handle; amdgpu_bo_handle bo1, bo2; @@ -1715,8 +1721,6 @@ static void amdgpu_command_submission_copy_linear_helper(unsigned ip_type) _va_handle); CU_ASSERT_EQUAL(r, 0); - /* set bo1 */ - memset((void*)bo1_cpu, 0xaa, sdma_write_length); /* allocate UC bo2 for sDMA use */ r = amdgpu_bo_alloc_and_map(device_handle, @@ -1727,8 +1731,6 @@ static void
Re: [RFC PATCH] drm/ttm: Do page counting after populate callback succeed
> 2021年6月15日 20:01,Christian König 写道: > > Am 15.06.21 um 13:57 schrieb xinhui pan: >> Amdgpu set SG flag in populate callback. So TTM still count pages in SG >> BO. > > It's probably better to fix this instead. E.g. why does amdgpu modify the SG > flag during populate and not during initial creation? That doesn't seem to > make sense. fair enough. Let me have a try. No idea why we set SG flag in populate years ago. > > Christian. > >> One easy way to fix this is lets count pages after populate callback. >> >> We hit one issue that amdgpu alloc many SG BOs, but TTM try to do swap >> again and again even if swapout does not swap SG BOs at all. >> >> Signed-off-by: xinhui pan >> --- >> drivers/gpu/drm/ttm/ttm_tt.c | 32 +--- >> 1 file changed, 13 insertions(+), 19 deletions(-) >> >> diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c >> index a1a25410ec74..4fa0a8cd71c0 100644 >> --- a/drivers/gpu/drm/ttm/ttm_tt.c >> +++ b/drivers/gpu/drm/ttm/ttm_tt.c >> @@ -317,13 +317,6 @@ int ttm_tt_populate(struct ttm_device *bdev, >> if (ttm_tt_is_populated(ttm)) >> return 0; >> - if (!(ttm->page_flags & TTM_PAGE_FLAG_SG)) { >> -atomic_long_add(ttm->num_pages, _pages_allocated); >> -if (bdev->pool.use_dma32) >> -atomic_long_add(ttm->num_pages, >> -_dma32_pages_allocated); >> -} >> - >> while (atomic_long_read(_pages_allocated) > ttm_pages_limit || >> atomic_long_read(_dma32_pages_allocated) > >> ttm_dma32_pages_limit) { >> @@ -342,6 +335,13 @@ int ttm_tt_populate(struct ttm_device *bdev, >> if (ret) >> goto error; >> + if (!(ttm->page_flags & TTM_PAGE_FLAG_SG)) { >> +atomic_long_add(ttm->num_pages, _pages_allocated); >> +if (bdev->pool.use_dma32) >> +atomic_long_add(ttm->num_pages, >> +_dma32_pages_allocated); >> +} >> + >> ttm_tt_add_mapping(bdev, ttm); >> ttm->page_flags |= TTM_PAGE_FLAG_PRIV_POPULATED; >> if (unlikely(ttm->page_flags & TTM_PAGE_FLAG_SWAPPED)) { >> @@ -355,12 +355,6 @@ int ttm_tt_populate(struct ttm_device *bdev, >> return 0; >>error: >> -if (!(ttm->page_flags & TTM_PAGE_FLAG_SG)) { >> -atomic_long_sub(ttm->num_pages, _pages_allocated); >> -if (bdev->pool.use_dma32) >> -atomic_long_sub(ttm->num_pages, >> -_dma32_pages_allocated); >> -} >> return ret; >> } >> EXPORT_SYMBOL(ttm_tt_populate); >> @@ -384,12 +378,6 @@ void ttm_tt_unpopulate(struct ttm_device *bdev, struct >> ttm_tt *ttm) >> if (!ttm_tt_is_populated(ttm)) >> return; >> - ttm_tt_clear_mapping(ttm); >> -if (bdev->funcs->ttm_tt_unpopulate) >> -bdev->funcs->ttm_tt_unpopulate(bdev, ttm); >> -else >> -ttm_pool_free(>pool, ttm); >> - >> if (!(ttm->page_flags & TTM_PAGE_FLAG_SG)) { >> atomic_long_sub(ttm->num_pages, _pages_allocated); >> if (bdev->pool.use_dma32) >> @@ -397,6 +385,12 @@ void ttm_tt_unpopulate(struct ttm_device *bdev, struct >> ttm_tt *ttm) >> _dma32_pages_allocated); >> } >> + ttm_tt_clear_mapping(ttm); >> +if (bdev->funcs->ttm_tt_unpopulate) >> +bdev->funcs->ttm_tt_unpopulate(bdev, ttm); >> +else >> +ttm_pool_free(>pool, ttm); >> + >> ttm->page_flags &= ~TTM_PAGE_FLAG_PRIV_POPULATED; >> } >> >
Re: [PATCH] drm/ttm: fix deref of bo->ttm without holding the lock v2
[AMD Official Use Only] Looks good to me. From: Christian König Sent: Monday, June 7, 2021 8:52:21 PM To: dri-devel@lists.freedesktop.org ; Pan, Xinhui ; Das, Nirmoy ; Huang, Ray Cc: Thomas Hellström Subject: Re: [PATCH] drm/ttm: fix deref of bo->ttm without holding the lock v2 Am 28.05.21 um 15:06 schrieb Thomas Hellström: > > On 5/28/21 3:00 PM, Christian König wrote: >> We need to grab the resv lock first before doing that check. >> >> v2 (chk): simplify the change for -fixes >> >> Signed-off-by: Christian König >> Signed-off-by: Thomas Hellström > > Hmm, OK, but this doesn't fix the swapped-bo-not-on-lru and > unpopulating from swap_notify issues. Are you planning a follow up > patch for those? As discussed in a separate thread this needs to be applied as needed when the DG1 branch is merged. Xinhui, Nirmoy, Rui can anybody give be an rb/ab so that I can push this? Thanks, Christian. > > Thanks, > > Thomas > >> --- >> drivers/gpu/drm/ttm/ttm_bo.c | 5 - >> drivers/gpu/drm/ttm/ttm_device.c | 8 +--- >> 2 files changed, 5 insertions(+), 8 deletions(-) >> >> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c >> index cfd0b9292397..ebcffe794adb 100644 >> --- a/drivers/gpu/drm/ttm/ttm_bo.c >> +++ b/drivers/gpu/drm/ttm/ttm_bo.c >> @@ -1172,7 +1172,10 @@ int ttm_bo_swapout(struct ttm_buffer_object >> *bo, struct ttm_operation_ctx *ctx, >> if (!ttm_bo_evict_swapout_allowable(bo, ctx, , NULL)) >> return -EBUSY; >> -if (!ttm_bo_get_unless_zero(bo)) { >> +if (!bo->ttm || !ttm_tt_is_populated(bo->ttm) || >> +bo->ttm->page_flags & TTM_PAGE_FLAG_SG || >> +bo->ttm->page_flags & TTM_PAGE_FLAG_SWAPPED || >> +!ttm_bo_get_unless_zero(bo)) { >> if (locked) >> dma_resv_unlock(bo->base.resv); >> return -EBUSY; >> diff --git a/drivers/gpu/drm/ttm/ttm_device.c >> b/drivers/gpu/drm/ttm/ttm_device.c >> index a1dcf7d55c90..3d9c62b93e29 100644 >> --- a/drivers/gpu/drm/ttm/ttm_device.c >> +++ b/drivers/gpu/drm/ttm/ttm_device.c >> @@ -143,14 +143,8 @@ int ttm_device_swapout(struct ttm_device *bdev, >> struct ttm_operation_ctx *ctx, >> for (j = 0; j < TTM_MAX_BO_PRIORITY; ++j) { >> list_for_each_entry(bo, >lru[j], lru) { >> -uint32_t num_pages; >> +uint32_t num_pages = PFN_UP(bo->base.size); >> -if (!bo->ttm || !ttm_tt_is_populated(bo->ttm) || >> -bo->ttm->page_flags & TTM_PAGE_FLAG_SG || >> -bo->ttm->page_flags & TTM_PAGE_FLAG_SWAPPED) >> -continue; >> - >> -num_pages = bo->ttm->num_pages; >> ret = ttm_bo_swapout(bo, ctx, gfp_flags); >> /* ttm_bo_swapout has dropped the lru_lock */ >> if (!ret)
回复: 回复: 回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and swapin
[AMD Official Use Only] I just sent out patch below yesterday. swapping unpopulated bo is useless indeed. [RFC PATCH 2/2] drm/ttm: skip swapout when ttm has no backend page. 发件人: Christian König 发送时间: 2021年5月20日 14:39 收件人: Pan, Xinhui; Kuehling, Felix; amd-...@lists.freedesktop.org 抄送: Deucher, Alexander; dan...@ffwll.ch; Koenig, Christian; dri-devel@lists.freedesktop.org 主题: Re: 回复: 回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and swapin > swapout function create one swap storage which is filled with zero. And set > ttm->page_flags as TTM_PAGE_FLAG_SWAPPED. Just because ttm has no backend > page this time, no real data is swapout to this swap storage. That's the fundamental problem. A TT object which isn't populated shouldn't be considered for swapout nor eviction in the first place. I'm going to take a look later today. Christian. Am 20.05.21 um 04:55 schrieb Pan, Xinhui: > [AMD Official Use Only] > > swapout function create one swap storage which is filled with zero. And set > ttm->page_flags as TTM_PAGE_FLAG_SWAPPED. Just because ttm has no backend > page this time, no real data is swapout to this swap storage. > > swapin function is called during populate as TTM_PAGE_FLAG_SWAPPED is set. > Now here is the problem, we swapin data to ttm bakend memory from swap > storage. That just causes the memory been overwritten. > > > 发件人: Christian König > 发送时间: 2021年5月19日 18:01 > 收件人: Pan, Xinhui; Kuehling, Felix; amd-...@lists.freedesktop.org > 抄送: Deucher, Alexander; dan...@ffwll.ch; Koenig, Christian; > dri-devel@lists.freedesktop.org > 主题: Re: 回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout > and swapin > > I'm scratching my head how that is even possible. > > See when a BO is created in the system domain it is just an empty hull, > e.g. without backing store and allocated pages. > > So the swapout function will just ignore it. > > Christian. > > Am 19.05.21 um 07:07 schrieb Pan, Xinhui: >> [AMD Official Use Only] >> >> I have reverted Chris' patch, still hit this failure. >> Just see two lines in Chris' patch. Any BO in cpu domian would be swapout >> first. That is why we hit this issue frequently now. But the bug is there >> long time ago. >> >> - for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) { >> - list_for_each_entry(bo, >swap_lru[i], swap) { >> [snip] >> + for (i = TTM_PL_SYSTEM; i < TTM_NUM_MEM_TYPES; ++i) { >> + for (j = 0; j < TTM_MAX_BO_PRIORITY; ++j) { >> >> >> >> 发件人: Pan, Xinhui >> 发送时间: 2021年5月19日 12:09 >> 收件人: Kuehling, Felix; amd-...@lists.freedesktop.org >> 抄送: Deucher, Alexander; Koenig, Christian; dri-devel@lists.freedesktop.org; >> dan...@ffwll.ch >> 主题: 回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and >> swapin >> >> yes, we really dont swapout SG BOs. >> The problems is that before we validate a userptr BO, we create this BO in >> CPU domain by default. So this BO has chance to swapout. >> >> we set flag TTM_PAGE_FLAG_SG on userptr BO in popluate() which is too late. >> I have not try to revert Chris' patch as I think it desnt help. Or I can >> have a try later. >> >> >> 发件人: Kuehling, Felix >> 发送时间: 2021年5月19日 11:29 >> 收件人: Pan, Xinhui; amd-...@lists.freedesktop.org >> 抄送: Deucher, Alexander; Koenig, Christian; dri-devel@lists.freedesktop.org; >> dan...@ffwll.ch >> 主题: Re: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and >> swapin >> >> Swapping SG BOs makes no sense, because TTM doesn't own the pages of >> this type of BO. >> >> Last I checked, userptr BOs (and other SG BOs) were protected from >> swapout by the fact that they would not be added to the swap-LRU. But it >> looks like Christian just removed the swap-LRU. I guess this broke that >> protection: >> >> commit 2cb51d22d70b18eaf339abf9758bf0b7608da65c >> Author: Christian König >> Date: Tue Oct 6 16:30:09 2020 +0200 >> >>drm/ttm: remove swap LRU v3 >> >>Instead evict round robin from each devices SYSTEM and TT domain. >> >>v2: reorder num_pages access reported by Dan's script >>v3: fix rebase fallout, num_pages should be 32bit >> >>Signed-off-by: Christian König >>Tested-by: Nirmoy Das >>Reviewed-by: Huang Rui >>Reviewed-by: Matthew Auld >>
Re: 回复: 回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and swapin
I just sent out patch below yesterday. swapping unpopulated bo is useless indeed. [RFC PATCH 2/2] drm/ttm: skip swapout when ttm has no backend page.
回复: 回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and swapin
[AMD Official Use Only] I am not sure if we can create a ttm_bo_type_sg bo for userptr. But I have another idea now. we can use flag AMDGPU_AMDKFD_CREATE_USERPTR_BO to create the userptr bo. 发件人: Kuehling, Felix 发送时间: 2021年5月19日 23:11 收件人: Christian König; Pan, Xinhui; amd-...@lists.freedesktop.org 抄送: Deucher, Alexander; dan...@ffwll.ch; Koenig, Christian; dri-devel@lists.freedesktop.org 主题: Re: 回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and swapin Looks like we're creating the userptr BO as ttm_bo_type_device. I guess we should be using ttm_bo_type_sg? BTW, amdgpu_gem_userptr_ioctl also uses ttm_bo_type_device. Regards, Felix Am 2021-05-19 um 6:01 a.m. schrieb Christian König: > I'm scratching my head how that is even possible. > > See when a BO is created in the system domain it is just an empty > hull, e.g. without backing store and allocated pages. > > So the swapout function will just ignore it. > > Christian. > > Am 19.05.21 um 07:07 schrieb Pan, Xinhui: >> [AMD Official Use Only] >> >> I have reverted Chris' patch, still hit this failure. >> Just see two lines in Chris' patch. Any BO in cpu domian would be >> swapout first. That is why we hit this issue frequently now. But the >> bug is there long time ago. >> >> - for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) { >> - list_for_each_entry(bo, >swap_lru[i], swap) { >> [snip] >> + for (i = TTM_PL_SYSTEM; i < TTM_NUM_MEM_TYPES; ++i) { >> + for (j = 0; j < TTM_MAX_BO_PRIORITY; ++j) { >> >> >> >> 发件人: Pan, Xinhui >> 发送时间: 2021年5月19日 12:09 >> 收件人: Kuehling, Felix; amd-...@lists.freedesktop.org >> 抄送: Deucher, Alexander; Koenig, Christian; >> dri-devel@lists.freedesktop.org; dan...@ffwll.ch >> 主题: 回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to >> swapout and swapin >> >> yes, we really dont swapout SG BOs. >> The problems is that before we validate a userptr BO, we create this >> BO in CPU domain by default. So this BO has chance to swapout. >> >> we set flag TTM_PAGE_FLAG_SG on userptr BO in popluate() which is too >> late. >> I have not try to revert Chris' patch as I think it desnt help. Or I >> can have a try later. >> >> >> 发件人: Kuehling, Felix >> 发送时间: 2021年5月19日 11:29 >> 收件人: Pan, Xinhui; amd-...@lists.freedesktop.org >> 抄送: Deucher, Alexander; Koenig, Christian; >> dri-devel@lists.freedesktop.org; dan...@ffwll.ch >> 主题: Re: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to >> swapout and swapin >> >> Swapping SG BOs makes no sense, because TTM doesn't own the pages of >> this type of BO. >> >> Last I checked, userptr BOs (and other SG BOs) were protected from >> swapout by the fact that they would not be added to the swap-LRU. But it >> looks like Christian just removed the swap-LRU. I guess this broke that >> protection: >> >> commit 2cb51d22d70b18eaf339abf9758bf0b7608da65c >> Author: Christian König >> Date: Tue Oct 6 16:30:09 2020 +0200 >> >> drm/ttm: remove swap LRU v3 >> >> Instead evict round robin from each devices SYSTEM and TT domain. >> >> v2: reorder num_pages access reported by Dan's script >> v3: fix rebase fallout, num_pages should be 32bit >> >> Signed-off-by: Christian König >> Tested-by: Nirmoy Das >> Reviewed-by: Huang Rui >> Reviewed-by: Matthew Auld >> Link: https://patchwork.freedesktop.org/patch/424009/ >> >> Regards, >> Felix >> >> >> On 2021-05-18 10:28 p.m., xinhui pan wrote: >>> cpu 1 cpu 2 >>> kfd alloc BO A(userptr) alloc BO B(GTT) >>> ->init -> validate -> init -> >>> validate -> populate >>> init_user_pages-> swapout BO A >>> //hit ttm pages limit >>>-> get_user_pages (fill up ttm->pages) >>> -> validate -> populate >>> -> swapin BO A // Now hit the BUG >>> >>> We know that get_user_pages may race with swapout on same BO. >>> Threre are some issues I have met. >>> 1) memory corruption. >>> This is because we do a swap before memory is setup. ttm_tt_swapout() >>> just create a swap_storage with its content being 0x0. So when we setup >>> memory a
回复: 回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and swapin
[AMD Official Use Only] swapout function create one swap storage which is filled with zero. And set ttm->page_flags as TTM_PAGE_FLAG_SWAPPED. Just because ttm has no backend page this time, no real data is swapout to this swap storage. swapin function is called during populate as TTM_PAGE_FLAG_SWAPPED is set. Now here is the problem, we swapin data to ttm bakend memory from swap storage. That just causes the memory been overwritten. 发件人: Christian König 发送时间: 2021年5月19日 18:01 收件人: Pan, Xinhui; Kuehling, Felix; amd-...@lists.freedesktop.org 抄送: Deucher, Alexander; dan...@ffwll.ch; Koenig, Christian; dri-devel@lists.freedesktop.org 主题: Re: 回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and swapin I'm scratching my head how that is even possible. See when a BO is created in the system domain it is just an empty hull, e.g. without backing store and allocated pages. So the swapout function will just ignore it. Christian. Am 19.05.21 um 07:07 schrieb Pan, Xinhui: > [AMD Official Use Only] > > I have reverted Chris' patch, still hit this failure. > Just see two lines in Chris' patch. Any BO in cpu domian would be swapout > first. That is why we hit this issue frequently now. But the bug is there > long time ago. > > - for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) { > - list_for_each_entry(bo, >swap_lru[i], swap) { > [snip] > + for (i = TTM_PL_SYSTEM; i < TTM_NUM_MEM_TYPES; ++i) { > + for (j = 0; j < TTM_MAX_BO_PRIORITY; ++j) { > > > > 发件人: Pan, Xinhui > 发送时间: 2021年5月19日 12:09 > 收件人: Kuehling, Felix; amd-...@lists.freedesktop.org > 抄送: Deucher, Alexander; Koenig, Christian; dri-devel@lists.freedesktop.org; > dan...@ffwll.ch > 主题: 回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and > swapin > > yes, we really dont swapout SG BOs. > The problems is that before we validate a userptr BO, we create this BO in > CPU domain by default. So this BO has chance to swapout. > > we set flag TTM_PAGE_FLAG_SG on userptr BO in popluate() which is too late. > I have not try to revert Chris' patch as I think it desnt help. Or I can have > a try later. > > > 发件人: Kuehling, Felix > 发送时间: 2021年5月19日 11:29 > 收件人: Pan, Xinhui; amd-...@lists.freedesktop.org > 抄送: Deucher, Alexander; Koenig, Christian; dri-devel@lists.freedesktop.org; > dan...@ffwll.ch > 主题: Re: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and > swapin > > Swapping SG BOs makes no sense, because TTM doesn't own the pages of > this type of BO. > > Last I checked, userptr BOs (and other SG BOs) were protected from > swapout by the fact that they would not be added to the swap-LRU. But it > looks like Christian just removed the swap-LRU. I guess this broke that > protection: > > commit 2cb51d22d70b18eaf339abf9758bf0b7608da65c > Author: Christian König > Date: Tue Oct 6 16:30:09 2020 +0200 > > drm/ttm: remove swap LRU v3 > > Instead evict round robin from each devices SYSTEM and TT domain. > > v2: reorder num_pages access reported by Dan's script > v3: fix rebase fallout, num_pages should be 32bit > > Signed-off-by: Christian König > Tested-by: Nirmoy Das > Reviewed-by: Huang Rui > Reviewed-by: Matthew Auld > Link: > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.freedesktop.org%2Fpatch%2F424009%2Fdata=04%7C01%7CXinhui.Pan%40amd.com%7Cb4422d8b3e4947cd243c08d91aad14c3%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637570152942496679%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=K3%2FnFpN56y8L49UuYRM6SqefVFLnqIwpDAtWpS1XvnQ%3Dreserved=0 > > Regards, > Felix > > > On 2021-05-18 10:28 p.m., xinhui pan wrote: >> cpu 1 cpu 2 >> kfd alloc BO A(userptr) alloc BO B(GTT) >> ->init -> validate -> init -> validate >> -> populate >> init_user_pages-> swapout BO A //hit ttm >> pages limit >>-> get_user_pages (fill up ttm->pages) >> -> validate -> populate >> -> swapin BO A // Now hit the BUG >> >> We know that get_user_pages may race with swapout on same BO. >> Threre are some issues I have met. >> 1) memory corruption. >> This is because we do a swap before memory is setup. ttm_tt_swapout() >> just create a swap_storage with its content being 0x0. So when we setup >> memory after
回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and swapin
[AMD Official Use Only] I have reverted Chris' patch, still hit this failure. Just see two lines in Chris' patch. Any BO in cpu domian would be swapout first. That is why we hit this issue frequently now. But the bug is there long time ago. - for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) { - list_for_each_entry(bo, >swap_lru[i], swap) { [snip] + for (i = TTM_PL_SYSTEM; i < TTM_NUM_MEM_TYPES; ++i) { + for (j = 0; j < TTM_MAX_BO_PRIORITY; ++j) { ________ 发件人: Pan, Xinhui 发送时间: 2021年5月19日 12:09 收件人: Kuehling, Felix; amd-...@lists.freedesktop.org 抄送: Deucher, Alexander; Koenig, Christian; dri-devel@lists.freedesktop.org; dan...@ffwll.ch 主题: 回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and swapin yes, we really dont swapout SG BOs. The problems is that before we validate a userptr BO, we create this BO in CPU domain by default. So this BO has chance to swapout. we set flag TTM_PAGE_FLAG_SG on userptr BO in popluate() which is too late. I have not try to revert Chris' patch as I think it desnt help. Or I can have a try later. 发件人: Kuehling, Felix 发送时间: 2021年5月19日 11:29 收件人: Pan, Xinhui; amd-...@lists.freedesktop.org 抄送: Deucher, Alexander; Koenig, Christian; dri-devel@lists.freedesktop.org; dan...@ffwll.ch 主题: Re: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and swapin Swapping SG BOs makes no sense, because TTM doesn't own the pages of this type of BO. Last I checked, userptr BOs (and other SG BOs) were protected from swapout by the fact that they would not be added to the swap-LRU. But it looks like Christian just removed the swap-LRU. I guess this broke that protection: commit 2cb51d22d70b18eaf339abf9758bf0b7608da65c Author: Christian König Date: Tue Oct 6 16:30:09 2020 +0200 drm/ttm: remove swap LRU v3 Instead evict round robin from each devices SYSTEM and TT domain. v2: reorder num_pages access reported by Dan's script v3: fix rebase fallout, num_pages should be 32bit Signed-off-by: Christian König Tested-by: Nirmoy Das Reviewed-by: Huang Rui Reviewed-by: Matthew Auld Link: https://patchwork.freedesktop.org/patch/424009/ Regards, Felix On 2021-05-18 10:28 p.m., xinhui pan wrote: > cpu 1 cpu 2 > kfd alloc BO A(userptr) alloc BO B(GTT) > ->init -> validate -> init -> validate -> > populate > init_user_pages-> swapout BO A //hit ttm > pages limit > -> get_user_pages (fill up ttm->pages) >-> validate -> populate >-> swapin BO A // Now hit the BUG > > We know that get_user_pages may race with swapout on same BO. > Threre are some issues I have met. > 1) memory corruption. > This is because we do a swap before memory is setup. ttm_tt_swapout() > just create a swap_storage with its content being 0x0. So when we setup > memory after the swapout. The following swapin makes the memory > corrupted. > > 2) panic > When swapout happes with get_user_pages, they touch ttm->pages without > anylock. It causes memory corruption too. But I hit page fault mostly. > > Signed-off-by: xinhui pan > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 16 +++- > 1 file changed, 15 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c > index 928e8d57cd08..42460e4480f8 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c > @@ -835,6 +835,7 @@ static int init_user_pages(struct kgd_mem *mem, uint64_t > user_addr) > struct amdkfd_process_info *process_info = mem->process_info; > struct amdgpu_bo *bo = mem->bo; > struct ttm_operation_ctx ctx = { true, false }; > + struct page **pages; > int ret = 0; > > mutex_lock(_info->lock); > @@ -852,7 +853,13 @@ static int init_user_pages(struct kgd_mem *mem, uint64_t > user_addr) > goto out; > } > > - ret = amdgpu_ttm_tt_get_user_pages(bo, bo->tbo.ttm->pages); > + pages = kvmalloc_array(bo->tbo.ttm->num_pages, > + sizeof(struct page *), > + GFP_KERNEL | __GFP_ZERO); > + if (!pages) > + goto unregister_out; > + > + ret = amdgpu_ttm_tt_get_user_pages(bo, pages); > if (ret) { > pr_err("%s: Failed to get user pages: %d\n", __func__, ret); > goto unregister_out; > @@ -863,6 +870,12 @@ static int init_user_pages(struct kgd_
回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and swapin
[AMD Official Use Only] yes, we really dont swapout SG BOs. The problems is that before we validate a userptr BO, we create this BO in CPU domain by default. So this BO has chance to swapout. we set flag TTM_PAGE_FLAG_SG on userptr BO in popluate() which is too late. I have not try to revert Chris' patch as I think it desnt help. Or I can have a try later. 发件人: Kuehling, Felix 发送时间: 2021年5月19日 11:29 收件人: Pan, Xinhui; amd-...@lists.freedesktop.org 抄送: Deucher, Alexander; Koenig, Christian; dri-devel@lists.freedesktop.org; dan...@ffwll.ch 主题: Re: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and swapin Swapping SG BOs makes no sense, because TTM doesn't own the pages of this type of BO. Last I checked, userptr BOs (and other SG BOs) were protected from swapout by the fact that they would not be added to the swap-LRU. But it looks like Christian just removed the swap-LRU. I guess this broke that protection: commit 2cb51d22d70b18eaf339abf9758bf0b7608da65c Author: Christian König Date: Tue Oct 6 16:30:09 2020 +0200 drm/ttm: remove swap LRU v3 Instead evict round robin from each devices SYSTEM and TT domain. v2: reorder num_pages access reported by Dan's script v3: fix rebase fallout, num_pages should be 32bit Signed-off-by: Christian König Tested-by: Nirmoy Das Reviewed-by: Huang Rui Reviewed-by: Matthew Auld Link: https://patchwork.freedesktop.org/patch/424009/ Regards, Felix On 2021-05-18 10:28 p.m., xinhui pan wrote: > cpu 1 cpu 2 > kfd alloc BO A(userptr) alloc BO B(GTT) > ->init -> validate -> init -> validate -> > populate > init_user_pages-> swapout BO A //hit ttm > pages limit > -> get_user_pages (fill up ttm->pages) >-> validate -> populate >-> swapin BO A // Now hit the BUG > > We know that get_user_pages may race with swapout on same BO. > Threre are some issues I have met. > 1) memory corruption. > This is because we do a swap before memory is setup. ttm_tt_swapout() > just create a swap_storage with its content being 0x0. So when we setup > memory after the swapout. The following swapin makes the memory > corrupted. > > 2) panic > When swapout happes with get_user_pages, they touch ttm->pages without > anylock. It causes memory corruption too. But I hit page fault mostly. > > Signed-off-by: xinhui pan > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 16 +++- > 1 file changed, 15 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c > index 928e8d57cd08..42460e4480f8 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c > @@ -835,6 +835,7 @@ static int init_user_pages(struct kgd_mem *mem, uint64_t > user_addr) > struct amdkfd_process_info *process_info = mem->process_info; > struct amdgpu_bo *bo = mem->bo; > struct ttm_operation_ctx ctx = { true, false }; > + struct page **pages; > int ret = 0; > > mutex_lock(_info->lock); > @@ -852,7 +853,13 @@ static int init_user_pages(struct kgd_mem *mem, uint64_t > user_addr) > goto out; > } > > - ret = amdgpu_ttm_tt_get_user_pages(bo, bo->tbo.ttm->pages); > + pages = kvmalloc_array(bo->tbo.ttm->num_pages, > + sizeof(struct page *), > + GFP_KERNEL | __GFP_ZERO); > + if (!pages) > + goto unregister_out; > + > + ret = amdgpu_ttm_tt_get_user_pages(bo, pages); > if (ret) { > pr_err("%s: Failed to get user pages: %d\n", __func__, ret); > goto unregister_out; > @@ -863,6 +870,12 @@ static int init_user_pages(struct kgd_mem *mem, uint64_t > user_addr) > pr_err("%s: Failed to reserve BO\n", __func__); > goto release_out; > } > + > + WARN_ON_ONCE(bo->tbo.ttm->page_flags & TTM_PAGE_FLAG_SWAPPED); > + > + memcpy(bo->tbo.ttm->pages, > + pages, > + sizeof(struct page*) * bo->tbo.ttm->num_pages); > amdgpu_bo_placement_from_domain(bo, mem->domain); > ret = ttm_bo_validate(>tbo, >placement, ); > if (ret) > @@ -872,6 +885,7 @@ static int init_user_pages(struct kgd_mem *mem, uint64_t > user_addr) > release_out: > amdgpu_ttm_tt_get_user_pages_done(bo->tbo.ttm); > unregister_out: > + kvfree(pages); > if (ret) > amdgpu_mn_unregister(bo); > out:
回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and swapin
[AMD Official Use Only] To observe the issue. I made one kfdtest case for debug. It just alloc a userptr memory and detect if memory is corrupted. I can hit this failure in 2 minutes. :( diff --git a/tests/kfdtest/src/KFDMemoryTest.cpp b/tests/kfdtest/src/KFDMemoryTest.cpp index 70c8033..a72f53f 100644 --- a/tests/kfdtest/src/KFDMemoryTest.cpp +++ b/tests/kfdtest/src/KFDMemoryTest.cpp @@ -584,6 +584,32 @@ TEST_F(KFDMemoryTest, ZeroMemorySizeAlloc) { TEST_END } +TEST_F(KFDMemoryTest, swap) { +TEST_START(TESTPROFILE_RUNALL) + +unsigned int size = 128<<20; +unsigned int*tmp = (unsigned int *)mmap(0, + size, + PROT_READ | PROT_WRITE, + MAP_ANONYMOUS | MAP_PRIVATE, + -1, + 0); +EXPECT_NE(tmp, MAP_FAILED); + +LOG() << "pls run this with KFDMemoryTest.LargestSysBufferTest" << std::endl; +do { + memset(tmp, 0xcc, size); + + HsaMemoryBuffer buf(tmp, size); + sleep(1); + EXPECT_EQ(tmp[0], 0x); +} while (true); + +munmap(tmp, size); + +TEST_END +} + // Basic test for hsaKmtAllocMemory TEST_F(KFDMemoryTest, MemoryAlloc) { TEST_START(TESTPROFILE_RUNALL) -- 2.25.1 ____________ 发件人: Pan, Xinhui 发送时间: 2021年5月19日 10:28 收件人: amd-...@lists.freedesktop.org 抄送: Kuehling, Felix; Deucher, Alexander; Koenig, Christian; dri-devel@lists.freedesktop.org; dan...@ffwll.ch; Pan, Xinhui 主题: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and swapin cpu 1 cpu 2 kfd alloc BO A(userptr) alloc BO B(GTT) ->init -> validate -> init -> validate -> populate init_user_pages -> swapout BO A //hit ttm pages limit -> get_user_pages (fill up ttm->pages) -> validate -> populate -> swapin BO A // Now hit the BUG We know that get_user_pages may race with swapout on same BO. Threre are some issues I have met. 1) memory corruption. This is because we do a swap before memory is setup. ttm_tt_swapout() just create a swap_storage with its content being 0x0. So when we setup memory after the swapout. The following swapin makes the memory corrupted. 2) panic When swapout happes with get_user_pages, they touch ttm->pages without anylock. It causes memory corruption too. But I hit page fault mostly. Signed-off-by: xinhui pan --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 16 +++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c index 928e8d57cd08..42460e4480f8 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c @@ -835,6 +835,7 @@ static int init_user_pages(struct kgd_mem *mem, uint64_t user_addr) struct amdkfd_process_info *process_info = mem->process_info; struct amdgpu_bo *bo = mem->bo; struct ttm_operation_ctx ctx = { true, false }; + struct page **pages; int ret = 0; mutex_lock(_info->lock); @@ -852,7 +853,13 @@ static int init_user_pages(struct kgd_mem *mem, uint64_t user_addr) goto out; } - ret = amdgpu_ttm_tt_get_user_pages(bo, bo->tbo.ttm->pages); + pages = kvmalloc_array(bo->tbo.ttm->num_pages, + sizeof(struct page *), + GFP_KERNEL | __GFP_ZERO); + if (!pages) + goto unregister_out; + + ret = amdgpu_ttm_tt_get_user_pages(bo, pages); if (ret) { pr_err("%s: Failed to get user pages: %d\n", __func__, ret); goto unregister_out; @@ -863,6 +870,12 @@ static int init_user_pages(struct kgd_mem *mem, uint64_t user_addr) pr_err("%s: Failed to reserve BO\n", __func__); goto release_out; } + + WARN_ON_ONCE(bo->tbo.ttm->page_flags & TTM_PAGE_FLAG_SWAPPED); + + memcpy(bo->tbo.ttm->pages, + pages, + sizeof(struct page*) * bo->tbo.ttm->num_pages); amdgpu_bo_placement_from_domain(bo, mem->domain); ret = ttm_bo_validate(>tbo, >placement, ); if (ret) @@ -872,6 +885,7 @@ static int init_user_pages(struct kgd_mem *mem, uint64_t user_addr) release_out: amdgpu_ttm_tt_get_user_pages_done(bo->tbo.ttm); unregister_out: + kvfree(pages); if (ret) amdgpu_mn_unregister(bo); out: -- 2.25.1
回复: [PATCH] drm/amdgpu: fix PM reference leak in amdgpu_debugfs_gfxoff_rea()
[AMD Official Use Only] thanks Kuai. But code below matches the other code block in this file. r = pm_runtime_get_sync(dev->dev); if (r < 0) { pm_runtime_put_autosuspend(dev->dev); return r; } 发件人: Yu Kuai 发送时间: 2021年5月17日 16:16 收件人: Deucher, Alexander; Koenig, Christian; Pan, Xinhui; airl...@linux.ie; dan...@ffwll.ch 抄送: amd-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; linux-ker...@vger.kernel.org; yuku...@huawei.com; yi.zh...@huawei.com 主题: [PATCH] drm/amdgpu: fix PM reference leak in amdgpu_debugfs_gfxoff_rea() pm_runtime_get_sync will increment pm usage counter even it failed. Forgetting to putting operation will result in reference leak here. Fix it by replacing it with pm_runtime_resume_and_get to keep usage counter balanced. Reported-by: Hulk Robot Signed-off-by: Yu Kuai --- drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c index bcaf271b39bf..eb7f9d20dad7 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c @@ -1058,7 +1058,7 @@ static ssize_t amdgpu_debugfs_gfxoff_read(struct file *f, char __user *buf, if (size & 0x3 || *pos & 0x3) return -EINVAL; - r = pm_runtime_get_sync(adev_to_drm(adev)->dev); + r = pm_runtime_resume_and_get(adev_to_drm(adev)->dev); if (r < 0) return r; -- 2.25.4
Re: [PATCH 0/3] Use implicit kref infra
> 2020年9月2日 22:50,Tuikov, Luben 写道: > > On 2020-09-02 00:43, Pan, Xinhui wrote: >> >> >>> 2020年9月2日 11:46,Tuikov, Luben 写道: >>> >>> On 2020-09-01 21:42, Pan, Xinhui wrote: >>>> If you take a look at the below function, you should not use driver's >>>> release to free adev. As dev is embedded in adev. >>> >>> Do you mean "look at the function below", using "below" as an adverb? >>> "below" is not an adjective. >>> >>> I know dev is embedded in adev--I did that patchset. >>> >>>> >>>> 809 static void drm_dev_release(struct kref *ref) >>>> 810 { >>>> 811 struct drm_device *dev = container_of(ref, struct drm_device, >>>> ref); >>>> 812 >>>> 813 if (dev->driver->release) >>>> 814 dev->driver->release(dev); >>>> 815 >>>> 816 drm_managed_release(dev); >>>> 817 >>>> 818 kfree(dev->managed.final_kfree); >>>> 819 } >>> >>> That's simple--this comes from change c6603c740e0e3 >>> and it should be reverted. Simple as that. >>> >>> The version before this change was absolutely correct: >>> >>> static void drm_dev_release(struct kref *ref) >>> { >>> if (dev->driver->release) >>> dev->driver->release(dev); >>> else >>> drm_dev_fini(dev); >>> } >>> >>> Meaning, "the kref is now 0"--> if the driver >>> has a release, call it, else use our own. >>> But note that nothing can be assumed after this point, >>> about the existence of "dev". >>> >>> It is exactly because struct drm_device is statically >>> embedded into a container, struct amdgpu_device, >>> that this change above should be reverted. >>> >>> This is very similar to how fops has open/release >>> but no close. That is, the "release" is called >>> only when the last kref is released, i.e. when >>> kref goes from non-zero to zero. >>> >>> This uses the kref infrastructure which has been >>> around for about 20 years in the Linux kernel. >>> >>> I suggest reading the comments >>> in drm_dev.c mostly, "DOC: driver instance overview" >>> starting at line 240 onwards. This is right above >>> drm_put_dev(). There is actually an example of a driver >>> in the comment. Also the comment to drm_dev_init(). >>> >>> Now, take a look at this: >>> >>> /** >>> * drm_dev_put - Drop reference of a DRM device >>> * @dev: device to drop reference of or NULL >>> * >>> * This decreases the ref-count of @dev by one. The device is destroyed if >>> the >>> * ref-count drops to zero. >>> */ >>> void drm_dev_put(struct drm_device *dev) >>> { >>> if (dev) >>> kref_put(>ref, drm_dev_release); >>> } >>> EXPORT_SYMBOL(drm_dev_put); >>> >>> Two things: >>> >>> 1. It is us, who kzalloc the amdgpu device, which contains >>> the drm_device (you'll see this discussed in the reading >>> material I pointed to above). We do this because we're >>> probing the PCI device whether we'll work it it or not. >>> >> >> that is true. > > Of course it's true--good morning! > >> My understanding of the drm core code is like something below. > > Let me stop you right there--just read the documentation I pointed > to you at. > >> struct B { >> strcut A >> } >> we initialize A firstly and initialize B in the end. But destroy B firstly >> and destory A in the end. > > No! > B, which is the amdgpu_device struct "exists" before A, which is the DRM > struct. > This is why DRM recommends to _embed_ it into the driver's own device struct, > as the documentation I pointed you to at. > I think you are misleading me here. A pci dev as you said below can act as many roles, a drm dev, a tty dev, etc. say, struct B{ struct A; struct TTY; struct printer; ... } but TTY or other members has nothing to do with our discussion. B of course exists before A. but the code logic is not that. code below is really rare in drm world. create_B() { init B members return create_A() } So usually B have more work to do after it initialize A. then code should
Re: [PATCH 0/3] Use implicit kref infra
> 2020年9月2日 11:46,Tuikov, Luben 写道: > > On 2020-09-01 21:42, Pan, Xinhui wrote: >> If you take a look at the below function, you should not use driver's >> release to free adev. As dev is embedded in adev. > > Do you mean "look at the function below", using "below" as an adverb? > "below" is not an adjective. > > I know dev is embedded in adev--I did that patchset. > >> >> 809 static void drm_dev_release(struct kref *ref) >> 810 { >> 811 struct drm_device *dev = container_of(ref, struct drm_device, >> ref); >> 812 >> 813 if (dev->driver->release) >> 814 dev->driver->release(dev); >> 815 >> 816 drm_managed_release(dev); >> 817 >> 818 kfree(dev->managed.final_kfree); >> 819 } > > That's simple--this comes from change c6603c740e0e3 > and it should be reverted. Simple as that. > > The version before this change was absolutely correct: > > static void drm_dev_release(struct kref *ref) > { > if (dev->driver->release) > dev->driver->release(dev); > else > drm_dev_fini(dev); > } > > Meaning, "the kref is now 0"--> if the driver > has a release, call it, else use our own. > But note that nothing can be assumed after this point, > about the existence of "dev". > > It is exactly because struct drm_device is statically > embedded into a container, struct amdgpu_device, > that this change above should be reverted. > > This is very similar to how fops has open/release > but no close. That is, the "release" is called > only when the last kref is released, i.e. when > kref goes from non-zero to zero. > > This uses the kref infrastructure which has been > around for about 20 years in the Linux kernel. > > I suggest reading the comments > in drm_dev.c mostly, "DOC: driver instance overview" > starting at line 240 onwards. This is right above > drm_put_dev(). There is actually an example of a driver > in the comment. Also the comment to drm_dev_init(). > > Now, take a look at this: > > /** > * drm_dev_put - Drop reference of a DRM device > * @dev: device to drop reference of or NULL > * > * This decreases the ref-count of @dev by one. The device is destroyed if the > * ref-count drops to zero. > */ > void drm_dev_put(struct drm_device *dev) > { >if (dev) >kref_put(>ref, drm_dev_release); > } > EXPORT_SYMBOL(drm_dev_put); > > Two things: > > 1. It is us, who kzalloc the amdgpu device, which contains > the drm_device (you'll see this discussed in the reading > material I pointed to above). We do this because we're > probing the PCI device whether we'll work it it or not. > that is true. My understanding of the drm core code is like something below. struct B { strcut A } we initialize A firstly and initialize B in the end. But destroy B firstly and destory A in the end. But yes, practice is more complex. if B has nothing to be destroyed. we can destory A directly, otherwise destroy B firstly. in this case, we can do something below in our release() //some cleanup work of B drm_dev_fini(dev);//destroy A kfree(adev) > 2. Using the kref infrastructure, when the ref goes to 0, > drm_dev_release is called. And here's the KEY: > Because WE allocated the container, we should free it--after the release > method is called, DRM cannot assume anything about the drm > device or the container. The "release" method is final. > > We allocate, we free. And we free only when the ref goes to 0. > > DRM can, in due time, "free" itself of the DRM device and stop > having knowledge of it--that's fine, but as long as the ref > is not 0, the amdgpu device and thus the contained DRM device, > cannot be freed. > >> >> You have to make another change something like >> diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c >> index 13068fdf4331..2aabd2b4c63b 100644 >> --- a/drivers/gpu/drm/drm_drv.c >> +++ b/drivers/gpu/drm/drm_drv.c >> @@ -815,7 +815,8 @@ static void drm_dev_release(struct kref *ref) >> >>drm_managed_release(dev); >> >> - kfree(dev->managed.final_kfree); >> + if (dev->driver->final_release) >> + dev->driver->final_release(dev); >> } > > No. What's this? > There is no such thing as "final" release, nor is there a "partial" release. > When the kref goes to 0, the device disappears. Simple. > If someone is using it, they should kref-g
Re: [PATCH 0/3] Use implicit kref infra
If you take a look at the below function, you should not use driver's release to free adev. As dev is embedded in adev. 809 static void drm_dev_release(struct kref *ref) 810 { 811 struct drm_device *dev = container_of(ref, struct drm_device, ref); 812 813 if (dev->driver->release) 814 dev->driver->release(dev); 815 816 drm_managed_release(dev); 817 818 kfree(dev->managed.final_kfree); 819 } You have to make another change something like diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c index 13068fdf4331..2aabd2b4c63b 100644 --- a/drivers/gpu/drm/drm_drv.c +++ b/drivers/gpu/drm/drm_drv.c @@ -815,7 +815,8 @@ static void drm_dev_release(struct kref *ref) drm_managed_release(dev); - kfree(dev->managed.final_kfree); + if (dev->driver->final_release) + dev->driver->final_release(dev); } And in the final_release callback we free the dev. But that is a little complex now. so I prefer still using final_kfree. Of course we can do some cleanup work in the driver's release callback. BUT no kfree. -原始邮件- 发件人: "Tuikov, Luben" 日期: 2020年9月2日 星期三 09:07 收件人: "amd-...@lists.freedesktop.org" , "dri-devel@lists.freedesktop.org" 抄送: "Deucher, Alexander" , Daniel Vetter , "Pan, Xinhui" , "Tuikov, Luben" 主题: [PATCH 0/3] Use implicit kref infra Use the implicit kref infrastructure to free the container struct amdgpu_device, container of struct drm_device. First, in drm_dev_register(), do not indiscriminately warn when a DRM driver hasn't opted for managed.final_kfree, but instead check if the driver has provided its own "release" function callback in the DRM driver structure. If that is the case, no warning. Remove drmm_add_final_kfree(). We take care of that, in the kref "release" callback when all refs are down to 0, via drm_dev_put(), i.e. the free is implicit. Remove superfluous NULL check, since the DRM device to be suspended always exists, so long as the underlying PCI and DRM devices exist. Luben Tuikov (3): drm: No warn for drivers who provide release drm/amdgpu: Remove drmm final free drm/amdgpu: Remove superfluous NULL check drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 --- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 2 -- drivers/gpu/drm/drm_drv.c | 3 ++- 3 files changed, 2 insertions(+), 6 deletions(-) -- 2.28.0.394.ge197136389 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] drm/ttm: Schedule out if possibe in bo delayed delete worker
> 2020年4月9日 22:59,Koenig, Christian 写道: > >> Why we break out the loops when there are pending bos to be released? > > We do this anyway if we can't acquire the necessary locks. Freeing already > deleted BOs is just a very lazy background work. That is true. eviction will reclaim the BO resource too. > >> So it did not break anything with this patch I think. > > Oh, the patch will certainly work. I'm just not sure if it's the ideal > behavior. > >> https://elixir.bootlin.com/linux/latest/source/mm/slab.c#L4026 >> >> This is another example of the usage of cond_sched. > > Yes, and that is also a good example of what I mean here: > >> if (!mutex_trylock(_mutex)) >> >> >> /* Give up. Setup the next iteration. */ >> >> >> goto out; > > If the function can't acquire the lock immediately it gives up and waits for > the next iteration. > > I think it would be better if we do this in TTM as well if we spend to much > time cleaning up old BOs. fair enough. > > On the other hand you are right that cond_resched() has the advantage that we > could spend more time on cleaning up old BOs if there is nothing else for the > CPU TODO. > > Regards, > Christian. > > Am 09.04.20 um 16:24 schrieb Pan, Xinhui: >> https://elixir.bootlin.com/linux/latest/source/mm/slab.c#L4026 >> >> This is another example of the usage of cond_sched. >> From: Pan, Xinhui >> Sent: Thursday, April 9, 2020 10:11:08 PM >> To: Lucas Stach ; amd-...@lists.freedesktop.org >> ; Koenig, Christian >> Cc: dri-devel@lists.freedesktop.org >> Subject: Re: [PATCH] drm/ttm: Schedule out if possibe in bo delayed delete >> worker >> >> I think it doesn't matter if workitem schedule out. Even we did not schedule >> out, the workqueue itself will schedule out later. >> So it did not break anything with this patch I think. >> From: Pan, Xinhui >> Sent: Thursday, April 9, 2020 10:07:09 PM >> To: Lucas Stach ; amd-...@lists.freedesktop.org >> ; Koenig, Christian >> Cc: dri-devel@lists.freedesktop.org >> Subject: Re: [PATCH] drm/ttm: Schedule out if possibe in bo delayed delete >> worker >> >> Why we break out the loops when there are pending bos to be released? >> >> And I just checked the process_one_work. Right after the work item callback >> is called, the workqueue itself will call cond_resched. So I think >> From: Koenig, Christian >> Sent: Thursday, April 9, 2020 9:38:24 PM >> To: Lucas Stach ; Pan, Xinhui ; >> amd-...@lists.freedesktop.org >> Cc: dri-devel@lists.freedesktop.org >> Subject: Re: [PATCH] drm/ttm: Schedule out if possibe in bo delayed delete >> worker >> >> Am 09.04.20 um 15:25 schrieb Lucas Stach: >> > Am Donnerstag, den 09.04.2020, 14:35 +0200 schrieb Christian König: >> >> Am 09.04.20 um 03:31 schrieb xinhui pan: >> >>> The delayed delete list is per device which might be very huge. And in >> >>> a heavy workload test, the list might always not be empty. That will >> >>> trigger any RCU stall warnings or softlockups in non-preemptible kernels >> >>> Lets do schedule out if possible in that case. >> >> Mhm, I'm not sure if that is actually allowed. This is called from a >> >> work item and those are not really supposed to be scheduled away. >> > Huh? Workitems can schedule out just fine, otherwise they would be >> > horribly broken when it comes to sleeping locks. >> >> Let me refine the sentence: Work items are not really supposed to be >> scheduled purposely. E.g. you shouldn't call schedule() or >> cond_resched() like in the case here. >> >> Getting scheduled away because we wait for a lock is of course perfectly >> fine. >> >> > The workqueue code >> > even has measures to keep the workqueues at the expected concurrency >> > level by starting other workitems when one of them goes to sleep. >> >> Yeah, and exactly that's what I would say we should avoid here :) >> >> In other words work items can be scheduled away, but they should not if >> not really necessary (e.g. waiting for a lock). >> >> Otherwise as you said new threads for work item processing are started >> up and I don't think we want that. >> >> Just returning from the work item and waiting for the next cycle is most >> likely the better option. >> >> Regards, >> Christian. >> >> > >> > Regards, >&g
Re: [PATCH] drm/ttm: Schedule out if possibe in bo delayed delete worker
https://elixir.bootlin.com/linux/latest/source/mm/slab.c#L4026 This is another example of the usage of cond_sched. From: Pan, Xinhui Sent: Thursday, April 9, 2020 10:11:08 PM To: Lucas Stach ; amd-...@lists.freedesktop.org ; Koenig, Christian Cc: dri-devel@lists.freedesktop.org Subject: Re: [PATCH] drm/ttm: Schedule out if possibe in bo delayed delete worker I think it doesn't matter if workitem schedule out. Even we did not schedule out, the workqueue itself will schedule out later. So it did not break anything with this patch I think. From: Pan, Xinhui Sent: Thursday, April 9, 2020 10:07:09 PM To: Lucas Stach ; amd-...@lists.freedesktop.org ; Koenig, Christian Cc: dri-devel@lists.freedesktop.org Subject: Re: [PATCH] drm/ttm: Schedule out if possibe in bo delayed delete worker Why we break out the loops when there are pending bos to be released? And I just checked the process_one_work. Right after the work item callback is called, the workqueue itself will call cond_resched. So I think From: Koenig, Christian Sent: Thursday, April 9, 2020 9:38:24 PM To: Lucas Stach ; Pan, Xinhui ; amd-...@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Subject: Re: [PATCH] drm/ttm: Schedule out if possibe in bo delayed delete worker Am 09.04.20 um 15:25 schrieb Lucas Stach: > Am Donnerstag, den 09.04.2020, 14:35 +0200 schrieb Christian König: >> Am 09.04.20 um 03:31 schrieb xinhui pan: >>> The delayed delete list is per device which might be very huge. And in >>> a heavy workload test, the list might always not be empty. That will >>> trigger any RCU stall warnings or softlockups in non-preemptible kernels >>> Lets do schedule out if possible in that case. >> Mhm, I'm not sure if that is actually allowed. This is called from a >> work item and those are not really supposed to be scheduled away. > Huh? Workitems can schedule out just fine, otherwise they would be > horribly broken when it comes to sleeping locks. Let me refine the sentence: Work items are not really supposed to be scheduled purposely. E.g. you shouldn't call schedule() or cond_resched() like in the case here. Getting scheduled away because we wait for a lock is of course perfectly fine. > The workqueue code > even has measures to keep the workqueues at the expected concurrency > level by starting other workitems when one of them goes to sleep. Yeah, and exactly that's what I would say we should avoid here :) In other words work items can be scheduled away, but they should not if not really necessary (e.g. waiting for a lock). Otherwise as you said new threads for work item processing are started up and I don't think we want that. Just returning from the work item and waiting for the next cycle is most likely the better option. Regards, Christian. > > Regards, > Lucas > >> Maybe rather change the while into while (!list_empty(>ddestroy) >> && !should_reschedule(0)). >> >> Christian. >> >>> Signed-off-by: xinhui pan >>> --- >>>drivers/gpu/drm/ttm/ttm_bo.c | 1 + >>>1 file changed, 1 insertion(+) >>> >>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c >>> index 9e07c3f75156..b8d853cab33b 100644 >>> --- a/drivers/gpu/drm/ttm/ttm_bo.c >>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c >>> @@ -541,6 +541,7 @@ static bool ttm_bo_delayed_delete(struct ttm_bo_device >>> *bdev, bool remove_all) >>> } >>> >>> ttm_bo_put(bo); >>> + cond_resched(); >>> spin_lock(>lru_lock); >>> } >>> list_splice_tail(, >ddestroy); >> ___ >> dri-devel mailing list >> dri-devel@lists.freedesktop.org >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fdri-develdata=02%7C01%7Cchristian.koenig%40amd.com%7C0a47486676a74702f05408d7dc89839c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637220355504145868sdata=wbRkYBPI6mYuZjKBtQN3AGLDOwqJlWY3XUtwwSiUQHg%3Dreserved=0 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] drm/ttm: Schedule out if possibe in bo delayed delete worker
I think it doesn't matter if workitem schedule out. Even we did not schedule out, the workqueue itself will schedule out later. So it did not break anything with this patch I think. From: Pan, Xinhui Sent: Thursday, April 9, 2020 10:07:09 PM To: Lucas Stach ; amd-...@lists.freedesktop.org ; Koenig, Christian Cc: dri-devel@lists.freedesktop.org Subject: Re: [PATCH] drm/ttm: Schedule out if possibe in bo delayed delete worker Why we break out the loops when there are pending bos to be released? And I just checked the process_one_work. Right after the work item callback is called, the workqueue itself will call cond_resched. So I think From: Koenig, Christian Sent: Thursday, April 9, 2020 9:38:24 PM To: Lucas Stach ; Pan, Xinhui ; amd-...@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Subject: Re: [PATCH] drm/ttm: Schedule out if possibe in bo delayed delete worker Am 09.04.20 um 15:25 schrieb Lucas Stach: > Am Donnerstag, den 09.04.2020, 14:35 +0200 schrieb Christian König: >> Am 09.04.20 um 03:31 schrieb xinhui pan: >>> The delayed delete list is per device which might be very huge. And in >>> a heavy workload test, the list might always not be empty. That will >>> trigger any RCU stall warnings or softlockups in non-preemptible kernels >>> Lets do schedule out if possible in that case. >> Mhm, I'm not sure if that is actually allowed. This is called from a >> work item and those are not really supposed to be scheduled away. > Huh? Workitems can schedule out just fine, otherwise they would be > horribly broken when it comes to sleeping locks. Let me refine the sentence: Work items are not really supposed to be scheduled purposely. E.g. you shouldn't call schedule() or cond_resched() like in the case here. Getting scheduled away because we wait for a lock is of course perfectly fine. > The workqueue code > even has measures to keep the workqueues at the expected concurrency > level by starting other workitems when one of them goes to sleep. Yeah, and exactly that's what I would say we should avoid here :) In other words work items can be scheduled away, but they should not if not really necessary (e.g. waiting for a lock). Otherwise as you said new threads for work item processing are started up and I don't think we want that. Just returning from the work item and waiting for the next cycle is most likely the better option. Regards, Christian. > > Regards, > Lucas > >> Maybe rather change the while into while (!list_empty(>ddestroy) >> && !should_reschedule(0)). >> >> Christian. >> >>> Signed-off-by: xinhui pan >>> --- >>>drivers/gpu/drm/ttm/ttm_bo.c | 1 + >>>1 file changed, 1 insertion(+) >>> >>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c >>> index 9e07c3f75156..b8d853cab33b 100644 >>> --- a/drivers/gpu/drm/ttm/ttm_bo.c >>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c >>> @@ -541,6 +541,7 @@ static bool ttm_bo_delayed_delete(struct ttm_bo_device >>> *bdev, bool remove_all) >>> } >>> >>> ttm_bo_put(bo); >>> + cond_resched(); >>> spin_lock(>lru_lock); >>> } >>> list_splice_tail(, >ddestroy); >> ___ >> dri-devel mailing list >> dri-devel@lists.freedesktop.org >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fdri-develdata=02%7C01%7Cchristian.koenig%40amd.com%7C0a47486676a74702f05408d7dc89839c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637220355504145868sdata=wbRkYBPI6mYuZjKBtQN3AGLDOwqJlWY3XUtwwSiUQHg%3Dreserved=0 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] drm/ttm: Schedule out if possibe in bo delayed delete worker
[AMD Official Use Only - Internal Distribution Only] Why we break out the loops when there are pending bos to be released? And I just checked the process_one_work. Right after the work item callback is called, the workqueue itself will call cond_resched. So I think From: Koenig, Christian Sent: Thursday, April 9, 2020 9:38:24 PM To: Lucas Stach ; Pan, Xinhui ; amd-...@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Subject: Re: [PATCH] drm/ttm: Schedule out if possibe in bo delayed delete worker Am 09.04.20 um 15:25 schrieb Lucas Stach: > Am Donnerstag, den 09.04.2020, 14:35 +0200 schrieb Christian König: >> Am 09.04.20 um 03:31 schrieb xinhui pan: >>> The delayed delete list is per device which might be very huge. And in >>> a heavy workload test, the list might always not be empty. That will >>> trigger any RCU stall warnings or softlockups in non-preemptible kernels >>> Lets do schedule out if possible in that case. >> Mhm, I'm not sure if that is actually allowed. This is called from a >> work item and those are not really supposed to be scheduled away. > Huh? Workitems can schedule out just fine, otherwise they would be > horribly broken when it comes to sleeping locks. Let me refine the sentence: Work items are not really supposed to be scheduled purposely. E.g. you shouldn't call schedule() or cond_resched() like in the case here. Getting scheduled away because we wait for a lock is of course perfectly fine. > The workqueue code > even has measures to keep the workqueues at the expected concurrency > level by starting other workitems when one of them goes to sleep. Yeah, and exactly that's what I would say we should avoid here :) In other words work items can be scheduled away, but they should not if not really necessary (e.g. waiting for a lock). Otherwise as you said new threads for work item processing are started up and I don't think we want that. Just returning from the work item and waiting for the next cycle is most likely the better option. Regards, Christian. > > Regards, > Lucas > >> Maybe rather change the while into while (!list_empty(>ddestroy) >> && !should_reschedule(0)). >> >> Christian. >> >>> Signed-off-by: xinhui pan >>> --- >>>drivers/gpu/drm/ttm/ttm_bo.c | 1 + >>>1 file changed, 1 insertion(+) >>> >>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c >>> index 9e07c3f75156..b8d853cab33b 100644 >>> --- a/drivers/gpu/drm/ttm/ttm_bo.c >>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c >>> @@ -541,6 +541,7 @@ static bool ttm_bo_delayed_delete(struct ttm_bo_device >>> *bdev, bool remove_all) >>> } >>> >>> ttm_bo_put(bo); >>> + cond_resched(); >>> spin_lock(>lru_lock); >>> } >>> list_splice_tail(, >ddestroy); >> ___ >> dri-devel mailing list >> dri-devel@lists.freedesktop.org >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fdri-develdata=02%7C01%7Cchristian.koenig%40amd.com%7C0a47486676a74702f05408d7dc89839c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637220355504145868sdata=wbRkYBPI6mYuZjKBtQN3AGLDOwqJlWY3XUtwwSiUQHg%3Dreserved=0 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] dma-buf: Fix missing excl fence waiting
> 2020年2月28日 13:45,panxinhui 写道: > > > >> 2020年2月26日 03:11,Koenig, Christian 写道: >> >> Am 25.02.20 um 18:23 schrieb Daniel Vetter: >>> On Sun, Feb 23, 2020 at 01:04:15PM +0100, Christian König wrote: >>>> Am 23.02.20 um 12:56 schrieb Pan, Xinhui: >>>>> If shared fence list is not empty, even we want to test all fences, excl >>>>> fence is ignored. >>>>> That is abviously wrong, so fix it. >>>> Yeah that is a known issue and I completely agree with you, but other >>>> disagree. >>>> >>>> See the shared fences are meant to depend on the exclusive fence. So all >>>> shared fences must finish only after the exclusive one has finished as >>>> well. >>>> >>>> The problem now is that for error handling this isn't necessary true. In >>>> other words when a shared fence completes with an error it is perfectly >>>> possible that he does this before the exclusive fence is finished. >>>> >>>> I'm trying to convince Daniel that this is a problem for years :) >>> I thought the consensus is that reasonable gpu schedulers and gpu reset >>> code should try to make really, really sure it only completes stuff in >>> sequence? That's at least my take away from the syncobj timeline >>> discussion, where you convinced me we shouldn't just crash >>> >>> I think as long as your scheduler is competent and your gpu reset tries to >>> limit damage (i.e. kill offending ctx terminally, mark everything else >>> that didn't complete for re-running) we should end up with everything >>> completing in sequence. I guess if you do kill a lot more stuff, then >>> you'd have to push these through your scheduler as dummy jobs, i.e. they >>> still wait for their dependencies, but then all they do is set the >>> dma_fence error and complete it. Maybe something the common scheduler >>> could do. >> >> Yes, that's exactly how we currently implement it. But I still think that >> this is not necessary the best approach :) >> >> Anyway Xinhui's problem turned out to be deeper. We somehow add an old stale >> fence to the dma_resv object sometimes and that can result in quite a bunch >> of problems. >> >> I'm currently trying to hunt down what's going wrong here in more detail. > > got some backtrace below. > > add excl fence: > > <4>[ 1203.904748] ttm_bo_pipeline_move+0x74/0x368 [ttm] > <4>[ 1203.904809] amdgpu_move_blit.isra.8+0xf4/0x108 [amdgpu] > <4>[ 1203.904870] amdgpu_bo_move+0x88/0x208 [amdgpu] > <4>[ 1203.904881] ttm_bo_handle_move_mem+0x250/0x498 [ttm] > <4>[ 1203.904888] ttm_bo_evict+0x12c/0x1c8 [ttm] > <4>[ 1203.904895] ttm_mem_evict_first+0x1d0/0x2c8 [ttm] > <4>[ 1203.904903] ttm_bo_mem_space+0x2f4/0x498 [ttm] > <4>[ 1203.904913] ttm_bo_validate+0xdc/0x168 [ttm] > <4>[ 1203.904975] amdgpu_cs_bo_validate+0xb0/0x230 [amdgpu] > <4>[ 1203.905038] amdgpu_cs_validate+0x60/0x2b8 [amdgpu] > <4>[ 1203.905099] amdgpu_cs_list_validate+0xb8/0x1a8 [amdgpu] > <4>[ 1203.905161] amdgpu_cs_ioctl+0x12b0/0x1598 [amdgpu] > <4>[ 1203.905186] drm_ioctl_kernel+0x94/0x118 [drm] > <4>[ 1203.905210] drm_ioctl+0x1f0/0x438 [drm] > <4>[ 1203.905271] amdgpu_drm_ioctl+0x58/0x90 [amdgpu] > <4>[ 1203.905275] do_vfs_ioctl+0xc4/0x8c0 > <4>[ 1203.905279] ksys_ioctl+0x8c/0xa0 > > add shared fence: > > <4>[ 1203.905349] amdgpu_bo_fence+0x6c/0x80 [amdgpu] > <4>[ 1203.905410] amdgpu_gem_object_close+0x194/0x1d0 [amdgpu] > <4>[ 1203.905435] drm_gem_object_release_handle+0x3c/0x98 [drm] > <4>[ 1203.905438] idr_for_each+0x70/0x128 > <4>[ 1203.905463] drm_gem_release+0x30/0x48 [drm] > <4>[ 1203.905486] drm_file_free.part.0+0x258/0x2f0 [drm] > <4>[ 1203.905511] drm_release+0x9c/0xe0 [drm] > <4>[ 1203.905514] __fput+0xac/0x218 > <4>[ 1203.905518] fput+0x20/0x30 > <4>[ 1203.905521] task_work_run+0xb8/0xf0 > <4>[ 1203.905523] do_exit+0x398/0xaf8 > <4>[ 1203.905525] do_group_exit+0x3c/0xd0 > <4>[ 1203.905527] get_signal+0xec/0x740 > <4>[ 1203.905529] do_signal+0x88/0x288 > <4>[ 1203.905531] do_notify_resume+0xd8/0x130 > <4
Re: [PATCH] dma-buf: Fix missing excl fence waiting
> 2020年2月26日 03:11,Koenig, Christian 写道: > > Am 25.02.20 um 18:23 schrieb Daniel Vetter: >> On Sun, Feb 23, 2020 at 01:04:15PM +0100, Christian König wrote: >>> Am 23.02.20 um 12:56 schrieb Pan, Xinhui: >>>> If shared fence list is not empty, even we want to test all fences, excl >>>> fence is ignored. >>>> That is abviously wrong, so fix it. >>> Yeah that is a known issue and I completely agree with you, but other >>> disagree. >>> >>> See the shared fences are meant to depend on the exclusive fence. So all >>> shared fences must finish only after the exclusive one has finished as well. >>> >>> The problem now is that for error handling this isn't necessary true. In >>> other words when a shared fence completes with an error it is perfectly >>> possible that he does this before the exclusive fence is finished. >>> >>> I'm trying to convince Daniel that this is a problem for years :) >> I thought the consensus is that reasonable gpu schedulers and gpu reset >> code should try to make really, really sure it only completes stuff in >> sequence? That's at least my take away from the syncobj timeline >> discussion, where you convinced me we shouldn't just crash >> >> I think as long as your scheduler is competent and your gpu reset tries to >> limit damage (i.e. kill offending ctx terminally, mark everything else >> that didn't complete for re-running) we should end up with everything >> completing in sequence. I guess if you do kill a lot more stuff, then >> you'd have to push these through your scheduler as dummy jobs, i.e. they >> still wait for their dependencies, but then all they do is set the >> dma_fence error and complete it. Maybe something the common scheduler >> could do. > > Yes, that's exactly how we currently implement it. But I still think that > this is not necessary the best approach :) > > Anyway Xinhui's problem turned out to be deeper. We somehow add an old stale > fence to the dma_resv object sometimes and that can result in quite a bunch > of problems. > > I'm currently trying to hunt down what's going wrong here in more detail. got some backtrace below. add excl fence: <4>[ 1203.904748] ttm_bo_pipeline_move+0x74/0x368 [ttm] <4>[ 1203.904809] amdgpu_move_blit.isra.8+0xf4/0x108 [amdgpu] <4>[ 1203.904870] amdgpu_bo_move+0x88/0x208 [amdgpu] <4>[ 1203.904881] ttm_bo_handle_move_mem+0x250/0x498 [ttm] <4>[ 1203.904888] ttm_bo_evict+0x12c/0x1c8 [ttm] <4>[ 1203.904895] ttm_mem_evict_first+0x1d0/0x2c8 [ttm] <4>[ 1203.904903] ttm_bo_mem_space+0x2f4/0x498 [ttm] <4>[ 1203.904913] ttm_bo_validate+0xdc/0x168 [ttm] <4>[ 1203.904975] amdgpu_cs_bo_validate+0xb0/0x230 [amdgpu] <4>[ 1203.905038] amdgpu_cs_validate+0x60/0x2b8 [amdgpu] <4>[ 1203.905099] amdgpu_cs_list_validate+0xb8/0x1a8 [amdgpu] <4>[ 1203.905161] amdgpu_cs_ioctl+0x12b0/0x1598 [amdgpu] <4>[ 1203.905186] drm_ioctl_kernel+0x94/0x118 [drm] <4>[ 1203.905210] drm_ioctl+0x1f0/0x438 [drm] <4>[ 1203.905271] amdgpu_drm_ioctl+0x58/0x90 [amdgpu] <4>[ 1203.905275] do_vfs_ioctl+0xc4/0x8c0 <4>[ 1203.905279] ksys_ioctl+0x8c/0xa0 add shared fence: <4>[ 1203.905349] amdgpu_bo_fence+0x6c/0x80 [amdgpu] <4>[ 1203.905410] amdgpu_gem_object_close+0x194/0x1d0 [amdgpu] <4>[ 1203.905435] drm_gem_object_release_handle+0x3c/0x98 [drm] <4>[ 1203.905438] idr_for_each+0x70/0x128 <4>[ 1203.905463] drm_gem_release+0x30/0x48 [drm] <4>[ 1203.905486] drm_file_free.part.0+0x258/0x2f0 [drm] <4>[ 1203.905511] drm_release+0x9c/0xe0 [drm] <4>[ 1203.905514] __fput+0xac/0x218 <4>[ 1203.905518] fput+0x20/0x30 <4>[ 1203.905521] task_work_run+0xb8/0xf0 <4>[ 1203.905523] do_exit+0x398/0xaf8 <4>[ 1203.905525] do_group_exit+0x3c/0xd0 <4>[ 1203.905527] get_signal+0xec/0x740 <4>[ 1203.905529] do_signal+0x88/0x288 <4>[ 1203.905531] do_notify_resume+0xd8/0x130 <4>[ 1203.905533] work_pending+0x8/0x10 we are using kernel 4.19.104. The problem is that, eviction on PT/PD submit one job and add excl fence to bo->resv. And if application is got killed, amdgpu_gem_object_close will try to clear PT/PD, and submit one job. I take a look at the code, it will sync root.base.bo->resv. and add the fence to bo as shared. So the fence used in clear PT/PD does not sync bo->resv actually. amdgpu_vm_bo_update_mapping
Re: [PATCH] dma-buf: Fix missing excl fence waiting
> 2020年2月23日 20:04,Koenig, Christian 写道: > > Am 23.02.20 um 12:56 schrieb Pan, Xinhui: >> If shared fence list is not empty, even we want to test all fences, excl >> fence is ignored. >> That is abviously wrong, so fix it. > > Yeah that is a known issue and I completely agree with you, but other > disagree. > > See the shared fences are meant to depend on the exclusive fence. So all > shared fences must finish only after the exclusive one has finished as well. fair enough. > The problem now is that for error handling this isn't necessary true. In > other words when a shared fence completes with an error it is perfectly > possible that he does this before the exclusive fence is finished. > > I'm trying to convince Daniel that this is a problem for years :) > I have met problems, eviction has race with bo relase. system memory is overwried by sDMA. the kernel is 4.19, stable one, LOL. amdgpu add excl fence to bo to move system memory which is done by the drm scheduler. after sDMA finish the moving job, the memory might have already been released as dma_resv_test_signaled_rcu did not check excl fence. Our local customer report this issue. I took 4 days into it. sigh thanks xinhui > Regards, > Christian. > >> >> Signed-off-by: xinhui pan >> --- >> drivers/dma-buf/dma-resv.c | 9 + >> 1 file changed, 5 insertions(+), 4 deletions(-) >> >> diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c >> index 4264e64788c4..44dc64c547c6 100644 >> --- a/drivers/dma-buf/dma-resv.c >> +++ b/drivers/dma-buf/dma-resv.c >> @@ -632,14 +632,14 @@ static inline int dma_resv_test_signaled_single(struct >> dma_fence *passed_fence) >> */ >> bool dma_resv_test_signaled_rcu(struct dma_resv *obj, bool test_all) >> { >> -unsigned seq, shared_count; >> +unsigned int seq, shared_count, left; >> int ret; >> rcu_read_lock(); >> retry: >> ret = true; >> shared_count = 0; >> -seq = read_seqcount_begin(>seq); >> +left = seq = read_seqcount_begin(>seq); >> if (test_all) { >> unsigned i; >> @@ -647,7 +647,7 @@ bool dma_resv_test_signaled_rcu(struct dma_resv *obj, >> bool test_all) >> struct dma_resv_list *fobj = rcu_dereference(obj->fence); >> if (fobj) >> -shared_count = fobj->shared_count; >> +left = shared_count = fobj->shared_count; >> for (i = 0; i < shared_count; ++i) { >> struct dma_fence *fence = >> rcu_dereference(fobj->shared[i]); >> @@ -657,13 +657,14 @@ bool dma_resv_test_signaled_rcu(struct dma_resv *obj, >> bool test_all) >> goto retry; >> else if (!ret) >> break; >> +left--; >> } >> if (read_seqcount_retry(>seq, seq)) >> goto retry; >> } >> - if (!shared_count) { >> +if (!left) { >> struct dma_fence *fence_excl = rcu_dereference(obj->fence_excl); >> if (fence_excl) { > ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH V2] dma-buf: Fix missing excl fence waiting
If shared fence list is not empty, even we want to test all fences, excl fence is ignored. That is abviously wrong, so fix it. Signed-off-by: xinhui pan --- change from v1: init left correctly --- drivers/dma-buf/dma-resv.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c index 4264e64788c4..82e4b4f63bef 100644 --- a/drivers/dma-buf/dma-resv.c +++ b/drivers/dma-buf/dma-resv.c @@ -632,13 +632,13 @@ static inline int dma_resv_test_signaled_single(struct dma_fence *passed_fence) */ bool dma_resv_test_signaled_rcu(struct dma_resv *obj, bool test_all) { - unsigned seq, shared_count; + unsigned int seq, shared_count, left; int ret; rcu_read_lock(); retry: ret = true; - shared_count = 0; + left = shared_count = 0; seq = read_seqcount_begin(>seq); if (test_all) { @@ -647,7 +647,7 @@ bool dma_resv_test_signaled_rcu(struct dma_resv *obj, bool test_all) struct dma_resv_list *fobj = rcu_dereference(obj->fence); if (fobj) - shared_count = fobj->shared_count; + left = shared_count = fobj->shared_count; for (i = 0; i < shared_count; ++i) { struct dma_fence *fence = rcu_dereference(fobj->shared[i]); @@ -657,13 +657,14 @@ bool dma_resv_test_signaled_rcu(struct dma_resv *obj, bool test_all) goto retry; else if (!ret) break; + left--; } if (read_seqcount_retry(>seq, seq)) goto retry; } - if (!shared_count) { + if (!left) { struct dma_fence *fence_excl = rcu_dereference(obj->fence_excl); if (fence_excl) { -- 2.21.0 (Apple Git-122) ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] dma-buf: Fix missing excl fence waiting
sorry, paste a wrong patch. will send out v2. > 2020年2月23日 19:56,Pan, Xinhui 写道: > > If shared fence list is not empty, even we want to test all fences, excl > fence is ignored. > That is abviously wrong, so fix it. > > Signed-off-by: xinhui pan > --- > drivers/dma-buf/dma-resv.c | 9 + > 1 file changed, 5 insertions(+), 4 deletions(-) > > diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c > index 4264e64788c4..44dc64c547c6 100644 > --- a/drivers/dma-buf/dma-resv.c > +++ b/drivers/dma-buf/dma-resv.c > @@ -632,14 +632,14 @@ static inline int dma_resv_test_signaled_single(struct > dma_fence *passed_fence) > */ > bool dma_resv_test_signaled_rcu(struct dma_resv *obj, bool test_all) > { > - unsigned seq, shared_count; > + unsigned int seq, shared_count, left; > int ret; > > rcu_read_lock(); > retry: > ret = true; > shared_count = 0; > - seq = read_seqcount_begin(>seq); > + left = seq = read_seqcount_begin(>seq); > > if (test_all) { > unsigned i; > @@ -647,7 +647,7 @@ bool dma_resv_test_signaled_rcu(struct dma_resv *obj, > bool test_all) > struct dma_resv_list *fobj = rcu_dereference(obj->fence); > > if (fobj) > - shared_count = fobj->shared_count; > + left = shared_count = fobj->shared_count; > > for (i = 0; i < shared_count; ++i) { > struct dma_fence *fence = > rcu_dereference(fobj->shared[i]); > @@ -657,13 +657,14 @@ bool dma_resv_test_signaled_rcu(struct dma_resv *obj, > bool test_all) > goto retry; > else if (!ret) > break; > + left--; > } > > if (read_seqcount_retry(>seq, seq)) > goto retry; > } > > - if (!shared_count) { > + if (!left) { > struct dma_fence *fence_excl = rcu_dereference(obj->fence_excl); > > if (fence_excl) { > -- > 2.21.0 (Apple Git-122) > > ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH] dma-buf: Fix missing excl fence waiting
If shared fence list is not empty, even we want to test all fences, excl fence is ignored. That is abviously wrong, so fix it. Signed-off-by: xinhui pan --- drivers/dma-buf/dma-resv.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c index 4264e64788c4..44dc64c547c6 100644 --- a/drivers/dma-buf/dma-resv.c +++ b/drivers/dma-buf/dma-resv.c @@ -632,14 +632,14 @@ static inline int dma_resv_test_signaled_single(struct dma_fence *passed_fence) */ bool dma_resv_test_signaled_rcu(struct dma_resv *obj, bool test_all) { - unsigned seq, shared_count; + unsigned int seq, shared_count, left; int ret; rcu_read_lock(); retry: ret = true; shared_count = 0; - seq = read_seqcount_begin(>seq); + left = seq = read_seqcount_begin(>seq); if (test_all) { unsigned i; @@ -647,7 +647,7 @@ bool dma_resv_test_signaled_rcu(struct dma_resv *obj, bool test_all) struct dma_resv_list *fobj = rcu_dereference(obj->fence); if (fobj) - shared_count = fobj->shared_count; + left = shared_count = fobj->shared_count; for (i = 0; i < shared_count; ++i) { struct dma_fence *fence = rcu_dereference(fobj->shared[i]); @@ -657,13 +657,14 @@ bool dma_resv_test_signaled_rcu(struct dma_resv *obj, bool test_all) goto retry; else if (!ret) break; + left--; } if (read_seqcount_retry(>seq, seq)) goto retry; } - if (!shared_count) { + if (!left) { struct dma_fence *fence_excl = rcu_dereference(obj->fence_excl); if (fence_excl) { -- 2.21.0 (Apple Git-122) ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] drm/ttm: replace dma_resv object on deleted BOs v3
> 2020年2月13日 18:01,Koenig, Christian 写道: > > Am 13.02.20 um 05:11 schrieb Pan, Xinhui: >> >> >>> 2020年2月12日 19:53,Christian König 写道: >>> >>> Am 12.02.20 um 07:23 schrieb Pan, Xinhui: >>>>> 2020年2月11日 23:43,Christian König 写道: >>>>> >>>>> When non-imported BOs are resurrected for delayed delete we replace >>>>> the dma_resv object to allow for easy reclaiming of the resources. >>>>> >>>>> v2: move that to ttm_bo_individualize_resv >>>>> v3: add a comment to explain what's going on >>>>> >>>>> Signed-off-by: Christian König >>>>> Reviewed-by: xinhui pan >>>>> --- >>>>> drivers/gpu/drm/ttm/ttm_bo.c | 14 +- >>>>> 1 file changed, 13 insertions(+), 1 deletion(-) >>>>> >>>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c >>>>> index bfc42a9e4fb4..8174603d390f 100644 >>>>> --- a/drivers/gpu/drm/ttm/ttm_bo.c >>>>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c >>>>> @@ -393,6 +393,18 @@ static int ttm_bo_individualize_resv(struct >>>>> ttm_buffer_object *bo) >>>>> >>>>> r = dma_resv_copy_fences(>base._resv, bo->base.resv); >>>>> dma_resv_unlock(>base._resv); >>>>> + if (r) >>>>> + return r; >>>>> + >>>>> + if (bo->type != ttm_bo_type_sg) { >>>>> + /* This works because the BO is about to be destroyed and nobody >>>>> + * reference it any more. The only tricky case is the trylock on >>>>> + * the resv object while holding the lru_lock. >>>>> + */ >>>>> + spin_lock(_bo_glob.lru_lock); >>>>> + bo->base.resv = >base._resv; >>>>> + spin_unlock(_bo_glob.lru_lock); >>>>> + } >>>>> >>>> how about something like that. >>>> the basic idea is to do the bo cleanup work in bo release first and avoid >>>> any race with evict. >>>> As in bo dieing progress, evict also just do bo cleanup work. >>>> >>>> If bo is busy, neither bo_release nor evict can do cleanupwork on it. >>>> For the bo release case, we just add bo back to lru list. >>>> So we can clean it up both in workqueue and shrinker as the past way did. >>>> >>>> @@ -405,8 +405,9 @@ static int ttm_bo_individualize_resv(struct >>>> ttm_buffer_object *bo) >>>> if (bo->type != ttm_bo_type_sg) { >>>> spin_lock(_bo_glob.lru_lock); >>>> - bo->base.resv = >base._resv; >>>> + ttm_bo_del_from_lru(bo); >>>> spin_unlock(_bo_glob.lru_lock); >>>> + bo->base.resv = >base._resv; >>>> } >>>> return r; >>>> @@ -606,10 +607,9 @@ static void ttm_bo_release(struct kref *kref) >>>> * shrinkers, now that they are queued for >>>> * destruction. >>>> */ >>>> - if (bo->mem.placement & TTM_PL_FLAG_NO_EVICT) { >>>> + if (bo->mem.placement & TTM_PL_FLAG_NO_EVICT) >>>> bo->mem.placement &= ~TTM_PL_FLAG_NO_EVICT; >>>> - ttm_bo_move_to_lru_tail(bo, NULL); >>>> - } >>>> + ttm_bo_add_mem_to_lru(bo, >mem); >>>> kref_init(>kref); >>>> list_add_tail(>ddestroy, >ddestroy); >>> Yeah, thought about that as well. But this has the major drawback that the >>> deleted BO moves to the end of the LRU, which is something we don't want. >> well, as the bo is busy, looks like it needs time to being idle. putting it >> to tail seems fair. > > No, see BOs should move to the tail of the LRU whenever they are used. > Freeing up a BO is basically the opposite of using it. > > So what would happen on the next memory contention is that the MM would evict > BOs which are still used and only after come to the delete BO which could > have been removed long ago. > >>> I think the real solution to this problem is to go a completely different >>> way and remove the delayed delete feature from TTM altogether. Instead this >>> should be part of some DRM domain handler component. >>> >> yes, completely agree. As long as we can shrink bos when OOM, the w
Re: [PATCH] drm/ttm: replace dma_resv object on deleted BOs v3
> 2020年2月12日 19:53,Christian König 写道: > > Am 12.02.20 um 07:23 schrieb Pan, Xinhui: >> >>> 2020年2月11日 23:43,Christian König 写道: >>> >>> When non-imported BOs are resurrected for delayed delete we replace >>> the dma_resv object to allow for easy reclaiming of the resources. >>> >>> v2: move that to ttm_bo_individualize_resv >>> v3: add a comment to explain what's going on >>> >>> Signed-off-by: Christian König >>> Reviewed-by: xinhui pan >>> --- >>> drivers/gpu/drm/ttm/ttm_bo.c | 14 +- >>> 1 file changed, 13 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c >>> index bfc42a9e4fb4..8174603d390f 100644 >>> --- a/drivers/gpu/drm/ttm/ttm_bo.c >>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c >>> @@ -393,6 +393,18 @@ static int ttm_bo_individualize_resv(struct >>> ttm_buffer_object *bo) >>> >>> r = dma_resv_copy_fences(>base._resv, bo->base.resv); >>> dma_resv_unlock(>base._resv); >>> + if (r) >>> + return r; >>> + >>> + if (bo->type != ttm_bo_type_sg) { >>> + /* This works because the BO is about to be destroyed and nobody >>> +* reference it any more. The only tricky case is the trylock on >>> +* the resv object while holding the lru_lock. >>> +*/ >>> + spin_lock(_bo_glob.lru_lock); >>> + bo->base.resv = >base._resv; >>> + spin_unlock(_bo_glob.lru_lock); >>> + } >>> >> how about something like that. >> the basic idea is to do the bo cleanup work in bo release first and avoid >> any race with evict. >> As in bo dieing progress, evict also just do bo cleanup work. >> >> If bo is busy, neither bo_release nor evict can do cleanupwork on it. For >> the bo release case, we just add bo back to lru list. >> So we can clean it up both in workqueue and shrinker as the past way did. >> >> @@ -405,8 +405,9 @@ static int ttm_bo_individualize_resv(struct >> ttm_buffer_object *bo) >> if (bo->type != ttm_bo_type_sg) { >> spin_lock(_bo_glob.lru_lock); >> - bo->base.resv = >base._resv; >> + ttm_bo_del_from_lru(bo); >> spin_unlock(_bo_glob.lru_lock); >> + bo->base.resv = >base._resv; >> } >> return r; >> @@ -606,10 +607,9 @@ static void ttm_bo_release(struct kref *kref) >> * shrinkers, now that they are queued for >> * destruction. >> */ >> - if (bo->mem.placement & TTM_PL_FLAG_NO_EVICT) { >> + if (bo->mem.placement & TTM_PL_FLAG_NO_EVICT) >> bo->mem.placement &= ~TTM_PL_FLAG_NO_EVICT; >> - ttm_bo_move_to_lru_tail(bo, NULL); >> - } >> + ttm_bo_add_mem_to_lru(bo, >mem); >> kref_init(>kref); >> list_add_tail(>ddestroy, >ddestroy); > > Yeah, thought about that as well. But this has the major drawback that the > deleted BO moves to the end of the LRU, which is something we don't want. well, as the bo is busy, looks like it needs time to being idle. putting it to tail seems fair. > I think the real solution to this problem is to go a completely different way > and remove the delayed delete feature from TTM altogether. Instead this > should be part of some DRM domain handler component. > yes, completely agree. As long as we can shrink bos when OOM, the workqueue is not necessary, The workqueue does not help in a heavy workload case. Pls see my patches below, I remove the workqueue, and what’s more, we can clearup the bo without lru lock hold. That would reduce the lock contention. I run kfdtest and got a good performance result. > In other words it should not matter if a BO is evicted, moved or freed. > Whenever a piece of memory becomes available again we keep around a fence > which marks the end of using this piece of memory. > > When then somebody asks for new memory we work through the LRU and test if > using a certain piece of memory makes sense or not. If we find that a BO > needs to be evicted for this we return a reference to the BO in question to > the upper level handling. > > If we find that we can do the allocation but only with recently freed up > memory we gather the fences and say you can only use the newly allocated > memory after waiting for those. > > HEY! Wait a second! Did I just outlined what a potential
Re: [PATCH] drm/ttm: replace dma_resv object on deleted BOs v3
> 2020年2月11日 23:43,Christian König 写道: > > When non-imported BOs are resurrected for delayed delete we replace > the dma_resv object to allow for easy reclaiming of the resources. > > v2: move that to ttm_bo_individualize_resv > v3: add a comment to explain what's going on > > Signed-off-by: Christian König > Reviewed-by: xinhui pan > --- > drivers/gpu/drm/ttm/ttm_bo.c | 14 +- > 1 file changed, 13 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c > index bfc42a9e4fb4..8174603d390f 100644 > --- a/drivers/gpu/drm/ttm/ttm_bo.c > +++ b/drivers/gpu/drm/ttm/ttm_bo.c > @@ -393,6 +393,18 @@ static int ttm_bo_individualize_resv(struct > ttm_buffer_object *bo) > > r = dma_resv_copy_fences(>base._resv, bo->base.resv); > dma_resv_unlock(>base._resv); > + if (r) > + return r; > + > + if (bo->type != ttm_bo_type_sg) { > + /* This works because the BO is about to be destroyed and nobody > + * reference it any more. The only tricky case is the trylock on > + * the resv object while holding the lru_lock. > + */ > + spin_lock(_bo_glob.lru_lock); > + bo->base.resv = >base._resv; > + spin_unlock(_bo_glob.lru_lock); > + } > how about something like that. the basic idea is to do the bo cleanup work in bo release first and avoid any race with evict. As in bo dieing progress, evict also just do bo cleanup work. If bo is busy, neither bo_release nor evict can do cleanupwork on it. For the bo release case, we just add bo back to lru list. So we can clean it up both in workqueue and shrinker as the past way did. @@ -405,8 +405,9 @@ static int ttm_bo_individualize_resv(struct ttm_buffer_object *bo) if (bo->type != ttm_bo_type_sg) { spin_lock(_bo_glob.lru_lock); - bo->base.resv = >base._resv; + ttm_bo_del_from_lru(bo); spin_unlock(_bo_glob.lru_lock); + bo->base.resv = >base._resv; } return r; @@ -606,10 +607,9 @@ static void ttm_bo_release(struct kref *kref) * shrinkers, now that they are queued for * destruction. */ - if (bo->mem.placement & TTM_PL_FLAG_NO_EVICT) { + if (bo->mem.placement & TTM_PL_FLAG_NO_EVICT) bo->mem.placement &= ~TTM_PL_FLAG_NO_EVICT; - ttm_bo_move_to_lru_tail(bo, NULL); - } + ttm_bo_add_mem_to_lru(bo, >mem); kref_init(>kref); list_add_tail(>ddestroy, >ddestroy); thanks xinhui > return r; > } > @@ -724,7 +736,7 @@ static bool ttm_bo_evict_swapout_allowable(struct > ttm_buffer_object *bo, > > if (bo->base.resv == ctx->resv) { > dma_resv_assert_held(bo->base.resv); > - if (ctx->flags & TTM_OPT_FLAG_ALLOW_RES_EVICT || bo->deleted) > + if (ctx->flags & TTM_OPT_FLAG_ALLOW_RES_EVICT) > ret = true; > *locked = false; > if (busy) > -- > 2.17.1 > > ___ > amd-gfx mailing list > amd-...@lists.freedesktop.org > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfxdata=02%7C01%7Cxinhui.pan%40amd.com%7Cb184dff5aaf349e2210008d7af092637%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637170326204966375sdata=KdZN1l%2FkDYodXxPQgaXaSXUvMz2RHxysSSF9krQRgpI%3Dreserved=0 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 5/6] drm/ttm: replace dma_resv object on deleted BOs v2
> 2020年2月11日 22:14,Daniel Vetter 写道: > > On Mon, Feb 10, 2020 at 04:09:06PM +0100, Christian König wrote: >> When non-imported BOs are resurrected for delayed delete we replace >> the dma_resv object to allow for easy reclaiming of the resources. >> >> v2: move that to ttm_bo_individualize_resv >> >> Signed-off-by: Christian König >> --- >> drivers/gpu/drm/ttm/ttm_bo.c | 10 +- >> 1 file changed, 9 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c >> index d0624685f5d2..4d161038de98 100644 >> --- a/drivers/gpu/drm/ttm/ttm_bo.c >> +++ b/drivers/gpu/drm/ttm/ttm_bo.c >> @@ -393,6 +393,14 @@ static int ttm_bo_individualize_resv(struct >> ttm_buffer_object *bo) >> >> r = dma_resv_copy_fences(>base._resv, bo->base.resv); >> dma_resv_unlock(>base._resv); >> +if (r) >> +return r; >> + >> +if (bo->type != ttm_bo_type_sg) { >> +spin_lock(_bo_glob.lru_lock); >> +bo->base.resv = >base._resv; > > Having the dma_resv pointer be protected by the lru_lock for ttm internal > stuff, but invariant everywhere else is really confusing. Not sure that's I think this is reader VS writer. To avoid any internal functions using the old resv, using an existing spin lock is acceptable. Maybe RCU is better? That will need a lot of effort. Anyway, ttm sucks. We HAS done a lot of work on it to make it better running on modern system. > a great idea, I've just chased some ttm code around freaking out about > that. > -Daniel > >> +spin_unlock(_bo_glob.lru_lock); >> +} >> >> return r; >> } >> @@ -720,7 +728,7 @@ static bool ttm_bo_evict_swapout_allowable(struct >> ttm_buffer_object *bo, >> >> if (bo->base.resv == ctx->resv) { >> dma_resv_assert_held(bo->base.resv); >> -if (ctx->flags & TTM_OPT_FLAG_ALLOW_RES_EVICT || bo->deleted) >> +if (ctx->flags & TTM_OPT_FLAG_ALLOW_RES_EVICT) >> ret = true; >> *locked = false; >> if (busy) >> -- >> 2.17.1 >> >> ___ >> dri-devel mailing list >> dri-devel@lists.freedesktop.org >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fdri-develdata=02%7C01%7CXinhui.Pan%40amd.com%7Cee67310e26b64ca9e79008d7aefca7b4%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637170272481765904sdata=ZpnP9MNBP1csQCKPR275ejIvsZ3b8xL80tmSlpf7MPA%3Dreserved=0 > > -- > Daniel Vetter > Software Engineer, Intel Corporation > https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.chdata=02%7C01%7CXinhui.Pan%40amd.com%7Cee67310e26b64ca9e79008d7aefca7b4%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637170272481765904sdata=fk28jtHhAnE312CFMgVXaZtaS2YNqJjmyJ317FWjAoM%3Dreserved=0 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] drm/ttm: rework BO delayed delete. v2
Reviewed-by: xinhui pan > 2020年2月11日 21:43,Christian König 写道: > > This patch reworks the whole delayed deletion of BOs which aren't idle. > > Instead of having two counters for the BO structure we resurrect the BO > when we find that a deleted BO is not idle yet. > > This has many advantages, especially that we don't need to > increment/decrement the BOs reference counter any more when it > moves on the LRUs. > > v2: remove duplicate ttm_tt_destroy, fix holde lock for LRU move > > Signed-off-by: Christian König > --- > drivers/gpu/drm/ttm/ttm_bo.c | 217 +- > drivers/gpu/drm/ttm/ttm_bo_util.c | 1 - > include/drm/ttm/ttm_bo_api.h | 11 +- > 3 files changed, 97 insertions(+), 132 deletions(-) > > diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c > index 1fbc36f05d89..bfc42a9e4fb4 100644 > --- a/drivers/gpu/drm/ttm/ttm_bo.c > +++ b/drivers/gpu/drm/ttm/ttm_bo.c > @@ -145,26 +145,6 @@ static inline uint32_t ttm_bo_type_flags(unsigned type) > return 1 << (type); > } > > -static void ttm_bo_release_list(struct kref *list_kref) > -{ > - struct ttm_buffer_object *bo = > - container_of(list_kref, struct ttm_buffer_object, list_kref); > - size_t acc_size = bo->acc_size; > - > - BUG_ON(kref_read(>list_kref)); > - BUG_ON(kref_read(>kref)); > - BUG_ON(bo->mem.mm_node != NULL); > - BUG_ON(!list_empty(>lru)); > - BUG_ON(!list_empty(>ddestroy)); > - ttm_tt_destroy(bo->ttm); > - atomic_dec(_bo_glob.bo_count); > - dma_fence_put(bo->moving); > - if (!ttm_bo_uses_embedded_gem_object(bo)) > - dma_resv_fini(>base._resv); > - bo->destroy(bo); > - ttm_mem_global_free(_mem_glob, acc_size); > -} > - > static void ttm_bo_add_mem_to_lru(struct ttm_buffer_object *bo, > struct ttm_mem_reg *mem) > { > @@ -181,21 +161,14 @@ static void ttm_bo_add_mem_to_lru(struct > ttm_buffer_object *bo, > > man = >man[mem->mem_type]; > list_add_tail(>lru, >lru[bo->priority]); > - kref_get(>list_kref); > > if (!(man->flags & TTM_MEMTYPE_FLAG_FIXED) && bo->ttm && > !(bo->ttm->page_flags & (TTM_PAGE_FLAG_SG | >TTM_PAGE_FLAG_SWAPPED))) { > list_add_tail(>swap, _bo_glob.swap_lru[bo->priority]); > - kref_get(>list_kref); > } > } > > -static void ttm_bo_ref_bug(struct kref *list_kref) > -{ > - BUG(); > -} > - > static void ttm_bo_del_from_lru(struct ttm_buffer_object *bo) > { > struct ttm_bo_device *bdev = bo->bdev; > @@ -203,12 +176,10 @@ static void ttm_bo_del_from_lru(struct > ttm_buffer_object *bo) > > if (!list_empty(>swap)) { > list_del_init(>swap); > - kref_put(>list_kref, ttm_bo_ref_bug); > notify = true; > } > if (!list_empty(>lru)) { > list_del_init(>lru); > - kref_put(>list_kref, ttm_bo_ref_bug); > notify = true; > } > > @@ -421,8 +392,7 @@ static int ttm_bo_individualize_resv(struct > ttm_buffer_object *bo) > BUG_ON(!dma_resv_trylock(>base._resv)); > > r = dma_resv_copy_fences(>base._resv, bo->base.resv); > - if (r) > - dma_resv_unlock(>base._resv); > + dma_resv_unlock(>base._resv); > > return r; > } > @@ -447,68 +417,10 @@ static void ttm_bo_flush_all_fences(struct > ttm_buffer_object *bo) > } > } > > -static void ttm_bo_cleanup_refs_or_queue(struct ttm_buffer_object *bo) > -{ > - struct ttm_bo_device *bdev = bo->bdev; > - int ret; > - > - ret = ttm_bo_individualize_resv(bo); > - if (ret) { > - /* Last resort, if we fail to allocate memory for the > - * fences block for the BO to become idle > - */ > - dma_resv_wait_timeout_rcu(bo->base.resv, true, false, > - 30 * HZ); > - spin_lock(_bo_glob.lru_lock); > - goto error; > - } > - > - spin_lock(_bo_glob.lru_lock); > - ret = dma_resv_trylock(bo->base.resv) ? 0 : -EBUSY; > - if (!ret) { > - if (dma_resv_test_signaled_rcu(>base._resv, true)) { > - ttm_bo_del_from_lru(bo); > - spin_unlock(_bo_glob.lru_lock); > - if (bo->base.resv != >base._resv) > - dma_resv_unlock(>base._resv); > - > - ttm_bo_cleanup_memtype_use(bo); > - dma_resv_unlock(bo->base.resv); > - return; > - } > - > - ttm_bo_flush_all_fences(bo); > - > - /* > - * Make NO_EVICT bos immediately available to > - * shrinkers, now that they are queued for > - * destruction. > - */ > - if (bo->mem.placement & TTM_PL_FLAG_NO_EVICT) { > - bo->mem.placement &=
Re: Cleanup TTMs delayed delete handling
[AMD Official Use Only - Internal Distribution Only] For patch 1/2/3/5/6 Reviewed-by: xinhui pan From: Christian König Sent: Monday, February 10, 2020 11:09:01 PM To: Pan, Xinhui ; amd-...@lists.freedesktop.org ; dri-devel@lists.freedesktop.org Subject: Cleanup TTMs delayed delete handling This series of patches cleans up TTMs delayed delete handling. The core of the new handling is that we new only have a single reference counter instead of two and use kref_get_unless_zero() to grab BOs from the LRU during eviction. This reduces the overhead of LRU moves and allows us to properly individualize the BOs reservation object during deletion to allow adding BOs for clearing memory, unmapping page tables etc.. Please review and comment, Christian. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 4/6] drm/ttm: rework BO delayed delete.
comments inline > 2020年2月11日 13:26,Pan, Xinhui 写道: > > comments inline. > [xh] > > >> 2020年2月10日 23:09,Christian König 写道: >> >> This patch reworks the whole delayed deletion of BOs which aren't idle. >> >> Instead of having two counters for the BO structure we resurrect the BO >> when we find that a deleted BO is not idle yet. >> >> This has many advantages, especially that we don't need to >> increment/decrement the BOs reference counter any more when it >> moves on the LRUs. >> >> Signed-off-by: Christian König >> --- >> drivers/gpu/drm/ttm/ttm_bo.c | 217 +- >> drivers/gpu/drm/ttm/ttm_bo_util.c | 1 - >> include/drm/ttm/ttm_bo_api.h | 11 +- >> 3 files changed, 97 insertions(+), 132 deletions(-) >> >> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c >> index e12fc2c2d165..d0624685f5d2 100644 >> --- a/drivers/gpu/drm/ttm/ttm_bo.c >> +++ b/drivers/gpu/drm/ttm/ttm_bo.c >> @@ -145,26 +145,6 @@ static inline uint32_t ttm_bo_type_flags(unsigned type) >> return 1 << (type); >> } >> >> -static void ttm_bo_release_list(struct kref *list_kref) >> -{ >> -struct ttm_buffer_object *bo = >> -container_of(list_kref, struct ttm_buffer_object, list_kref); >> -size_t acc_size = bo->acc_size; >> - >> -BUG_ON(kref_read(>list_kref)); >> -BUG_ON(kref_read(>kref)); >> -BUG_ON(bo->mem.mm_node != NULL); >> -BUG_ON(!list_empty(>lru)); >> -BUG_ON(!list_empty(>ddestroy)); >> -ttm_tt_destroy(bo->ttm); >> -atomic_dec(_bo_glob.bo_count); >> -dma_fence_put(bo->moving); >> -if (!ttm_bo_uses_embedded_gem_object(bo)) >> -dma_resv_fini(>base._resv); >> -bo->destroy(bo); >> -ttm_mem_global_free(_mem_glob, acc_size); >> -} >> - >> static void ttm_bo_add_mem_to_lru(struct ttm_buffer_object *bo, >>struct ttm_mem_reg *mem) >> { >> @@ -181,21 +161,14 @@ static void ttm_bo_add_mem_to_lru(struct >> ttm_buffer_object *bo, >> >> man = >man[mem->mem_type]; >> list_add_tail(>lru, >lru[bo->priority]); >> -kref_get(>list_kref); >> >> if (!(man->flags & TTM_MEMTYPE_FLAG_FIXED) && bo->ttm && >> !(bo->ttm->page_flags & (TTM_PAGE_FLAG_SG | >> TTM_PAGE_FLAG_SWAPPED))) { >> list_add_tail(>swap, _bo_glob.swap_lru[bo->priority]); >> -kref_get(>list_kref); >> } >> } >> >> -static void ttm_bo_ref_bug(struct kref *list_kref) >> -{ >> -BUG(); >> -} >> - >> static void ttm_bo_del_from_lru(struct ttm_buffer_object *bo) >> { >> struct ttm_bo_device *bdev = bo->bdev; >> @@ -203,12 +176,10 @@ static void ttm_bo_del_from_lru(struct >> ttm_buffer_object *bo) >> >> if (!list_empty(>swap)) { >> list_del_init(>swap); >> -kref_put(>list_kref, ttm_bo_ref_bug); >> notify = true; >> } >> if (!list_empty(>lru)) { >> list_del_init(>lru); >> -kref_put(>list_kref, ttm_bo_ref_bug); >> notify = true; >> } >> >> @@ -421,8 +392,7 @@ static int ttm_bo_individualize_resv(struct >> ttm_buffer_object *bo) >> BUG_ON(!dma_resv_trylock(>base._resv)); >> >> r = dma_resv_copy_fences(>base._resv, bo->base.resv); >> -if (r) >> -dma_resv_unlock(>base._resv); >> +dma_resv_unlock(>base._resv); >> >> return r; >> } >> @@ -449,68 +419,10 @@ static void ttm_bo_flush_all_fences(struct >> ttm_buffer_object *bo) >> rcu_read_unlock(); >> } >> >> -static void ttm_bo_cleanup_refs_or_queue(struct ttm_buffer_object *bo) >> -{ >> -struct ttm_bo_device *bdev = bo->bdev; >> -int ret; >> - >> -ret = ttm_bo_individualize_resv(bo); >> -if (ret) { >> -/* Last resort, if we fail to allocate memory for the >> - * fences block for the BO to become idle >> - */ >> -dma_resv_wait_timeout_rcu(bo->base.resv, true, false, >> -30 * HZ); >> -spin_lock(_bo_glob.lru_lock); >> -goto error; >> -} >>
Re: [PATCH 4/6] drm/ttm: rework BO delayed delete.
comments inline. [xh] > 2020年2月10日 23:09,Christian König 写道: > > This patch reworks the whole delayed deletion of BOs which aren't idle. > > Instead of having two counters for the BO structure we resurrect the BO > when we find that a deleted BO is not idle yet. > > This has many advantages, especially that we don't need to > increment/decrement the BOs reference counter any more when it > moves on the LRUs. > > Signed-off-by: Christian König > --- > drivers/gpu/drm/ttm/ttm_bo.c | 217 +- > drivers/gpu/drm/ttm/ttm_bo_util.c | 1 - > include/drm/ttm/ttm_bo_api.h | 11 +- > 3 files changed, 97 insertions(+), 132 deletions(-) > > diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c > index e12fc2c2d165..d0624685f5d2 100644 > --- a/drivers/gpu/drm/ttm/ttm_bo.c > +++ b/drivers/gpu/drm/ttm/ttm_bo.c > @@ -145,26 +145,6 @@ static inline uint32_t ttm_bo_type_flags(unsigned type) > return 1 << (type); > } > > -static void ttm_bo_release_list(struct kref *list_kref) > -{ > - struct ttm_buffer_object *bo = > - container_of(list_kref, struct ttm_buffer_object, list_kref); > - size_t acc_size = bo->acc_size; > - > - BUG_ON(kref_read(>list_kref)); > - BUG_ON(kref_read(>kref)); > - BUG_ON(bo->mem.mm_node != NULL); > - BUG_ON(!list_empty(>lru)); > - BUG_ON(!list_empty(>ddestroy)); > - ttm_tt_destroy(bo->ttm); > - atomic_dec(_bo_glob.bo_count); > - dma_fence_put(bo->moving); > - if (!ttm_bo_uses_embedded_gem_object(bo)) > - dma_resv_fini(>base._resv); > - bo->destroy(bo); > - ttm_mem_global_free(_mem_glob, acc_size); > -} > - > static void ttm_bo_add_mem_to_lru(struct ttm_buffer_object *bo, > struct ttm_mem_reg *mem) > { > @@ -181,21 +161,14 @@ static void ttm_bo_add_mem_to_lru(struct > ttm_buffer_object *bo, > > man = >man[mem->mem_type]; > list_add_tail(>lru, >lru[bo->priority]); > - kref_get(>list_kref); > > if (!(man->flags & TTM_MEMTYPE_FLAG_FIXED) && bo->ttm && > !(bo->ttm->page_flags & (TTM_PAGE_FLAG_SG | >TTM_PAGE_FLAG_SWAPPED))) { > list_add_tail(>swap, _bo_glob.swap_lru[bo->priority]); > - kref_get(>list_kref); > } > } > > -static void ttm_bo_ref_bug(struct kref *list_kref) > -{ > - BUG(); > -} > - > static void ttm_bo_del_from_lru(struct ttm_buffer_object *bo) > { > struct ttm_bo_device *bdev = bo->bdev; > @@ -203,12 +176,10 @@ static void ttm_bo_del_from_lru(struct > ttm_buffer_object *bo) > > if (!list_empty(>swap)) { > list_del_init(>swap); > - kref_put(>list_kref, ttm_bo_ref_bug); > notify = true; > } > if (!list_empty(>lru)) { > list_del_init(>lru); > - kref_put(>list_kref, ttm_bo_ref_bug); > notify = true; > } > > @@ -421,8 +392,7 @@ static int ttm_bo_individualize_resv(struct > ttm_buffer_object *bo) > BUG_ON(!dma_resv_trylock(>base._resv)); > > r = dma_resv_copy_fences(>base._resv, bo->base.resv); > - if (r) > - dma_resv_unlock(>base._resv); > + dma_resv_unlock(>base._resv); > > return r; > } > @@ -449,68 +419,10 @@ static void ttm_bo_flush_all_fences(struct > ttm_buffer_object *bo) > rcu_read_unlock(); > } > > -static void ttm_bo_cleanup_refs_or_queue(struct ttm_buffer_object *bo) > -{ > - struct ttm_bo_device *bdev = bo->bdev; > - int ret; > - > - ret = ttm_bo_individualize_resv(bo); > - if (ret) { > - /* Last resort, if we fail to allocate memory for the > - * fences block for the BO to become idle > - */ > - dma_resv_wait_timeout_rcu(bo->base.resv, true, false, > - 30 * HZ); > - spin_lock(_bo_glob.lru_lock); > - goto error; > - } > - > - spin_lock(_bo_glob.lru_lock); > - ret = dma_resv_trylock(bo->base.resv) ? 0 : -EBUSY; > - if (!ret) { > - if (dma_resv_test_signaled_rcu(>base._resv, true)) { > - ttm_bo_del_from_lru(bo); > - spin_unlock(_bo_glob.lru_lock); > - if (bo->base.resv != >base._resv) > - dma_resv_unlock(>base._resv); > - > - ttm_bo_cleanup_memtype_use(bo); > - dma_resv_unlock(bo->base.resv); > - return; > - } > - > - ttm_bo_flush_all_fences(bo); > - > - /* > - * Make NO_EVICT bos immediately available to > - * shrinkers, now that they are queued for > - * destruction. > - */ > - if (bo->mem.placement & TTM_PL_FLAG_NO_EVICT) { > - bo->mem.placement &= ~TTM_PL_FLAG_NO_EVICT; > - ttm_bo_move_to_lru_tail(bo,
Re: [PATCH] drm/amdgpu: Fix bounds checking in amdgpu_ras_is_supported()
do you mean that something like 1<<65 might be a none zero value? From: Dan Carpenter Sent: Saturday, June 8, 2019 5:23:57 PM To: Deucher, Alexander; Pan, Xinhui Cc: Koenig, Christian; Zhou, David(ChunMing); David Airlie; Daniel Vetter; Quan, Evan; Zhu, James; amd-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; kernel-janit...@vger.kernel.org Subject: [PATCH] drm/amdgpu: Fix bounds checking in amdgpu_ras_is_supported() The "block" variable can be set by the user through debugfs, so it can be quite large which leads to shift wrapping here. This means we report a "block" as supported when it's not, and that leads to array overflows later on. This bug is not really a security issue in real life, because debugfs is generally root only. Fixes: 36ea1bd2d084 ("drm/amdgpu: add debugfs ctrl node") Signed-off-by: Dan Carpenter --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h index c6b34fbd695f..94c652f5265a 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h @@ -173,6 +173,8 @@ static inline int amdgpu_ras_is_supported(struct amdgpu_device *adev, { struct amdgpu_ras *ras = amdgpu_ras_get_context(adev); + if (block >= AMDGPU_RAS_BLOCK_COUNT) + return 0; return ras && (ras->supported & (1 << block)); } -- 2.20.1 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
回复: [PATCH] gpu: drm: use struct_size() in kmalloc()
Daniel, what you are talking about is totally wrong. 1) AFAIK, only one zero-size array can be in the end of a struct. 2) two struct_size will add up struct itself twice. the sum is wrong then. No offense. I can't help feeling lucky that you are in intel. 发件人: Daniel Vetter 代表 Daniel Vetter 发送时间: 2019年5月21日 0:28 收件人: Pan, Xinhui 抄送: Deucher, Alexander; Koenig, Christian; Zhou, David(ChunMing); airl...@linux.ie; dan...@ffwll.ch; Quan, Evan; xiaolinkui; amd-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; linux-ker...@vger.kernel.org 主题: Re: [PATCH] gpu: drm: use struct_size() in kmalloc() [CAUTION: External Email] On Fri, May 17, 2019 at 04:44:30PM +, Pan, Xinhui wrote: > I am going to put more members which are also array after this struct, > not only obj[]. Looks like this struct_size did not help on multiple > array case. Thanks anyway. You can then add them up, e.g. kmalloc(struct_size()+struct_size(), GFP_KERNEL), so this patch here still looks like a good idea. Reviewed-by: Daniel Vetter Cheers, Daniel > From: xiaolinkui > Sent: Friday, May 17, 2019 4:46:00 PM > To: Deucher, Alexander; Koenig, Christian; Zhou, David(ChunMing); > airl...@linux.ie; dan...@ffwll.ch; Pan, Xinhui; Quan, Evan > Cc: amd-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; > linux-ker...@vger.kernel.org; xiaolin...@kylinos.cn > Subject: [PATCH] gpu: drm: use struct_size() in kmalloc() > > [CAUTION: External Email] > > Use struct_size() helper to keep code simple. > > Signed-off-by: xiaolinkui > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c > index 22bd21e..4717a64 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c > @@ -1375,8 +1375,7 @@ int amdgpu_ras_init(struct amdgpu_device *adev) > if (con) > return 0; > > - con = kmalloc(sizeof(struct amdgpu_ras) + > - sizeof(struct ras_manager) * AMDGPU_RAS_BLOCK_COUNT, > + con = kmalloc(struct_size(con, objs, AMDGPU_RAS_BLOCK_COUNT), > GFP_KERNEL|__GFP_ZERO); > if (!con) > return -ENOMEM; > -- > 2.7.4 > > > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] gpu: drm: use struct_size() in kmalloc()
Daniel, your idea is obviously and totally wrong. There can NOT be more than one zero-size array in a struct. Nack for them all. From: Daniel Vetter on behalf of Daniel Vetter Sent: Tuesday, May 21, 2019 12:28:07 AM To: Pan, Xinhui Cc: Deucher, Alexander; Koenig, Christian; Zhou, David(ChunMing); airl...@linux.ie; dan...@ffwll.ch; Quan, Evan; xiaolinkui; amd-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; linux-ker...@vger.kernel.org Subject: Re: [PATCH] gpu: drm: use struct_size() in kmalloc() [CAUTION: External Email] On Fri, May 17, 2019 at 04:44:30PM +, Pan, Xinhui wrote: > I am going to put more members which are also array after this struct, > not only obj[]. Looks like this struct_size did not help on multiple > array case. Thanks anyway. You can then add them up, e.g. kmalloc(struct_size()+struct_size(), GFP_KERNEL), so this patch here still looks like a good idea. Reviewed-by: Daniel Vetter Cheers, Daniel > From: xiaolinkui > Sent: Friday, May 17, 2019 4:46:00 PM > To: Deucher, Alexander; Koenig, Christian; Zhou, David(ChunMing); > airl...@linux.ie; dan...@ffwll.ch; Pan, Xinhui; Quan, Evan > Cc: amd-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; > linux-ker...@vger.kernel.org; xiaolin...@kylinos.cn > Subject: [PATCH] gpu: drm: use struct_size() in kmalloc() > > [CAUTION: External Email] > > Use struct_size() helper to keep code simple. > > Signed-off-by: xiaolinkui > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c > index 22bd21e..4717a64 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c > @@ -1375,8 +1375,7 @@ int amdgpu_ras_init(struct amdgpu_device *adev) > if (con) > return 0; > > - con = kmalloc(sizeof(struct amdgpu_ras) + > - sizeof(struct ras_manager) * AMDGPU_RAS_BLOCK_COUNT, > + con = kmalloc(struct_size(con, objs, AMDGPU_RAS_BLOCK_COUNT), > GFP_KERNEL|__GFP_ZERO); > if (!con) > return -ENOMEM; > -- > 2.7.4 > > > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] gpu: drm: use struct_size() in kmalloc()
I am going to put more members which are also array after this struct, not only obj[]. Looks like this struct_size did not help on multiple array case. Thanks anyway. From: xiaolinkui Sent: Friday, May 17, 2019 4:46:00 PM To: Deucher, Alexander; Koenig, Christian; Zhou, David(ChunMing); airl...@linux.ie; dan...@ffwll.ch; Pan, Xinhui; Quan, Evan Cc: amd-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; linux-ker...@vger.kernel.org; xiaolin...@kylinos.cn Subject: [PATCH] gpu: drm: use struct_size() in kmalloc() [CAUTION: External Email] Use struct_size() helper to keep code simple. Signed-off-by: xiaolinkui --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c index 22bd21e..4717a64 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c @@ -1375,8 +1375,7 @@ int amdgpu_ras_init(struct amdgpu_device *adev) if (con) return 0; - con = kmalloc(sizeof(struct amdgpu_ras) + - sizeof(struct ras_manager) * AMDGPU_RAS_BLOCK_COUNT, + con = kmalloc(struct_size(con, objs, AMDGPU_RAS_BLOCK_COUNT), GFP_KERNEL|__GFP_ZERO); if (!con) return -ENOMEM; -- 2.7.4 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
RE: Clang warning in drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
I am going to apply your fix patch in my branch. Thanks xinhui -Original Message- From: Nathan Chancellor Sent: 2019年3月20日 23:45 To: Stephen Hines Cc: Koenig, Christian ; Pan, Xinhui ; Deucher, Alexander ; Zhou, David(ChunMing) ; amd-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; linux-ker...@vger.kernel.org; clang-built-li...@googlegroups.com Subject: Re: Clang warning in drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c On Wed, Mar 20, 2019 at 07:58:09AM -0700, Stephen Hines wrote: > Why are there 2 different enums for this same thing at all? By > casting, you are reducing type safety in the kernel, which can cause > bugs later (should the two enums diverge in encoding). In my opinion, > the proper solution is to remove one of the enums or provide an > explicit helper that does the conversion (along with assertions for > handling any unexpected cases). The helper function should not be > doing a direct cast, since a bug could change the integer value of one > (or both) of these enums so that they don't match up. > > Thanks, > Steve > Indeed, I would suggest something like this (if this was to be a build time error, this would need to be a macro instead of a function): diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c index a71668b8a7d0..451478cf4f35 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c @@ -541,13 +541,13 @@ int amdgpu_ras_feature_enable(struct amdgpu_device *adev, if (!enable) { info.disable_features = (struct ta_ras_disable_features_input) { - .block_id = head->block, - .error_type = head->type, + .block_id = amdgpu_ras_block_to_ta(head->block), + .error_type = amdgpu_ras_error_to_ta(head->type), }; } else { info.enable_features = (struct ta_ras_enable_features_input) { - .block_id = head->block, - .error_type = head->type, + .block_id = amdgpu_ras_block_to_ta(head->block), + .error_type = amdgpu_ras_error_to_ta(head->type), }; } @@ -647,8 +647,8 @@ int amdgpu_ras_error_inject(struct amdgpu_device *adev, { struct ras_manager *obj = amdgpu_ras_find_obj(adev, >head); struct ta_ras_trigger_error_input block_info = { - .block_id = info->head.block, - .inject_error_type = info->head.type, + .block_id = amdgpu_ras_block_to_ta(info->head.block), + .inject_error_type = amdgpu_ras_error_to_ta(info->head.type), .sub_block_index = info->head.sub_block_index, .address = info->address, .value = info->value, diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h index 7a35316baab0..c8576ab6e057 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h @@ -197,6 +197,64 @@ static inline int amdgpu_ras_reset_gpu(struct amdgpu_device *adev, return 0; } +static inline enum ta_ras_block +amdgpu_ras_block_to_ta(enum amdgpu_ras_block block) { + switch (block) { + case AMDGPU_RAS_BLOCK__UMC: + return TA_RAS_BLOCK__UMC; + case AMDGPU_RAS_BLOCK__SDMA: + return TA_RAS_BLOCK__SDMA; + case AMDGPU_RAS_BLOCK__GFX: + return TA_RAS_BLOCK__GFX; + case AMDGPU_RAS_BLOCK__MMHUB: + return TA_RAS_BLOCK__MMHUB; + case AMDGPU_RAS_BLOCK__ATHUB: + return TA_RAS_BLOCK__ATHUB; + case AMDGPU_RAS_BLOCK__PCIE_BIF: + return TA_RAS_BLOCK__PCIE_BIF; + case AMDGPU_RAS_BLOCK__HDP: + return TA_RAS_BLOCK__HDP; + case AMDGPU_RAS_BLOCK__XGMI_WAFL: + return TA_RAS_BLOCK__XGMI_WAFL; + case AMDGPU_RAS_BLOCK__DF: + return TA_RAS_BLOCK__DF; + case AMDGPU_RAS_BLOCK__SMN: + return TA_RAS_BLOCK__SMN; + case AMDGPU_RAS_BLOCK__SEM: + return TA_RAS_BLOCK__SEM; + case AMDGPU_RAS_BLOCK__MP0: + return TA_RAS_BLOCK__MP0; + case AMDGPU_RAS_BLOCK__MP1: + return TA_RAS_BLOCK__MP1; + case AMDGPU_RAS_BLOCK__FUSE: + return TA_RAS_BLOCK__FUSE; + default: + WARN(1, "amdgpu_ras_block_to_ta fell through with a value of %d\n", block); + return TA_RAS_BLOCK__UMC; + } +} + +static inline enum ta_ras_error_type +amdgpu_ras_error_to_ta(enum amdgpu_ras_error_type error) { + switch (error) { + case AMDGPU_RAS_ERROR__NONE: + return TA_RAS_ERROR__NONE; + case AMDGPU_RAS_ERROR__PARITY: + return T
RE: Clang warning in drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
these two enumerated types are same for now. both of them might change in the future. I have not used clang, but would .block_id = (int)head->block fix your warning? If such change is acceptable, I can make one then. Thanks xinhui -Original Message- From: Nathan Chancellor Sent: 2019年3月20日 8:54 To: Deucher, Alexander ; Koenig, Christian ; Zhou, David(ChunMing) ; Pan, Xinhui Cc: amd-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; linux-ker...@vger.kernel.org; clang-built-li...@googlegroups.com Subject: Clang warning in drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c Hi all, The introduction of this file in commit dbd249c24427 ("drm/amdgpu: add amdgpu_ras.c to support ras (v2)") introduces the following Clang warnings: drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:544:23: warning: implicit conversion from enumeration type 'enum amdgpu_ras_block' to different enumeration type 'enum ta_ras_block' [-Wenum-conversion] .block_id = head->block, ~~^ drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:545:24: warning: implicit conversion from enumeration type 'enum amdgpu_ras_error_type' to different enumeration type 'enum ta_ras_error_type' [-Wenum-conversion] .error_type = head->type, ~~^~~~ drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:549:23: warning: implicit conversion from enumeration type 'enum amdgpu_ras_block' to different enumeration type 'enum ta_ras_block' [-Wenum-conversion] .block_id = head->block, ~~^ drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:550:24: warning: implicit conversion from enumeration type 'enum amdgpu_ras_error_type' to different enumeration type 'enum ta_ras_error_type' [-Wenum-conversion] .error_type = head->type, ~~^~~~ drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:650:26: warning: implicit conversion from enumeration type 'enum amdgpu_ras_block' to different enumeration type 'enum ta_ras_block' [-Wenum-conversion] .block_id = info->head.block, ~~~^ drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:651:35: warning: implicit conversion from enumeration type 'enum amdgpu_ras_error_type' to different enumeration type 'enum ta_ras_error_type' [-Wenum-conversion] .inject_error_type = info->head.type, ~~~^~~~ 6 warnings generated. Normally, I would sent a fix for this myself but I am not entirely sure why these two enumerated types exist when one would do since they have the same values minus the prefix. In fact, the ta_ras_{block,error_type} values are never used aside from being defined. Some clarification would be appreciated. Thank you, Nathan ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel