Hi Christian,

Are you adding new traces or turning on existing ones? Would you like me to try them out in my setup?

Tom

On 2018-09-10 8:49 a.m., Christian König wrote:
Am 10.09.2018 um 14:05 schrieb Huang Rui:
On Mon, Sep 10, 2018 at 05:25:48PM +0800, Koenig, Christian wrote:
Am 10.09.2018 um 11:23 schrieb Huang Rui:
On Mon, Sep 10, 2018 at 11:00:04AM +0200, Christian König wrote:
Hi Ray,

well those patches doesn't make sense, the pointer is only local to
the function.
You're right.
I narrowed it with gdb dump from ttm_bo_bulk_move_lru_tail+0x2b, the
use-after-free should be in below codes:

man = &bulk->tt[i].first->bdev->man[TTM_PL_TT];
ttm_bo_bulk_move_helper(&bulk->tt[i], &man->lru[i], false);

Is there a case, when orignal bo is destroyed in the bulk pos, but it
doesn't update pos->first pointer, then we still use it during the bulk
moving?
Only when a per VM BO is freed or the VM destroyed.

The first case should now be handled by "drm/amdgpu: set bulk_moveable
to false when a per VM is released" and when we use a destroyed VM we
would see other problems as well.

If a VM instance is teared down, all BOs which belong that VM should be
removed from LRU. But how can we submit cmd based on a destroyed VM? You
know, we do the bulk move at last step of submission.

Well exactly that's the point this can't happen :)

Otherwise we would crash because of using freed up memory much earlier in the command submission.

The best idea I have to track this down further is to add some trace_printk in ttm_bo_bulk_move_helper and amdgpu_bo_destroy and see why and when we are actually using a destroyed BO.

Christian.



Thanks,
Ray

BTW: Just pushed this commit to the repository, should show up any second.

Christian.

Thanks,
Ray

Regards,
Christian.

Am 10.09.2018 um 10:57 schrieb Huang Rui:
It avoids to be refered again after freed.

Signed-off-by: Huang Rui <ray.hu...@amd.com>
Cc: Christian König <christian.koe...@amd.com>
Cc: Tom StDenis <tom.stde...@amd.com>
---
   drivers/gpu/drm/ttm/ttm_bo.c | 1 +
   1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 138c989..d3ef5f8 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -54,6 +54,7 @@ static struct attribute ttm_bo_count = {
   static void ttm_bo_default_destroy(struct ttm_buffer_object *bo)
   {
       kfree(bo);
+    bo = NULL;
   }
   static inline int ttm_mem_type_from_place(const struct ttm_place *place,
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
dri-devel mailing list
dri-de...@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Reply via email to