Re: [PATCH] drm/amdgpu: prevent double kfree ttm->sg

2020-09-16 Thread Christian König

Am 15.09.20 um 23:52 schrieb Philip Yang:

Set ttm->sg to NULL after kfree, to avoid memory corruption backtrace:

[  420.932812] kernel BUG at
/build/linux-do9eLF/linux-4.15.0/mm/slub.c:295!
[  420.934182] invalid opcode:  [#1] SMP NOPTI
[  420.935445] Modules linked in: xt_conntrack ipt_MASQUERADE
[  420.951332] Hardware name: Dell Inc. PowerEdge R7525/0PYVT1, BIOS
1.5.4 07/09/2020
[  420.952887] RIP: 0010:__slab_free+0x180/0x2d0
[  420.954419] RSP: 0018:be426291fa60 EFLAGS: 00010246
[  420.955963] RAX: 9e29263e9c30 RBX: 9e29263e9c30 RCX:
0001814b
[  420.957512] RDX: 9e29263e9c30 RSI: f3d33e98fa40 RDI:
9e297e407a80
[  420.959055] RBP: be426291fb00 R08: 0001 R09:
c0d39ade
[  420.960587] R10: be426291fb20 R11: 9e49ffdd4000 R12:
9e297e407a80
[  420.962105] R13: f3d33e98fa40 R14: 9e29263e9c30 R15:
9e2954464fd8
[  420.963611] FS:  7fa2ea097780() GS:9e297e84()
knlGS:
[  420.965144] CS:  0010 DS:  ES:  CR0: 80050033
[  420.93] CR2: 7f16bfffefb8 CR3: 001ff0c62000 CR4:
00340ee0
[  420.968193] Call Trace:
[  420.969703]  ? __page_cache_release+0x3c/0x220
[  420.971294]  ? amdgpu_ttm_tt_unpopulate+0x5e/0x80 [amdgpu]
[  420.972789]  kfree+0x168/0x180
[  420.974353]  ? amdgpu_ttm_tt_set_user_pages+0x64/0xc0 [amdgpu]
[  420.975850]  ? kfree+0x168/0x180
[  420.977403]  amdgpu_ttm_tt_unpopulate+0x5e/0x80 [amdgpu]
[  420.97]  ttm_tt_unpopulate.part.10+0x53/0x60 [amdttm]
[  420.980357]  ttm_tt_destroy.part.11+0x4f/0x60 [amdttm]
[  420.981814]  ttm_tt_destroy+0x13/0x20 [amdttm]
[  420.983273]  ttm_bo_cleanup_memtype_use+0x36/0x80 [amdttm]
[  420.984725]  ttm_bo_release+0x1c9/0x360 [amdttm]
[  420.986167]  amdttm_bo_put+0x24/0x30 [amdttm]
[  420.987663]  amdgpu_bo_unref+0x1e/0x30 [amdgpu]
[  420.989165]  amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0x9ca/0xb10
[amdgpu]
[  420.990666]  kfd_ioctl_alloc_memory_of_gpu+0xef/0x2c0 [amdgpu]

Signed-off-by: Philip Yang 


Reviewed-by: Christian König 


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 8b704451a18c..4b3ab9a25e91 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1076,6 +1076,7 @@ static int amdgpu_ttm_tt_pin_userptr(struct ttm_tt *ttm)
  
  release_sg:

kfree(ttm->sg);
+   ttm->sg = NULL;
return r;
  }
  


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: prevent double kfree ttm->sg

2020-09-15 Thread Felix Kuehling
Am 2020-09-15 um 5:52 p.m. schrieb Philip Yang:
> Set ttm->sg to NULL after kfree, to avoid memory corruption backtrace:
>
> [  420.932812] kernel BUG at
> /build/linux-do9eLF/linux-4.15.0/mm/slub.c:295!
> [  420.934182] invalid opcode:  [#1] SMP NOPTI
> [  420.935445] Modules linked in: xt_conntrack ipt_MASQUERADE
> [  420.951332] Hardware name: Dell Inc. PowerEdge R7525/0PYVT1, BIOS
> 1.5.4 07/09/2020
> [  420.952887] RIP: 0010:__slab_free+0x180/0x2d0
> [  420.954419] RSP: 0018:be426291fa60 EFLAGS: 00010246
> [  420.955963] RAX: 9e29263e9c30 RBX: 9e29263e9c30 RCX:
> 0001814b
> [  420.957512] RDX: 9e29263e9c30 RSI: f3d33e98fa40 RDI:
> 9e297e407a80
> [  420.959055] RBP: be426291fb00 R08: 0001 R09:
> c0d39ade
> [  420.960587] R10: be426291fb20 R11: 9e49ffdd4000 R12:
> 9e297e407a80
> [  420.962105] R13: f3d33e98fa40 R14: 9e29263e9c30 R15:
> 9e2954464fd8
> [  420.963611] FS:  7fa2ea097780() GS:9e297e84()
> knlGS:
> [  420.965144] CS:  0010 DS:  ES:  CR0: 80050033
> [  420.93] CR2: 7f16bfffefb8 CR3: 001ff0c62000 CR4:
> 00340ee0
> [  420.968193] Call Trace:
> [  420.969703]  ? __page_cache_release+0x3c/0x220
> [  420.971294]  ? amdgpu_ttm_tt_unpopulate+0x5e/0x80 [amdgpu]
> [  420.972789]  kfree+0x168/0x180
> [  420.974353]  ? amdgpu_ttm_tt_set_user_pages+0x64/0xc0 [amdgpu]
> [  420.975850]  ? kfree+0x168/0x180
> [  420.977403]  amdgpu_ttm_tt_unpopulate+0x5e/0x80 [amdgpu]
> [  420.97]  ttm_tt_unpopulate.part.10+0x53/0x60 [amdttm]
> [  420.980357]  ttm_tt_destroy.part.11+0x4f/0x60 [amdttm]
> [  420.981814]  ttm_tt_destroy+0x13/0x20 [amdttm]
> [  420.983273]  ttm_bo_cleanup_memtype_use+0x36/0x80 [amdttm]
> [  420.984725]  ttm_bo_release+0x1c9/0x360 [amdttm]
> [  420.986167]  amdttm_bo_put+0x24/0x30 [amdttm]
> [  420.987663]  amdgpu_bo_unref+0x1e/0x30 [amdgpu]
> [  420.989165]  amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0x9ca/0xb10
> [amdgpu]
> [  420.990666]  kfd_ioctl_alloc_memory_of_gpu+0xef/0x2c0 [amdgpu]
>
> Signed-off-by: Philip Yang 

Reviewed-by: Felix Kuehling 


> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> index 8b704451a18c..4b3ab9a25e91 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> @@ -1076,6 +1076,7 @@ static int amdgpu_ttm_tt_pin_userptr(struct ttm_tt *ttm)
>  
>  release_sg:
>   kfree(ttm->sg);
> + ttm->sg = NULL;
>   return r;
>  }
>  
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: prevent double kfree ttm->sg

2020-09-15 Thread Philip Yang
Set ttm->sg to NULL after kfree, to avoid memory corruption backtrace:

[  420.932812] kernel BUG at
/build/linux-do9eLF/linux-4.15.0/mm/slub.c:295!
[  420.934182] invalid opcode:  [#1] SMP NOPTI
[  420.935445] Modules linked in: xt_conntrack ipt_MASQUERADE
[  420.951332] Hardware name: Dell Inc. PowerEdge R7525/0PYVT1, BIOS
1.5.4 07/09/2020
[  420.952887] RIP: 0010:__slab_free+0x180/0x2d0
[  420.954419] RSP: 0018:be426291fa60 EFLAGS: 00010246
[  420.955963] RAX: 9e29263e9c30 RBX: 9e29263e9c30 RCX:
0001814b
[  420.957512] RDX: 9e29263e9c30 RSI: f3d33e98fa40 RDI:
9e297e407a80
[  420.959055] RBP: be426291fb00 R08: 0001 R09:
c0d39ade
[  420.960587] R10: be426291fb20 R11: 9e49ffdd4000 R12:
9e297e407a80
[  420.962105] R13: f3d33e98fa40 R14: 9e29263e9c30 R15:
9e2954464fd8
[  420.963611] FS:  7fa2ea097780() GS:9e297e84()
knlGS:
[  420.965144] CS:  0010 DS:  ES:  CR0: 80050033
[  420.93] CR2: 7f16bfffefb8 CR3: 001ff0c62000 CR4:
00340ee0
[  420.968193] Call Trace:
[  420.969703]  ? __page_cache_release+0x3c/0x220
[  420.971294]  ? amdgpu_ttm_tt_unpopulate+0x5e/0x80 [amdgpu]
[  420.972789]  kfree+0x168/0x180
[  420.974353]  ? amdgpu_ttm_tt_set_user_pages+0x64/0xc0 [amdgpu]
[  420.975850]  ? kfree+0x168/0x180
[  420.977403]  amdgpu_ttm_tt_unpopulate+0x5e/0x80 [amdgpu]
[  420.97]  ttm_tt_unpopulate.part.10+0x53/0x60 [amdttm]
[  420.980357]  ttm_tt_destroy.part.11+0x4f/0x60 [amdttm]
[  420.981814]  ttm_tt_destroy+0x13/0x20 [amdttm]
[  420.983273]  ttm_bo_cleanup_memtype_use+0x36/0x80 [amdttm]
[  420.984725]  ttm_bo_release+0x1c9/0x360 [amdttm]
[  420.986167]  amdttm_bo_put+0x24/0x30 [amdttm]
[  420.987663]  amdgpu_bo_unref+0x1e/0x30 [amdgpu]
[  420.989165]  amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0x9ca/0xb10
[amdgpu]
[  420.990666]  kfd_ioctl_alloc_memory_of_gpu+0xef/0x2c0 [amdgpu]

Signed-off-by: Philip Yang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 8b704451a18c..4b3ab9a25e91 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1076,6 +1076,7 @@ static int amdgpu_ttm_tt_pin_userptr(struct ttm_tt *ttm)
 
 release_sg:
kfree(ttm->sg);
+   ttm->sg = NULL;
return r;
 }
 
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx