Philip already stumbled over this issue as well, but this is the wrong place to fix this.

dma_resv_reserve_shared() needs to be called after we reserved the page tables and before we do the update in amdgpu_vm_handle_fault().

Reserved slots are freed (in a debug build) as soon as we release the reservation.

Christian.

Am 29.09.20 um 07:57 schrieb xinhui pan:
[  179.556745] kernel BUG at drivers/dma-buf/dma-resv.c:282!
[snip]
[  179.702910] Call Trace:
[  179.705696]  amdgpu_bo_fence+0x21/0x50 [amdgpu]
[  179.710707]  amdgpu_vm_sdma_commit+0x299/0x430 [amdgpu]
[  179.716497]  amdgpu_vm_bo_update_mapping.constprop.0+0x29f/0x390 [amdgpu]
[  179.723927]  ? find_held_lock+0x38/0x90
[  179.728183]  amdgpu_vm_handle_fault+0x1af/0x420 [amdgpu]
[  179.734063]  gmc_v9_0_process_interrupt+0x245/0x2e0 [amdgpu]
[  179.740347]  ? kgd2kfd_interrupt+0xb8/0x1e0 [amdgpu]
[  179.745808]  amdgpu_irq_dispatch+0x10a/0x3c0 [amdgpu]
[  179.751380]  ? amdgpu_irq_dispatch+0x10a/0x3c0 [amdgpu]
[  179.757159]  amdgpu_ih_process+0xbb/0x1a0 [amdgpu]
[  179.762466]  amdgpu_irq_handle_ih1+0x27/0x40 [amdgpu]
[  179.767997]  process_one_work+0x23c/0x580
[  179.772371]  worker_thread+0x50/0x3b0
[  179.776356]  ? process_one_work+0x580/0x580
[  179.780939]  kthread+0x128/0x160
[  179.784462]  ? kthread_park+0x90/0x90
[  179.788466]  ret_from_fork+0x1f/0x30

We have two scheduler entities, immediate and delayed.
So there are two kinds of scheduler finished fences.
We might add these two fences in root bo resv at same time while we
only reserve one slot.

Signed-off-by: xinhui pan <[email protected]>
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 37221b99ca96..9e0116c7f8d1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -2869,7 +2869,7 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct 
amdgpu_vm *vm,
        if (r)
                goto error_free_root;
- r = dma_resv_reserve_shared(root->tbo.base.resv, 1);
+       r = dma_resv_reserve_shared(root->tbo.base.resv, 2);
        if (r)
                goto error_unreserve;

_______________________________________________
amd-gfx mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Reply via email to