On 3/25/26 09:44, Prike Liang wrote: > amdgpu_vm_fini() relies on vm->last_tlb_flush to wait for TLB activity > to complete before calling amdgpu_vm_pt_free_root(). Because > vm->last_tlb_flush tracks only the vm commit fence and never the TLB > fence itself, so fini() may proceed to free page tables while the TLB fence > work item is still running amdgpu_gmc_flush_gpu_tlb_pasid().
That is not correct. The PDs/PTs can be freed without waiting for the TLB flush to complete because dma_resv_add_fence() adds the fence to the PDs/PTs and make sure that their backing store stays around until that is finished. The reason why we have vm->last_tlb_flush is to track the SDMA job which will trigger the TLB flush callback for kernel queues. And your patch here breaks exactly that, so absolutely clear NAK from my side for this. Regards, Christian. > > Signed-off-by: Prike Liang <[email protected]> > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > index 937a6dd3a4b5..53d0ac8bf98f 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > @@ -1093,6 +1093,13 @@ amdgpu_vm_tlb_flush(struct amdgpu_vm_update_params > *params, > if (!params->unlocked && vm->need_tlb_fence) { > amdgpu_vm_tlb_fence_create(params->adev, vm, fence); > > + /* > + * Update last_tlb_flush to the TLB fence so that > + * amdgpu_vm_fini() waits for the actual TLB flush to > + * complete, not just its commit fence. > + */ > + dma_fence_put(vm->last_tlb_flush); > + vm->last_tlb_flush = dma_fence_get(*fence); > /* Makes sure no PD/PT is freed before the flush */ > dma_resv_add_fence(vm->root.bo->tbo.base.resv, *fence, > DMA_RESV_USAGE_BOOKKEEP);
