[Public] We might want to add a TODO tag around the TLB fence creation to track a follow-up check from the KIQ/MES side.
With it or not, the patch is Reviewed-by: Prike Liang <[email protected]> Regards, Prike > -----Original Message----- > From: Alex Deucher <[email protected]> > Sent: Monday, March 16, 2026 11:17 PM > To: [email protected] > Cc: Deucher, Alexander <[email protected]>; Koenig, Christian > <[email protected]>; Liang, Prike <[email protected]> > Subject: [PATCH] drm/amdgpu: rework how we handle TLB fences > > Add a new VM flag to indicate whether or not we need a TLB fence. Userqs > (KFD or > KGD) require a TLB fence. > A TLB fence is not strictly required for kernel queues, but it shouldn't > hurt. That said, > enabling this unconditionally should be fine, but it seems to tickle some > issues in > KIQ/MES. Only enable them for KFD, or when KGD userq queues are enabled > (currently via module parameter). > > Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4798 > Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4749 > Fixes: f3854e04b708 ("drm/amdgpu: attach tlb fence to the PTs update") > Cc: Christian König <[email protected]> > Cc: Prike Liang <[email protected]> > Signed-off-by: Alex Deucher <[email protected]> > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 +++- > drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 2 ++ > 2 files changed, 5 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > index b89013a6aa0b6..497464f50ea7d 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > @@ -1041,7 +1041,7 @@ amdgpu_vm_tlb_flush(struct > amdgpu_vm_update_params *params, > } > > /* Prepare a TLB flush fence to be attached to PTs */ > - if (!params->unlocked) { > + if (!params->unlocked && vm->need_tlb_fence) { > amdgpu_vm_tlb_fence_create(params->adev, vm, fence); > > /* Makes sure no PD/PT is freed before the flush */ @@ -2573,6 > +2573,7 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm > *vm, > ttm_lru_bulk_move_init(&vm->lru_bulk_move); > > vm->is_compute_context = false; > + vm->need_tlb_fence = amdgpu_userq_enabled(&adev->ddev); > > vm->use_cpu_for_update = !!(adev->vm_manager.vm_update_mode & > AMDGPU_VM_USE_CPU_FOR_GFX); > @@ -2710,6 +2711,7 @@ int amdgpu_vm_make_compute(struct amdgpu_device > *adev, struct amdgpu_vm *vm) > dma_fence_put(vm->last_update); > vm->last_update = dma_fence_get_stub(); > vm->is_compute_context = true; > + vm->need_tlb_fence = true; > > unreserve_bo: > amdgpu_bo_unreserve(vm->root.bo); > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h > b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h > index ae9449d5b00cd..25d176d1350ef 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h > @@ -444,6 +444,8 @@ struct amdgpu_vm { > struct ttm_lru_bulk_move lru_bulk_move; > /* Flag to indicate if VM is used for compute */ > bool is_compute_context; > + /* Flag to indicate if VM needs a TLB fence (KFD or KGD) */ > + bool need_tlb_fence; > > /* Memory partition number, -1 means any partition */ > int8_t mem_id; > -- > 2.53.0
