Yeah, but that is still not the root cause. Attaching the TLB fence all the time just makes more use of the MES, it doesn't cause any additional problems which wouldn't have been there before.
Regards, Christian. On 3/12/26 22:08, Mario Limonciello wrote: > There is actually a contingent of two people who claim that this patch is the > cause for MES resets here: > > https://gitlab.freedesktop.org/drm/amd/-/issues/4749 > > > On 3/5/2026 3:43 AM, Christian König wrote: >> The original reporter already mentioned on the ticket that this patch is not >> the actual cause of the issues. >> >> It basically just changes timing to create and eventually wait for the TLB >> fence to signal. >> >> Let's see what the reporter finds with his extended bisect. >> >> Regards, >> Christian. >> >> On 3/5/26 07:48, Liang, Prike wrote: >>> [Public] >>> >>> It’s possible that we failed to save and invalidate some active pages >>> during suspend, which then prevents those pages from being restored >>> correctly on resume. >>> >>> For now, we still rely on this patch to keep the userq page tables updated >>> and synchronized. Until the full solution is ready, how about we fall back >>> to the initial approach and restrict this TLB flush to only the userq path? >>> >>> Regards, >>> Prike >>> >>>> -----Original Message----- >>>> From: Koenig, Christian <[email protected]> >>>> Sent: Wednesday, March 4, 2026 9:57 PM >>>> To: Deucher, Alexander <[email protected]>; amd- >>>> [email protected] >>>> Cc: Liang, Prike <[email protected]> >>>> Subject: Re: [PATCH] Revert "drm/amdgpu: attach tlb fence to the PTs >>>> update" >>>> >>>> On 3/4/26 14:54, Alex Deucher wrote: >>>>> This reverts commit f3854e04b708d73276c4488231a8bd66d30b4671. >>>>> >>>>> This causes framebuffer corruption after suspend. >>>> >>>> But prevents massive memory corruption with userqueues. >>>> >>>> I have strong doubts that this is related to the FB corruption in any way, >>>> it will just >>>> change the timing. >>>> >>>> Regards, >>>> Christian. >>>> >>>>> >>>>> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4798 >>>>> Cc: Christian König <[email protected]> >>>>> Cc: Prike Liang <[email protected]> >>>>> Signed-off-by: Alex Deucher <[email protected]> >>>>> --- >>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 +- >>>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>>> >>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >>>>> index 01fef0e4f4085..25b1d679ba262 100644 >>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >>>>> @@ -1073,7 +1073,7 @@ amdgpu_vm_tlb_flush(struct >>>> amdgpu_vm_update_params *params, >>>>> } >>>>> >>>>> /* Prepare a TLB flush fence to be attached to PTs */ >>>>> - if (!params->unlocked) { >>>>> + if (!params->unlocked && vm->is_compute_context) { >>>>> amdgpu_vm_tlb_fence_create(params->adev, vm, fence); >>>>> >>>>> /* Makes sure no PD/PT is freed before the flush */ >>> >> >> >
