On Tuesday, June 16, 2026 2:16:35 PM Central European Summer Time Tvrtko Ursulin wrote: > On 16/06/2026 12:57, Timur Kristóf wrote: > > On Tuesday, June 16, 2026 10:09:53 AM Central European Summer Time Tvrtko > > > > Ursulin wrote: > >> On 25/05/2026 12:45, Timur Kristóf wrote: > >>> When retry faults are disabled (amdgpu.noretry=1), > >>> the ENABLE_RETRY_FAULT_INTERRUPT bit should be programmed to 0. > >>> > >>> Note that retry faults are enabled by default on GFX12.1 > >>> so this just fixes the case when they are explicitly disabled. > >>> > >>> Signed-off-by: Timur Kristóf <[email protected]> > >>> --- > >>> > >>> drivers/gpu/drm/amd/amdgpu/gfxhub_v12_1.c | 2 +- > >>> 1 file changed, 1 insertion(+), 1 deletion(-) > >>> > >>> diff --git a/drivers/gpu/drm/amd/amdgpu/gfxhub_v12_1.c > >>> b/drivers/gpu/drm/amd/amdgpu/gfxhub_v12_1.c index > >>> 4c2fd1e6616e..d2edfe037da8 100644 > >>> --- a/drivers/gpu/drm/amd/amdgpu/gfxhub_v12_1.c > >>> +++ b/drivers/gpu/drm/amd/amdgpu/gfxhub_v12_1.c > >>> @@ -243,7 +243,7 @@ static void > >>> gfxhub_v12_1_xcc_init_system_aperture_regs(struct amdgpu_device *ade> > >>> > >>> tmp = REG_SET_FIELD(tmp, > > > > GCVM_L2_PROTECTION_FAULT_CNTL2, > > > > > > ACTIVE_PAGE_MIGRATION_PTE_READ_RETRY, 1); > > > >>> tmp = REG_SET_FIELD(tmp, > > > > GCVM_L2_PROTECTION_FAULT_CNTL2, > > > >>> - ENABLE_RETRY_FAULT_INTERRUPT, > > > > 0x1); > > > >>> + ENABLE_RETRY_FAULT_INTERRUPT, > > > > !adev->gmc.noretry); > > > >>> WREG32_SOC15(GC, GET_INST(GC, i), > >>> > >>> regGCVM_L2_PROTECTION_FAULT_CNTL2, > > > > tmp); > > > >>> } > >> > >> If I look at 6f894c92490b ("drm/amdgpu: Enable retry faults for GFX > >> 12.1") which added this code, it also touched > >> > >> gfxhub_v12_1_xcc_setup_vmid_config(): > >> tmp = REG_SET_FIELD(tmp, GCVM_CONTEXT1_CNTL, > >> > >> RETRY_PERMISSION_OR_INVALID_PAGE_FAULT, > >> > >> - !amdgpu_noretry); > >> + 1); > >> > >> Should that be changed as well? > > > > I personally don't have a GFX12.1 GPU so I have no way to verify how that > > works, which is why I try to avoid changing it unless it's pretty obvious > > that the upstream code is wrong. > > > > Can you elaborate on what you are suggesting exactly? > > I'm asking. :) > > commit 6f894c92490be1bb27492a82544b4b1e4ad20915 > Author: Mukul Joshi <[email protected]> > Date: Wed Mar 26 22:06:39 2025 -0400 > > drm/amdgpu: Enable retry faults for GFX 12.1 > > Made these three changes: > > gfxhub_v12_1_xcc_init_system_aperture_regs: > + tmp = REG_SET_FIELD(tmp, > GCVM_L2_PROTECTION_FAULT_CNTL2, > + > ENABLE_RETRY_FAULT_INTERRUPT, 0x1); > > > gfxhub_v12_1_xcc_setup_vmid_config: > - !amdgpu_noretry); > + 1); > > > mmhub_v4_2_0_mid_init_system_aperture_regs: > + tmp = REG_SET_FIELD(tmp, MMVM_L2_PROTECTION_FAULT_CNTL2, > + ENABLE_RETRY_FAULT_INTERRUPT, 0x1); > > > The claim from that one was that it is enabling retry faults on gfx > 12.1. If that is correct, and we look at your patch which wants respect > the noretry modparam, but only changes one of those three. > > So question is are you confident it is only that one you need to change > to make it respect the modparam? I don't know to be clear, those are > just things I spot while reading you patch and the relevant history > trying to familiarise myself with this area.
I think I see what you mean. Indeed it would make sense to change the patch to use the gmc->noretry flag there as well. Will add that to the next version. Thanks, Timur
