On Tuesday, June 16, 2026 2:16:35 PM Central European Summer Time Tvrtko 
Ursulin wrote:
> On 16/06/2026 12:57, Timur Kristóf wrote:
> > On Tuesday, June 16, 2026 10:09:53 AM Central European Summer Time Tvrtko
> > 
> > Ursulin wrote:
> >> On 25/05/2026 12:45, Timur Kristóf wrote:
> >>> When retry faults are disabled (amdgpu.noretry=1),
> >>> the ENABLE_RETRY_FAULT_INTERRUPT bit should be programmed to 0.
> >>> 
> >>> Note that retry faults are enabled by default on GFX12.1
> >>> so this just fixes the case when they are explicitly disabled.
> >>> 
> >>> Signed-off-by: Timur Kristóf <[email protected]>
> >>> ---
> >>> 
> >>>    drivers/gpu/drm/amd/amdgpu/gfxhub_v12_1.c | 2 +-
> >>>    1 file changed, 1 insertion(+), 1 deletion(-)
> >>> 
> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/gfxhub_v12_1.c
> >>> b/drivers/gpu/drm/amd/amdgpu/gfxhub_v12_1.c index
> >>> 4c2fd1e6616e..d2edfe037da8 100644
> >>> --- a/drivers/gpu/drm/amd/amdgpu/gfxhub_v12_1.c
> >>> +++ b/drivers/gpu/drm/amd/amdgpu/gfxhub_v12_1.c
> >>> @@ -243,7 +243,7 @@ static void
> >>> gfxhub_v12_1_xcc_init_system_aperture_regs(struct amdgpu_device *ade>
> >>> 
> >>>                   tmp = REG_SET_FIELD(tmp,
> > 
> > GCVM_L2_PROTECTION_FAULT_CNTL2,
> > 
> > 
> > ACTIVE_PAGE_MIGRATION_PTE_READ_RETRY, 1);
> > 
> >>>                   tmp = REG_SET_FIELD(tmp,
> > 
> > GCVM_L2_PROTECTION_FAULT_CNTL2,
> > 
> >>> -                             ENABLE_RETRY_FAULT_INTERRUPT,
> > 
> > 0x1);
> > 
> >>> +                             ENABLE_RETRY_FAULT_INTERRUPT,
> > 
> > !adev->gmc.noretry);
> > 
> >>>                   WREG32_SOC15(GC, GET_INST(GC, i),
> >>>                   
> >>>                                regGCVM_L2_PROTECTION_FAULT_CNTL2,
> > 
> > tmp);
> > 
> >>>           }
> >> 
> >> If I look at 6f894c92490b ("drm/amdgpu: Enable retry faults for GFX
> >> 12.1") which added this code, it also touched
> >> 
> >> gfxhub_v12_1_xcc_setup_vmid_config():
> >>       tmp = REG_SET_FIELD(tmp, GCVM_CONTEXT1_CNTL,
> >>       
> >>                           RETRY_PERMISSION_OR_INVALID_PAGE_FAULT,
> >> 
> >> -                       !amdgpu_noretry);
> >> +                       1);
> >> 
> >> Should that be changed as well?
> > 
> > I personally don't have a GFX12.1 GPU so I have no way to verify how that
> > works, which is why I try to avoid changing it unless it's pretty obvious
> > that the upstream code is wrong.
> > 
> > Can you elaborate on what you are suggesting exactly?
> 
> I'm asking. :)
> 
> commit 6f894c92490be1bb27492a82544b4b1e4ad20915
> Author: Mukul Joshi <[email protected]>
> Date:   Wed Mar 26 22:06:39 2025 -0400
> 
>      drm/amdgpu: Enable retry faults for GFX 12.1
> 
> Made these three changes:
> 
> gfxhub_v12_1_xcc_init_system_aperture_regs:
> +                       tmp = REG_SET_FIELD(tmp,
> GCVM_L2_PROTECTION_FAULT_CNTL2,
> +
> ENABLE_RETRY_FAULT_INTERRUPT, 0x1);
> 
> 
> gfxhub_v12_1_xcc_setup_vmid_config:
> -                                           !amdgpu_noretry);
> +                                           1);
> 
> 
> mmhub_v4_2_0_mid_init_system_aperture_regs:
> +               tmp = REG_SET_FIELD(tmp, MMVM_L2_PROTECTION_FAULT_CNTL2,
> +                                   ENABLE_RETRY_FAULT_INTERRUPT, 0x1);
> 
> 
> The claim from that one was that it is enabling retry faults on gfx
> 12.1. If that is correct, and we look at your patch which wants respect
> the noretry modparam, but only changes one of those three.
> 
> So question is are you confident it is only that one you need to change
> to make it respect the modparam? I don't know to be clear, those are
> just things I spot while reading you patch and the relevant history
> trying to familiarise myself with this area.

I think I see what you mean.
Indeed it would make sense to change the patch to use the gmc->noretry flag 
there as well. Will add that to the next version.

Thanks,
Timur





Reply via email to