On 16/06/2026 12:57, Timur Kristóf wrote:
On Tuesday, June 16, 2026 10:09:53 AM Central European Summer Time Tvrtko
Ursulin wrote:
On 25/05/2026 12:45, Timur Kristóf wrote:
When retry faults are disabled (amdgpu.noretry=1),
the ENABLE_RETRY_FAULT_INTERRUPT bit should be programmed to 0.

Note that retry faults are enabled by default on GFX12.1
so this just fixes the case when they are explicitly disabled.

Signed-off-by: Timur Kristóf <[email protected]>
---

   drivers/gpu/drm/amd/amdgpu/gfxhub_v12_1.c | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfxhub_v12_1.c
b/drivers/gpu/drm/amd/amdgpu/gfxhub_v12_1.c index
4c2fd1e6616e..d2edfe037da8 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfxhub_v12_1.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfxhub_v12_1.c
@@ -243,7 +243,7 @@ static void
gfxhub_v12_1_xcc_init_system_aperture_regs(struct amdgpu_device *ade>
                tmp = REG_SET_FIELD(tmp,
GCVM_L2_PROTECTION_FAULT_CNTL2,
                
                                
ACTIVE_PAGE_MIGRATION_PTE_READ_RETRY, 1);
                
                tmp = REG_SET_FIELD(tmp,
GCVM_L2_PROTECTION_FAULT_CNTL2,

-                                   ENABLE_RETRY_FAULT_INTERRUPT,
0x1);
+                                   ENABLE_RETRY_FAULT_INTERRUPT,
!adev->gmc.noretry);

                WREG32_SOC15(GC, GET_INST(GC, i),
                
                             regGCVM_L2_PROTECTION_FAULT_CNTL2,
tmp);
        
        }

If I look at 6f894c92490b ("drm/amdgpu: Enable retry faults for GFX
12.1") which added this code, it also touched
gfxhub_v12_1_xcc_setup_vmid_config():

      tmp = REG_SET_FIELD(tmp, GCVM_CONTEXT1_CNTL,
                          RETRY_PERMISSION_OR_INVALID_PAGE_FAULT,
-                       !amdgpu_noretry);
+                       1);

Should that be changed as well?


I personally don't have a GFX12.1 GPU so I have no way to verify how that
works, which is why I try to avoid changing it unless it's pretty obvious that
the upstream code is wrong.

Can you elaborate on what you are suggesting exactly?

I'm asking. :)

commit 6f894c92490be1bb27492a82544b4b1e4ad20915
Author: Mukul Joshi <[email protected]>
Date:   Wed Mar 26 22:06:39 2025 -0400

    drm/amdgpu: Enable retry faults for GFX 12.1

Made these three changes:

gfxhub_v12_1_xcc_init_system_aperture_regs:
+ tmp = REG_SET_FIELD(tmp, GCVM_L2_PROTECTION_FAULT_CNTL2, + ENABLE_RETRY_FAULT_INTERRUPT, 0x1);


gfxhub_v12_1_xcc_setup_vmid_config:
-                                           !amdgpu_noretry);
+                                           1);


mmhub_v4_2_0_mid_init_system_aperture_regs:
+               tmp = REG_SET_FIELD(tmp, MMVM_L2_PROTECTION_FAULT_CNTL2,
+                                   ENABLE_RETRY_FAULT_INTERRUPT, 0x1);


The claim from that one was that it is enabling retry faults on gfx 12.1. If that is correct, and we look at your patch which wants respect the noretry modparam, but only changes one of those three.

So question is are you confident it is only that one you need to change to make it respect the modparam? I don't know to be clear, those are just things I spot while reading you patch and the relevant history trying to familiarise myself with this area.

Regards,

Tvrtko

Reply via email to