On 11/26/2025 7:42 PM, Alex Deucher wrote:
Ping on this series?


It looks like the same generic logic in different IP versions. Would it make sense to have amdgpu_gmc_handle_retry_fault() and call it.

Thanks,
Lijo

On Wed, Nov 19, 2025 at 10:16 AM Alex Deucher <[email protected]> wrote:

On Wed, Nov 19, 2025 at 3:14 AM Pierre-Eric Pelloux-Prayer
<[email protected]> wrote:



Le 18/11/2025 à 23:06, Alex Deucher a écrit :
We need to call amdgpu_vm_handle_fault() on page fault
on all gfx9 and newer parts to properly update the
page tables, not just for recoverable page faults.

Signed-off-by: Alex Deucher <[email protected]>
---
   drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c | 27 ++++++++++++++++++++++++++
   1 file changed, 27 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
index 7bc389d9f5c48..25cdcb850416c 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
@@ -103,12 +103,39 @@ static int gmc_v11_0_process_interrupt(struct 
amdgpu_device *adev,
       uint32_t vmhub_index = entry->client_id == SOC21_IH_CLIENTID_VMC ?
                              AMDGPU_MMHUB0(0) : AMDGPU_GFXHUB(0);
       struct amdgpu_vmhub *hub = &adev->vmhub[vmhub_index];
+     bool retry_fault = !!(entry->src_data[1] & 0x80);
+     bool write_fault = !!(entry->src_data[1] & 0x20);
       uint32_t status = 0;
       u64 addr;

       addr = (u64)entry->src_data[0] << 12;
       addr |= ((u64)entry->src_data[1] & 0xf) << 44;

+     if (retry_fault) {
+             /* Returning 1 here also prevents sending the IV to the KFD */
+
+             /* Process it onyl if it's the first fault for this address */

typo: onyl -> only (same for patch 2/3)

Fixed locally.  thanks!

Alex


Pierre-Eric



+             if (entry->ih != &adev->irq.ih_soft &&
+                 amdgpu_gmc_filter_faults(adev, entry->ih, addr, entry->pasid,
+                                          entry->timestamp))
+                     return 1;
+
+             /* Delegate it to a different ring if the hardware hasn't
+              * already done it.
+              */
+             if (entry->ih == &adev->irq.ih) {
+                     amdgpu_irq_delegate(adev, entry, 8);
+                     return 1;
+             }
+
+             /* Try to handle the recoverable page faults by filling page
+              * tables
+              */
+             if (amdgpu_vm_handle_fault(adev, entry->pasid, 0, 0, addr,
+                                        entry->timestamp, write_fault))
+                     return 1;
+     }
+
       if (!amdgpu_sriov_vf(adev)) {
               /*
                * Issue a dummy read to wait for the status register to

Reply via email to