On old GPUs, it may be an issue that handling the interrupts from
VM faults is too slow and the interrupt handler (IH) ring may
overflow, which can cause an eventual hang.

Delegate the processing of all VM faults to the soft
IRQ handler ring.

As a result, we spend much less time in the IRQ handler that
interacts with the HW IH ring, which significantly reduces the
chance of hangs/reboots.

Signed-off-by: Timur Kristóf <[email protected]>
---
 drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
index e1509480dfc2..6551b60f2584 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
@@ -1439,6 +1439,12 @@ static int gmc_v8_0_process_interrupt(struct 
amdgpu_device *adev,
                return 0;
        }
 
+       /* Delegate to the soft IRQ handler ring */
+       if (adev->irq.ih_soft.enabled && entry->ih != &adev->irq.ih_soft) {
+               amdgpu_irq_delegate(adev, entry, 4);
+               return 1;
+       }
+
        addr = RREG32(mmVM_CONTEXT1_PROTECTION_FAULT_ADDR);
        status = RREG32(mmVM_CONTEXT1_PROTECTION_FAULT_STATUS);
        mc_client = RREG32(mmVM_CONTEXT1_PROTECTION_FAULT_MCCLIENT);
-- 
2.51.1

Reply via email to