On old GPUs, it may be an issue that handling the interrupts from
VM faults is too slow and the interrupt handler (IH) ring may
overflow, which can cause an eventual hang.

Delegate the processing of all VM faults to the soft
IRQ handler ring.

As a result, we spend much less time in the IRQ handler that
interacts with the HW IH ring, which significantly reduces the
chance of hangs/reboots.

Signed-off-by: Timur Kristóf <[email protected]>
---
 drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
index 0e5e54d0a9a5..fbd0bf147f50 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
@@ -1261,6 +1261,12 @@ static int gmc_v7_0_process_interrupt(struct 
amdgpu_device *adev,
 {
        u32 addr, status, mc_client, vmid;
 
+       /* Delegate to the soft IRQ handler ring */
+       if (adev->irq.ih_soft.enabled && entry->ih != &adev->irq.ih_soft) {
+               amdgpu_irq_delegate(adev, entry, 4);
+               return 1;
+       }
+
        addr = RREG32(mmVM_CONTEXT1_PROTECTION_FAULT_ADDR);
        status = RREG32(mmVM_CONTEXT1_PROTECTION_FAULT_STATUS);
        mc_client = RREG32(mmVM_CONTEXT1_PROTECTION_FAULT_MCCLIENT);
-- 
2.51.1

Reply via email to