Hi Alex,

INT_MAX is used instead of MAX_SCHEDULE_TIMEOUT(which we discussed in another 
mail thread) since the amdgpu_lockup_timeout is with data type int.
Using MAX_SCHEDULE_TIMEOUT(data type:long) will get compile warnings.

Regards,
Evan
-----Original Message-----
From: Evan Quan [mailto:[email protected]] 
Sent: Monday, March 19, 2018 2:08 PM
To: [email protected]
Cc: Deucher, Alexander <[email protected]>; Quan, Evan 
<[email protected]>
Subject: [PATCH] drm/amdgpu: disable job timeout on GPU reset disabled

Since under some heavy computing environment(dgemm test), it takes the asic 
over 10+ seconds to finish the dispatched single job which will trigger the 
timeout. It's quite confusing although it does not seem to bring any real 
problems.
As a quick workround, we choose to disable timeout when GPU reset is disabled.

Change-Id: I3a95d856ba4993094dc7b6269649e470c5b053d2
Signed-off-by: Evan Quan <[email protected]>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 8bd9c3f..9d6a775 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -861,6 +861,13 @@ static void amdgpu_device_check_arguments(struct 
amdgpu_device *adev)
                amdgpu_lockup_timeout = 10000;
        }
 
+       /*
+        * Disable timeout when GPU reset is disabled to avoid confusing
+        * timeout messages in the kernel log.
+        */
+       if (amdgpu_gpu_recovery == 0 || amdgpu_gpu_recovery == -1)
+               amdgpu_lockup_timeout = INT_MAX;
+
        adev->firmware.load_type = amdgpu_ucode_get_load_type(adev, 
amdgpu_fw_load_type);  }
 
--
2.7.4

_______________________________________________
amd-gfx mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Reply via email to