[AMD Official Use Only - Internal Distribution Only]

Hello Guchun,

Besides this, could you please also make a patch to replace DRM_INFO with 
dev_info in amdgpu_ras_check_supported. Basically, we'd prefer to have device 
bdf as the prefix in RAS related wording in kernel message, instead of DRM 
pre-fix.

Please also have a review again on the other RAS wording in case there is still 
use DRM_INFO for the print out message. We shall let user know exactly gpu 
device for any RAS error information.

Regards,
Hawking
-----Original Message-----
From: Chen, Guchun <[email protected]> 
Sent: Friday, April 10, 2020 11:55
To: [email protected]; Zhang, Hawking <[email protected]>; Li, 
Dennis <[email protected]>; Zhou1, Tao <[email protected]>; Clements, John 
<[email protected]>
Cc: Chen, Guchun <[email protected]>
Subject: [PATCH] drm/amdgpu: add uncorrectable error count print in UMC ecc irq 
cb

Uncorrectable error count printing is missed when issuing UMC UE injection. 
When going to the error count log function in GPU recover work thread, there is 
no chance to get correct error count value by last error injection and print, 
because the error status register is automatically cleared after reading in UMC 
ecc irq callback. So add such message printing in UMC ecc irq cb to be 
consistent with other RAS error interrupt cases.

Signed-off-by: Guchun Chen <[email protected]>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c
index f4d40855147b..267f7c30f4dd 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c
@@ -121,6 +121,9 @@ int amdgpu_umc_process_ras_data_cb(struct amdgpu_device 
*adev,
 
        /* only uncorrectable error needs gpu reset */
        if (err_data->ue_count) {
+               dev_info(adev->dev, "%ld uncorrectable errors detected in UMC 
block\n",
+                       err_data->ue_count);
+
                if (err_data->err_addr_cnt &&
                    amdgpu_ras_add_bad_pages(adev, err_data->err_addr,
                                                err_data->err_addr_cnt))
--
2.17.1
_______________________________________________
amd-gfx mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Reply via email to