On 3/5/2026 11:07 PM, Lazar, Lijo wrote:
On 06-Mar-26 3:35 AM, Mario Limonciello wrote:
I found more case that a NULL version causes problems.
Add NULL checks as applicable.
Fixes: 39fc2bc4da00 ("drm/amdgpu: Protect GPU register accesses in
powergated state in some paths")
Signed-off-by: Mario Limonciello <[email protected]>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/
drm/amd/amdgpu/amdgpu_device.c
index bc6f714e8763a..74cbe58484fe2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3463,6 +3463,9 @@ static void amdgpu_ip_block_hw_fini(struct
amdgpu_ip_block *ip_block)
struct amdgpu_device *adev = ip_block->adev;
int r;
+ if (!ip_block->version)
+ return;
+
ip block versions are set during discovery phase itself. This is a very
early init failure
Yes; this case is NPI system that not all blocks are in discovery yet.
System panics at bootup with NULL ptr deref in multiple places instead
of a clean recovery and keep fbdev. This patch series sorts it out.
and ideally the fix should be not to call any fini
for such an early failure.
As an alternative to this series?
Thanks,
Lijo
if (!ip_block->version->funcs->hw_fini) {
dev_err(adev->dev, "hw_fini of IP block <%s> not defined\n",
ip_block->version->funcs->name);
@@ -3496,6 +3499,8 @@ static void amdgpu_device_smu_fini_early(struct
amdgpu_device *adev)
for (i = 0; i < adev->num_ip_blocks; i++) {
if (!adev->ip_blocks[i].status.hw)
continue;
+ if (!adev->ip_blocks[i].version)
+ continue;
if (adev->ip_blocks[i].version->type ==
AMD_IP_BLOCK_TYPE_SMC) {
amdgpu_ip_block_hw_fini(&adev->ip_blocks[i]);
break;