On 3/5/2026 11:07 PM, Lazar, Lijo wrote:


On 06-Mar-26 3:35 AM, Mario Limonciello wrote:
I found more case that a NULL version causes problems.
Add NULL checks as applicable.

Fixes: 39fc2bc4da00 ("drm/amdgpu: Protect GPU register accesses in powergated state in some paths")
Signed-off-by: Mario Limonciello <[email protected]>
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 5 +++++
  1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/ drm/amd/amdgpu/amdgpu_device.c
index bc6f714e8763a..74cbe58484fe2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3463,6 +3463,9 @@ static void amdgpu_ip_block_hw_fini(struct amdgpu_ip_block *ip_block)
      struct amdgpu_device *adev = ip_block->adev;
      int r;
+    if (!ip_block->version)
+        return;
+

ip block versions are set during discovery phase itself. This is a very early init failure

Yes; this case is NPI system that not all blocks are in discovery yet. System panics at bootup with NULL ptr deref in multiple places instead of a clean recovery and keep fbdev. This patch series sorts it out.

and ideally the fix should be not to call any fini for such an early failure.

As an alternative to this series?


Thanks,
Lijo

      if (!ip_block->version->funcs->hw_fini) {
          dev_err(adev->dev, "hw_fini of IP block <%s> not defined\n",
              ip_block->version->funcs->name);
@@ -3496,6 +3499,8 @@ static void amdgpu_device_smu_fini_early(struct amdgpu_device *adev)
      for (i = 0; i < adev->num_ip_blocks; i++) {
          if (!adev->ip_blocks[i].status.hw)
              continue;
+        if (!adev->ip_blocks[i].version)
+            continue;
          if (adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_SMC) {
              amdgpu_ip_block_hw_fini(&adev->ip_blocks[i]);
              break;


Reply via email to