On Tue, Nov 18, 2025 at 9:59 PM Yang Wang <[email protected]> wrote:
>
> fix amdgpu_irq enabled counter unbalanced issue on 
> smu_v11_0_disable_thermal_alert.
>
> [  357.773144] ------------[ cut here ]------------
> [  357.773156] WARNING: CPU: 21 PID: 2202 at 
> drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c:639 amdgpu_irq_put+0xd8/0xf0 [amdgpu]
> ...
> [  357.774651] Tainted: [E]=UNSIGNED_MODULE
> [  357.774656] Hardware name: GIGABYTE MZ01-CE0-00/MZ01-CE0-00, BIOS F14a 
> 08/14/2020
> [  357.774664] RIP: 0010:amdgpu_irq_put+0xd8/0xf0 [amdgpu]
> [  357.775563] Code: 31 f6 31 ff e9 f9 c3 4f cb 44 89 f2 4c 89 e6 4c 89 ef e8 
> db fc ff ff 5b 41 5c 41 5d 41 5e 5d 31 d2 31 f6 31 ff e9 d8 c3 4f cb <0f> 0b 
> eb c3 b8 fe ff ff ff eb 97 e9 d3 8d 8b 00 0f 1f 84 00 00 00
> [  357.775573] RSP: 0018:ffffd28616ecba58 EFLAGS: 00010246
> [  357.775584] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 
> 0000000000000000
> [  357.775592] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 
> 0000000000000000
> [  357.775598] RBP: ffffd28616ecba78 R08: 0000000000000000 R09: 
> 0000000000000000
> [  357.775605] R10: 0000000000000000 R11: 0000000000000000 R12: 
> ffff8aac201a8008
> [  357.775611] R13: ffff8aac0e600000 R14: 0000000000000000 R15: 
> ffff8aac201a8000
> [  357.775618] FS:  0000751c697b7c40(0000) GS:ffff8acb4fba2000(0000) 
> knlGS:0000000000000000
> [  357.775627] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  357.775634] CR2: 00005a844a5e7028 CR3: 0000001039a0f000 CR4: 
> 00000000003506f0
> [  357.775642] Call Trace:
> [  357.775649]  <TASK>
> [  357.775663]  smu_v11_0_disable_thermal_alert+0x17/0x30 [amdgpu]
> [  357.776704]  smu_smc_hw_cleanup+0x79/0x500 [amdgpu]
> [  357.777857]  smu_hw_fini+0x139/0x200 [amdgpu]
> [  357.778908]  amdgpu_ip_block_hw_fini+0x29/0xc0 [amdgpu]
> [  357.779698]  amdgpu_device_fini_hw+0x2e5/0x560 [amdgpu]
> [  357.780487]  ? blocking_notifier_chain_unregister+0x3e/0x70
> [  357.780511]  amdgpu_driver_unload_kms+0x4b/0x70 [amdgpu]
> [  357.781334]  amdgpu_pci_remove+0x50/0x90 [amdgpu]
> [  357.782126]  pci_device_remove+0x41/0xc0
> [  357.782145]  device_remove+0x46/0x80
> [  357.782159]  device_release_driver_internal+0x203/0x270
> [  357.782169]  ? srso_return_thunk+0x5/0x5f
> [  357.782189]  driver_detach+0x4a/0xa0
> [  357.782201]  bus_remove_driver+0x83/0x110
> [  357.782216]  driver_unregister+0x31/0x60
> [  357.782227]  pci_unregister_driver+0x40/0x90
> [  357.782244]  amdgpu_exit+0x15/0x3b [amdgpu]
>
> Signed-off-by: Yang Wang <[email protected]>
> ---
>  drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c 
> b/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c
> index 78e4186d06cc..24d9f576846b 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c
> @@ -1022,7 +1022,12 @@ int smu_v11_0_enable_thermal_alert(struct smu_context 
> *smu)
>
>  int smu_v11_0_disable_thermal_alert(struct smu_context *smu)
>  {
> -       return amdgpu_irq_put(smu->adev, &smu->irq_source, 0);
> +       int ret = 0;
> +
> +       if (smu->smu_table.thermal_controller_type)
> +               ret = amdgpu_irq_get(smu->adev, &smu->irq_source, 0);

Shouldn't this be amdgpu_irq_put()?  With that fixed,
Reviewed-by: Alex Deucher <[email protected]>

> +
> +       return ret;
>  }
>
>  static uint16_t convert_to_vddc(uint8_t vid)
> --
> 2.34.1
>

Reply via email to