Thank you for your contribution! Sashiko AI review found 2 potential issue(s) 
to consider:

Pre-existing issues:
- [High] panfrost_remove() and the panfrost_probe() error path unconditionally 
execute panfrost_device_fini() without ensuring the device is awake, leading to 
unclocked MMIO accesses.
- [High] Shared IRQ handlers remain active during and after device teardown, 
leading to unclocked MMIO reads and Use-After-Free/NULL-dereference crashes.
--

commit 0beb45b6c65f2fa448d72c39c4f224fe73270c9f
Author: Ruoyu Wang <[email protected]>

drm/panfrost: fix runtime PM leak on remove

pm_runtime_get_sync() increments the runtime PM usage counter even when it
returns an error. panfrost_remove() uses it to keep the GPU active while
runtime PM is disabled and the device is finalized, but the usage counter
is not dropped afterwards.

> diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c 
> b/drivers/gpu/drm/panfrost/panfrost_drv.c
> index 784e36d72c2ba..0b475ad79b83b 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_drv.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
> @@ -1027,6 +1027,7 @@ static void panfrost_remove(struct platform_device 
> *pdev)
>  
>       pm_runtime_get_sync(pfdev->base.dev);

[Severity: High]
This is a pre-existing issue, but is it safe to ignore the return value of
pm_runtime_get_sync() here?

If resuming the device fails (for example, due to a clock enablement failure),
the device remains suspended. The driver then proceeds to call
panfrost_device_fini(), which executes hardware teardown routines that write
to registers. Accessing MMIO registers of a suspended device with disabled
clocks on ARM causes a synchronous external abort.

The exact same missing state management appears to exist in the
panfrost_probe() error path, which calls panfrost_device_fini()
without attempting to resume the device, even though it may have
autosuspended after drm_dev_register():

drivers/gpu/drm/panfrost/panfrost_drv.c:panfrost_probe() {
    ...
err_out1:
    pm_runtime_disable(pfdev->base.dev);
    panfrost_device_fini(pfdev);
    ...
}

>       pm_runtime_disable(pfdev->base.dev);
> +     pm_runtime_put_noidle(pfdev->base.dev);
>       panfrost_device_fini(pfdev);
>       pm_runtime_set_suspended(pfdev->base.dev);

[Severity: High]
This isn't a bug introduced by this patch, but does the teardown sequence
leave a window where shared IRQ handlers can run and access unclocked MMIO
or freed memory?

During panfrost_remove(), hardware and software are torn down, but the
shared devm-managed IRQs are not synchronized or disabled. They remain
active until driver core cleanup runs after panfrost_remove() returns.

If another device triggers the shared interrupt during this window, the
Panfrost handlers will execute:

drivers/gpu/drm/panfrost/panfrost_gpu.c:panfrost_gpu_irq_handler() {
    if (test_bit(PANFROST_COMP_BIT_GPU, pfdev->is_suspended))
        return IRQ_NONE;

    fault_status = gpu_read(pfdev, GPU_FAULT_STATUS);
    ...
}

Because panfrost_device_fini() does not set the PANFROST_COMP_BIT_* flags to
indicate the device is suspended, the handlers will incorrectly assume the
device is active. They will attempt to read MMIO registers (causing an abort
because clocks are off) and may dereference freed pointers.

>  }

-- 
Sashiko AI review · 
https://sashiko.dev/#/patchset/[email protected]?part=1

Reply via email to