Thank you for your contribution! Sashiko AI review found 4 potential issue(s) 
to consider:

New issues:
- [High] Incorrect error handling of pm_runtime_get_if_active() return value 
causes unbalanced pm_runtime_put() and unsafe hardware access.
- [High] PM usage count leak due to threaded IRQ coalescing on shared 
interrupts.

Pre-existing issues:
- [High] Missing synchronize_irq() in the runtime suspend callback for a device 
with a shared IRQ handler.
- [High] Attacker-controlled DMA faults leave the device interrupt asserted 
without clearing it, causing an interrupt storm and kernel log spam.
--

commit 57f30fe728d7f765520f0d3ccb419368e6f945d0
Author: ZhaoJinming <[email protected]>

accel/rocket: Fix iommu_group leak and unsafe IRQ register access

This commit fixes an iommu_group reference leak in rocket_job_handle_irq()
by using the stored iommu_group instead of getting a new reference. It also
attempts to fix unsafe hardware register accesses in the shared IRQ handler
by adding pm_runtime_get_if_active() checks to ensure the device is awake.

> 2) Unsafe hardware register access in shared IRQ handler:
>    rocket_job_irq_handler() is registered with IRQF_SHARED but accesses
>    hardware registers without checking runtime PM status.

[Severity: High]
This is a pre-existing issue, but since the device uses a shared IRQ handler,
does the runtime suspend callback (rocket_device_runtime_suspend() in
rocket_drv.c) need to call synchronize_irq() before disabling clocks? 

Without it, the IRQ handler could execute mid-flight while the hardware is
turned off, leading to invalid register accesses and spurious interrupts.

> diff --git a/drivers/accel/rocket/rocket_job.c 
> b/drivers/accel/rocket/rocket_job.c
> index cf9f61682dd8b..8eaa3d427a51d 100644
> --- a/drivers/accel/rocket/rocket_job.c
> +++ b/drivers/accel/rocket/rocket_job.c
[ ... ]
> @@ -428,14 +429,20 @@ static irqreturn_t rocket_job_irq_handler_thread(int 
> irq, void *data)
>  static irqreturn_t rocket_job_irq_handler(int irq, void *data)
>  {
>       struct rocket_core *core = data;
> +
> +     if (!pm_runtime_get_if_active(core->dev))
> +             return IRQ_NONE;
> +

[Severity: High]
Does this correctly handle negative error codes? 

pm_runtime_get_if_active() returns negative error codes (e.g., -EINVAL) when
runtime PM is disabled. The condition here only traps a return value of 0.
When it returns a negative error code, the condition evaluates to false,
allowing execution to proceed to read hardware registers on a potentially
suspended device.

Because the PM usage counter is not incremented on error, wouldn't the
subsequent unconditional pm_runtime_put() calls cause a PM usage count
underflow?

>       u32 raw_status = rocket_pc_readl(core, INTERRUPT_RAW_STATUS);
>  
>       WARN_ON(raw_status & PC_INTERRUPT_RAW_STATUS_DMA_READ_ERROR);
>       WARN_ON(raw_status & PC_INTERRUPT_RAW_STATUS_DMA_WRITE_ERROR);
>  

[Severity: High]
This is a pre-existing issue, but if a DMA error occurs and the DPU_0 and
DPU_1 bits are not set, the handler returns IRQ_NONE without clearing the
interrupt.

Since this is a shared level-triggered IRQ line, won't this leave the
interrupt asserted and cause an interrupt storm, permanently disabling the
shared IRQ line?

>       if (!(raw_status & PC_INTERRUPT_RAW_STATUS_DPU_0 ||
> -           raw_status & PC_INTERRUPT_RAW_STATUS_DPU_1))
> +           raw_status & PC_INTERRUPT_RAW_STATUS_DPU_1)) {
> +             pm_runtime_put(core->dev);
>               return IRQ_NONE;
> +     }
>  
>       rocket_pc_writel(core, INTERRUPT_MASK, 0x0);
>  

[Severity: High]
Does this leak the PM usage count if the IRQ subsystem coalesces the threaded
interrupt?

If another device triggers an interrupt on this shared line while
rocket_job_irq_handler_thread() is already pending, this hardirq handler
will run again. It will increment the PM usage count via
pm_runtime_get_if_active() and eventually return IRQ_WAKE_THREAD.

Because the thread is already pending, the IRQ core coalesces the request
and the thread only runs once. Doesn't this mean pm_runtime_put() is only
called once, permanently leaking the extra PM reference acquired here?

-- 
Sashiko AI review ยท 
https://sashiko.dev/#/patchset/[email protected]?part=2

Reply via email to