panthor: Protect events processing with a separate spinlock

sashiko-bot Tue, 23 Jun 2026 06:07:42 -0700

Thank you for your contribution! Sashiko AI review found 3 potential issue(s) 
to consider:
- [High] Data race and lost updates on `group->fatal_queues` due to split 
locking.
- [Medium] Data race on `group->timedout` boolean flag due to split locking.
- [Medium] Data race on `sched->might_have_idle_groups` due to split locking.
--


commit db6f3f8002805cdf65efff6be24738f26e5f60a4
Author: Boris Brezillon <[email protected]>

drm/panthor: Protect events processing with a separate spinlock

This commit introduces a new events_lock spinlock in the panthor scheduler.
It aims to protect event processing and command stream group (CSG) slot
updates, allowing some event processing to be moved to a threaded IRQ
handler.

> diff --git a/drivers/gpu/drm/panthor/panthor_sched.c 
> b/drivers/gpu/drm/panthor/panthor_sched.c
> index 5b34032deff81..4f5d874e5d468 100644
> --- a/drivers/gpu/drm/panthor/panthor_sched.c
> +++ b/drivers/gpu/drm/panthor/panthor_sched.c
[ ... ]
> @@ -1484,7 +1497,7 @@ cs_slot_process_fatal_event_locked(struct 
> panthor_device *ptdev,
>       u32 fatal;
>       u64 info;
>  
> -     lockdep_assert_held(&sched->lock);
> +     lockdep_assert_held(&sched->events_lock);
>  
>       cs_iface = panthor_fw_get_cs_iface(ptdev, csg_id, cs_id);
>       fatal = cs_iface->output->fatal;

[Severity: High]
With this change, cs_slot_process_fatal_event_locked() now executes with only
events_lock held, which means the subsequent update to group->fatal_queues
inside this function is no longer protected by sched->lock:

    group->fatal_queues |= BIT(cs_id);

However, in tick_ctx_init(), we update this same field while holding
sched->lock, but after dropping the scoped events_lock:

    if (!group->fatal_queues)
        group->fatal_queues |= GENMASK(group->queue_count - 1, 0);

Could this result in a data race and lost updates on group->fatal_queues if
these two paths execute concurrently on different CPUs? If the non-atomic
read-modify-write operations overlap, one of the updates might be lost,
potentially leaving a faulty queue unflagged.

This is a pre-existing issue, but it's worth noting that
group_process_tiler_oom() also performs a lockless read-modify-write on
group->fatal_queues:

    group->fatal_queues |= BIT(cs_id);

[ ... ]
> @@ -1731,7 +1744,7 @@ static void csg_slot_process_idle_event_locked(struct 
> panthor_device *ptdev, u32
>  {
>       struct panthor_scheduler *sched = ptdev->scheduler;
>  
> -     lockdep_assert_held(&sched->lock);
> +     lockdep_assert_held(&sched->events_lock);
>  
>       sched->might_have_idle_groups = true;
>  

[Severity: Medium]
Will this lead to a data race on sched->might_have_idle_groups? 

csg_slot_process_idle_event_locked() sets this flag to true while holding
events_lock. But in tick_ctx_apply(), the flag is overwritten under
sched->lock without holding events_lock:

    sched->might_have_idle_groups = ctx->idle_group_count > 0;

Because these paths do not share a common lock, they can execute
concurrently, which violates the C memory model and might trigger KCSAN
warnings.

[ ... ]
> @@ -1763,7 +1776,7 @@ csg_slot_process_progress_timer_event_locked(struct 
> panthor_device *ptdev, u32 c
>       struct panthor_csg_slot *csg_slot = &sched->csg_slots[csg_id];
>       struct panthor_group *group = csg_slot->group;
>  
> -     lockdep_assert_held(&sched->lock);
> +     lockdep_assert_held(&sched->events_lock);
>  
>       group = csg_slot->group;
>       if (!drm_WARN_ON(&ptdev->base, !group)) {

[Severity: Medium]
Since csg_slot_process_progress_timer_event_locked() now holds events_lock
instead of sched->lock, is it still safe to modify group->timedout here?

    group->timedout = true;

Other paths like panthor_sched_suspend() and panthor_job_timedout() modify
and read group->timedout while holding only sched->lock:

    csg_slot->group->timedout = true;

Concurrent writes from the event processing workqueue and the suspend/timeout
paths could cause a data race on this boolean flag.

-- 
Sashiko AI review · 
https://sashiko.dev/#/patchset/[email protected]?part=4

Re: [PATCH v3 4/7] drm/panthor: Protect events processing with a separate spinlock

Reply via email to