On 31/10/2025 15:48, Boris Brezillon wrote:
> A group can become runnable even after reset.in_progress has
> been set to true and panthor_sched_suspend() has been called,
> because the drm_sched queues are still running at that point,
> and ::run_job() might call group_schedule_locked() which moves
> the group to the runnable list. And that's fine, because we're
> moving those groups to the stopped list anyway when we call
> panthor_group_stop(), so just drop the misleading WARN_ON().

If we've got another thread mutating the runnable list between
panthor_sched_suspend() and list_for_each_entry_safe(), doesn't that
make the list iterator unsafe? (_safe only protects against deleting the
current item, not against concurrent access).

It feels to me like we should be holding the sched mutex - at least
while iterating. I agree the WARN_ON is unnecessary, and will need
removing if we simply guard the iteration - the alternative is to
recolour panthor_sched_suspend() to assume the lock is held (and take
the lock in panthor_sched_pre_reset), but I suspect that's a more ugly
change.

Thanks,
Steve

> 
> Signed-off-by: Boris Brezillon <[email protected]>
> ---
>  drivers/gpu/drm/panthor/panthor_sched.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panthor/panthor_sched.c 
> b/drivers/gpu/drm/panthor/panthor_sched.c
> index fc0826db8f48..51a8d842a7a3 100644
> --- a/drivers/gpu/drm/panthor/panthor_sched.c
> +++ b/drivers/gpu/drm/panthor/panthor_sched.c
> @@ -2835,8 +2835,6 @@ void panthor_sched_pre_reset(struct panthor_device 
> *ptdev)
>        * new jobs while we're resetting.
>        */
>       for (i = 0; i < ARRAY_SIZE(sched->groups.runnable); i++) {
> -             /* All groups should be in the idle lists. */
> -             drm_WARN_ON(&ptdev->base, 
> !list_empty(&sched->groups.runnable[i]));
>               list_for_each_entry_safe(group, group_tmp, 
> &sched->groups.runnable[i], run_node)
>                       panthor_group_stop(group);
>       }

Reply via email to