Re: [PATCH] drm/sched: Fix race in drm_sched_entity_select_rq()

Tvrtko Ursulin Wed, 22 Oct 2025 00:16:39 -0700


On 22/10/2025 07:34, Philipp Stanner wrote:

In a past bug fix it was forgotten that entity access must be protected
by the entity lock. That's a data race and potentially UB.

Move the spin_unlock() to the appropriate position.

Cc: [email protected] # v5.13+
Fixes: ac4eb83ab255 ("drm/sched: select new rq even if there is only one v3")
Signed-off-by: Philipp Stanner <[email protected]>
---
  drivers/gpu/drm/scheduler/sched_entity.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
b/drivers/gpu/drm/scheduler/sched_entity.c
index 5a4697f636f2..aa222166de58 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -552,10 +552,11 @@ void drm_sched_entity_select_rq(struct drm_sched_entity 
*entity)
                drm_sched_rq_remove_entity(entity->rq, entity);
                entity->rq = rq;
        }
-       spin_unlock(&entity->lock);

if (entity->num_sched_list == 1)

                entity->sched_list = NULL;
+
+       spin_unlock(&entity->lock);
  }

/**


Agreed, unlock at the end is correct.

Reviewed-by: Tvrtko Ursulin <[email protected]>

Luckily only amdgpu calls drm_sched_entity_modify_sched(), and I am notsure if with hypothetical multi-threaded submit it could be possible tohit it or not.

One thread would have to clear the sched_list, while another would haveto have passed the !sched_list at the beginning of the function, and thejob_queue count and last_scheduled completed guards. Which would meanjob has to get completed between the unlock and clearing the pointer.


Also, take a look at this when you can please:

https://lore.kernel.org/dri-devel/[email protected]/

This would also have fixed this issue by removing clearing ofentity->sched_list. Latter being just a hack to work around the issue ofconfused container ownership rules.


Regards,

Tvrtko

Re: [PATCH] drm/sched: Fix race in drm_sched_entity_select_rq()

Reply via email to