On Wed, 2025-10-15 at 17:52 +0200, Christian König wrote: > > > On 15.10.25 10:23, Philipp Stanner wrote: > > On Tue, 2025-10-14 at 16:24 +0200, Christian König wrote: > > > From: David Rosca <[email protected]> > > > > > > The DRM scheduler tracks who last uses an entity and when that process > > > is killed blocks all further submissions to that entity. > > > > > > The problem is that we didn't tracked who initialy created an entity, so > > > > s/tracked/track > > > > > when an process accidentially leaked its file descriptor to a child and > > > > s/an/a > > > > > that child got killed we killed the parents entities. > > > > s/parents/parent's > > > > > > > > Avoid that and instead initialize the entities last user on entity > > > creation. > > > > > > Signed-off-by: David Rosca <[email protected]> > > > Signed-off-by: Christian König <[email protected]> > > > CC: [email protected] > > > --- > > > drivers/gpu/drm/scheduler/sched_entity.c | 3 ++- > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > diff --git a/drivers/gpu/drm/scheduler/sched_entity.c > > > b/drivers/gpu/drm/scheduler/sched_entity.c > > > index 5a4697f636f2..3e2f83dc3f24 100644 > > > --- a/drivers/gpu/drm/scheduler/sched_entity.c > > > +++ b/drivers/gpu/drm/scheduler/sched_entity.c > > > @@ -70,6 +70,7 @@ int drm_sched_entity_init(struct drm_sched_entity > > > *entity, > > > entity->guilty = guilty; > > > entity->num_sched_list = num_sched_list; > > > entity->priority = priority; > > > + entity->last_user = current->group_leader; > > > /* > > > * It's perfectly valid to initialize an entity without having a valid > > > * scheduler attached. It's just not valid to use the scheduler before > > > it > > > @@ -302,7 +303,7 @@ long drm_sched_entity_flush(struct drm_sched_entity > > > *entity, long timeout) > > > > > > /* For a killed process disallow further enqueueing of jobs. */ > > > last_user = cmpxchg(&entity->last_user, current->group_leader, NULL); > > > - if ((!last_user || last_user == current->group_leader) && > > > + if (last_user == current->group_leader && > > > > It's not super clear from the commit message why the NULL check is > > being removed. Previously entities could have been killed if there was > > no last user. That's not desired anymore. Why? > > The reason that we don't need the NULL check any more is because we now don't > encounter the NULL pointer any more. > > In other words the pointer is now always initialized, even when the entity > was never used. > > I've added another sentence to the commit message, but I'm not sure how to > better describe that. Is that sufficient? If not suggestions welcome.
Sounds good enough, thank you. P. > > Thanks, > Christian. > > > > > > > > P. > > > > > (current->flags & PF_EXITING) && (current->exit_code == SIGKILL)) > > > drm_sched_entity_kill(entity); > > > > > >
