On Wed, 2025-10-15 at 17:52 +0200, Christian König wrote:
> 
> 
> On 15.10.25 10:23, Philipp Stanner wrote:
> > On Tue, 2025-10-14 at 16:24 +0200, Christian König wrote:
> > > From: David Rosca <[email protected]>
> > > 
> > > The DRM scheduler tracks who last uses an entity and when that process
> > > is killed blocks all further submissions to that entity.
> > > 
> > > The problem is that we didn't tracked who initialy created an entity, so
> > 
> > s/tracked/track
> > 
> > > when an process accidentially leaked its file descriptor to a child and
> > 
> > s/an/a
> > 
> > > that child got killed we killed the parents entities.
> > 
> > s/parents/parent's
> > 
> > > 
> > > Avoid that and instead initialize the entities last user on entity
> > > creation.
> > > 
> > > Signed-off-by: David Rosca <[email protected]>
> > > Signed-off-by: Christian König <[email protected]>
> > > CC: [email protected]
> > > ---
> > >  drivers/gpu/drm/scheduler/sched_entity.c | 3 ++-
> > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
> > > b/drivers/gpu/drm/scheduler/sched_entity.c
> > > index 5a4697f636f2..3e2f83dc3f24 100644
> > > --- a/drivers/gpu/drm/scheduler/sched_entity.c
> > > +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> > > @@ -70,6 +70,7 @@ int drm_sched_entity_init(struct drm_sched_entity 
> > > *entity,
> > >   entity->guilty = guilty;
> > >   entity->num_sched_list = num_sched_list;
> > >   entity->priority = priority;
> > > + entity->last_user = current->group_leader;
> > >   /*
> > >    * It's perfectly valid to initialize an entity without having a valid
> > >    * scheduler attached. It's just not valid to use the scheduler before 
> > > it
> > > @@ -302,7 +303,7 @@ long drm_sched_entity_flush(struct drm_sched_entity 
> > > *entity, long timeout)
> > >  
> > >   /* For a killed process disallow further enqueueing of jobs. */
> > >   last_user = cmpxchg(&entity->last_user, current->group_leader, NULL);
> > > - if ((!last_user || last_user == current->group_leader) &&
> > > + if (last_user == current->group_leader &&
> > 
> > It's not super clear from the commit message why the NULL check is
> > being removed. Previously entities could have been killed if there was
> > no last user. That's not desired anymore. Why?
> 
> The reason that we don't need the NULL check any more is because we now don't 
> encounter the NULL pointer any more.
> 
> In other words the pointer is now always initialized, even when the entity 
> was never used.
> 
> I've added another sentence to the commit message, but I'm not sure how to 
> better describe that. Is that sufficient? If not suggestions welcome.

Sounds good enough, thank you.

P.

> 
> Thanks,
> Christian.
> 
> 
> > 
> > 
> > P.
> > 
> > >       (current->flags & PF_EXITING) && (current->exit_code == SIGKILL))
> > >           drm_sched_entity_kill(entity);
> > >  
> > 
> 

Reply via email to