On 2020-11-17 13:45:03 [+0100], Peter Zijlstra wrote:
> On Tue, Nov 10, 2020 at 12:38:47PM +0100, Sebastian Andrzej Siewior wrote:
> > With enabled threaded interrupts the nouveau driver reported the
> > following:
> > | Chain exists of:
> > |   &mm->mmap_lock#2 --> &device->mutex --> &cpuset_rwsem
> > |
> > |  Possible unsafe locking scenario:
> > |
> > |        CPU0                    CPU1
> > |        ----                    ----
> > |   lock(&cpuset_rwsem);
> > |                                lock(&device->mutex);
> > |                                lock(&cpuset_rwsem);
> > |   lock(&mm->mmap_lock#2);
> > 
> > The device->mutex is nvkm_device::mutex.
> > 
> > Unblocking the lockchain at `cpuset_rwsem' is probably the easiest thing
> > to do.
> > Move the priority reset to the start of the newly created thread.
> > 
> > Fixes: 710da3c8ea7df ("sched/core: Prevent race condition between cpuset 
> > and __sched_setscheduler()")
> > Reported-by: Mike Galbraith <[email protected]>
> > Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
> > Link: 
> > https://lkml.kernel.org/r/[email protected]
> 
> Moo... yes this is certainly the easiest solution, because nouveau is a
> horrible rats nest. But when I spoke to Greg KH about this, he suggested
> nouveau ought to be fixed.
> 
> Ben, I got terminally lost when trying to untangle nouvea init, is there
> any chance this can be fixed to not hold that nvkm_device::mutex thing
> while doing request_irq() ?

Ben, did you had a chance to peek at this?

> > ---
> >  kernel/kthread.c | 16 ++++++++--------
> >  1 file changed, 8 insertions(+), 8 deletions(-)
> > 
> > diff --git a/kernel/kthread.c b/kernel/kthread.c
> > index 933a625621b8d..4a31127c6efbf 100644
> > --- a/kernel/kthread.c
> > +++ b/kernel/kthread.c
> > @@ -243,6 +243,7 @@ EXPORT_SYMBOL_GPL(kthread_parkme);
> >  
> >  static int kthread(void *_create)
> >  {
> > +   static const struct sched_param param = { .sched_priority = 0 };
> >     /* Copy data: it's on kthread's stack */
> >     struct kthread_create_info *create = _create;
> >     int (*threadfn)(void *data) = create->threadfn;
> > @@ -273,6 +274,13 @@ static int kthread(void *_create)
> >     init_completion(&self->parked);
> >     current->vfork_done = &self->exited;
> >  
> > +   /*
> > +    * The new thread inherited kthreadd's priority and CPU mask. Reset
> > +    * back to default in case they have been changed.
> > +    */
> > +   sched_setscheduler_nocheck(current, SCHED_NORMAL, &param);
> > +   set_cpus_allowed_ptr(current, housekeeping_cpumask(HK_FLAG_KTHREAD));
> > +
> >     /* OK, tell user we're spawned, wait for stop or wakeup */
> >     __set_current_state(TASK_UNINTERRUPTIBLE);
> >     create->result = current;
> > @@ -370,7 +378,6 @@ struct task_struct *__kthread_create_on_node(int 
> > (*threadfn)(void *data),
> >     }
> >     task = create->result;
> >     if (!IS_ERR(task)) {
> > -           static const struct sched_param param = { .sched_priority = 0 };
> >             char name[TASK_COMM_LEN];
> >  
> >             /*
> > @@ -379,13 +386,6 @@ struct task_struct *__kthread_create_on_node(int 
> > (*threadfn)(void *data),
> >              */
> >             vsnprintf(name, sizeof(name), namefmt, args);
> >             set_task_comm(task, name);
> > -           /*
> > -            * root may have changed our (kthreadd's) priority or CPU mask.
> > -            * The kernel thread should not inherit these properties.
> > -            */
> > -           sched_setscheduler_nocheck(task, SCHED_NORMAL, &param);
> > -           set_cpus_allowed_ptr(task,
> > -                                housekeeping_cpumask(HK_FLAG_KTHREAD));
> >     }
> >     kfree(create);
> >     return task;

Sebastian

Reply via email to