On Wed, 26 Feb 2025 03:42:53 +0000
Ran Xiaokai <[email protected]> wrote:

> >> @@ -2105,7 +2104,12 @@ static void osnoise_hotplug_workfn(struct
> >> work_struct *dummy)
> >>      if (!cpumask_test_cpu(cpu, &osnoise_cpumask))
> >>          return;
> >>  
> >> -    start_kthread(cpu);
> >> +    if (start_kthread(cpu)) {
> >> +        cpus_read_unlock();
> >> +        stop_per_cpu_kthreads();
> >> +        return;  
> >
> >If all you want to do is to unlock before calling stop_per_cpu_kthreads(),
> >then this should simply be:
> >
> >   if (start_kthread(cpu)) {
> >       cpus_read_unlock();
> >       stop_per_cpu_kthreads();
> >       cpus_read_lock(); // The guard() above will unlock this
> >       return;
> >   }  
> 
> This is the deadlock senario:
> start_per_cpu_kthreads()
>   cpus_read_lock();                  // first lock call
>   start_kthread(cpu)
>     ... kthread_run_on_cpu() fails:
>     if (IS_ERR(kthread)) {
>       stop_per_cpu_kthreads(); {
>         cpus_read_lock();      // second lock call. Cause the AA deadlock 
> senario
>       }
>     }
>   stop_per_cpu_kthreads();
> 
> Besides, stop_per_cpu_kthreads() is called both in start_per_cpu_kthreads() 
> and
> start_kthread() which is unnecessary.
> 
> So the fix should be inside start_kthread()?
> How about this ?

No! You misunderstood what I wrote above.

Instead of removing the guard, keep it!

Do everything the same, but instead of having the error path of:

[..]
    if (start_kthread(cpu)) {
        cpus_read_unlock();
        stop_per_cpu_kthreads();
        return;
    }
    cpus_read_unlock();
 }

Which requires removing the guard. Just do:

    if (start_kthread(cpu)) {
        cpus_read_unlock();
        stop_per_cpu_kthreads();
        cpus_read_lock(); // The guard() will unlock this
    }
 }

I'm just saying to not replace the guard with open coded locking of
cpus_read_lock().

-- Steve

Reply via email to