On Wed, 9 Oct 2024 15:47:23 +0800
"liwei (GF)" <[email protected]> wrote:

> On 2024/10/4 4:19, Steven Rostedt wrote:
> > On Tue, 24 Sep 2024 17:45:15 +0800
> > Wei Li <[email protected]> wrote:
> >   
> >> Another "hung task" error was reported during the test, and i figured out
> >> the deadlock scenario is as follows:
> >>
> >> T1 [BP]               | T2 [AP]                     | T3 [hwlatd/1]        
> >>           | T4
> >> work_for_cpu_fn()     | cpuhp_thread_fun()          | kthread_fn()         
> >>           | hwlat_hotplug_workfn()
> >>   _cpu_down()         |   stop_cpu_kthread()        |                      
> >>           |   mutex_lock(&hwlat_data.lock)
> >>     cpus_write_lock() |     kthread_stop(hwlatd/1)  |   
> >> mutex_lock(&hwlat_data.lock) |
> >>     __cpuhp_kick_ap() |       wait_for_completion() |                      
> >>           |   cpus_read_lock()

So, if we can make T3 not take the mutex_lock then that should be a
solution, right?

> >>
> >> It constitutes ABBA deadlock indirectly beAs it calls 
> >> msleep_interruptible() and 'break' if signal pending below, i choosed 
> >> 'break' here too.tween "cpu_hotplug_lock" and
> >> "hwlat_data.lock", make the mutex obtaining in kthread_fn() interruptible
> >> to fix this.
> >>
> >> Fixes: ba998f7d9531 ("trace/hwlat: Support hotplug operations")
> >> Signed-off-by: Wei Li <[email protected]>
> >> ---
> >>  kernel/trace/trace_hwlat.c | 3 ++-
> >>  1 file changed, 2 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/kernel/trace/trace_hwlat.c b/kernel/trace/trace_hwlat.c
> >> index 3bd6071441ad..4c228ccb8a38 100644
> >> --- a/kernel/trace/trace_hwlat.c
> >> +++ b/kernel/trace/trace_hwlat.c
> >> @@ -370,7 +370,8 @@ static int kthread_fn(void *data)
> >>            get_sample();
> >>            local_irq_enable();
> >>  
> >> -          mutex_lock(&hwlat_data.lock);
> >> +          if (mutex_lock_interruptible(&hwlat_data.lock))
> >> +                  break;  
> > 
> > So basically this requires as signal to break it out of the loop?
> > 
> > But if it receives a signal for any other reason, it breaks out of the loop
> > too. Which is not what we want. If anything, it should be:
> > 
> >             if (mutex_lock_interruptible(&hwlat_data.lock))
> >                     continue;  
> 
> As it calls msleep_interruptible() below and 'break' if signal pending, i
> choosed 'break' here too.
> 
> > But I still don't really like this solution, as it will still report a
> >     deadlock.
> > 
> > Is it possible to switch the cpu_read_lock() to be taken before the
> > hwlat_data.lock?  
> 
> It's a little hard to change the sequence of these two locks, we'll hold
> "cpu_hotplug_lock" for longer unnecessarily if we do that.
> 
> But maybe we can remove the "hwlat_data.lock" in kthread_fn(), let me try
> another modification.

Have you found something yet? Looking at the code we have:

                mutex_lock(&hwlat_data.lock);
                interval = hwlat_data.sample_window - hwlat_data.sample_width;
                mutex_unlock(&hwlat_data.lock);

Where the lock is only there to synchronize the calculation of the
interval. We could add a counter for when sample_window and sample_width
are updated, and we could simply do:

 again:
                counter = atomic_read(&hwlat_data.counter);
                smp_rmb();
                if (!(counter & 1)) {
                        new_interval = hwlat_data.sample_window - 
hwlat_data.sample_width;
                        smp_rmb();
                        if (counter == atomic_read(&hwlat_data.counter))
                                interval = new_interval;
                }

Then we could do something like:

        atomic_inc(&hwlat_data.counter);
        smp_wmb();
        /* update sample_window or sample_width */
        smp_wmb();
        atomic_inc(&hwlat_data.counter);

And then the interval will only be updated if the values are not being
updated. Otherwise it just keeps the previous value.

-- Steve

Reply via email to