Another "hung task" error was reported during the test, and i figured out
the deadlock scenario is as follows:
T1 [BP] | T2 [AP] | T3 [hwlatd/1]
| T4
work_for_cpu_fn() | cpuhp_thread_fun() | kthread_fn()
| hwlat_hotplug_workfn()
_cpu_down() | stop_cpu_kthread() |
| mutex_lock(&hwlat_data.lock)
cpus_write_lock() | kthread_stop(hwlatd/1) |
mutex_lock(&hwlat_data.lock) |
__cpuhp_kick_ap() | wait_for_completion() |
| cpus_read_lock()
It constitutes ABBA deadlock indirectly between "cpu_hotplug_lock" and
"hwlat_data.lock", make the mutex obtaining in kthread_fn() interruptible
to fix this.
Fixes: ba998f7d9531 ("trace/hwlat: Support hotplug operations")
Signed-off-by: Wei Li <[email protected]>
---
kernel/trace/trace_hwlat.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/kernel/trace/trace_hwlat.c b/kernel/trace/trace_hwlat.c
index 3bd6071441ad..4c228ccb8a38 100644
--- a/kernel/trace/trace_hwlat.c
+++ b/kernel/trace/trace_hwlat.c
@@ -370,7 +370,8 @@ static int kthread_fn(void *data)
get_sample();
local_irq_enable();
- mutex_lock(&hwlat_data.lock);
+ if (mutex_lock_interruptible(&hwlat_data.lock))
+ break;
interval = hwlat_data.sample_window - hwlat_data.sample_width;
mutex_unlock(&hwlat_data.lock);
--
2.25.1