Re: [RFC] lockdep: Put graph lock/unlock under lock_recursion protection

2020-11-13 Thread Peter Zijlstra
On Fri, Nov 13, 2020 at 07:05:03PM +0800, Boqun Feng wrote:
> A warning was hit when running xfstests/generic/068 in a Hyper-V guest:
> 
> [...] [ cut here ]
> [...] DEBUG_LOCKS_WARN_ON(lockdep_hardirqs_enabled())
> [...] WARNING: CPU: 2 PID: 1350 at kernel/locking/lockdep.c:5280 
> check_flags.part.0+0x165/0x170
> [...] ...
> [...] Workqueue: events pwq_unbound_release_workfn
> [...] RIP: 0010:check_flags.part.0+0x165/0x170
> [...] ...
> [...] Call Trace:
> [...]  lock_is_held_type+0x72/0x150
> [...]  ? lock_acquire+0x16e/0x4a0
> [...]  rcu_read_lock_sched_held+0x3f/0x80
> [...]  __send_ipi_one+0x14d/0x1b0
> [...]  hv_send_ipi+0x12/0x30
> [...]  __pv_queued_spin_unlock_slowpath+0xd1/0x110
> [...]  __raw_callee_save___pv_queued_spin_unlock_slowpath+0x11/0x20
> [...]  .slowpath+0x9/0xe
> [...]  lockdep_unregister_key+0x128/0x180
> [...]  pwq_unbound_release_workfn+0xbb/0xf0
> [...]  process_one_work+0x227/0x5c0
> [...]  worker_thread+0x55/0x3c0
> [...]  ? process_one_work+0x5c0/0x5c0
> [...]  kthread+0x153/0x170
> [...]  ? __kthread_bind_mask+0x60/0x60
> [...]  ret_from_fork+0x1f/0x30
> 
> The cause of the problem is we have call chain lockdep_unregister_key()
> ->  lockdep_unlock() ->
> arch_spin_unlock() -> __pv_queued_spin_unlock_slowpath() -> pv_kick() ->
> __send_ipi_one() -> trace_hyperv_send_ipi_one().
> 
> Although this particular warning is triggered because Hyper-V has a
> trace point in ipi sending, but in general arch_spin_unlock() may call
> another function having a trace point in it, so put the arch_spin_lock()
> and arch_spin_unlock() after lock_recursion protection to fix this
> problem and avoid similiar problems.
> 
> Signed-off-by: Boqun Feng 

Works for me, thanks!


[RFC] lockdep: Put graph lock/unlock under lock_recursion protection

2020-11-13 Thread Boqun Feng
A warning was hit when running xfstests/generic/068 in a Hyper-V guest:

[...] [ cut here ]
[...] DEBUG_LOCKS_WARN_ON(lockdep_hardirqs_enabled())
[...] WARNING: CPU: 2 PID: 1350 at kernel/locking/lockdep.c:5280 
check_flags.part.0+0x165/0x170
[...] ...
[...] Workqueue: events pwq_unbound_release_workfn
[...] RIP: 0010:check_flags.part.0+0x165/0x170
[...] ...
[...] Call Trace:
[...]  lock_is_held_type+0x72/0x150
[...]  ? lock_acquire+0x16e/0x4a0
[...]  rcu_read_lock_sched_held+0x3f/0x80
[...]  __send_ipi_one+0x14d/0x1b0
[...]  hv_send_ipi+0x12/0x30
[...]  __pv_queued_spin_unlock_slowpath+0xd1/0x110
[...]  __raw_callee_save___pv_queued_spin_unlock_slowpath+0x11/0x20
[...]  .slowpath+0x9/0xe
[...]  lockdep_unregister_key+0x128/0x180
[...]  pwq_unbound_release_workfn+0xbb/0xf0
[...]  process_one_work+0x227/0x5c0
[...]  worker_thread+0x55/0x3c0
[...]  ? process_one_work+0x5c0/0x5c0
[...]  kthread+0x153/0x170
[...]  ? __kthread_bind_mask+0x60/0x60
[...]  ret_from_fork+0x1f/0x30

The cause of the problem is we have call chain lockdep_unregister_key()
->  lockdep_unlock() ->
arch_spin_unlock() -> __pv_queued_spin_unlock_slowpath() -> pv_kick() ->
__send_ipi_one() -> trace_hyperv_send_ipi_one().

Although this particular warning is triggered because Hyper-V has a
trace point in ipi sending, but in general arch_spin_unlock() may call
another function having a trace point in it, so put the arch_spin_lock()
and arch_spin_unlock() after lock_recursion protection to fix this
problem and avoid similiar problems.

Signed-off-by: Boqun Feng 
Cc: "K. Y. Srinivasan" 
Cc: Haiyang Zhang 
Cc: Stephen Hemminger 
Cc: Wei Liu 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: x...@kernel.org
Cc: "H. Peter Anvin" 
---
 kernel/locking/lockdep.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index b71ad8d9f1c9..b98e44f88c6a 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -108,19 +108,21 @@ static inline void lockdep_lock(void)
 {
DEBUG_LOCKS_WARN_ON(!irqs_disabled());
 
+   __this_cpu_inc(lockdep_recursion);
arch_spin_lock(&__lock);
__owner = current;
-   __this_cpu_inc(lockdep_recursion);
 }
 
 static inline void lockdep_unlock(void)
 {
+   DEBUG_LOCKS_WARN_ON(!irqs_disabled());
+
if (debug_locks && DEBUG_LOCKS_WARN_ON(__owner != current))
return;
 
-   __this_cpu_dec(lockdep_recursion);
__owner = NULL;
arch_spin_unlock(&__lock);
+   __this_cpu_dec(lockdep_recursion);
 }
 
 static inline bool lockdep_assert_locked(void)
-- 
2.29.2