Ok. Unfortunately everything in the discoverer is executing in BH, both 
timeouts and message receptions (meaning that even the other call to 
tipc_net_finalize() in the code is problematic.)
So unless we come up with something really smart I can only see the work queue 
solution as the way forward.

///jon

> -----Original Message-----
> From: Ying Xue <ying....@windriver.com>
> Sent: 6-Nov-18 06:18
> To: Jon Maloy <jon.ma...@ericsson.com>
> Cc: tipc-discussion@lists.sourceforge.net; tipc-...@dektech.com.au
> Subject: Re: Deadlock warning
> 
> Hi Jon,
> 
> You are right. We cannot call rhashtable_walk_enter() in BH context. But its
> reason is not related to the lock in tipc_discoverers struct.
> Instead, its root cause is that rhashtable_rehash_table() takes ht->lock
> without disabling BH.
> 
> Please image the scenario below:
> 
>    Thread context:
> =====================
> worker_thread()
> rht_deferred_worker()
> spin_lock(&ht->lock);
>        -------->interrupted by irq
> 
>                                        BH context
>                                    =================
>                                     tipc_disc_timeout()
>                                     tipc_net_finalize()
>                                     tipc_sk_reinit()
>                                     rhashtable_walk_enter()
>                                     spin_lock(&ht->lock); --->Deadlock
> 
> 
> But as rht_deferred_worker() is common code of rhashtable, we cannot
> change its code, in my opinion, the only proper approach is that we should
> avoid to call tipc_net_finalize() in BH. Otherwise, as you mentioned, we have
> to call tipc_net_finalize() in a work queue.
> 
> Thanks,
> Ying
> 
> On 11/06/2018 08:59 AM, Jon Maloy wrote:
> > Hi Ying,
> > It seems the mail server I am using to send out patches is down at the
> moment, so I will just paste my own solution into the mail browser. See
> below.
> > This patch works, and the warning disappears, but we are still violating the
> restriction that rhashtable_walk_enter() should not be called in interrupt
> context.
> > The alternative is to launch the net_finalize() call into the system work
> queue, but that is far from any 'elegant' solution in my view.
> >
> > ///jon
> >
> >
> >>From c18f89cb99c9d5f9a950b031acebfaf0751fa488 Mon Sep 17 00:00:00
> 2001
> > From: Jon Maloy <jon.ma...@ericsson.com>
> > Date: Mon, 5 Nov 2018 01:30:04 +0100
> > Subject: [net  1/1] tipc: fix lockdep warning
> >
> > We get the following warning:
> >
> > [   47.926140] 32-bit node address hash set to 2010a0a
> > [   47.927202]
> > [   47.927433] ================================
> > [   47.928050] WARNING: inconsistent lock state
> > [   47.928661] 4.19.0+ #37 Tainted: G            E
> > [   47.929346] --------------------------------
> > [   47.929954] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
> > [   47.930116] swapper/3/0 [HC0[0]:SC1[3]:HE1:SE0] takes:
> > [   47.930116] 00000000af8bc31e (&(&ht->lock)->rlock){+.?.}, at:
> rhashtable_walk_enter+0x36/0xb0
> > [   47.930116] {SOFTIRQ-ON-W} state was registered at:
> > [   47.930116]   _raw_spin_lock+0x29/0x60
> > [   47.930116]   rht_deferred_worker+0x556/0x810
> > [   47.930116]   process_one_work+0x1f5/0x540
> > [   47.930116]   worker_thread+0x64/0x3e0
> > [   47.930116]   kthread+0x112/0x150
> > [   47.930116]   ret_from_fork+0x3a/0x50
> > [   47.930116] irq event stamp: 14044
> > [   47.930116] hardirqs last  enabled at (14044): [<ffffffff9a07fbba>]
> __local_bh_enable_ip+0x7a/0xf0
> > [   47.938117] hardirqs last disabled at (14043): [<ffffffff9a07fb81>]
> __local_bh_enable_ip+0x41/0xf0
> > [   47.938117] softirqs last  enabled at (14028): [<ffffffff9a0803ee>]
> irq_enter+0x5e/0x60
> > [   47.938117] softirqs last disabled at (14029): [<ffffffff9a0804a5>]
> irq_exit+0xb5/0xc0
> > [   47.938117]
> > [   47.938117] other info that might help us debug this:
> > [   47.938117]  Possible unsafe locking scenario:
> > [   47.938117]
> > [   47.938117]        CPU0
> > [   47.938117]        ----
> > [   47.938117]   lock(&(&ht->lock)->rlock);
> > [   47.938117]   <Interrupt>
> > [   47.938117]     lock(&(&ht->lock)->rlock);
> > [   47.938117]
> > [   47.938117]  *** DEADLOCK ***
> > [   47.938117]
> > [   47.938117] 2 locks held by swapper/3/0:
> > [   47.938117]  #0: 0000000062c64f90 ((&d->timer)){+.-.}, at:
> call_timer_fn+0x5/0x280
> > [   47.938117]  #1: 00000000ee39619c (&(&d->lock)->rlock){+.-.}, at:
> tipc_disc_timeout+0xc8/0x540 [tipc]
> > [   47.938117]
> > [   47.938117] stack backtrace:
> > [   47.938117] CPU: 3 PID: 0 Comm: swapper/3 Tainted: G            E     
> > 4.19.0+
> #37
> > [   47.938117] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
> > [   47.938117] Call Trace:
> > [   47.938117]  <IRQ>
> > [   47.938117]  dump_stack+0x5e/0x8b
> > [   47.938117]  print_usage_bug+0x1ed/0x1ff
> > [   47.938117]  mark_lock+0x5b5/0x630
> > [   47.938117]  __lock_acquire+0x4c0/0x18f0
> > [   47.938117]  ? lock_acquire+0xa6/0x180
> > [   47.938117]  lock_acquire+0xa6/0x180
> > [   47.938117]  ? rhashtable_walk_enter+0x36/0xb0
> > [   47.938117]  _raw_spin_lock+0x29/0x60
> > [   47.938117]  ? rhashtable_walk_enter+0x36/0xb0
> > [   47.938117]  rhashtable_walk_enter+0x36/0xb0
> > [   47.938117]  tipc_sk_reinit+0xb0/0x410 [tipc]
> > [   47.938117]  ? mark_held_locks+0x6f/0x90
> > [   47.938117]  ? __local_bh_enable_ip+0x7a/0xf0
> > [   47.938117]  ? lockdep_hardirqs_on+0x20/0x1a0
> > [   47.938117]  tipc_net_finalize+0xbf/0x180 [tipc]
> > [   47.938117]  tipc_disc_timeout+0x509/0x540 [tipc]
> > [   47.938117]  ? call_timer_fn+0x5/0x280
> > [   47.938117]  ? tipc_disc_msg_xmit.isra.19+0xa0/0xa0 [tipc]
> > [   47.938117]  ? tipc_disc_msg_xmit.isra.19+0xa0/0xa0 [tipc]
> > [   47.938117]  call_timer_fn+0xa1/0x280
> > [   47.938117]  ? tipc_disc_msg_xmit.isra.19+0xa0/0xa0 [tipc]
> > [   47.938117]  run_timer_softirq+0x1f2/0x4d0
> > [   47.938117]  __do_softirq+0xfc/0x413
> > [   47.938117]  irq_exit+0xb5/0xc0
> > [   47.938117]  smp_apic_timer_interrupt+0xac/0x210
> > [   47.938117]  apic_timer_interrupt+0xf/0x20
> > [   47.938117]  </IRQ>
> > [   47.938117] RIP: 0010:default_idle+0x1c/0x140
> > [   47.938117] Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 
> > 00 41
> 54 55 53 65 8b 2d d8 2b 74 65 0f 1f 44 00 00 e8 c6 2c 8b ff fb f4 <65> 8b 2d 
> c5 2b
> 74 65 0f 1f 44 00 00 5b 5d 41 5c c3 65 8b 05 b4 2b
> > [   47.938117] RSP: 0018:ffffaf6ac0207ec8 EFLAGS: 00000206 ORIG_RAX:
> ffffffffffffff13
> > [   47.938117] RAX: ffff8f5b3735e200 RBX: 0000000000000003 RCX:
> 0000000000000001
> > [   47.938117] RDX: 0000000000000001 RSI: 0000000000000001 RDI:
> ffff8f5b3735e200
> > [   47.938117] RBP: 0000000000000003 R08: 0000000000000001 R09:
> 0000000000000000
> > [   47.938117] R10: 0000000000000000 R11: 0000000000000000 R12:
> 0000000000000000
> > [   47.938117] R13: 0000000000000000 R14: ffff8f5b3735e200 R15:
> ffff8f5b3735e200
> > [   47.938117]  ? default_idle+0x1a/0x140
> > [   47.938117]  do_idle+0x1bc/0x280
> > [   47.938117]  cpu_startup_entry+0x19/0x20
> > [   47.938117]  start_secondary+0x187/0x1c0
> > [   47.938117]  secondary_startup_64+0xa4/0xb0
> >
> > The reason seems to be that tipc_net_finalize()->tipc_sk_reinit() is
> > calling the function rhashtable_walk_enter() within a timer interrupt
> > with dicoverer::lock set. It is safe to release this lock before we
> > call tipc_net_finalize(), so that is what we do.
> >
> > Signed-off-by: Jon Maloy <jon.ma...@ericsson.com>
> > ---
> >  net/tipc/discover.c | 17 +++++++++--------
> >  1 file changed, 9 insertions(+), 8 deletions(-)
> >
> > diff --git a/net/tipc/discover.c b/net/tipc/discover.c index
> > 2830709..5fff91c 100644
> > --- a/net/tipc/discover.c
> > +++ b/net/tipc/discover.c
> > @@ -167,6 +167,7 @@ static bool tipc_disc_addr_trial_msg(struct
> tipc_discoverer *d,
> >         /* Apply trial address if we just left trial period */
> >         if (!trial && !self) {
> >                 tipc_net_finalize(net, tn->trial_addr);
> > +               msg_set_prevnode(buf_msg(d->skb), tn->trial_addr);
> >                 msg_set_type(buf_msg(d->skb), DSC_REQ_MSG);
> >         }
> >
> > @@ -300,14 +301,12 @@ static void tipc_disc_timeout(struct timer_list *t)
> >                 goto exit;
> >         }
> >
> > -       /* Trial period over ? */
> > -       if (!time_before(jiffies, tn->addr_trial_end)) {
> > -               /* Did we just leave it ? */
> > -               if (!tipc_own_addr(net))
> > -                       tipc_net_finalize(net, tn->trial_addr);
> > -
> > -               msg_set_type(buf_msg(d->skb), DSC_REQ_MSG);
> > -               msg_set_prevnode(buf_msg(d->skb), tipc_own_addr(net));
> > +       /* Did we just leave trial period ? */
> > +       if (!time_before(jiffies, tn->addr_trial_end) && 
> > !tipc_own_addr(net))
> {
> > +               mod_timer(&d->timer, jiffies + TIPC_DISC_INIT);
> > +               spin_unlock_bh(&d->lock);
> > +               tipc_net_finalize(net, tn->trial_addr);
> > +               return;
> >         }
> >
> >         /* Adjust timeout interval according to discovery phase */ @@
> > -319,6 +318,8 @@ static void tipc_disc_timeout(struct timer_list *t)
> >                         d->timer_intv = TIPC_DISC_SLOW;
> >                 else if (!d->num_nodes && d->timer_intv > TIPC_DISC_FAST)
> >                         d->timer_intv = TIPC_DISC_FAST;
> > +               msg_set_type(buf_msg(d->skb), DSC_REQ_MSG);
> > +               msg_set_prevnode(buf_msg(d->skb), tn->trial_addr);
> >         }
> >
> >         mod_timer(&d->timer, jiffies + d->timer_intv);
> > --
> > 2.1.4
> >
> >
> >
> >
> >> -----Original Message-----
> >> From: Ying Xue <ying....@windriver.com>
> >> Sent: 5-Nov-18 06:34
> >> To: Jon Maloy <jon.ma...@ericsson.com>
> >> Subject: Re: Deadlock warning
> >>
> >> FYI: I have found its root cause. Now I am figuring out how to a bit
> >> elegantly solve it.
> >>
> >> On 10/17/2018 10:36 AM, Jon Maloy wrote:
> >>> Hi Ying,
> >>>
> >>> I sometimes get the following deadlock warning. As I understand it
> >>> the reason is that we are calling rhashtable_walk_enter() in a tmer,
> >>> i.e., in an SW interrupt, something that is not permitted. Do you
> >>> agree with this interpretation? Since you have worked more with
> >>> these hash tables than I have, can you see any easy solution to it.
> >>> I would hate to introduce a work queue to solve this.
> >>>
> >>>
> >>>
> >>> ///jon
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> [346769.617370] ================================
> >>>
> >>> [346769.618331] WARNING: inconsistent lock state
> >>>
> >>> [346769.619187] 4.19.0-rc6+ #27 Tainted: G            E
> >>>
> >>> [346769.619651] --------------------------------
> >>>
> >>> [346769.619651] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W}
> usage.
> >>>
> >>> [346769.619651] swapper/3/0 [HC0[0]:SC1[3]:HE1:SE0] takes:
> >>>
> >>> [346769.619651] 000000009e3eefe1 (&(&ht->lock)->rlock){+.?.}, at:
> >>> rhashtable_walk_enter+0x36/0xb0
> >>>
> >>> [346769.619651] {SOFTIRQ-ON-W} state was registered at:
> >>>
> >>> [346769.619651]   _raw_spin_lock+0x29/0x60
> >>>
> >>> [346769.619651]   rht_deferred_worker+0x556/0x810
> >>>
> >>> [346769.619651]   process_one_work+0x1f5/0x540
> >>>
> >>> [346769.619651]   worker_thread+0x64/0x3e0
> >>>
> >>> [346769.619651]   kthread+0x112/0x150
> >>>
> >>> [346769.619651]   ret_from_fork+0x3a/0x50
> >>>
> >>> [346769.619651] irq event stamp: 8581694
> >>>
> >>> [346769.619651] hardirqs last  enabled at (8581694):
> >>> [<ffffffff9107f290>] __local_bh_enable_ip+0x80           /0x100
> >>>
> >>> [346769.619651] hardirqs last disabled at (8581693):
> >>> [<ffffffff9107f257>] __local_bh_enable_ip+0x47           /0x100
> >>>
> >>> [346769.619651] softirqs last  enabled at (8581678):
> >>> [<ffffffff9107fa1e>] irq_enter+0x5e/0x60
> >>>
> >>> [346769.619651] softirqs last disabled at (8581679):
> >>> [<ffffffff9107fadb>] irq_exit+0xbb/0xc0
> >>>
> >>> [346769.619651]
> >>>
> >>> [346769.619651] other info that might help us debug this:
> >>>
> >>> [346769.619651]  Possible unsafe locking scenario:
> >>>
> >>> [346769.619651]
> >>>
> >>> [346769.619651]        CPU0
> >>>
> >>> [346769.619651]        ----
> >>>
> >>> [346769.619651]   lock(&(&ht->lock)->rlock);
> >>>
> >>> [346769.619651]   <Interrupt>
> >>>
> >>> [346769.619651]     lock(&(&ht->lock)->rlock);
> >>>
> >>> [346769.619651]
> >>>
> >>> [346769.619651]  *** DEADLOCK ***
> >>>
> >>> [346769.619651]
> >>>
> >>> [346769.619651] 2 locks held by swapper/3/0:
> >>>
> >>> [346769.619651]  #0: 00000000d9e59d74 ((&d->timer)){+.-.}, at:
> >>> call_timer_fn+0x5/0x280
> >>>
> >>> [346769.619651]  #1: 000000000bf452d8 (&(&d->lock)->rlock){+.-.}, at:
> >>> tipc_disc_timeout+0xc8/0x540            [tipc]
> >>>
> >>> [346769.619651]
> >>>
> >>> [346769.619651] stack backtrace:
> >>>
> >>> [346769.619651] CPU: 3 PID: 0 Comm: swapper/3 Tainted: G E
> >>> 4.19.0-rc6+ #27
> >>>
> >>> [346769.619651] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
> >>>
> >>> [346769.619651] Call Trace:
> >>>
> >>> [346769.619651]  <IRQ>
> >>>
> >>> [346769.619651]  dump_stack+0x78/0xb3
> >>>
> >>> [346769.619651]  print_usage_bug+0x1ed/0x20b
> >>>
> >>> [346769.619651]  mark_lock+0x5ca/0x640
> >>>
> >>> [346769.619651]  __lock_acquire+0x41f/0x1b10
> >>>
> >>> [346769.619651]  ? sched_clock_local+0x12/0x80
> >>>
> >>> [346769.619651]  ? lock_acquire+0xb3/0x190
> >>>
> >>> [346769.619651]  lock_acquire+0xb3/0x190
> >>>
> >>> [346769.619651]  ? rhashtable_walk_enter+0x36/0xb0
> >>>
> >>> [346769.651609]  _raw_spin_lock+0x29/0x60
> >>>
> >>> [346769.651609]  ? rhashtable_walk_enter+0x36/0xb0
> >>>
> >>> [346769.651609]  rhashtable_walk_enter+0x36/0xb0
> >>>
> >>> [346769.651609]  tipc_sk_reinit+0xb0/0x410 [tipc]
> >>>
> >>> [346769.651609]  ? mark_held_locks+0x6f/0x90
> >>>
> >>> [346769.651609]  ? __local_bh_enable_ip+0x80/0x100
> >>>
> >>> [346769.651609]  ? debug_show_all_locks+0x170/0x190
> >>>
> >>> [346769.651609]  tipc_net_finalize+0xbf/0x180 [tipc]
> >>>
> >>> [346769.651609]  tipc_disc_timeout+0x509/0x540 [tipc]
> >>>
> >

_______________________________________________
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

Reply via email to