Ok. Unfortunately everything in the discoverer is executing in BH, both timeouts and message receptions (meaning that even the other call to tipc_net_finalize() in the code is problematic.) So unless we come up with something really smart I can only see the work queue solution as the way forward.
///jon > -----Original Message----- > From: Ying Xue <ying....@windriver.com> > Sent: 6-Nov-18 06:18 > To: Jon Maloy <jon.ma...@ericsson.com> > Cc: tipc-discussion@lists.sourceforge.net; tipc-...@dektech.com.au > Subject: Re: Deadlock warning > > Hi Jon, > > You are right. We cannot call rhashtable_walk_enter() in BH context. But its > reason is not related to the lock in tipc_discoverers struct. > Instead, its root cause is that rhashtable_rehash_table() takes ht->lock > without disabling BH. > > Please image the scenario below: > > Thread context: > ===================== > worker_thread() > rht_deferred_worker() > spin_lock(&ht->lock); > -------->interrupted by irq > > BH context > ================= > tipc_disc_timeout() > tipc_net_finalize() > tipc_sk_reinit() > rhashtable_walk_enter() > spin_lock(&ht->lock); --->Deadlock > > > But as rht_deferred_worker() is common code of rhashtable, we cannot > change its code, in my opinion, the only proper approach is that we should > avoid to call tipc_net_finalize() in BH. Otherwise, as you mentioned, we have > to call tipc_net_finalize() in a work queue. > > Thanks, > Ying > > On 11/06/2018 08:59 AM, Jon Maloy wrote: > > Hi Ying, > > It seems the mail server I am using to send out patches is down at the > moment, so I will just paste my own solution into the mail browser. See > below. > > This patch works, and the warning disappears, but we are still violating the > restriction that rhashtable_walk_enter() should not be called in interrupt > context. > > The alternative is to launch the net_finalize() call into the system work > queue, but that is far from any 'elegant' solution in my view. > > > > ///jon > > > > > >>From c18f89cb99c9d5f9a950b031acebfaf0751fa488 Mon Sep 17 00:00:00 > 2001 > > From: Jon Maloy <jon.ma...@ericsson.com> > > Date: Mon, 5 Nov 2018 01:30:04 +0100 > > Subject: [net 1/1] tipc: fix lockdep warning > > > > We get the following warning: > > > > [ 47.926140] 32-bit node address hash set to 2010a0a > > [ 47.927202] > > [ 47.927433] ================================ > > [ 47.928050] WARNING: inconsistent lock state > > [ 47.928661] 4.19.0+ #37 Tainted: G E > > [ 47.929346] -------------------------------- > > [ 47.929954] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. > > [ 47.930116] swapper/3/0 [HC0[0]:SC1[3]:HE1:SE0] takes: > > [ 47.930116] 00000000af8bc31e (&(&ht->lock)->rlock){+.?.}, at: > rhashtable_walk_enter+0x36/0xb0 > > [ 47.930116] {SOFTIRQ-ON-W} state was registered at: > > [ 47.930116] _raw_spin_lock+0x29/0x60 > > [ 47.930116] rht_deferred_worker+0x556/0x810 > > [ 47.930116] process_one_work+0x1f5/0x540 > > [ 47.930116] worker_thread+0x64/0x3e0 > > [ 47.930116] kthread+0x112/0x150 > > [ 47.930116] ret_from_fork+0x3a/0x50 > > [ 47.930116] irq event stamp: 14044 > > [ 47.930116] hardirqs last enabled at (14044): [<ffffffff9a07fbba>] > __local_bh_enable_ip+0x7a/0xf0 > > [ 47.938117] hardirqs last disabled at (14043): [<ffffffff9a07fb81>] > __local_bh_enable_ip+0x41/0xf0 > > [ 47.938117] softirqs last enabled at (14028): [<ffffffff9a0803ee>] > irq_enter+0x5e/0x60 > > [ 47.938117] softirqs last disabled at (14029): [<ffffffff9a0804a5>] > irq_exit+0xb5/0xc0 > > [ 47.938117] > > [ 47.938117] other info that might help us debug this: > > [ 47.938117] Possible unsafe locking scenario: > > [ 47.938117] > > [ 47.938117] CPU0 > > [ 47.938117] ---- > > [ 47.938117] lock(&(&ht->lock)->rlock); > > [ 47.938117] <Interrupt> > > [ 47.938117] lock(&(&ht->lock)->rlock); > > [ 47.938117] > > [ 47.938117] *** DEADLOCK *** > > [ 47.938117] > > [ 47.938117] 2 locks held by swapper/3/0: > > [ 47.938117] #0: 0000000062c64f90 ((&d->timer)){+.-.}, at: > call_timer_fn+0x5/0x280 > > [ 47.938117] #1: 00000000ee39619c (&(&d->lock)->rlock){+.-.}, at: > tipc_disc_timeout+0xc8/0x540 [tipc] > > [ 47.938117] > > [ 47.938117] stack backtrace: > > [ 47.938117] CPU: 3 PID: 0 Comm: swapper/3 Tainted: G E > > 4.19.0+ > #37 > > [ 47.938117] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 > > [ 47.938117] Call Trace: > > [ 47.938117] <IRQ> > > [ 47.938117] dump_stack+0x5e/0x8b > > [ 47.938117] print_usage_bug+0x1ed/0x1ff > > [ 47.938117] mark_lock+0x5b5/0x630 > > [ 47.938117] __lock_acquire+0x4c0/0x18f0 > > [ 47.938117] ? lock_acquire+0xa6/0x180 > > [ 47.938117] lock_acquire+0xa6/0x180 > > [ 47.938117] ? rhashtable_walk_enter+0x36/0xb0 > > [ 47.938117] _raw_spin_lock+0x29/0x60 > > [ 47.938117] ? rhashtable_walk_enter+0x36/0xb0 > > [ 47.938117] rhashtable_walk_enter+0x36/0xb0 > > [ 47.938117] tipc_sk_reinit+0xb0/0x410 [tipc] > > [ 47.938117] ? mark_held_locks+0x6f/0x90 > > [ 47.938117] ? __local_bh_enable_ip+0x7a/0xf0 > > [ 47.938117] ? lockdep_hardirqs_on+0x20/0x1a0 > > [ 47.938117] tipc_net_finalize+0xbf/0x180 [tipc] > > [ 47.938117] tipc_disc_timeout+0x509/0x540 [tipc] > > [ 47.938117] ? call_timer_fn+0x5/0x280 > > [ 47.938117] ? tipc_disc_msg_xmit.isra.19+0xa0/0xa0 [tipc] > > [ 47.938117] ? tipc_disc_msg_xmit.isra.19+0xa0/0xa0 [tipc] > > [ 47.938117] call_timer_fn+0xa1/0x280 > > [ 47.938117] ? tipc_disc_msg_xmit.isra.19+0xa0/0xa0 [tipc] > > [ 47.938117] run_timer_softirq+0x1f2/0x4d0 > > [ 47.938117] __do_softirq+0xfc/0x413 > > [ 47.938117] irq_exit+0xb5/0xc0 > > [ 47.938117] smp_apic_timer_interrupt+0xac/0x210 > > [ 47.938117] apic_timer_interrupt+0xf/0x20 > > [ 47.938117] </IRQ> > > [ 47.938117] RIP: 0010:default_idle+0x1c/0x140 > > [ 47.938117] Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 > > 00 41 > 54 55 53 65 8b 2d d8 2b 74 65 0f 1f 44 00 00 e8 c6 2c 8b ff fb f4 <65> 8b 2d > c5 2b > 74 65 0f 1f 44 00 00 5b 5d 41 5c c3 65 8b 05 b4 2b > > [ 47.938117] RSP: 0018:ffffaf6ac0207ec8 EFLAGS: 00000206 ORIG_RAX: > ffffffffffffff13 > > [ 47.938117] RAX: ffff8f5b3735e200 RBX: 0000000000000003 RCX: > 0000000000000001 > > [ 47.938117] RDX: 0000000000000001 RSI: 0000000000000001 RDI: > ffff8f5b3735e200 > > [ 47.938117] RBP: 0000000000000003 R08: 0000000000000001 R09: > 0000000000000000 > > [ 47.938117] R10: 0000000000000000 R11: 0000000000000000 R12: > 0000000000000000 > > [ 47.938117] R13: 0000000000000000 R14: ffff8f5b3735e200 R15: > ffff8f5b3735e200 > > [ 47.938117] ? default_idle+0x1a/0x140 > > [ 47.938117] do_idle+0x1bc/0x280 > > [ 47.938117] cpu_startup_entry+0x19/0x20 > > [ 47.938117] start_secondary+0x187/0x1c0 > > [ 47.938117] secondary_startup_64+0xa4/0xb0 > > > > The reason seems to be that tipc_net_finalize()->tipc_sk_reinit() is > > calling the function rhashtable_walk_enter() within a timer interrupt > > with dicoverer::lock set. It is safe to release this lock before we > > call tipc_net_finalize(), so that is what we do. > > > > Signed-off-by: Jon Maloy <jon.ma...@ericsson.com> > > --- > > net/tipc/discover.c | 17 +++++++++-------- > > 1 file changed, 9 insertions(+), 8 deletions(-) > > > > diff --git a/net/tipc/discover.c b/net/tipc/discover.c index > > 2830709..5fff91c 100644 > > --- a/net/tipc/discover.c > > +++ b/net/tipc/discover.c > > @@ -167,6 +167,7 @@ static bool tipc_disc_addr_trial_msg(struct > tipc_discoverer *d, > > /* Apply trial address if we just left trial period */ > > if (!trial && !self) { > > tipc_net_finalize(net, tn->trial_addr); > > + msg_set_prevnode(buf_msg(d->skb), tn->trial_addr); > > msg_set_type(buf_msg(d->skb), DSC_REQ_MSG); > > } > > > > @@ -300,14 +301,12 @@ static void tipc_disc_timeout(struct timer_list *t) > > goto exit; > > } > > > > - /* Trial period over ? */ > > - if (!time_before(jiffies, tn->addr_trial_end)) { > > - /* Did we just leave it ? */ > > - if (!tipc_own_addr(net)) > > - tipc_net_finalize(net, tn->trial_addr); > > - > > - msg_set_type(buf_msg(d->skb), DSC_REQ_MSG); > > - msg_set_prevnode(buf_msg(d->skb), tipc_own_addr(net)); > > + /* Did we just leave trial period ? */ > > + if (!time_before(jiffies, tn->addr_trial_end) && > > !tipc_own_addr(net)) > { > > + mod_timer(&d->timer, jiffies + TIPC_DISC_INIT); > > + spin_unlock_bh(&d->lock); > > + tipc_net_finalize(net, tn->trial_addr); > > + return; > > } > > > > /* Adjust timeout interval according to discovery phase */ @@ > > -319,6 +318,8 @@ static void tipc_disc_timeout(struct timer_list *t) > > d->timer_intv = TIPC_DISC_SLOW; > > else if (!d->num_nodes && d->timer_intv > TIPC_DISC_FAST) > > d->timer_intv = TIPC_DISC_FAST; > > + msg_set_type(buf_msg(d->skb), DSC_REQ_MSG); > > + msg_set_prevnode(buf_msg(d->skb), tn->trial_addr); > > } > > > > mod_timer(&d->timer, jiffies + d->timer_intv); > > -- > > 2.1.4 > > > > > > > > > >> -----Original Message----- > >> From: Ying Xue <ying....@windriver.com> > >> Sent: 5-Nov-18 06:34 > >> To: Jon Maloy <jon.ma...@ericsson.com> > >> Subject: Re: Deadlock warning > >> > >> FYI: I have found its root cause. Now I am figuring out how to a bit > >> elegantly solve it. > >> > >> On 10/17/2018 10:36 AM, Jon Maloy wrote: > >>> Hi Ying, > >>> > >>> I sometimes get the following deadlock warning. As I understand it > >>> the reason is that we are calling rhashtable_walk_enter() in a tmer, > >>> i.e., in an SW interrupt, something that is not permitted. Do you > >>> agree with this interpretation? Since you have worked more with > >>> these hash tables than I have, can you see any easy solution to it. > >>> I would hate to introduce a work queue to solve this. > >>> > >>> > >>> > >>> ///jon > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> [346769.617370] ================================ > >>> > >>> [346769.618331] WARNING: inconsistent lock state > >>> > >>> [346769.619187] 4.19.0-rc6+ #27 Tainted: G E > >>> > >>> [346769.619651] -------------------------------- > >>> > >>> [346769.619651] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} > usage. > >>> > >>> [346769.619651] swapper/3/0 [HC0[0]:SC1[3]:HE1:SE0] takes: > >>> > >>> [346769.619651] 000000009e3eefe1 (&(&ht->lock)->rlock){+.?.}, at: > >>> rhashtable_walk_enter+0x36/0xb0 > >>> > >>> [346769.619651] {SOFTIRQ-ON-W} state was registered at: > >>> > >>> [346769.619651] _raw_spin_lock+0x29/0x60 > >>> > >>> [346769.619651] rht_deferred_worker+0x556/0x810 > >>> > >>> [346769.619651] process_one_work+0x1f5/0x540 > >>> > >>> [346769.619651] worker_thread+0x64/0x3e0 > >>> > >>> [346769.619651] kthread+0x112/0x150 > >>> > >>> [346769.619651] ret_from_fork+0x3a/0x50 > >>> > >>> [346769.619651] irq event stamp: 8581694 > >>> > >>> [346769.619651] hardirqs last enabled at (8581694): > >>> [<ffffffff9107f290>] __local_bh_enable_ip+0x80 /0x100 > >>> > >>> [346769.619651] hardirqs last disabled at (8581693): > >>> [<ffffffff9107f257>] __local_bh_enable_ip+0x47 /0x100 > >>> > >>> [346769.619651] softirqs last enabled at (8581678): > >>> [<ffffffff9107fa1e>] irq_enter+0x5e/0x60 > >>> > >>> [346769.619651] softirqs last disabled at (8581679): > >>> [<ffffffff9107fadb>] irq_exit+0xbb/0xc0 > >>> > >>> [346769.619651] > >>> > >>> [346769.619651] other info that might help us debug this: > >>> > >>> [346769.619651] Possible unsafe locking scenario: > >>> > >>> [346769.619651] > >>> > >>> [346769.619651] CPU0 > >>> > >>> [346769.619651] ---- > >>> > >>> [346769.619651] lock(&(&ht->lock)->rlock); > >>> > >>> [346769.619651] <Interrupt> > >>> > >>> [346769.619651] lock(&(&ht->lock)->rlock); > >>> > >>> [346769.619651] > >>> > >>> [346769.619651] *** DEADLOCK *** > >>> > >>> [346769.619651] > >>> > >>> [346769.619651] 2 locks held by swapper/3/0: > >>> > >>> [346769.619651] #0: 00000000d9e59d74 ((&d->timer)){+.-.}, at: > >>> call_timer_fn+0x5/0x280 > >>> > >>> [346769.619651] #1: 000000000bf452d8 (&(&d->lock)->rlock){+.-.}, at: > >>> tipc_disc_timeout+0xc8/0x540 [tipc] > >>> > >>> [346769.619651] > >>> > >>> [346769.619651] stack backtrace: > >>> > >>> [346769.619651] CPU: 3 PID: 0 Comm: swapper/3 Tainted: G E > >>> 4.19.0-rc6+ #27 > >>> > >>> [346769.619651] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 > >>> > >>> [346769.619651] Call Trace: > >>> > >>> [346769.619651] <IRQ> > >>> > >>> [346769.619651] dump_stack+0x78/0xb3 > >>> > >>> [346769.619651] print_usage_bug+0x1ed/0x20b > >>> > >>> [346769.619651] mark_lock+0x5ca/0x640 > >>> > >>> [346769.619651] __lock_acquire+0x41f/0x1b10 > >>> > >>> [346769.619651] ? sched_clock_local+0x12/0x80 > >>> > >>> [346769.619651] ? lock_acquire+0xb3/0x190 > >>> > >>> [346769.619651] lock_acquire+0xb3/0x190 > >>> > >>> [346769.619651] ? rhashtable_walk_enter+0x36/0xb0 > >>> > >>> [346769.651609] _raw_spin_lock+0x29/0x60 > >>> > >>> [346769.651609] ? rhashtable_walk_enter+0x36/0xb0 > >>> > >>> [346769.651609] rhashtable_walk_enter+0x36/0xb0 > >>> > >>> [346769.651609] tipc_sk_reinit+0xb0/0x410 [tipc] > >>> > >>> [346769.651609] ? mark_held_locks+0x6f/0x90 > >>> > >>> [346769.651609] ? __local_bh_enable_ip+0x80/0x100 > >>> > >>> [346769.651609] ? debug_show_all_locks+0x170/0x190 > >>> > >>> [346769.651609] tipc_net_finalize+0xbf/0x180 [tipc] > >>> > >>> [346769.651609] tipc_disc_timeout+0x509/0x540 [tipc] > >>> > > _______________________________________________ tipc-discussion mailing list tipc-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/tipc-discussion