Re: VXLAN RCU error
On Fri, 24 Feb 2017 11:30:05 -0800, Cong Wang wrote: > On Thu, Feb 23, 2017 at 12:20 AM, Jakub Kicinskiwrote: > > > > Ugh. Looks like this may not work even if it makes the splat go away. > > synchronize_net() doesn't seem to wait for the _bh() flavor of RCU, so > > we need to add syncronize_rcu_bh() call before freeing the socket or do > > a normal rcu_read_lock()/unlock() on the fast path. Any RCU experts > > want to comment? :) > > But rcu_read_lock_bh() is enforced by upper layer, __dev_queue_xmit(), > I am afraid you can't change it. > > Does changing these kfree() to kfree_rcu() work too since there is no > kfree_rcu_bh()? If I read the code correctly the VXLAN/GENEVE does standard RCU dereferencing (i.e. non-_bh()) but as you point out __dev_queue_xmit() takes rcu_read_lock_bh() and not rcu_read_lock() so we either have to make VXLAN code take the correct lock itself or make VXLAN reconfiguration code do synchronize_rcu_bh() as well as synchronize_net() (most likely with synchronize_rcu_mult() but then we will have to open code expediting under rtnl...) I was waiting for some internal reviews but let me just post the code I have now implementing the first option.
Re: VXLAN RCU error
On Thu, Feb 23, 2017 at 12:20 AM, Jakub Kicinskiwrote: > > Ugh. Looks like this may not work even if it makes the splat go away. > synchronize_net() doesn't seem to wait for the _bh() flavor of RCU, so > we need to add syncronize_rcu_bh() call before freeing the socket or do > a normal rcu_read_lock()/unlock() on the fast path. Any RCU experts > want to comment? :) But rcu_read_lock_bh() is enforced by upper layer, __dev_queue_xmit(), I am afraid you can't change it. Does changing these kfree() to kfree_rcu() work too since there is no kfree_rcu_bh()?
Re: VXLAN RCU error
On Thu, Feb 23, 2017 at 12:20 AM, Jakub Kicinskiwrote: > On Wed, 22 Feb 2017 20:30:31 -0800, Jakub Kicinski wrote: >> On Wed, 22 Feb 2017 14:27:45 -0800, Jakub Kicinski wrote: >> > [ 1571.067134] === >> > [ 1571.071842] [ ERR: suspicious RCU usage. ] >> > [ 1571.076546] 4.10.0-debug-03232-g12d656af4e3d #1 Tainted: GW O >> > [ 1571.084166] --- >> > [ 1571.088867] ../drivers/net/vxlan.c:2111 suspicious >> > rcu_dereference_check() usage! >> > [ 1571.097286] >> > [ 1571.097286] other info that might help us debug this: >> > [ 1571.097286] >> > [ 1571.106305] >> > [ 1571.106305] rcu_scheduler_active = 2, debug_locks = 1 >> > [ 1571.113654] 3 locks held by ping/13826: >> > [ 1571.117968] #0: (sk_lock-AF_INET){+.+.+.}, at: [] >> > raw_sendmsg+0x14e2/0x2e40 >> > [ 1571.127758] #1: (rcu_read_lock_bh){..}, at: [] >> > ip_finish_output2+0x274/0x1390 >> > [ 1571.138135] #2: (rcu_read_lock_bh){..}, at: [] >> > __dev_queue_xmit+0x1ec/0x2750 >> >> >> diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c >> index 4e27c5b09600..8aa3e837cd6c 100644 >> --- a/drivers/net/vxlan.c >> +++ b/drivers/net/vxlan.c >> @@ -2109,7 +2109,7 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct >> net_device *dev, >> vxlan->cfg.port_max, true); >> >> if (dst->sa.sa_family == AF_INET) { >> - struct vxlan_sock *sock4 = rcu_dereference(vxlan->vn4_sock); >> + struct vxlan_sock *sock4 = >> rcu_dereference_bh(vxlan->vn4_sock); >> struct rtable *rt; >> __be16 df = 0; >> >> @@ -2148,7 +2148,7 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct >> net_device *dev, >> src_port, dst_port, xnet, !udp_sum); >> #if IS_ENABLED(CONFIG_IPV6) >> } else { >> - struct vxlan_sock *sock6 = rcu_dereference(vxlan->vn6_sock); >> + struct vxlan_sock *sock6 = >> rcu_dereference_bh(vxlan->vn6_sock); >> >> ndst = vxlan6_get_route(vxlan, dev, sock6, skb, >> rdst ? rdst->remote_ifindex : 0, tos, >> > > Ugh. Looks like this may not work even if it makes the splat go away. > synchronize_net() doesn't seem to wait for the _bh() flavor of RCU, so > we need to add syncronize_rcu_bh() call before freeing the socket or do > a normal rcu_read_lock()/unlock() on the fast path. Any RCU experts > want to comment? :) > I think both solutions would work. I prefer using rcu_read-lock in fast path. We already execute vxlan rcv path in rcu read lock so it would make it consistent with xmit path. > FWIW geneve will need similar fix, I presume. I agree.
Re: VXLAN RCU error
On Wed, 22 Feb 2017 20:30:31 -0800, Jakub Kicinski wrote: > On Wed, 22 Feb 2017 14:27:45 -0800, Jakub Kicinski wrote: > > [ 1571.067134] === > > [ 1571.071842] [ ERR: suspicious RCU usage. ] > > [ 1571.076546] 4.10.0-debug-03232-g12d656af4e3d #1 Tainted: GW O > > [ 1571.084166] --- > > [ 1571.088867] ../drivers/net/vxlan.c:2111 suspicious > > rcu_dereference_check() usage! > > [ 1571.097286] > > [ 1571.097286] other info that might help us debug this: > > [ 1571.097286] > > [ 1571.106305] > > [ 1571.106305] rcu_scheduler_active = 2, debug_locks = 1 > > [ 1571.113654] 3 locks held by ping/13826: > > [ 1571.117968] #0: (sk_lock-AF_INET){+.+.+.}, at: [] > > raw_sendmsg+0x14e2/0x2e40 > > [ 1571.127758] #1: (rcu_read_lock_bh){..}, at: [] > > ip_finish_output2+0x274/0x1390 > > [ 1571.138135] #2: (rcu_read_lock_bh){..}, at: [] > > __dev_queue_xmit+0x1ec/0x2750 > > > diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c > index 4e27c5b09600..8aa3e837cd6c 100644 > --- a/drivers/net/vxlan.c > +++ b/drivers/net/vxlan.c > @@ -2109,7 +2109,7 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct > net_device *dev, > vxlan->cfg.port_max, true); > > if (dst->sa.sa_family == AF_INET) { > - struct vxlan_sock *sock4 = rcu_dereference(vxlan->vn4_sock); > + struct vxlan_sock *sock4 = > rcu_dereference_bh(vxlan->vn4_sock); > struct rtable *rt; > __be16 df = 0; > > @@ -2148,7 +2148,7 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct > net_device *dev, > src_port, dst_port, xnet, !udp_sum); > #if IS_ENABLED(CONFIG_IPV6) > } else { > - struct vxlan_sock *sock6 = rcu_dereference(vxlan->vn6_sock); > + struct vxlan_sock *sock6 = > rcu_dereference_bh(vxlan->vn6_sock); > > ndst = vxlan6_get_route(vxlan, dev, sock6, skb, > rdst ? rdst->remote_ifindex : 0, tos, > Ugh. Looks like this may not work even if it makes the splat go away. synchronize_net() doesn't seem to wait for the _bh() flavor of RCU, so we need to add syncronize_rcu_bh() call before freeing the socket or do a normal rcu_read_lock()/unlock() on the fast path. Any RCU experts want to comment? :) FWIW geneve will need similar fix, I presume.
Re: VXLAN RCU error
On Wed, 22 Feb 2017 14:27:45 -0800, Jakub Kicinski wrote: > Hi Roopa! Ah, sorry, it seems like this splat may be coming all the way from c6fcc4fc5f8b ("vxlan: avoid using stale vxlan socket."). > I get this RCU error on net 12d656af4e3d2781b9b9f52538593e1717e7c979: > > [ 1571.067134] === > [ 1571.071842] [ ERR: suspicious RCU usage. ] > [ 1571.076546] 4.10.0-debug-03232-g12d656af4e3d #1 Tainted: GW O > [ 1571.084166] --- > [ 1571.088867] ../drivers/net/vxlan.c:2111 suspicious rcu_dereference_check() > usage! > [ 1571.097286] > [ 1571.097286] other info that might help us debug this: > [ 1571.097286] > [ 1571.106305] > [ 1571.106305] rcu_scheduler_active = 2, debug_locks = 1 > [ 1571.113654] 3 locks held by ping/13826: > [ 1571.117968] #0: (sk_lock-AF_INET){+.+.+.}, at: [] > raw_sendmsg+0x14e2/0x2e40 > [ 1571.127758] #1: (rcu_read_lock_bh){..}, at: [] > ip_finish_output2+0x274/0x1390 > [ 1571.138135] #2: (rcu_read_lock_bh){..}, at: [] > __dev_queue_xmit+0x1ec/0x2750 > [ 1571.148408] > [ 1571.148408] stack backtrace: > [ 1571.153326] CPU: 10 PID: 13826 Comm: ping Tainted: GW O > 4.10.0-debug-03232-g12d656af4e3d #1 > [ 1571.163877] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.3.4 > 11/08/2016 > [ 1571.172290] Call Trace: > [ 1571.175053] dump_stack+0xcd/0x134 > [ 1571.178881] ? _atomic_dec_and_lock+0xcc/0xcc > [ 1571.183782] ? print_lock+0xb2/0xb5 > [ 1571.187711] lockdep_rcu_suspicious+0x123/0x170 > [ 1571.192807] vxlan_xmit_one+0x1931/0x4270 [vxlan] > [ 1571.198126] ? encap_bypass_if_local+0x380/0x380 [vxlan] > [ 1571.204109] ? sched_clock+0x9/0x10 > [ 1571.208034] ? sched_clock_cpu+0x20/0x2c0 > [ 1571.212541] ? unwind_get_return_address+0x1b8/0x2b0 > [ 1571.218132] ? __lock_acquire+0x6d6/0x3160 > [ 1571.222740] vxlan_xmit+0x756/0x4f90 [vxlan] > [ 1571.227541] ? vxlan_xmit_one+0x4270/0x4270 [vxlan] > [ 1571.233014] ? netif_skb_features+0x2be/0xba0 > [ 1571.237919] dev_hard_start_xmit+0x1ab/0xa70 > [ 1571.242724] __dev_queue_xmit+0x137b/0x2750 > [ 1571.247425] ? __dev_queue_xmit+0x1ec/0x2750 > [ 1571.252228] ? netdev_pick_tx+0x330/0x330 > [ 1571.256735] ? debug_smp_processor_id+0x17/0x20 > [ 1571.261826] ? get_lock_stats+0x1d/0x160 > [ 1571.266241] ? mark_held_locks+0x105/0x280 > [ 1571.270850] ? memcpy+0x45/0x50 > [ 1571.274391] dev_queue_xmit+0x10/0x20 > [ 1571.278511] neigh_resolve_output+0x43e/0x7f0 > [ 1571.283405] ? ip_finish_output2+0x69d/0x1390 > [ 1571.288308] ip_finish_output2+0x69d/0x1390 > [ 1571.293008] ? ip_finish_output2+0x274/0x1390 > [ 1571.297909] ? ip_copy_metadata+0x7e0/0x7e0 > [ 1571.302610] ? get_lock_stats+0x1d/0x160 > [ 1571.307027] ip_finish_output+0x598/0xc50 > [ 1571.311537] ip_output+0x371/0x630 > [ 1571.315362] ? ip_output+0x1dc/0x630 > [ 1571.319383] ? ip_mc_output+0xe70/0xe70 > [ 1571.323694] ? kfree+0x372/0x5a0 > [ 1571.327325] ? mark_held_locks+0x105/0x280 > [ 1571.331933] ? __ip_make_skb+0xdd1/0x2200 > [ 1571.336457] ip_local_out+0x8f/0x180 > [ 1571.340480] ip_send_skb+0x44/0xf0 > [ 1571.344306] ip_push_pending_frames+0x5a/0x80 > [ 1571.349203] raw_sendmsg+0x164d/0x2e40 > [ 1571.353422] ? debug_check_no_locks_freed+0x350/0x350 > [ 1571.359099] ? dst_output+0x1b0/0x1b0 > [ 1571.363217] ? get_lock_stats+0x1d/0x160 > [ 1571.367640] ? __might_fault+0x199/0x230 > [ 1571.372052] ? kasan_check_write+0x14/0x20 > [ 1571.382002] ? _copy_from_user+0xb9/0x130 > [ 1571.386513] ? rw_copy_check_uvector+0x8d/0x490 > [ 1571.391609] ? import_iovec+0xae/0x5d0 > [ 1571.395826] ? push_pipe+0xd00/0xd00 > [ 1571.399847] ? kasan_check_write+0x14/0x20 > [ 1571.404450] ? _copy_from_user+0xb9/0x130 > [ 1571.408960] inet_sendmsg+0x19f/0x5f0 > [ 1571.413071] ? inet_recvmsg+0x980/0x980 > [ 1571.417386] sock_sendmsg+0xe2/0x170 > [ 1571.421408] ___sys_sendmsg+0x66e/0x960 > [ 1571.425726] ? mem_cgroup_commit_charge+0x144/0x2720 > [ 1571.431303] ? copy_msghdr_from_user+0x610/0x610 > [ 1571.436495] ? debug_smp_processor_id+0x17/0x20 > [ 1571.441584] ? get_lock_stats+0x1d/0x160 > [ 1571.445995] ? mem_cgroup_uncharge_swap+0x250/0x250 > [ 1571.451474] ? page_add_new_anon_rmap+0x173/0x3a0 > [ 1571.456762] ? handle_mm_fault+0x1589/0x3820 > [ 1571.461566] ? handle_mm_fault+0x1589/0x3820 > [ 1571.466362] ? handle_mm_fault+0x191/0x3820 > [ 1571.471070] ? __fdget+0x13/0x20 > [ 1571.474702] ? get_lock_stats+0x1d/0x160 > [ 1571.479116] __sys_sendmsg+0xc6/0x150 > [ 1571.483234] ? SyS_shutdown+0x1b0/0x1b0 > [ 1571.487551] ? __do_page_fault+0x556/0xe50 > [ 1571.492158] ? trace_hardirqs_on_thunk+0x1a/0x1c > [ 1571.497340] SyS_sendmsg+0x12/0x20 > [ 1571.501166] entry_SYSCALL_64_fastpath+0x23/0xc6 > [ 1571.506354] RIP: 0033:0x7fca2d0384a0 > [ 1571.510374] RSP: 002b:7ffd18d7fe88 EFLAGS: 0246 ORIG_RAX: > 002e > [ 1571.518886] RAX: ffda RBX: 0040 RCX: > 7fca2d0384a0 > [ 1571.526889] RDX:
VXLAN RCU error
Hi Roopa! I get this RCU error on net 12d656af4e3d2781b9b9f52538593e1717e7c979: [ 1571.067134] === [ 1571.071842] [ ERR: suspicious RCU usage. ] [ 1571.076546] 4.10.0-debug-03232-g12d656af4e3d #1 Tainted: GW O [ 1571.084166] --- [ 1571.088867] ../drivers/net/vxlan.c:2111 suspicious rcu_dereference_check() usage! [ 1571.097286] [ 1571.097286] other info that might help us debug this: [ 1571.097286] [ 1571.106305] [ 1571.106305] rcu_scheduler_active = 2, debug_locks = 1 [ 1571.113654] 3 locks held by ping/13826: [ 1571.117968] #0: (sk_lock-AF_INET){+.+.+.}, at: [] raw_sendmsg+0x14e2/0x2e40 [ 1571.127758] #1: (rcu_read_lock_bh){..}, at: [] ip_finish_output2+0x274/0x1390 [ 1571.138135] #2: (rcu_read_lock_bh){..}, at: [] __dev_queue_xmit+0x1ec/0x2750 [ 1571.148408] [ 1571.148408] stack backtrace: [ 1571.153326] CPU: 10 PID: 13826 Comm: ping Tainted: GW O 4.10.0-debug-03232-g12d656af4e3d #1 [ 1571.163877] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.3.4 11/08/2016 [ 1571.172290] Call Trace: [ 1571.175053] dump_stack+0xcd/0x134 [ 1571.178881] ? _atomic_dec_and_lock+0xcc/0xcc [ 1571.183782] ? print_lock+0xb2/0xb5 [ 1571.187711] lockdep_rcu_suspicious+0x123/0x170 [ 1571.192807] vxlan_xmit_one+0x1931/0x4270 [vxlan] [ 1571.198126] ? encap_bypass_if_local+0x380/0x380 [vxlan] [ 1571.204109] ? sched_clock+0x9/0x10 [ 1571.208034] ? sched_clock_cpu+0x20/0x2c0 [ 1571.212541] ? unwind_get_return_address+0x1b8/0x2b0 [ 1571.218132] ? __lock_acquire+0x6d6/0x3160 [ 1571.222740] vxlan_xmit+0x756/0x4f90 [vxlan] [ 1571.227541] ? vxlan_xmit_one+0x4270/0x4270 [vxlan] [ 1571.233014] ? netif_skb_features+0x2be/0xba0 [ 1571.237919] dev_hard_start_xmit+0x1ab/0xa70 [ 1571.242724] __dev_queue_xmit+0x137b/0x2750 [ 1571.247425] ? __dev_queue_xmit+0x1ec/0x2750 [ 1571.252228] ? netdev_pick_tx+0x330/0x330 [ 1571.256735] ? debug_smp_processor_id+0x17/0x20 [ 1571.261826] ? get_lock_stats+0x1d/0x160 [ 1571.266241] ? mark_held_locks+0x105/0x280 [ 1571.270850] ? memcpy+0x45/0x50 [ 1571.274391] dev_queue_xmit+0x10/0x20 [ 1571.278511] neigh_resolve_output+0x43e/0x7f0 [ 1571.283405] ? ip_finish_output2+0x69d/0x1390 [ 1571.288308] ip_finish_output2+0x69d/0x1390 [ 1571.293008] ? ip_finish_output2+0x274/0x1390 [ 1571.297909] ? ip_copy_metadata+0x7e0/0x7e0 [ 1571.302610] ? get_lock_stats+0x1d/0x160 [ 1571.307027] ip_finish_output+0x598/0xc50 [ 1571.311537] ip_output+0x371/0x630 [ 1571.315362] ? ip_output+0x1dc/0x630 [ 1571.319383] ? ip_mc_output+0xe70/0xe70 [ 1571.323694] ? kfree+0x372/0x5a0 [ 1571.327325] ? mark_held_locks+0x105/0x280 [ 1571.331933] ? __ip_make_skb+0xdd1/0x2200 [ 1571.336457] ip_local_out+0x8f/0x180 [ 1571.340480] ip_send_skb+0x44/0xf0 [ 1571.344306] ip_push_pending_frames+0x5a/0x80 [ 1571.349203] raw_sendmsg+0x164d/0x2e40 [ 1571.353422] ? debug_check_no_locks_freed+0x350/0x350 [ 1571.359099] ? dst_output+0x1b0/0x1b0 [ 1571.363217] ? get_lock_stats+0x1d/0x160 [ 1571.367640] ? __might_fault+0x199/0x230 [ 1571.372052] ? kasan_check_write+0x14/0x20 [ 1571.382002] ? _copy_from_user+0xb9/0x130 [ 1571.386513] ? rw_copy_check_uvector+0x8d/0x490 [ 1571.391609] ? import_iovec+0xae/0x5d0 [ 1571.395826] ? push_pipe+0xd00/0xd00 [ 1571.399847] ? kasan_check_write+0x14/0x20 [ 1571.404450] ? _copy_from_user+0xb9/0x130 [ 1571.408960] inet_sendmsg+0x19f/0x5f0 [ 1571.413071] ? inet_recvmsg+0x980/0x980 [ 1571.417386] sock_sendmsg+0xe2/0x170 [ 1571.421408] ___sys_sendmsg+0x66e/0x960 [ 1571.425726] ? mem_cgroup_commit_charge+0x144/0x2720 [ 1571.431303] ? copy_msghdr_from_user+0x610/0x610 [ 1571.436495] ? debug_smp_processor_id+0x17/0x20 [ 1571.441584] ? get_lock_stats+0x1d/0x160 [ 1571.445995] ? mem_cgroup_uncharge_swap+0x250/0x250 [ 1571.451474] ? page_add_new_anon_rmap+0x173/0x3a0 [ 1571.456762] ? handle_mm_fault+0x1589/0x3820 [ 1571.461566] ? handle_mm_fault+0x1589/0x3820 [ 1571.466362] ? handle_mm_fault+0x191/0x3820 [ 1571.471070] ? __fdget+0x13/0x20 [ 1571.474702] ? get_lock_stats+0x1d/0x160 [ 1571.479116] __sys_sendmsg+0xc6/0x150 [ 1571.483234] ? SyS_shutdown+0x1b0/0x1b0 [ 1571.487551] ? __do_page_fault+0x556/0xe50 [ 1571.492158] ? trace_hardirqs_on_thunk+0x1a/0x1c [ 1571.497340] SyS_sendmsg+0x12/0x20 [ 1571.501166] entry_SYSCALL_64_fastpath+0x23/0xc6 [ 1571.506354] RIP: 0033:0x7fca2d0384a0 [ 1571.510374] RSP: 002b:7ffd18d7fe88 EFLAGS: 0246 ORIG_RAX: 002e [ 1571.518886] RAX: ffda RBX: 0040 RCX: 7fca2d0384a0 [ 1571.526889] RDX: RSI: 0060a300 RDI: 0003 [ 1571.534892] RBP: 0046 R08: 0020 R09: 003e [ 1571.542897] R10: 7ffd18d7fc50 R11: 0246 R12: 00c0 [ 1571.550900] R13: 0004 R14: 7ffd18d81608 R15: 7ffd18d810b0 Some of Netronome's VXLAN tests are also failing but I need to dig a bit to see what's wrong