Re: VXLAN RCU error

2017-02-24 Thread Jakub Kicinski
On Fri, 24 Feb 2017 11:30:05 -0800, Cong Wang wrote:
> On Thu, Feb 23, 2017 at 12:20 AM, Jakub Kicinski  wrote:
> >
> > Ugh.  Looks like this may not work even if it makes the splat go away.
> > synchronize_net() doesn't seem to wait for the _bh() flavor of RCU, so
> > we need to add syncronize_rcu_bh() call before freeing the socket or do
> > a normal rcu_read_lock()/unlock() on the fast path.  Any RCU experts
> > want to comment? :)  
> 
> But rcu_read_lock_bh() is enforced by upper layer, __dev_queue_xmit(),
> I am afraid you can't change it.
> 
> Does changing these kfree() to kfree_rcu() work too since there is no
> kfree_rcu_bh()?

If I read the code correctly the VXLAN/GENEVE does standard RCU
dereferencing (i.e. non-_bh()) but as you point out __dev_queue_xmit()
takes rcu_read_lock_bh() and not rcu_read_lock() so we either have to
make VXLAN code take the correct lock itself or make VXLAN
reconfiguration code do synchronize_rcu_bh() as well as
synchronize_net() (most likely with synchronize_rcu_mult() but then
we will have to open code expediting under rtnl...)  

I was waiting for some internal reviews but let me just post the code I
have now implementing the first option.


Re: VXLAN RCU error

2017-02-24 Thread Cong Wang
On Thu, Feb 23, 2017 at 12:20 AM, Jakub Kicinski  wrote:
>
> Ugh.  Looks like this may not work even if it makes the splat go away.
> synchronize_net() doesn't seem to wait for the _bh() flavor of RCU, so
> we need to add syncronize_rcu_bh() call before freeing the socket or do
> a normal rcu_read_lock()/unlock() on the fast path.  Any RCU experts
> want to comment? :)

But rcu_read_lock_bh() is enforced by upper layer, __dev_queue_xmit(),
I am afraid you can't change it.

Does changing these kfree() to kfree_rcu() work too since there is no
kfree_rcu_bh()?


Re: VXLAN RCU error

2017-02-23 Thread Pravin Shelar
On Thu, Feb 23, 2017 at 12:20 AM, Jakub Kicinski  wrote:
> On Wed, 22 Feb 2017 20:30:31 -0800, Jakub Kicinski wrote:
>> On Wed, 22 Feb 2017 14:27:45 -0800, Jakub Kicinski wrote:
>> > [ 1571.067134] ===
>> > [ 1571.071842] [ ERR: suspicious RCU usage.  ]
>> > [ 1571.076546] 4.10.0-debug-03232-g12d656af4e3d #1 Tainted: GW  O
>> > [ 1571.084166] ---
>> > [ 1571.088867] ../drivers/net/vxlan.c:2111 suspicious 
>> > rcu_dereference_check() usage!
>> > [ 1571.097286]
>> > [ 1571.097286] other info that might help us debug this:
>> > [ 1571.097286]
>> > [ 1571.106305]
>> > [ 1571.106305] rcu_scheduler_active = 2, debug_locks = 1
>> > [ 1571.113654] 3 locks held by ping/13826:
>> > [ 1571.117968]  #0:  (sk_lock-AF_INET){+.+.+.}, at: [] 
>> > raw_sendmsg+0x14e2/0x2e40
>> > [ 1571.127758]  #1:  (rcu_read_lock_bh){..}, at: [] 
>> > ip_finish_output2+0x274/0x1390
>> > [ 1571.138135]  #2:  (rcu_read_lock_bh){..}, at: [] 
>> > __dev_queue_xmit+0x1ec/0x2750
>> 
>>
>> diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
>> index 4e27c5b09600..8aa3e837cd6c 100644
>> --- a/drivers/net/vxlan.c
>> +++ b/drivers/net/vxlan.c
>> @@ -2109,7 +2109,7 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct 
>> net_device *dev,
>>  vxlan->cfg.port_max, true);
>>
>> if (dst->sa.sa_family == AF_INET) {
>> -   struct vxlan_sock *sock4 = rcu_dereference(vxlan->vn4_sock);
>> +   struct vxlan_sock *sock4 = 
>> rcu_dereference_bh(vxlan->vn4_sock);
>> struct rtable *rt;
>> __be16 df = 0;
>>
>> @@ -2148,7 +2148,7 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct 
>> net_device *dev,
>> src_port, dst_port, xnet, !udp_sum);
>>  #if IS_ENABLED(CONFIG_IPV6)
>> } else {
>> -   struct vxlan_sock *sock6 = rcu_dereference(vxlan->vn6_sock);
>> +   struct vxlan_sock *sock6 = 
>> rcu_dereference_bh(vxlan->vn6_sock);
>>
>> ndst = vxlan6_get_route(vxlan, dev, sock6, skb,
>> rdst ? rdst->remote_ifindex : 0, tos,
>>
>
> Ugh.  Looks like this may not work even if it makes the splat go away.
> synchronize_net() doesn't seem to wait for the _bh() flavor of RCU, so
> we need to add syncronize_rcu_bh() call before freeing the socket or do
> a normal rcu_read_lock()/unlock() on the fast path.  Any RCU experts
> want to comment? :)
>
I think both solutions would work. I prefer using rcu_read-lock in
fast path. We already execute vxlan rcv path in rcu read lock so it
would make it consistent with xmit path.

> FWIW geneve will need similar fix, I presume.

I agree.


Re: VXLAN RCU error

2017-02-23 Thread Jakub Kicinski
On Wed, 22 Feb 2017 20:30:31 -0800, Jakub Kicinski wrote:
> On Wed, 22 Feb 2017 14:27:45 -0800, Jakub Kicinski wrote:
> > [ 1571.067134] ===
> > [ 1571.071842] [ ERR: suspicious RCU usage.  ]
> > [ 1571.076546] 4.10.0-debug-03232-g12d656af4e3d #1 Tainted: GW  O   
> > [ 1571.084166] ---
> > [ 1571.088867] ../drivers/net/vxlan.c:2111 suspicious 
> > rcu_dereference_check() usage!
> > [ 1571.097286] 
> > [ 1571.097286] other info that might help us debug this:
> > [ 1571.097286] 
> > [ 1571.106305] 
> > [ 1571.106305] rcu_scheduler_active = 2, debug_locks = 1
> > [ 1571.113654] 3 locks held by ping/13826:
> > [ 1571.117968]  #0:  (sk_lock-AF_INET){+.+.+.}, at: [] 
> > raw_sendmsg+0x14e2/0x2e40
> > [ 1571.127758]  #1:  (rcu_read_lock_bh){..}, at: [] 
> > ip_finish_output2+0x274/0x1390
> > [ 1571.138135]  #2:  (rcu_read_lock_bh){..}, at: [] 
> > __dev_queue_xmit+0x1ec/0x2750
> 
>
> diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
> index 4e27c5b09600..8aa3e837cd6c 100644
> --- a/drivers/net/vxlan.c
> +++ b/drivers/net/vxlan.c
> @@ -2109,7 +2109,7 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct 
> net_device *dev,
>  vxlan->cfg.port_max, true);
>  
> if (dst->sa.sa_family == AF_INET) {
> -   struct vxlan_sock *sock4 = rcu_dereference(vxlan->vn4_sock);
> +   struct vxlan_sock *sock4 = 
> rcu_dereference_bh(vxlan->vn4_sock);
> struct rtable *rt;
> __be16 df = 0;
>  
> @@ -2148,7 +2148,7 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct 
> net_device *dev,
> src_port, dst_port, xnet, !udp_sum);
>  #if IS_ENABLED(CONFIG_IPV6)
> } else {
> -   struct vxlan_sock *sock6 = rcu_dereference(vxlan->vn6_sock);
> +   struct vxlan_sock *sock6 = 
> rcu_dereference_bh(vxlan->vn6_sock);
>  
> ndst = vxlan6_get_route(vxlan, dev, sock6, skb,
> rdst ? rdst->remote_ifindex : 0, tos,
> 

Ugh.  Looks like this may not work even if it makes the splat go away.
synchronize_net() doesn't seem to wait for the _bh() flavor of RCU, so
we need to add syncronize_rcu_bh() call before freeing the socket or do
a normal rcu_read_lock()/unlock() on the fast path.  Any RCU experts
want to comment? :)

FWIW geneve will need similar fix, I presume.


Re: VXLAN RCU error

2017-02-22 Thread Jakub Kicinski
On Wed, 22 Feb 2017 14:27:45 -0800, Jakub Kicinski wrote:
> Hi Roopa!

Ah, sorry, it seems like this splat may be coming all the way from
c6fcc4fc5f8b ("vxlan: avoid using stale vxlan socket.").

> I get this RCU error on net 12d656af4e3d2781b9b9f52538593e1717e7c979:
> 
> [ 1571.067134] ===
> [ 1571.071842] [ ERR: suspicious RCU usage.  ]
> [ 1571.076546] 4.10.0-debug-03232-g12d656af4e3d #1 Tainted: GW  O   
> [ 1571.084166] ---
> [ 1571.088867] ../drivers/net/vxlan.c:2111 suspicious rcu_dereference_check() 
> usage!
> [ 1571.097286] 
> [ 1571.097286] other info that might help us debug this:
> [ 1571.097286] 
> [ 1571.106305] 
> [ 1571.106305] rcu_scheduler_active = 2, debug_locks = 1
> [ 1571.113654] 3 locks held by ping/13826:
> [ 1571.117968]  #0:  (sk_lock-AF_INET){+.+.+.}, at: [] 
> raw_sendmsg+0x14e2/0x2e40
> [ 1571.127758]  #1:  (rcu_read_lock_bh){..}, at: [] 
> ip_finish_output2+0x274/0x1390
> [ 1571.138135]  #2:  (rcu_read_lock_bh){..}, at: [] 
> __dev_queue_xmit+0x1ec/0x2750
> [ 1571.148408] 
> [ 1571.148408] stack backtrace:
> [ 1571.153326] CPU: 10 PID: 13826 Comm: ping Tainted: GW  O
> 4.10.0-debug-03232-g12d656af4e3d #1
> [ 1571.163877] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.3.4 
> 11/08/2016
> [ 1571.172290] Call Trace:
> [ 1571.175053]  dump_stack+0xcd/0x134
> [ 1571.178881]  ? _atomic_dec_and_lock+0xcc/0xcc
> [ 1571.183782]  ? print_lock+0xb2/0xb5
> [ 1571.187711]  lockdep_rcu_suspicious+0x123/0x170
> [ 1571.192807]  vxlan_xmit_one+0x1931/0x4270 [vxlan]
> [ 1571.198126]  ? encap_bypass_if_local+0x380/0x380 [vxlan]
> [ 1571.204109]  ? sched_clock+0x9/0x10
> [ 1571.208034]  ? sched_clock_cpu+0x20/0x2c0
> [ 1571.212541]  ? unwind_get_return_address+0x1b8/0x2b0
> [ 1571.218132]  ? __lock_acquire+0x6d6/0x3160
> [ 1571.222740]  vxlan_xmit+0x756/0x4f90 [vxlan]
> [ 1571.227541]  ? vxlan_xmit_one+0x4270/0x4270 [vxlan]
> [ 1571.233014]  ? netif_skb_features+0x2be/0xba0
> [ 1571.237919]  dev_hard_start_xmit+0x1ab/0xa70
> [ 1571.242724]  __dev_queue_xmit+0x137b/0x2750
> [ 1571.247425]  ? __dev_queue_xmit+0x1ec/0x2750
> [ 1571.252228]  ? netdev_pick_tx+0x330/0x330
> [ 1571.256735]  ? debug_smp_processor_id+0x17/0x20
> [ 1571.261826]  ? get_lock_stats+0x1d/0x160
> [ 1571.266241]  ? mark_held_locks+0x105/0x280
> [ 1571.270850]  ? memcpy+0x45/0x50
> [ 1571.274391]  dev_queue_xmit+0x10/0x20
> [ 1571.278511]  neigh_resolve_output+0x43e/0x7f0
> [ 1571.283405]  ? ip_finish_output2+0x69d/0x1390
> [ 1571.288308]  ip_finish_output2+0x69d/0x1390
> [ 1571.293008]  ? ip_finish_output2+0x274/0x1390
> [ 1571.297909]  ? ip_copy_metadata+0x7e0/0x7e0
> [ 1571.302610]  ? get_lock_stats+0x1d/0x160
> [ 1571.307027]  ip_finish_output+0x598/0xc50
> [ 1571.311537]  ip_output+0x371/0x630
> [ 1571.315362]  ? ip_output+0x1dc/0x630
> [ 1571.319383]  ? ip_mc_output+0xe70/0xe70
> [ 1571.323694]  ? kfree+0x372/0x5a0
> [ 1571.327325]  ? mark_held_locks+0x105/0x280
> [ 1571.331933]  ? __ip_make_skb+0xdd1/0x2200
> [ 1571.336457]  ip_local_out+0x8f/0x180
> [ 1571.340480]  ip_send_skb+0x44/0xf0
> [ 1571.344306]  ip_push_pending_frames+0x5a/0x80
> [ 1571.349203]  raw_sendmsg+0x164d/0x2e40
> [ 1571.353422]  ? debug_check_no_locks_freed+0x350/0x350
> [ 1571.359099]  ? dst_output+0x1b0/0x1b0
> [ 1571.363217]  ? get_lock_stats+0x1d/0x160
> [ 1571.367640]  ? __might_fault+0x199/0x230
> [ 1571.372052]  ? kasan_check_write+0x14/0x20
> [ 1571.382002]  ? _copy_from_user+0xb9/0x130
> [ 1571.386513]  ? rw_copy_check_uvector+0x8d/0x490
> [ 1571.391609]  ? import_iovec+0xae/0x5d0
> [ 1571.395826]  ? push_pipe+0xd00/0xd00
> [ 1571.399847]  ? kasan_check_write+0x14/0x20
> [ 1571.404450]  ? _copy_from_user+0xb9/0x130
> [ 1571.408960]  inet_sendmsg+0x19f/0x5f0
> [ 1571.413071]  ? inet_recvmsg+0x980/0x980
> [ 1571.417386]  sock_sendmsg+0xe2/0x170
> [ 1571.421408]  ___sys_sendmsg+0x66e/0x960
> [ 1571.425726]  ? mem_cgroup_commit_charge+0x144/0x2720
> [ 1571.431303]  ? copy_msghdr_from_user+0x610/0x610
> [ 1571.436495]  ? debug_smp_processor_id+0x17/0x20
> [ 1571.441584]  ? get_lock_stats+0x1d/0x160
> [ 1571.445995]  ? mem_cgroup_uncharge_swap+0x250/0x250
> [ 1571.451474]  ? page_add_new_anon_rmap+0x173/0x3a0
> [ 1571.456762]  ? handle_mm_fault+0x1589/0x3820
> [ 1571.461566]  ? handle_mm_fault+0x1589/0x3820
> [ 1571.466362]  ? handle_mm_fault+0x191/0x3820
> [ 1571.471070]  ? __fdget+0x13/0x20
> [ 1571.474702]  ? get_lock_stats+0x1d/0x160
> [ 1571.479116]  __sys_sendmsg+0xc6/0x150
> [ 1571.483234]  ? SyS_shutdown+0x1b0/0x1b0
> [ 1571.487551]  ? __do_page_fault+0x556/0xe50
> [ 1571.492158]  ? trace_hardirqs_on_thunk+0x1a/0x1c
> [ 1571.497340]  SyS_sendmsg+0x12/0x20
> [ 1571.501166]  entry_SYSCALL_64_fastpath+0x23/0xc6
> [ 1571.506354] RIP: 0033:0x7fca2d0384a0
> [ 1571.510374] RSP: 002b:7ffd18d7fe88 EFLAGS: 0246 ORIG_RAX: 
> 002e
> [ 1571.518886] RAX: ffda RBX: 0040 RCX: 
> 7fca2d0384a0
> [ 1571.526889] RDX: 

VXLAN RCU error

2017-02-22 Thread Jakub Kicinski
Hi Roopa!

I get this RCU error on net 12d656af4e3d2781b9b9f52538593e1717e7c979:

[ 1571.067134] ===
[ 1571.071842] [ ERR: suspicious RCU usage.  ]
[ 1571.076546] 4.10.0-debug-03232-g12d656af4e3d #1 Tainted: GW  O   
[ 1571.084166] ---
[ 1571.088867] ../drivers/net/vxlan.c:2111 suspicious rcu_dereference_check() 
usage!
[ 1571.097286] 
[ 1571.097286] other info that might help us debug this:
[ 1571.097286] 
[ 1571.106305] 
[ 1571.106305] rcu_scheduler_active = 2, debug_locks = 1
[ 1571.113654] 3 locks held by ping/13826:
[ 1571.117968]  #0:  (sk_lock-AF_INET){+.+.+.}, at: [] 
raw_sendmsg+0x14e2/0x2e40
[ 1571.127758]  #1:  (rcu_read_lock_bh){..}, at: [] 
ip_finish_output2+0x274/0x1390
[ 1571.138135]  #2:  (rcu_read_lock_bh){..}, at: [] 
__dev_queue_xmit+0x1ec/0x2750
[ 1571.148408] 
[ 1571.148408] stack backtrace:
[ 1571.153326] CPU: 10 PID: 13826 Comm: ping Tainted: GW  O
4.10.0-debug-03232-g12d656af4e3d #1
[ 1571.163877] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.3.4 
11/08/2016
[ 1571.172290] Call Trace:
[ 1571.175053]  dump_stack+0xcd/0x134
[ 1571.178881]  ? _atomic_dec_and_lock+0xcc/0xcc
[ 1571.183782]  ? print_lock+0xb2/0xb5
[ 1571.187711]  lockdep_rcu_suspicious+0x123/0x170
[ 1571.192807]  vxlan_xmit_one+0x1931/0x4270 [vxlan]
[ 1571.198126]  ? encap_bypass_if_local+0x380/0x380 [vxlan]
[ 1571.204109]  ? sched_clock+0x9/0x10
[ 1571.208034]  ? sched_clock_cpu+0x20/0x2c0
[ 1571.212541]  ? unwind_get_return_address+0x1b8/0x2b0
[ 1571.218132]  ? __lock_acquire+0x6d6/0x3160
[ 1571.222740]  vxlan_xmit+0x756/0x4f90 [vxlan]
[ 1571.227541]  ? vxlan_xmit_one+0x4270/0x4270 [vxlan]
[ 1571.233014]  ? netif_skb_features+0x2be/0xba0
[ 1571.237919]  dev_hard_start_xmit+0x1ab/0xa70
[ 1571.242724]  __dev_queue_xmit+0x137b/0x2750
[ 1571.247425]  ? __dev_queue_xmit+0x1ec/0x2750
[ 1571.252228]  ? netdev_pick_tx+0x330/0x330
[ 1571.256735]  ? debug_smp_processor_id+0x17/0x20
[ 1571.261826]  ? get_lock_stats+0x1d/0x160
[ 1571.266241]  ? mark_held_locks+0x105/0x280
[ 1571.270850]  ? memcpy+0x45/0x50
[ 1571.274391]  dev_queue_xmit+0x10/0x20
[ 1571.278511]  neigh_resolve_output+0x43e/0x7f0
[ 1571.283405]  ? ip_finish_output2+0x69d/0x1390
[ 1571.288308]  ip_finish_output2+0x69d/0x1390
[ 1571.293008]  ? ip_finish_output2+0x274/0x1390
[ 1571.297909]  ? ip_copy_metadata+0x7e0/0x7e0
[ 1571.302610]  ? get_lock_stats+0x1d/0x160
[ 1571.307027]  ip_finish_output+0x598/0xc50
[ 1571.311537]  ip_output+0x371/0x630
[ 1571.315362]  ? ip_output+0x1dc/0x630
[ 1571.319383]  ? ip_mc_output+0xe70/0xe70
[ 1571.323694]  ? kfree+0x372/0x5a0
[ 1571.327325]  ? mark_held_locks+0x105/0x280
[ 1571.331933]  ? __ip_make_skb+0xdd1/0x2200
[ 1571.336457]  ip_local_out+0x8f/0x180
[ 1571.340480]  ip_send_skb+0x44/0xf0
[ 1571.344306]  ip_push_pending_frames+0x5a/0x80
[ 1571.349203]  raw_sendmsg+0x164d/0x2e40
[ 1571.353422]  ? debug_check_no_locks_freed+0x350/0x350
[ 1571.359099]  ? dst_output+0x1b0/0x1b0
[ 1571.363217]  ? get_lock_stats+0x1d/0x160
[ 1571.367640]  ? __might_fault+0x199/0x230
[ 1571.372052]  ? kasan_check_write+0x14/0x20
[ 1571.382002]  ? _copy_from_user+0xb9/0x130
[ 1571.386513]  ? rw_copy_check_uvector+0x8d/0x490
[ 1571.391609]  ? import_iovec+0xae/0x5d0
[ 1571.395826]  ? push_pipe+0xd00/0xd00
[ 1571.399847]  ? kasan_check_write+0x14/0x20
[ 1571.404450]  ? _copy_from_user+0xb9/0x130
[ 1571.408960]  inet_sendmsg+0x19f/0x5f0
[ 1571.413071]  ? inet_recvmsg+0x980/0x980
[ 1571.417386]  sock_sendmsg+0xe2/0x170
[ 1571.421408]  ___sys_sendmsg+0x66e/0x960
[ 1571.425726]  ? mem_cgroup_commit_charge+0x144/0x2720
[ 1571.431303]  ? copy_msghdr_from_user+0x610/0x610
[ 1571.436495]  ? debug_smp_processor_id+0x17/0x20
[ 1571.441584]  ? get_lock_stats+0x1d/0x160
[ 1571.445995]  ? mem_cgroup_uncharge_swap+0x250/0x250
[ 1571.451474]  ? page_add_new_anon_rmap+0x173/0x3a0
[ 1571.456762]  ? handle_mm_fault+0x1589/0x3820
[ 1571.461566]  ? handle_mm_fault+0x1589/0x3820
[ 1571.466362]  ? handle_mm_fault+0x191/0x3820
[ 1571.471070]  ? __fdget+0x13/0x20
[ 1571.474702]  ? get_lock_stats+0x1d/0x160
[ 1571.479116]  __sys_sendmsg+0xc6/0x150
[ 1571.483234]  ? SyS_shutdown+0x1b0/0x1b0
[ 1571.487551]  ? __do_page_fault+0x556/0xe50
[ 1571.492158]  ? trace_hardirqs_on_thunk+0x1a/0x1c
[ 1571.497340]  SyS_sendmsg+0x12/0x20
[ 1571.501166]  entry_SYSCALL_64_fastpath+0x23/0xc6
[ 1571.506354] RIP: 0033:0x7fca2d0384a0
[ 1571.510374] RSP: 002b:7ffd18d7fe88 EFLAGS: 0246 ORIG_RAX: 
002e
[ 1571.518886] RAX: ffda RBX: 0040 RCX: 7fca2d0384a0
[ 1571.526889] RDX:  RSI: 0060a300 RDI: 0003
[ 1571.534892] RBP: 0046 R08: 0020 R09: 003e
[ 1571.542897] R10: 7ffd18d7fc50 R11: 0246 R12: 00c0
[ 1571.550900] R13: 0004 R14: 7ffd18d81608 R15: 7ffd18d810b0

Some of Netronome's VXLAN tests are also failing but I need to dig a
bit to see what's wrong