Re: [ovs-dev] [PATCH net v3] net: openvswitch: fix race on port output

2023-04-07 Thread patchwork-bot+netdevbpf
Hello:

This patch was applied to netdev/net.git (main)
by Jakub Kicinski :

On Wed, 5 Apr 2023 07:53:41 + you wrote:
> assume the following setup on a single machine:
> 1. An openvswitch instance with one bridge and default flows
> 2. two network namespaces "server" and "client"
> 3. two ovs interfaces "server" and "client" on the bridge
> 4. for each ovs interface a veth pair with a matching name and 32 rx and
>tx queues
> 5. move the ends of the veth pairs to the respective network namespaces
> 6. assign ip addresses to each of the veth ends in the namespaces (needs
>to be the same subnet)
> 7. start some http server on the server network namespace
> 8. test if a client in the client namespace can reach the http server
> 
> [...]

Here is the summary with links:
  - [net,v3] net: openvswitch: fix race on port output
https://git.kernel.org/netdev/net/c/066b86787fa3

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH net v3] net: openvswitch: fix race on port output

2023-04-07 Thread Simon Horman
On Fri, Apr 07, 2023 at 10:44:35AM +0200, Simon Horman wrote:
> On Thu, Apr 06, 2023 at 07:05:13PM -0700, Jakub Kicinski wrote:
> > On Wed, 5 Apr 2023 07:53:41 + Felix Huettner wrote:
> > > assume the following setup on a single machine:
> > > 1. An openvswitch instance with one bridge and default flows
> > > 2. two network namespaces "server" and "client"
> > > 3. two ovs interfaces "server" and "client" on the bridge
> > > 4. for each ovs interface a veth pair with a matching name and 32 rx and
> > >tx queues
> > > 5. move the ends of the veth pairs to the respective network namespaces
> > > 6. assign ip addresses to each of the veth ends in the namespaces (needs
> > >to be the same subnet)
> > > 7. start some http server on the server network namespace
> > > 8. test if a client in the client namespace can reach the http server
> > 
> > Hi Simon, looks good?
> 
> Thanks Jakub, will check.

Yes, this does look good to me.

Reviewed-by: Simon Horman 

nit: somewhere in the patch description, 'inifinite' -> 'infinite'

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH net v3] net: openvswitch: fix race on port output

2023-04-07 Thread Eric Dumazet via dev
On Wed, Apr 5, 2023 at 9:53 AM Felix Huettner
 wrote:
>
> assume the following setup on a single machine:
> 1. An openvswitch instance with one bridge and default flows
> 2. two network namespaces "server" and "client"
> 3. two ovs interfaces "server" and "client" on the bridge
> 4. for each ovs interface a veth pair with a matching name and 32 rx and
>tx queues
> 5. move the ends of the veth pairs to the respective network namespaces
> 6. assign ip addresses to each of the veth ends in the namespaces (needs
>to be the same subnet)
> 7. start some http server on the server network namespace
> 8. test if a client in the client namespace can reach the http server
>
> when following the actions below the host has a chance of getting a cpu
> stuck in a infinite loop:
> 1. send a large amount of parallel requests to the http server (around
>3000 curls should work)
> 2. in parallel delete the network namespace (do not delete interfaces or
>stop the server, just kill the namespace)
>
> there is a low chance that this will cause the below kernel cpu stuck
> message. If this does not happen just retry.
> Below there is also the output of bpftrace for the functions mentioned
> in the output.
>
...
Reviewed-by: Eric Dumazet 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH net v3] net: openvswitch: fix race on port output

2023-04-07 Thread Simon Horman
On Thu, Apr 06, 2023 at 07:05:13PM -0700, Jakub Kicinski wrote:
> On Wed, 5 Apr 2023 07:53:41 + Felix Huettner wrote:
> > assume the following setup on a single machine:
> > 1. An openvswitch instance with one bridge and default flows
> > 2. two network namespaces "server" and "client"
> > 3. two ovs interfaces "server" and "client" on the bridge
> > 4. for each ovs interface a veth pair with a matching name and 32 rx and
> >tx queues
> > 5. move the ends of the veth pairs to the respective network namespaces
> > 6. assign ip addresses to each of the veth ends in the namespaces (needs
> >to be the same subnet)
> > 7. start some http server on the server network namespace
> > 8. test if a client in the client namespace can reach the http server
> 
> Hi Simon, looks good?

Thanks Jakub, will check.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH net v3] net: openvswitch: fix race on port output

2023-04-06 Thread Jakub Kicinski
On Wed, 5 Apr 2023 07:53:41 + Felix Huettner wrote:
> assume the following setup on a single machine:
> 1. An openvswitch instance with one bridge and default flows
> 2. two network namespaces "server" and "client"
> 3. two ovs interfaces "server" and "client" on the bridge
> 4. for each ovs interface a veth pair with a matching name and 32 rx and
>tx queues
> 5. move the ends of the veth pairs to the respective network namespaces
> 6. assign ip addresses to each of the veth ends in the namespaces (needs
>to be the same subnet)
> 7. start some http server on the server network namespace
> 8. test if a client in the client namespace can reach the http server

Hi Simon, looks good?
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH net v3] net: openvswitch: fix race on port output

2023-04-05 Thread Felix Huettner via dev
assume the following setup on a single machine:
1. An openvswitch instance with one bridge and default flows
2. two network namespaces "server" and "client"
3. two ovs interfaces "server" and "client" on the bridge
4. for each ovs interface a veth pair with a matching name and 32 rx and
   tx queues
5. move the ends of the veth pairs to the respective network namespaces
6. assign ip addresses to each of the veth ends in the namespaces (needs
   to be the same subnet)
7. start some http server on the server network namespace
8. test if a client in the client namespace can reach the http server

when following the actions below the host has a chance of getting a cpu
stuck in a infinite loop:
1. send a large amount of parallel requests to the http server (around
   3000 curls should work)
2. in parallel delete the network namespace (do not delete interfaces or
   stop the server, just kill the namespace)

there is a low chance that this will cause the below kernel cpu stuck
message. If this does not happen just retry.
Below there is also the output of bpftrace for the functions mentioned
in the output.

The series of events happening here is:
1. the network namespace is deleted calling
   `unregister_netdevice_many_notify` somewhere in the process
2. this sets first `NETREG_UNREGISTERING` on both ends of the veth and
   then runs `synchronize_net`
3. it then calls `call_netdevice_notifiers` with `NETDEV_UNREGISTER`
4. this is then handled by `dp_device_event` which calls
   `ovs_netdev_detach_dev` (if a vport is found, which is the case for
   the veth interface attached to ovs)
5. this removes the rx_handlers of the device but does not prevent
   packages to be sent to the device
6. `dp_device_event` then queues the vport deletion to work in
   background as a ovs_lock is needed that we do not hold in the
   unregistration path
7. `unregister_netdevice_many_notify` continues to call
   `netdev_unregister_kobject` which sets `real_num_tx_queues` to 0
8. port deletion continues (but details are not relevant for this issue)
9. at some future point the background task deletes the vport

If after 7. but before 9. a packet is send to the ovs vport (which is
not deleted at this point in time) which forwards it to the
`dev_queue_xmit` flow even though the device is unregistering.
In `skb_tx_hash` (which is called in the `dev_queue_xmit`) path there is
a while loop (if the packet has a rx_queue recorded) that is infinite if
`dev->real_num_tx_queues` is zero.

To prevent this from happening we update `do_output` to handle devices
without carrier the same as if the device is not found (which would
be the code path after 9. is done).

Additionally we now produce a warning in `skb_tx_hash` if we will hit
the inifinite loop.

bpftrace (first word is function name):

__dev_queue_xmit server: real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, 
skb_addr: 0x9edb6f207000, reg_state: 1
netdev_core_pick_tx server: addr: 0x9f0a46d4a000 real_num_tx_queues: 1, 
cpu: 2, pid: 28024, tid: 28024, skb_addr: 0x9edb6f207000, reg_state: 1
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, 
event 2, reg_state: 1
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, 
event 6, reg_state: 2
ovs_netdev_detach_dev server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 
21024, reg_state: 2
netdev_rx_handler_unregister server: real_num_tx_queues: 1, cpu: 9, pid: 21024, 
tid: 21024, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
netdev_rx_handler_unregister ret server: real_num_tx_queues: 1, cpu: 9, pid: 
21024, tid: 21024, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, 
event 27, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, 
event 22, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, 
event 18, reg_state: 2
netdev_unregister_kobject: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
ovs_vport_send server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, 
skb_addr: 0x9edb6f207000, reg_state: 2
__dev_queue_xmit server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, 
skb_addr: 0x9edb6f207000, reg_state: 2
netdev_core_pick_tx server: addr: 0x9f0a46d4a000 real_num_tx_queues: 0, 
cpu: 2, pid: 28024, tid: 28024, skb_addr: 0x9edb6f207000, reg_state: 2
broken device server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024
ovs_dp_detach_port server: real_num_tx_queues: 0 cpu 9, pid: 9124, tid: 9124, 
reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 33604, tid: 33604

stuck message:

watchdog: BUG: soft lockup - CPU#5 stuck