Re: [ovs-dev] [PATCH net v3] net: openvswitch: fix race on port output
Hello: This patch was applied to netdev/net.git (main) by Jakub Kicinski : On Wed, 5 Apr 2023 07:53:41 + you wrote: > assume the following setup on a single machine: > 1. An openvswitch instance with one bridge and default flows > 2. two network namespaces "server" and "client" > 3. two ovs interfaces "server" and "client" on the bridge > 4. for each ovs interface a veth pair with a matching name and 32 rx and >tx queues > 5. move the ends of the veth pairs to the respective network namespaces > 6. assign ip addresses to each of the veth ends in the namespaces (needs >to be the same subnet) > 7. start some http server on the server network namespace > 8. test if a client in the client namespace can reach the http server > > [...] Here is the summary with links: - [net,v3] net: openvswitch: fix race on port output https://git.kernel.org/netdev/net/c/066b86787fa3 You are awesome, thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/patchwork/pwbot.html ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH net v3] net: openvswitch: fix race on port output
On Fri, Apr 07, 2023 at 10:44:35AM +0200, Simon Horman wrote: > On Thu, Apr 06, 2023 at 07:05:13PM -0700, Jakub Kicinski wrote: > > On Wed, 5 Apr 2023 07:53:41 + Felix Huettner wrote: > > > assume the following setup on a single machine: > > > 1. An openvswitch instance with one bridge and default flows > > > 2. two network namespaces "server" and "client" > > > 3. two ovs interfaces "server" and "client" on the bridge > > > 4. for each ovs interface a veth pair with a matching name and 32 rx and > > >tx queues > > > 5. move the ends of the veth pairs to the respective network namespaces > > > 6. assign ip addresses to each of the veth ends in the namespaces (needs > > >to be the same subnet) > > > 7. start some http server on the server network namespace > > > 8. test if a client in the client namespace can reach the http server > > > > Hi Simon, looks good? > > Thanks Jakub, will check. Yes, this does look good to me. Reviewed-by: Simon Horman nit: somewhere in the patch description, 'inifinite' -> 'infinite' ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH net v3] net: openvswitch: fix race on port output
On Wed, Apr 5, 2023 at 9:53 AM Felix Huettner wrote: > > assume the following setup on a single machine: > 1. An openvswitch instance with one bridge and default flows > 2. two network namespaces "server" and "client" > 3. two ovs interfaces "server" and "client" on the bridge > 4. for each ovs interface a veth pair with a matching name and 32 rx and >tx queues > 5. move the ends of the veth pairs to the respective network namespaces > 6. assign ip addresses to each of the veth ends in the namespaces (needs >to be the same subnet) > 7. start some http server on the server network namespace > 8. test if a client in the client namespace can reach the http server > > when following the actions below the host has a chance of getting a cpu > stuck in a infinite loop: > 1. send a large amount of parallel requests to the http server (around >3000 curls should work) > 2. in parallel delete the network namespace (do not delete interfaces or >stop the server, just kill the namespace) > > there is a low chance that this will cause the below kernel cpu stuck > message. If this does not happen just retry. > Below there is also the output of bpftrace for the functions mentioned > in the output. > ... Reviewed-by: Eric Dumazet ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH net v3] net: openvswitch: fix race on port output
On Thu, Apr 06, 2023 at 07:05:13PM -0700, Jakub Kicinski wrote: > On Wed, 5 Apr 2023 07:53:41 + Felix Huettner wrote: > > assume the following setup on a single machine: > > 1. An openvswitch instance with one bridge and default flows > > 2. two network namespaces "server" and "client" > > 3. two ovs interfaces "server" and "client" on the bridge > > 4. for each ovs interface a veth pair with a matching name and 32 rx and > >tx queues > > 5. move the ends of the veth pairs to the respective network namespaces > > 6. assign ip addresses to each of the veth ends in the namespaces (needs > >to be the same subnet) > > 7. start some http server on the server network namespace > > 8. test if a client in the client namespace can reach the http server > > Hi Simon, looks good? Thanks Jakub, will check. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH net v3] net: openvswitch: fix race on port output
On Wed, 5 Apr 2023 07:53:41 + Felix Huettner wrote: > assume the following setup on a single machine: > 1. An openvswitch instance with one bridge and default flows > 2. two network namespaces "server" and "client" > 3. two ovs interfaces "server" and "client" on the bridge > 4. for each ovs interface a veth pair with a matching name and 32 rx and >tx queues > 5. move the ends of the veth pairs to the respective network namespaces > 6. assign ip addresses to each of the veth ends in the namespaces (needs >to be the same subnet) > 7. start some http server on the server network namespace > 8. test if a client in the client namespace can reach the http server Hi Simon, looks good? ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH net v3] net: openvswitch: fix race on port output
assume the following setup on a single machine: 1. An openvswitch instance with one bridge and default flows 2. two network namespaces "server" and "client" 3. two ovs interfaces "server" and "client" on the bridge 4. for each ovs interface a veth pair with a matching name and 32 rx and tx queues 5. move the ends of the veth pairs to the respective network namespaces 6. assign ip addresses to each of the veth ends in the namespaces (needs to be the same subnet) 7. start some http server on the server network namespace 8. test if a client in the client namespace can reach the http server when following the actions below the host has a chance of getting a cpu stuck in a infinite loop: 1. send a large amount of parallel requests to the http server (around 3000 curls should work) 2. in parallel delete the network namespace (do not delete interfaces or stop the server, just kill the namespace) there is a low chance that this will cause the below kernel cpu stuck message. If this does not happen just retry. Below there is also the output of bpftrace for the functions mentioned in the output. The series of events happening here is: 1. the network namespace is deleted calling `unregister_netdevice_many_notify` somewhere in the process 2. this sets first `NETREG_UNREGISTERING` on both ends of the veth and then runs `synchronize_net` 3. it then calls `call_netdevice_notifiers` with `NETDEV_UNREGISTER` 4. this is then handled by `dp_device_event` which calls `ovs_netdev_detach_dev` (if a vport is found, which is the case for the veth interface attached to ovs) 5. this removes the rx_handlers of the device but does not prevent packages to be sent to the device 6. `dp_device_event` then queues the vport deletion to work in background as a ovs_lock is needed that we do not hold in the unregistration path 7. `unregister_netdevice_many_notify` continues to call `netdev_unregister_kobject` which sets `real_num_tx_queues` to 0 8. port deletion continues (but details are not relevant for this issue) 9. at some future point the background task deletes the vport If after 7. but before 9. a packet is send to the ovs vport (which is not deleted at this point in time) which forwards it to the `dev_queue_xmit` flow even though the device is unregistering. In `skb_tx_hash` (which is called in the `dev_queue_xmit`) path there is a while loop (if the packet has a rx_queue recorded) that is infinite if `dev->real_num_tx_queues` is zero. To prevent this from happening we update `do_output` to handle devices without carrier the same as if the device is not found (which would be the code path after 9. is done). Additionally we now produce a warning in `skb_tx_hash` if we will hit the inifinite loop. bpftrace (first word is function name): __dev_queue_xmit server: real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0x9edb6f207000, reg_state: 1 netdev_core_pick_tx server: addr: 0x9f0a46d4a000 real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0x9edb6f207000, reg_state: 1 dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 2, reg_state: 1 synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024 synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024 synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024 synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024 dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 6, reg_state: 2 ovs_netdev_detach_dev server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, reg_state: 2 netdev_rx_handler_unregister server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2 synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024 netdev_rx_handler_unregister ret server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2 dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 27, reg_state: 2 dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 22, reg_state: 2 dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 18, reg_state: 2 netdev_unregister_kobject: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024 synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024 ovs_vport_send server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0x9edb6f207000, reg_state: 2 __dev_queue_xmit server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0x9edb6f207000, reg_state: 2 netdev_core_pick_tx server: addr: 0x9f0a46d4a000 real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0x9edb6f207000, reg_state: 2 broken device server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024 ovs_dp_detach_port server: real_num_tx_queues: 0 cpu 9, pid: 9124, tid: 9124, reg_state: 2 synchronize_rcu_expedited: cpu 9, pid: 33604, tid: 33604 stuck message: watchdog: BUG: soft lockup - CPU#5 stuck