Hi, Keven Sorry for the late reply, because I set up a new environment and ran a full test again.
On Fri, Jun 6, 2025 at 6:52 PM Kevin Traynor <ktray...@redhat.com> wrote: > > On 06/06/2025 04:18, Changliang Wu wrote: > > Hi, Kevin > > Thanks for reviewing. > > > > On Thu, Jun 5, 2025 at 8:44 PM Kevin Traynor <ktray...@redhat.com> wrote: > >> > >> On 08/05/2025 04:50, Changliang Wu wrote: > >>> The ovs op hangs when performing: > >>> 1. Create a DPDK bridge (datapath_type=netdev) with bind-ed DPDK port > >>> 2. Restart OVS service (systemctl restart openvswitch) > >>> 3. Delete the bridge (ovs-vsctl del-br) > >>> 4. Recreate bridge and re-add DPDK port > >>> > >>> Root cause: > >>> During OVS service restart, ovs-vswitchd reloads all ports from OVSDB via: > >>> dpdk_init() -> dpdk_init__() -> rte_eal_init() -> ... -> > >>> rte_eth_dev_probing_finish() > >>> This leaves port in DPDK's global rte_eth_devices array with > >>> dev->state=RTE_ETH_DEV_ATTACHED > >>> > >>> When recreating port: > >>> netdev_dpdk_process_devargs() calls rte_eth_dev_is_valid_port(port_id) > >>> Since port_id is valid, skips rte_dev_probe() and does not set > >>> netdev->attached=true > >>> > >> > >> Hi, > >> > >> Do you hit an issue with this when running, or is this from review ? > > > > I hit this issue in the production environment. > > I restarted the service because of upgrading ovs, > > and then deleted the bridge, > > and later found that these ports on the old bridge > > could no longer be added to the new bridge. > > > >> > >> rte_eal_init() will probe all devices at init (except when dpdk-extra > >> has -a/-b args to allow/block specific devices). Probed devices in DPDK > >> have rte_eth_devices[].state set as RTE_ETH_DEV_ATTACHED. Other devices > >> are set as RTE_ETH_DEV_UNUSED. > >> > >> If a device that is not already probed is added as a port in OVS then > >> OVS will call probe for it. In DPDK, rte_eth_devices[].state is then set > >> to RTE_ETH_DEV_ATTACHED. > >> > >> Note, OVS dev->attached=true is only used to track whether the device > >> was explicitly probed from OVS, so that when the port is deleted we can > >> remove the device and return it to it's previous RTE_ETH_DEV_UNUSED > >> state in DPDK. > >> > >> OVS dev->attached is *not* a mirror of > >> rte_eth_devices[].state=RTE_ETH_DEV_ATTACHED and may have a different > >> value. If the port is added manually, the ovs status should be recorded here, but when the ovs service is restarted, the record here is lost, resulting in failure to clean up normally when deleted. > >> > >> The existing code looks ok to me. Your change would mean that devices > >> that were not explicitly probed by OVS are being removed by OVS. That's > >> not something we want to do, perhaps there could be some hotplug > >> limitations and the device cannot be reused etc. > >> > > > > These above make sense to me. Modifying OVS dev->attached seems not a > > good option. > > > >> If you are having an issue with a device probed during rte_eal_init() it > >> is probably something that should be reported in DPDK ml. One option is > >> to set dpdk-extra like below, so any device will be explicitly probed on > >> port add. > >> > >> ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-extra="-a > >> 0000:00:00.0" > > > > It works for my problem, and also found that I can explicitly detach > > the device with > > ovs-appctl netdev-dpdk/detach 0000:00:00.0 > > > > In fact, the problem here is that when deleting the bridge, > > the status of all ports on the bridge is not reset. > > In contrast, if delete the port directly, the status is reset normally, > > regardless of whether the port is added manually or automatically > > after restarting the service. > > The same port destruct code is called regardless of whether the bridge > is deleted or the port is deleted. In that port destruct code, DPDK > remove is called dependent on dev->attached, as per above. > > > > > So maybe we need to discuss whether there is a problem with this scenario? > > > > Below is using Intel E810, with or without dpdk-extra="-a 0000:00:00.0" > > # ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev > # ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk > options:dpdk-devargs=0000:d8:00.1 > # ovs-vsctl show > c0ad19cd-bb12-4b7e-b90f-1bd785fb0a53 > Bridge br0 > datapath_type: netdev > Port br0 > Interface br0 > type: internal > Port dpdk0 > Interface dpdk0 > type: dpdk > options: {dpdk-devargs="0000:d8:00.1"} > # ovs-vsctl del-br br0 > # ovs-vsctl show > c0ad19cd-bb12-4b7e-b90f-1bd785fb0a53 Restart service is a key step here. Here are my new test results. System environment: Kernel: 4.19.90-2307 & 5.10.0-247 (same in VM) CPU: Intel(R) Xeon(R) Gold 6230 QEMU : 6.2.0 DPDK : 23.11.3 (same in VM) OvS : 3.3.4 (same in VM) NIC 1 : Ethernet Connection X722 for 10GBASE-T 37d2 NIC 2 : MT27800 Family [ConnectX-5] 1017 NIC 3 : virtio in qemu vm Test 1 1. bind nic to vfio-pci before ovs start, or restart ovs [root@n-202 ~]# dpdk-devbind.py -s | grep 0000:1a:00.1 0000:1a:00.1 'Ethernet Connection X722 for 10GBASE-T 37d2' drv=vfio-pci unused=i40e,uio_pci_generic 2. nic cannot bind back to kernel driver, vfio-pci report errors. [root@n-202 ~]# dpdk-devbind.py -b i40e 0000:1a:00.1 kernel message: [root@n-202 ~]# dmesg | tail -n 5 [430286.503689] device ovs-netdev entered promiscuous mode [430289.141854] device ovs-netdev left promiscuous mode [430290.073351] vfio-pci 0000:1a:00.1: Relaying device request to user (#0) [430392.502666] vfio-pci 0000:1a:00.1: Relaying device request to user (#10) [430494.901754] vfio-pci 0000:1a:00.1: Relaying device request to user (#20) Test 2 1. bind nic to vfio-pci before ovs start, or restart ovs [root@n-202 ~]# dpdk-devbind.py -s | grep 0000:1a:00.1 0000:1a:00.1 'Ethernet Connection X722 for 10GBASE-T 37d2' drv=vfio-pci unused=i40e,uio_pci_generic 2. create bridge and add port [root@n-202 ~]# ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev [root@n-202 ~]# ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk options:dpdk-devargs=0000:1a:00.1 [root@n-202 ~]# ovs-vsctl show 8d2863c6-7fa7-446e-917c-d8fbc7492251 Bridge br0 datapath_type: netdev Port dpdk0 Interface dpdk0 type: dpdk options: {dpdk-devargs="0000:1a:00.1"} Port br0 Interface br0 type: internal ovs_version: "3.3.4" 3. delete bridge [root@n-202 ~]# ovs-vsctl del-br br0 4. nic can bind back to kernel driver [root@n-202 ~]# dpdk-devbind.py -b i40e 0000:1a:00.1 [root@n-202 ~]# dpdk-devbind.py -s | grep 0000:1a:00.1 0000:1a:00.1 'Ethernet Connection X722 for 10GBASE-T 37d2' if=eno2 drv=i40e unused=vfio-pci Test 3 1. compare to Test 2, restart ovs between step2 and step3 # dpdk-devbind.py -b vfio-pci 0000:1a:00.1 # ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev # ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk options:dpdk-devargs=0000:1a:00.1 # /usr/share/openvswitch/scripts/ovs-ctl restart # ovs-vsctl del-br br0 # dpdk-devbind.py -b i40e 0000:1a:00.1 keep hang like Test 1. Test 4 The following ops will hang at add-port (kernel 4.19), no problem in (5.10), maybe the problems in kernel driver uio_pci_generic # dpdk-devbind.py -s | grep 0000:00:05.0 0000:00:05.0 'Virtio network device 1000' unused=vfio-pci,uio_pci_generic # dpdk-devbind.py -b uio_pci_generic 0000:00:05.0 # ovs-vsctl show # ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev # ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk options:dpdk-devargs=0000:00:05.0 # ovs-vsctl show # /usr/share/openvswitch/scripts/ovs-ctl restart # ovs-vsctl del-br br0 # ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev # ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk options:dpdk-devargs=0000:00:05.0 Summary: 1. If the vfio-pci driver is used, ovs will automatically load to dpdk when it starts (without adding ports), which will cause ubind failure (Test1). If add,then delete ports/bridges, dpdk dev will be released in ovs, and ubind will succeed (Test2). If restart ovs after adding, you will not be able to ubind successfully if delete bridge again (Test3). 2. Some versions of the kernel uio_pci_generic will not be added back to ovs after adding ports, restarting, and deleting them (Test4). (This is also the problem described at the beginning of the current patch. My current environment cannot fully reproduce the original problem. Maybe I made a mistake before.) 3. No problems have been found for mlx5 network cards that do not need to bind to dpdk. Finally, I think that although the modifications here are inappropriate, the original idea is that if the port is manually added in ovs, then after restarting the ovs service, it should be set to the state before the restart. > # ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev > # ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk > options:dpdk-devargs=0000:d8:00.1 > # ovs-vsctl show > c0ad19cd-bb12-4b7e-b90f-1bd785fb0a53 > Bridge br0 > datapath_type: netdev > Port br0 > Interface br0 > type: internal > Port dpdk0 > Interface dpdk0 > type: dpdk > options: {dpdk-devargs="0000:d8:00.1"} > > > I don't see a problem with this part of the OVS code, so perhaps you are > facing some other issue. Which NIC are you using ? Perhaps you need to > update OVS/DPDK versions if they are old ? > > >> > >> thanks, > >> Kevin. > >> > >>> During bridge/port deletion: > >>> netdev_dpdk_destruct() checks if(dev->attached) > >>> and fail to execute cleanup logic for attached devices > >>> Leave a device in rte_eth_devices with RTE_ETH_DEV_ATTACHED > >>> > >>> Subsequent port addition: > >>> rte_eth_dev_is_valid_port() still returns true, > >>> critical PCI initialization (rte_dev_probe()) was skip, > >>> the following configurations hang due to uninitialized device state > >>> > >>> Fix implementation: > >>> Modify netdev_dpdk_process_devargs() to set netdev->attached=true > >>> explicitly, > >>> when port_id is not DPDK_ETH_PORT_ID_INVALID. > >>> > >>> Signed-off-by: Changliang Wu <changliang...@smartx.com> > >>> --- > >>> lib/netdev-dpdk.c | 3 ++- > >>> 1 file changed, 2 insertions(+), 1 deletion(-) > >>> > >>> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c > >>> index e2708a8a5..a551b204e 100644 > >>> --- a/lib/netdev-dpdk.c > >>> +++ b/lib/netdev-dpdk.c > >>> @@ -2129,7 +2129,6 @@ netdev_dpdk_process_devargs(struct netdev_dpdk *dev, > >>> new_port_id = netdev_dpdk_get_port_by_devargs(devargs); > >>> if (rte_eth_dev_is_valid_port(new_port_id)) { > >>> /* Attach successful */ > >>> - dev->attached = true; > >>> VLOG_INFO("Device '%s' attached to DPDK", devargs); > >>> } else { > >>> /* Attach unsuccessful */ > >>> @@ -2141,6 +2140,8 @@ netdev_dpdk_process_devargs(struct netdev_dpdk *dev, > >>> > >>> if (new_port_id == DPDK_ETH_PORT_ID_INVALID) { > >>> VLOG_WARN_BUF(errp, "Error attaching device '%s' to DPDK", > >>> devargs); > >>> + } else { > >>> + dev->attached = true; > >>> } > >>> > >>> return new_port_id; > >> > > > > thanks, > > Changliang > > > _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev