Hi, Keven

Sorry for the late reply,
because I set up a new environment and ran a full test again.

On Fri, Jun 6, 2025 at 6:52 PM Kevin Traynor <ktray...@redhat.com> wrote:
>
> On 06/06/2025 04:18, Changliang Wu wrote:
> > Hi, Kevin
> > Thanks for reviewing.
> >
> > On Thu, Jun 5, 2025 at 8:44 PM Kevin Traynor <ktray...@redhat.com> wrote:
> >>
> >> On 08/05/2025 04:50, Changliang Wu wrote:
> >>> The ovs op hangs when performing:
> >>> 1. Create a DPDK bridge (datapath_type=netdev) with bind-ed DPDK port
> >>> 2. Restart OVS service (systemctl restart openvswitch)
> >>> 3. Delete the bridge (ovs-vsctl del-br)
> >>> 4. Recreate bridge and re-add DPDK port
> >>>
> >>> Root cause:
> >>> During OVS service restart, ovs-vswitchd reloads all ports from OVSDB via:
> >>> dpdk_init() -> dpdk_init__() -> rte_eal_init() -> ... -> 
> >>> rte_eth_dev_probing_finish()
> >>> This leaves port in DPDK's global rte_eth_devices array with 
> >>> dev->state=RTE_ETH_DEV_ATTACHED
> >>>
> >>> When recreating port:
> >>> netdev_dpdk_process_devargs() calls rte_eth_dev_is_valid_port(port_id)
> >>> Since port_id is valid, skips rte_dev_probe() and does not set 
> >>> netdev->attached=true
> >>>
> >>
> >> Hi,
> >>
> >> Do you hit an issue with this when running, or is this from review ?
> >
> > I hit this issue in the production environment.
> > I restarted the service because of upgrading ovs,
> > and then deleted the bridge,
> > and later found that these ports on the old bridge
> > could no longer be added to the new bridge.
> >
> >>
> >> rte_eal_init() will probe all devices at init (except when dpdk-extra
> >> has -a/-b args to allow/block specific devices). Probed devices in DPDK
> >> have rte_eth_devices[].state set as RTE_ETH_DEV_ATTACHED. Other devices
> >> are set as RTE_ETH_DEV_UNUSED.
> >>
> >> If a device that is not already probed is added as a port in OVS then
> >> OVS will call probe for it. In DPDK, rte_eth_devices[].state is then set
> >> to RTE_ETH_DEV_ATTACHED.
> >>
> >> Note, OVS dev->attached=true is only used to track whether the device
> >> was explicitly probed from OVS, so that when the port is deleted we can
> >> remove the device and return it to it's previous RTE_ETH_DEV_UNUSED
> >> state in DPDK.
> >>
> >> OVS dev->attached is *not* a mirror of
> >> rte_eth_devices[].state=RTE_ETH_DEV_ATTACHED and may have a different 
> >> value.

If the port is added manually, the ovs status should be recorded here,
but when the ovs service is restarted, the record here is lost,
resulting in failure to clean up normally when deleted.


> >>
> >> The existing code looks ok to me. Your change would mean that devices
> >> that were not explicitly probed by OVS are being removed by OVS. That's
> >> not something we want to do, perhaps there could be some hotplug
> >> limitations and the device cannot be reused etc.
> >>
> >
> > These above make sense to me. Modifying OVS dev->attached seems not a
> > good option.
> >
> >> If you are having an issue with a device probed during rte_eal_init() it
> >> is probably something that should be reported in DPDK ml. One option is
> >> to set dpdk-extra like below, so any device will be explicitly probed on
> >> port add.
> >>
> >> ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-extra="-a
> >> 0000:00:00.0"
> >
> > It works for my problem, and also found that I can explicitly detach
> > the device with
> > ovs-appctl netdev-dpdk/detach 0000:00:00.0
> >
> > In fact, the problem here is that when deleting the bridge,
> > the status of all ports on the bridge is not reset.
> > In contrast, if delete the port directly, the status is reset normally,
> > regardless of whether the port is added manually or automatically
> > after restarting the service.
>
> The same port destruct code is called regardless of whether the bridge
> is deleted or the port is deleted. In that port destruct code, DPDK
> remove is called dependent on dev->attached, as per above.
>
> >
> > So maybe we need to discuss whether there is a problem with this scenario?
> >
>
> Below is using Intel E810, with or without dpdk-extra="-a 0000:00:00.0"
>
> # ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev
> # ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk
> options:dpdk-devargs=0000:d8:00.1
> # ovs-vsctl show
> c0ad19cd-bb12-4b7e-b90f-1bd785fb0a53
>     Bridge br0
>         datapath_type: netdev
>         Port br0
>             Interface br0
>                 type: internal
>         Port dpdk0
>             Interface dpdk0
>                 type: dpdk
>                 options: {dpdk-devargs="0000:d8:00.1"}
> # ovs-vsctl del-br br0
> # ovs-vsctl show
> c0ad19cd-bb12-4b7e-b90f-1bd785fb0a53

Restart service is a key step here.

Here are my new test results.
System environment:
Kernel: 4.19.90-2307 & 5.10.0-247 (same in VM)
CPU: Intel(R) Xeon(R) Gold 6230
QEMU : 6.2.0
DPDK : 23.11.3 (same in VM)
OvS : 3.3.4 (same in VM)
NIC 1 : Ethernet Connection X722 for 10GBASE-T 37d2
NIC 2 : MT27800 Family [ConnectX-5] 1017
NIC 3 : virtio in qemu vm

Test 1
1. bind nic to vfio-pci before ovs start, or restart ovs
[root@n-202 ~]# dpdk-devbind.py -s | grep 0000:1a:00.1
0000:1a:00.1 'Ethernet Connection X722 for 10GBASE-T 37d2'
drv=vfio-pci unused=i40e,uio_pci_generic
2. nic cannot bind back to kernel driver, vfio-pci report errors.
[root@n-202 ~]# dpdk-devbind.py -b i40e 0000:1a:00.1
kernel message:
[root@n-202 ~]# dmesg | tail -n 5
[430286.503689] device ovs-netdev entered promiscuous mode
[430289.141854] device ovs-netdev left promiscuous mode
[430290.073351] vfio-pci 0000:1a:00.1: Relaying device request to user (#0)
[430392.502666] vfio-pci 0000:1a:00.1: Relaying device request to user (#10)
[430494.901754] vfio-pci 0000:1a:00.1: Relaying device request to user (#20)

Test 2
1. bind nic to vfio-pci before ovs start, or restart ovs
[root@n-202 ~]# dpdk-devbind.py -s | grep 0000:1a:00.1
0000:1a:00.1 'Ethernet Connection X722 for 10GBASE-T 37d2'
drv=vfio-pci unused=i40e,uio_pci_generic
2. create bridge and add port
[root@n-202 ~]# ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev
[root@n-202 ~]# ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0
type=dpdk options:dpdk-devargs=0000:1a:00.1
[root@n-202 ~]# ovs-vsctl show
8d2863c6-7fa7-446e-917c-d8fbc7492251
    Bridge br0
        datapath_type: netdev
        Port dpdk0
            Interface dpdk0
                type: dpdk
                options: {dpdk-devargs="0000:1a:00.1"}
        Port br0
            Interface br0
                type: internal
    ovs_version: "3.3.4"
3. delete bridge
[root@n-202 ~]# ovs-vsctl del-br br0
4. nic can bind back to kernel driver
[root@n-202 ~]# dpdk-devbind.py -b i40e 0000:1a:00.1
[root@n-202 ~]# dpdk-devbind.py -s | grep 0000:1a:00.1
0000:1a:00.1 'Ethernet Connection X722 for 10GBASE-T 37d2' if=eno2
drv=i40e unused=vfio-pci

Test 3
1. compare to Test 2, restart ovs between step2 and step3
# dpdk-devbind.py -b vfio-pci 0000:1a:00.1
# ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev
# ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk
options:dpdk-devargs=0000:1a:00.1
# /usr/share/openvswitch/scripts/ovs-ctl restart
# ovs-vsctl del-br br0
# dpdk-devbind.py -b i40e 0000:1a:00.1
keep hang like Test 1.


Test 4
The following ops will hang at add-port (kernel 4.19), no problem in (5.10),
maybe the problems in kernel driver uio_pci_generic
# dpdk-devbind.py -s | grep 0000:00:05.0
0000:00:05.0 'Virtio network device 1000' unused=vfio-pci,uio_pci_generic
# dpdk-devbind.py -b uio_pci_generic 0000:00:05.0
# ovs-vsctl show
# ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev
# ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk
options:dpdk-devargs=0000:00:05.0
# ovs-vsctl show
# /usr/share/openvswitch/scripts/ovs-ctl restart
# ovs-vsctl del-br br0
# ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev
# ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk
options:dpdk-devargs=0000:00:05.0

Summary:
1. If the vfio-pci driver is used, ovs will automatically load to dpdk
when it starts (without adding ports),
which will cause ubind failure (Test1).
If add,then delete ports/bridges, dpdk dev will be released in ovs,
and ubind will succeed (Test2).
If restart ovs after adding, you will not be able to ubind
successfully if delete bridge again (Test3).
2. Some versions of the kernel uio_pci_generic will not be added back to ovs
after adding ports, restarting, and deleting them (Test4).
(This is also the problem described at the beginning of the current patch.
My current environment cannot fully reproduce the original problem.
Maybe I made a mistake before.)
3. No problems have been found for mlx5 network cards that do not need
to bind to dpdk.

Finally, I think that although the modifications here are inappropriate,
the original idea is that if the port is manually added in ovs, then
after restarting the ovs service,
it should be set to the state before the restart.

> # ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev
> # ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk
> options:dpdk-devargs=0000:d8:00.1
> # ovs-vsctl show
> c0ad19cd-bb12-4b7e-b90f-1bd785fb0a53
>     Bridge br0
>         datapath_type: netdev
>         Port br0
>             Interface br0
>                 type: internal
>         Port dpdk0
>             Interface dpdk0
>                 type: dpdk
>                 options: {dpdk-devargs="0000:d8:00.1"}
>
>
> I don't see a problem with this part of the OVS code, so perhaps you are
> facing some other issue. Which NIC are you using ? Perhaps you need to
> update OVS/DPDK versions if they are old ?
>
> >>
> >> thanks,
> >> Kevin.
> >>
> >>> During bridge/port deletion:
> >>> netdev_dpdk_destruct() checks if(dev->attached)
> >>> and fail to execute cleanup logic for attached devices
> >>> Leave a device in rte_eth_devices with RTE_ETH_DEV_ATTACHED
> >>>
> >>> Subsequent port addition:
> >>> rte_eth_dev_is_valid_port() still returns true,
> >>> critical PCI initialization (rte_dev_probe()) was skip,
> >>> the following configurations hang due to uninitialized device state
> >>>
> >>> Fix implementation:
> >>> Modify netdev_dpdk_process_devargs() to set netdev->attached=true 
> >>> explicitly,
> >>> when port_id is not DPDK_ETH_PORT_ID_INVALID.
> >>>
> >>> Signed-off-by: Changliang Wu <changliang...@smartx.com>
> >>> ---
> >>>  lib/netdev-dpdk.c | 3 ++-
> >>>  1 file changed, 2 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> >>> index e2708a8a5..a551b204e 100644
> >>> --- a/lib/netdev-dpdk.c
> >>> +++ b/lib/netdev-dpdk.c
> >>> @@ -2129,7 +2129,6 @@ netdev_dpdk_process_devargs(struct netdev_dpdk *dev,
> >>>                  new_port_id = netdev_dpdk_get_port_by_devargs(devargs);
> >>>                  if (rte_eth_dev_is_valid_port(new_port_id)) {
> >>>                      /* Attach successful */
> >>> -                    dev->attached = true;
> >>>                      VLOG_INFO("Device '%s' attached to DPDK", devargs);
> >>>                  } else {
> >>>                      /* Attach unsuccessful */
> >>> @@ -2141,6 +2140,8 @@ netdev_dpdk_process_devargs(struct netdev_dpdk *dev,
> >>>
> >>>      if (new_port_id == DPDK_ETH_PORT_ID_INVALID) {
> >>>          VLOG_WARN_BUF(errp, "Error attaching device '%s' to DPDK", 
> >>> devargs);
> >>> +    } else {
> >>> +        dev->attached = true;
> >>>      }
> >>>
> >>>      return new_port_id;
> >>
> >
> > thanks,
> > Changliang
> >
>
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to