On Wed, Jun 12, 2024 at 4:33 PM Kevin Traynor <[email protected]> wrote: > > When a device reset interrupt event (RTE_ETH_EVENT_INTR_RESET) > is detected for a DPDK device added to OVS, a device reset is > performed. > > If a device reset interrupt event is detected for a device before > it is added to OVS, device reset is not called. > > If that device is later attempted to be added to OVS, it may fail > while being configured if it is still pending a reset as pending > reset is not checked when adding a device. > > A simple way to force a reset event from the ice driver for an > iavf device is to set the mac address after binding iavf dev to > vfio but before adding to OVS. (note: should not be set like this > in normal case). e.g. > > $ echo 2 > /sys/class/net/ens3f0/device/sriov_numvfs > $ ./devbind.py -b vfio-pci 0000:d8:01.1 > $ ip link set ens3f0 vf 1 mac 26:ab:e6:6f:79:4d > $ ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk \ > options:dpdk-devargs=0000:d8:01.1 > > |dpdk|ERR|Port1 dev_configure = -1 > |netdev_dpdk|WARN|Interface dpdk0 eth_dev setup error > Operation not permitted > |netdev_dpdk|ERR|Interface dpdk0(rxq:1 txq:5 lsc interrupt mode:false) > configure error: Operation not permitted > |dpif_netdev|ERR|Failed to set interface dpdk0 new configuration > > Add a check if there was any previous device reset interrupt events > when a device is added to OVS. If there was, perform the reset > before continuing with the rest of the configuration. > > netdev_dpdk_pending_reset[] already tracks device reset interrupt > events for all devices, so it can be reused to check if there is a > reset needed during configuration of newly added devices. By extending > it's usage, dev->reset_needed is no longer needed. >
Maybe add a Fixes: and I think we should backport this. > Signed-off-by: Kevin Traynor <[email protected]> > --- > lib/netdev-dpdk.c | 20 ++++++++++++-------- > 1 file changed, 12 insertions(+), 8 deletions(-) > > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c > index 0fa37d514..9ceb0d972 100644 > --- a/lib/netdev-dpdk.c > +++ b/lib/netdev-dpdk.c > @@ -465,7 +465,6 @@ struct netdev_dpdk { > /* If true, rte_eth_dev_start() was successfully called */ > bool started; > - bool reset_needed; > - /* 1 pad byte here. */ > struct eth_addr hwaddr; > + /* 2 pad byte here. */ Nit: bytes? > int mtu; > int socket_id; > @@ -1532,5 +1531,4 @@ common_construct(struct netdev *netdev, dpdk_port_t > port_no, > dev->attached = false; > dev->started = false; > - dev->reset_needed = false; > > ovsrcu_init(&dev->qos_conf, NULL); > @@ -2155,5 +2153,4 @@ netdev_dpdk_run(const struct netdev_class *netdev_class > OVS_UNUSED) > continue; > } > - atomic_store_relaxed(&netdev_dpdk_pending_reset[port_id], false); > > ovs_mutex_lock(&dpdk_mutex); > @@ -2161,5 +2158,4 @@ netdev_dpdk_run(const struct netdev_class *netdev_class > OVS_UNUSED) > if (dev) { > ovs_mutex_lock(&dev->mutex); > - dev->reset_needed = true; > netdev_request_reconfigure(&dev->up); > VLOG_DBG_RL(&rl, "%s: Device reset requested.", > @@ -6073,4 +6069,5 @@ netdev_dpdk_reconfigure(struct netdev *netdev) > { > struct netdev_dpdk *dev = netdev_dpdk_cast(netdev); > + bool pending_reset; > bool try_rx_steer; > int err = 0; > @@ -6084,4 +6081,7 @@ netdev_dpdk_reconfigure(struct netdev *netdev) > } > > + atomic_read_relaxed(&netdev_dpdk_pending_reset[dev->port_id], > + &pending_reset); > + > if (netdev->n_txq == dev->requested_n_txq > && netdev->n_rxq == dev->requested_n_rxq > @@ -6093,5 +6093,5 @@ netdev_dpdk_reconfigure(struct netdev *netdev) > && eth_addr_equals(dev->hwaddr, dev->requested_hwaddr) > && dev->socket_id == dev->requested_socket_id > - && dev->started && !dev->reset_needed) { > + && dev->started && !pending_reset) { > /* Reconfiguration is unnecessary */ > > @@ -6102,8 +6102,12 @@ retry: > dpdk_rx_steer_unconfigure(dev); > > - if (dev->reset_needed) { > + if (pending_reset) { > + /* > + * Set false before reset to avoid missing a new reset interrupt > event > + * in a race with event callback. > + */ > + atomic_store_relaxed(&netdev_dpdk_pending_reset[dev->port_id], > false); > rte_eth_dev_reset(dev->port_id); > if_notifier_manual_report(); > - dev->reset_needed = false; > } else { > rte_eth_dev_stop(dev->port_id); The patch looks good to me. Reviewed-by: David Marchand <[email protected]> -- David Marchand _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
