On 01.08.2017 17:19, Ilya Maximets wrote: > On 01.08.2017 15:46, Keshav Gupta wrote: >> Hi Ben/Ilya >> Mainly patch do the following >> 1) Call the dpdk rx api(rte_eth_rx_burst) only when ovs netdev state is UP >> 2) Now the rte_eth_dev_stop/ rte_eth_dev_start calls are removed in >> updating the flags >> >> >> This fixes the core dump issue but I see below side effect with this patch: >> With this change now only ports admin state(ovs netdev level) is changed >> (dpdk/nic level state is not changed) so NIC will continue to receive the >> packet until its Rxq ring is full >> Later if somebody bring up the port then he will get the burst of old >> packets which are there in the Rxq (typically 4k packet (dpdk Rxq size)). >> >> >> Thanks >> Keshav >> > > Hi Keshav, > > Yes, you're right. I think it's the same situation right now can be observed > on > current master. We can try to fix this for the HW NICs in the similar way as > we > handle destroy_device event for vhost-user ports. I prepared quick fix (not > even > compile tested) for the reference how this can be fixed. Please check the code > below. One thing is that we actually can't fix such behaviour for the > vhost-user > because there is no such command to stop the vhost device. But we may try to > fix it for HW NICs if it's important. > > --8<--------------------------------------------------------------------->8-- > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c > index ea17b97..202678f 100644 > --- a/lib/netdev-dpdk.c > +++ b/lib/netdev-dpdk.c > @@ -2300,20 +2300,48 @@ netdev_dpdk_update_flags__(struct netdev_dpdk *dev, > enum netdev_flags *old_flagsp) > OVS_REQUIRES(dev->mutex) > { > + enum netdev_flags new_flags = dev->flags; > + > if ((off | on) & ~(NETDEV_UP | NETDEV_PROMISC)) { > return EINVAL; > } > > *old_flagsp = dev->flags; > - dev->flags |= on; > - dev->flags &= ~off; > + new_flags |= on; > + new_flags &= ~off; > > - if (dev->flags == *old_flagsp) { > + if (new_flags == dev->flags) { > return 0; > } > > if (dev->type == DPDK_DEV_ETH) { > - if (dev->flags & NETDEV_PROMISC) { > + if (NETDEV_UP & ((*old_flagsp ^ on) | (*old_flagsp ^ off))) { > + if (NETDEV_UP & off) { > + bool quiescent = ovsrcu_is_quiescent(); > + > + dev->flags = new_flags; > + /* Wait for other threads to quiesce after turning off the > + * 'NETDEV_UP' before actual stopping the device to be sure > + * that all threads finished their rx/tx operations. */ > + ovsrcu_synchronize(); > + if (quiescent) { > + /* As call to ovsrcu_synchronize() will end the quiescent > + * state, put thread back into quiescent state. */ > + ovsrcu_quiesce_start(); > + } > + rte_eth_dev_stop(dev->port_id); > + } else { > + int retval = rte_eth_dev_start(dev->port_id); > + > + if (retval) { > + VLOG_ERR("Interface %s start error: %s", dev->up.name, > + rte_strerror(-diag)); > + return -retval; > + } > + } > + } > + > + if (new_flags & NETDEV_PROMISC) { > rte_eth_promiscuous_enable(dev->port_id); > } > > @@ -2336,6 +2364,7 @@ netdev_dpdk_update_flags__(struct netdev_dpdk *dev, > } > } > > + dev->flags = new_flags; > return 0; > } > > --8<--------------------------------------------------------------------->8-- > > Also, the diff above made on top of current master. So, you possibly will > need to port it a little. >
To make it work on current master at least following change also required: --8<--------------------------------------------------------------------->8-- diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index ea17b97..7145266 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -761,11 +761,13 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev) return -diag; } - diag = rte_eth_dev_start(dev->port_id); - if (diag) { - VLOG_ERR("Interface %s start error: %s", dev->up.name, - rte_strerror(-diag)); - return -diag; + if (dev->flags & NETDEV_UP) { + diag = rte_eth_dev_start(dev->port_id); + if (diag) { + VLOG_ERR("Interface %s start error: %s", dev->up.name, + rte_strerror(-diag)); + return -diag; + } } rte_eth_promiscuous_enable(dev->port_id); --8<--------------------------------------------------------------------->8-- It is to prevent stopped port starting on each reconfiguration. > Best regards, Ilya Maximets. > >> >> -----Original Message----- >> From: Ben Pfaff [mailto:b...@ovn.org] >> Sent: Monday, July 31, 2017 9:26 PM >> To: Ilya Maximets >> Cc: ovs-dev@openvswitch.org; Keshav Gupta >> Subject: Re: [ovs-dev] Openvswitch crash when bringing down the dpdk bond >> port using "ovs-ofctl mod-port br-prv dpdk1 down" >> >> Thanks. >> >> It doesn't cherry-pick cleanly to branch-2.6, so I think I'll leave it for >> now since Darrell made the reasonable point that DPDK was experimental in >> 2.6. >> >> On Mon, Jul 31, 2017 at 06:20:28PM +0300, Ilya Maximets wrote: >>> On 31.07.2017 16:05, Ben Pfaff wrote: >>>> Ilya, should we apply this patch to branch-2.6? Are there other >>>> patches that should be backported? >>> >>> It's definitely a bug and should be backported if someone wants to use >>> barnch-2.6 with DPDK datapath. I traced only this exact case, so it's >>> hard to say if something else should be backported, but this patch >>> should fix described issue without any additional changes. >>> >>>> On Wed, Jul 26, 2017 at 03:28:12PM +0300, Ilya Maximets wrote: >>>>> Hi. >>>>> >>>>> You need to backport at least following patch: >>>>> >>>>> commit 3b1fb0779b87788968c1a6a9ff295a9883547485 >>>>> Author: Daniele Di Proietto <diproiet...@vmware.com> >>>>> Date: Tue Nov 15 15:40:49 2016 -0800 >>>>> >>>>> netdev-dpdk: Don't call rte_dev_stop() in update_flags(). >>>>> >>>>> Calling rte_eth_dev_stop() while the device is running causes a crash. >>>>> >>>>> We could use rte_eth_dev_set_link_down(), but not every PMD implements >>>>> that, and I found one NIC where that has no effect. >>>>> >>>>> Instead, this commit checks if the device has the NETDEV_UP flag when >>>>> transmitting or receiving (similarly to what we do for vhostuser). I >>>>> didn't notice any performance difference with this check in case the >>>>> device is up. >>>>> >>>>> An alternative would be to remove the device queues from the pmd >>>>> threads >>>>> tx and receive cache, but that requires reconfiguration and I'd prefer >>>>> to avoid it, because the change can come from OpenFlow. >>>>> >>>>> Signed-off-by: Daniele Di Proietto <diproiet...@vmware.com> >>>>> Acked-by: Ilya Maximets <i.maxim...@samsung.com> >>>>> >>>>> This should fix your issue. >>>>> In general, I'm suggesting to use stable 2.7 OVS, there was too >>>>> many DPDK related changes including stability fixes since 2.6. >>>>> >>>>> Best regards, Ilya Maximets. >>>>> >>>>>> Hi >>>>>> We are experiencing a openvswitch crash when bringing down the dpdk >>>>>> bond port using "ovs-ofctl mod-port br-prv dpdk1 down". >>>>>> >>>>>> backtrace of core is like below. Is there any issue reported earlier >>>>>> for this type of crash in openvswitch community. >>>>>> >>>>>> (gdb) bt >>>>>> #0 ixgbe_rxq_rearm (rxq=0x7fa45061f800) at >>>>>> /home/sdn/new_cloud_sdn_switch_2/cloud-sdn-switch/dpdk/drivers/net >>>>>> /ixgbe/ixgbe_rxtx_vec_sse.c:98 >>>>>> #1 _recv_raw_pkts_vec (split_packet=0x0, nb_pkts=32, rx_pkts=<optimized >>>>>> out>, rxq=0x7fa45061f800) >>>>>> at >>>>>> /home/sdn/new_cloud_sdn_switch_2/cloud-sdn-switch/dpdk/drivers/net >>>>>> /ixgbe/ixgbe_rxtx_vec_sse.c:290 >>>>>> #2 ixgbe_recv_pkts_vec (rx_queue=0x7fa45061f800, rx_pkts=<optimized >>>>>> out>, nb_pkts=<optimized out>) >>>>>> at >>>>>> /home/sdn/new_cloud_sdn_switch_2/cloud-sdn-switch/dpdk/drivers/net >>>>>> /ixgbe/ixgbe_rxtx_vec_sse.c:474 >>>>>> #3 0x000000e5000000e4 in ?? () >>>>>> #4 0x00000046000000e6 in ?? () >>>>>> #5 0x0000006a00000069 in ?? () >>>>>> #6 0x0000006c0000006b in ?? () >>>>>> #7 0x000000ec0000006d in ?? () >>>>>> #8 0x000000ee000000ed in ?? () >>>>>> #9 0x00000001537f5780 in ?? () >>>>>> #10 0x0000000000000000 in ?? () >>>>>> (gdb) >>>>>> >>>>>> >>>>>> I have analyzed the core and it seems it is a result of device >>>>>> stop and packet receive from the port happening at same time by >>>>>> two thread OVS main thread(device stop) and PMD thread(pkt receive). >>>>>> More precisely main thread cleaning the packet buffer from rxq sw_ring >>>>>> to avoid the packet buffer leak while in parallel PMD thread is filling >>>>>> the packet buffer in sw_ring/descriptor ring as part of >>>>>> ixgbe_recv_pkts_vec. >>>>>> >>>>>> version used is: openvswitch (2.6.1) with dpdk (16.11). >>>>>> >>>>>> This crash is not every time reproducible but frequency seems to be high. >>>>>> >>>>>> I am new to openvswitch community and this is first time I am posting a >>>>>> query. let me know if anything you require from my side. >>>>>> >>>>>> Thanks >>>>>> Keshav >>>>> >>>>> _______________________________________________ >>>>> dev mailing list >>>>> d...@openvswitch.org >>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev >>>> >>>> >>>> > > _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev