On 31.07.2017 16:05, Ben Pfaff wrote:
> Ilya, should we apply this patch to branch-2.6?  Are there other patches
> that should be backported?

It's definitely a bug and should be backported if someone wants to
use barnch-2.6 with DPDK datapath. I traced only this exact case, so
it's hard to say if something else should be backported, but this
patch should fix described issue without  any additional changes.
 
> On Wed, Jul 26, 2017 at 03:28:12PM +0300, Ilya Maximets wrote:
>> Hi.
>>
>> You need to backport at least following patch:
>>
>> commit 3b1fb0779b87788968c1a6a9ff295a9883547485
>> Author: Daniele Di Proietto <[email protected]>
>> Date:   Tue Nov 15 15:40:49 2016 -0800
>>
>>     netdev-dpdk: Don't call rte_dev_stop() in update_flags().
>>     
>>     Calling rte_eth_dev_stop() while the device is running causes a crash.
>>     
>>     We could use rte_eth_dev_set_link_down(), but not every PMD implements
>>     that, and I found one NIC where that has no effect.
>>     
>>     Instead, this commit checks if the device has the NETDEV_UP flag when
>>     transmitting or receiving (similarly to what we do for vhostuser). I
>>     didn't notice any performance difference with this check in case the
>>     device is up.
>>     
>>     An alternative would be to remove the device queues from the pmd threads
>>     tx and receive cache, but that requires reconfiguration and I'd prefer
>>     to avoid it, because the change can come from OpenFlow.
>>     
>>     Signed-off-by: Daniele Di Proietto <[email protected]>
>>     Acked-by: Ilya Maximets <[email protected]>
>>
>> This should fix your issue.
>> In general, I'm suggesting to use stable 2.7 OVS, there was too many DPDK
>> related changes including stability fixes since 2.6.
>>
>> Best regards, Ilya Maximets.
>>
>>> Hi
>>>   We are experiencing a openvswitch crash when bringing down the dpdk bond 
>>> port using "ovs-ofctl mod-port br-prv dpdk1 down".
>>>
>>> backtrace of core is like below. Is there any issue reported earlier  for 
>>> this type of crash in openvswitch community.
>>>
>>> (gdb) bt
>>> #0  ixgbe_rxq_rearm (rxq=0x7fa45061f800) at 
>>> /home/sdn/new_cloud_sdn_switch_2/cloud-sdn-switch/dpdk/drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c:98
>>> #1  _recv_raw_pkts_vec (split_packet=0x0, nb_pkts=32, rx_pkts=<optimized 
>>> out>, rxq=0x7fa45061f800)
>>>     at 
>>> /home/sdn/new_cloud_sdn_switch_2/cloud-sdn-switch/dpdk/drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c:290
>>> #2  ixgbe_recv_pkts_vec (rx_queue=0x7fa45061f800, rx_pkts=<optimized out>, 
>>> nb_pkts=<optimized out>)
>>>     at 
>>> /home/sdn/new_cloud_sdn_switch_2/cloud-sdn-switch/dpdk/drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c:474
>>> #3  0x000000e5000000e4 in ?? ()
>>> #4  0x00000046000000e6 in ?? ()
>>> #5  0x0000006a00000069 in ?? ()
>>> #6  0x0000006c0000006b in ?? ()
>>> #7  0x000000ec0000006d in ?? ()
>>> #8  0x000000ee000000ed in ?? ()
>>> #9  0x00000001537f5780 in ?? ()
>>> #10 0x0000000000000000 in ?? ()
>>> (gdb)
>>>
>>>
>>> I have analyzed the core and it seems it is a result of device stop and 
>>> packet receive from the port happening at same time by two thread
>>> OVS main thread(device stop) and PMD thread(pkt receive). More precisely 
>>> main thread cleaning the packet buffer from rxq sw_ring to avoid the
>>> packet buffer leak while in parallel PMD thread is filling the packet 
>>> buffer in sw_ring/descriptor ring as part of ixgbe_recv_pkts_vec.
>>>
>>> version used is: openvswitch (2.6.1) with dpdk (16.11).
>>>
>>> This crash is not every time reproducible but frequency seems to be high.
>>>
>>> I am new to openvswitch community and this is first time I am posting a 
>>> query. let me know if anything you require from my side.
>>>
>>> Thanks
>>> Keshav
>>
>> _______________________________________________
>> dev mailing list
>> [email protected]
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> 
> 
> 
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to