Hi Riccardo,

So the issue you are seeing is not related to the VSI queue setup as previously 
mentioned. The issue you see is specific to the use of the i350 VF with OVS it 
seems.

Specifically, OVS attempts to set the MTU of a given device (in this case the 
SRIOV VF from an i350 interface known as igbvf) with the call

diag = rte_eth_dev_set_mtu(dev->port_id, dev->mtu);

Currently igbvf devices do not support rte_eth_dev_set_mtu. This is confirmed 
upon examination of the igbvf operations in DPDK (note mtu_set = 0x0 below)

$1 = {dev_configure = 0x5326e1 <igbvf_dev_configure>, dev_start = 0x532772 
<igbvf_dev_start>, dev_stop = 0x532980 <igbvf_dev_stop>, dev_set_link_up = 0x0, 
dev_set_link_down = 0x0, dev_close = 0x532a34 <igbvf_dev_close>,  dev_reset = 
0x0, link_update = 0x530bf1 <eth_igb_link_update>, promiscuous_enable = 
0x532abe <igbvf_promiscuous_enable>, promiscuous_disable = 0x532aee 
<igbvf_promiscuous_disable>,
  allmulticast_enable = 0x532b47 <igbvf_allmulticast_enable>, 
allmulticast_disable = 0x532b8d <igbvf_allmulticast_disable>, mac_addr_remove = 
0x0, mac_addr_add = 0x0, mac_addr_set = 0x532e4d <igbvf_default_mac_addr_set>,  
set_mc_addr_list = 0x535ca2 <eth_igb_set_mc_addr_list>, mtu_set = 0x0, 
stats_get = 0x5303e5 <eth_igbvf_stats_get>, stats_reset = 0x530488 
<eth_igbvf_stats_reset>, xstats_get = 0x53030f <eth_igbvf_xstats_get>,  
xstats_reset = 0x530488 <eth_igbvf_stats_reset>, xstats_get_names = 0x530299 
<eth_igbvf_xstats_get_names>, queue_stats_mapping_set = 0x0, dev_infos_get = 
0x5309cf <eth_igbvf_infos_get>, rxq_info_get = 0x53f4b2 <igb_rxq_info_get>,  
txq_info_get = 0x53f53f <igb_txq_info_get>, fw_version_get = 0x0, 
dev_supported_ptypes_get = 0x53099b <eth_igb_supported_ptypes_get>, 
vlan_filter_set = 0x532d4b <igbvf_vlan_filter_set>, vlan_tpid_set = 0x0,  
vlan_strip_queue_set = 0x0, vlan_offload_set = 0x0, vlan_pvid_set = 0x0, 
rx_queue_start = 0x0, rx_queue_stop = 0x0, tx_queue_start = 0x0, tx_queue_stop 
= 0x0, rx_queue_setup = 0x53c42c <eth_igb_rx_queue_setup>,  rx_queue_release = 
0x53c39f <eth_igb_rx_queue_release>, rx_queue_count = 0x0, rx_descriptor_done = 
0x0, rx_descriptor_status = 0x0, tx_descriptor_status = 0x0, 
rx_queue_intr_enable = 0x0, rx_queue_intr_disable = 0x0,  tx_queue_setup = 
0x53bbb1 <eth_igb_tx_queue_setup>, tx_queue_release = 0x53b42c 
<eth_igb_tx_queue_release>, tx_done_cleanup = 0x0, dev_led_on = 0x0, 
dev_led_off = 0x0, flow_ctrl_get = 0x0, flow_ctrl_set = 0x0,  
priority_flow_ctrl_set = 0x0, uc_hash_table_set = 0x0, uc_all_hash_table_set = 
0x0, mirror_rule_set = 0x0, mirror_rule_reset = 0x0, udp_tunnel_port_add = 0x0, 
udp_tunnel_port_del = 0x0, l2_tunnel_eth_type_conf = 0x0,  
l2_tunnel_offload_set = 0x0, set_queue_rate_limit = 0x0, rss_hash_update = 0x0, 
rss_hash_conf_get = 0x0, reta_update = 0x0, reta_query = 0x0, get_reg = 
0x536a60 <igbvf_get_regs>, get_eeprom_length = 0x0, get_eeprom = 0x0,  
set_eeprom = 0x0, filter_ctrl = 0x0, get_dcb_info = 0x0, timesync_enable = 0x0, 
timesync_disable = 0x0, timesync_read_rx_timestamp = 0x0, 
timesync_read_tx_timestamp = 0x0, timesync_adjust_time = 0x0, 
timesync_read_time = 0x0,  timesync_write_time = 0x0, xstats_get_by_id = 0x0, 
xstats_get_names_by_id = 0x0, tm_ops_get = 0x0, mtr_ops_get = 0x0, 
pool_ops_supported = 0x0}

I confirmed this by also testing an i40evf device and in that case the mtu_set 
function is supported so I didn’t see the error/segfault you reported.

Support for rte_eth_dev_set_mtu was introduced in commit 
67fe6d635193761439f791e48652acfd60076cfb. The purpose was to set the mtu for 
the physical device so that call to rte_eth_dev_get_mtu() would correctly 
report the MTU set for the port. The best solution here would be for support to 
be introduced in DPDK for rte_eth_dev_set_mtu() for igbvf functions. However 
this will have to be in a future release and means you will be blocked until 
then.

For testing purposes you could re-introduce the previous method of setting the 
mtu for the device (Note this is compile tested only):

+
+    /*
+     * Need to enable hw_strip_crc specifically for SRIOV devices.
+     */
+    conf.rxmode.hw_strip_crc = 1;
+
     /* A device may report more queues than it makes available (this has
      * been observed for Intel xl710, which reserves some of them for
      * SRIOV):  rte_eth_*_queue_setup will fail if a queue is not
@@ -718,11 +724,25 @@ dpdk_eth_dev_queue_setup(struct netdev_dpdk *dev, int 
n_rxq, int n_txq)
         }

         diag = rte_eth_dev_set_mtu(dev->port_id, dev->mtu);
-        if (diag) {
+        if (diag && (diag != -ENOTSUP)) {
             VLOG_ERR("Interface %s MTU (%d) setup error: %s",
                     dev->up.name, dev->mtu, rte_strerror(-diag));
             break;
         }
+        else {
+            /* Some devices do not support rte_eth_dev_set_mtu e.g. igbvf.
+             * If the operation is not supported attempt to set MTU manually
+             * in the devices configuration.
+             */
+            if (dev->mtu > ETHER_MTU) {
+                conf.rxmode.jumbo_frame = 1;
+                conf.rxmode.max_rx_pkt_len = dev->max_packet_len;
+            } else {
+                conf.rxmode.jumbo_frame = 0;
+                conf.rxmode.max_rx_pkt_len = 0;
+            }
+            diag = 0;
+        }

This is not a pretty approach but might enable you to test further with the 
igbvf function in the meantime.

As regards the segfault occurring, in this case it was due to the error code 
returned when rte_eth_dev_set_mtu is not supported. The error code returned by 
DPDK is –ENOTSUP (95 in this case). This value is returned to flag a failure in 
the OVS device setup during reconfiguration, however the error value is checked 
at the dipf_netdev layer in the port_reconfigure() function as below

if (err && (err != EOPNOTSUPP)) {
        if (err) {
            VLOG_ERR("Failed to set interface %s new configuration",
                     netdev_get_name(netdev));
            return err;
        }

Note err in this case will be –ENOTSUP. EOPNOTSUPP is the same value as 
–ENOTSUP. The check above is intended to check if a netdev device supports a 
reconfigure() function or if an error occurs during reconfiguration. In this 
case the error value returned during reconfiguration is the same as if the 
reconfigure function does not exist and is ignored. This leads to the port 
being polled at a later stage resulting in a segfault. You could remove the 
(err != EOPNOTSUPP) check but would introduce error messages in the case where 
a reconfigure method is not supported. With the changes for setting the mtu 
outlined above however this should be avoided as the error returned is set to 0 
when the set mtu is not supported.

I’ll have to look a little bit closer as regards the next steps here. There is 
definitely some work needed around error reporting during queue setup in OVS 
DPDK as well as some work for DPDK itself to enable this case.

Hope this helps.

Regards
Ian


From: scaricapo...@gmail.com [mailto:scaricapo...@gmail.com] On Behalf Of 
Riccardo Ravaioli
Sent: Wednesday, March 7, 2018 3:20 PM
To: Stokes, Ian <ian.sto...@intel.com>
Cc: ovs-discuss@openvswitch.org
Subject: Re: [ovs-discuss] segmentation fault when adding a VF in DPDK to a 
switch

Hi Ian,
Thanks a lot for your patch. I applied your modifications and ran again the 
setup described in the original post. While the CRC-related error message has 
disappeared, openvswitch still crashes (with no gdb running!):

# tail ovs-vswitchd.log
2018-03-07T14:58:53.311Z|00215|dpdk|INFO|EAL: PCI device 0000:05:10.1 on NUMA 
socket 0
2018-03-07T14:58:53.311Z|00216|dpdk|INFO|EAL:   probe driver: 8086:1520 
net_e1000_igb_vf
2018-03-07T14:58:53.518Z|00217|dpdk|INFO|PMD: eth_igbvf_dev_init():     VF MAC 
address not assigned by Host PF
2018-03-07T14:58:53.518Z|00218|dpdk|INFO|PMD: eth_igbvf_dev_init():     Assign 
randomly generated MAC address fe:00:a5:78:49:2a
2018-03-07T14:58:53.518Z|00219|netdev_dpdk|INFO|Device '0000:05:10.1' attached 
to DPDK
2018-03-07T14:58:53.518Z|00220|netdev_dpdk|ERR|Interface 2.switch1 MTU (1500) 
setup error: Operation not supported
2018-03-07T14:58:53.518Z|00221|netdev_dpdk|ERR|Interface 2.switch1(rxq:1 txq:1) 
configure error: Operation not supported
2018-03-07T14:58:53.884Z|00002|daemon_unix(monitor)|ERR|1 crashes: pid 4171 
died, killed (Segmentation fault), core dumped, restarting

# lspci -s 0000:05:10.1 -v
05:10.1 Ethernet controller: Intel Corporation I350 Ethernet Controller Virtual 
Function (rev 01)
    Flags: bus master, fast devsel, latency 0
    [virtual] Memory at fbea0000 (64-bit, prefetchable) [size=16K]
    [virtual] Memory at fbe80000 (64-bit, prefetchable) [size=16K]
    Capabilities: [70] MSI-X: Enable- Count=3 Masked-
    Capabilities: [a0] Express Endpoint, MSI 00
    Capabilities: [100] Advanced Error Reporting
    Capabilities: [150] Alternative Routing-ID Interpretation (ARI)
    Capabilities: [1a0] Transaction Processing Hints
    Capabilities: [1d0] Access Control Services
    Kernel driver in use: vfio-pci

This is indeed the second issue you mentioned in a previous post. Can we get 
past this with a workaround?
Thanks again!
Riccardo


On 7 March 2018 at 14:41, Stokes, Ian 
<ian.sto...@intel.com<mailto:ian.sto...@intel.com>> wrote:
Hi Ricardo,

After some more time to look at the issue you could do something like below to 
enable crc for the interface (Note I haven’t fully validated this).

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index af9843a..28d7d1e 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -700,6 +700,12 @@ dpdk_eth_dev_queue_setup(struct netdev_dpdk *dev, int 
n_rxq, int n_txq)

     conf.rxmode.hw_ip_checksum = (dev->hw_ol_features &
                                   NETDEV_RX_CHECKSUM_OFFLOAD) != 0;
+
+    /*
+     * Need to enable hw_strip_crc specifically for SRIOV devices.
+     */
+    conf.rxmode.hw_strip_crc = 1;
+

On my system this at least got past the configuration error when adding the 
SRIOV VF port and I was able to pass traffic through the port in a simple VF to 
phy port setup . As I’ve only completed minor validation on this maybe you 
could give it a shot and see if it works on your setup.

With regards to the VSI queue error I mentioned in previous posts, with some 
more investigation I found this only occurred when running the setup of SRIOV 
VFs in DPDK with GDB, I was able to reproduce the same issue in the DPDK l2fwd 
sample app so it is not specific to OVS. Once you are not running OVS with GDB 
during the SRIOV setup it should be OK. I’ll need to look at this in a little 
bit more detail when I have time but for the moment it shouldn’t block you.

Hope this helps,

Regards
Ian



From: 
ovs-discuss-boun...@openvswitch.org<mailto:ovs-discuss-boun...@openvswitch.org> 
[mailto:ovs-discuss-boun...@openvswitch.org<mailto:ovs-discuss-boun...@openvswitch.org>]
 On Behalf Of Stokes, Ian
Sent: Thursday, February 1, 2018 11:52 AM
To: riccardoravai...@gmail.com<mailto:riccardoravai...@gmail.com>

Cc: ovs-discuss@openvswitch.org<mailto:ovs-discuss@openvswitch.org>
Subject: Re: [ovs-discuss] segmentation fault when adding a VF in DPDK to a 
switch

Hi Ricardo,

Apologies for the delay. Unfortunately with the OVS 2.9 release I haven’t had 
much time to look at this further.

At the very least I think work needs to be done for dpdk.c and netdev-dpdk.c to 
enable configuration of VFs specifically (to account for the HW_CRC and VSI 
queue configurations).

There would also be a task to ensure the work required for enabling a VF on the 
i40e driver would also cover enabling a VF for the ixgbe driver. In DPDK it’s 
been the case in the past that driver implementations for different NIC devices 
can differ.

This could be looked at in the OVS 2.10 development cycle at some point. I can 
post an update here when there is progress.

Thanks
Ian

From: scaricapo...@gmail.com<mailto:scaricapo...@gmail.com> 
[mailto:scaricapo...@gmail.com] On Behalf Of Riccardo Ravaioli
Sent: Thursday, January 25, 2018 4:35 PM
To: Stokes, Ian <ian.sto...@intel.com<mailto:ian.sto...@intel.com>>
Cc: ovs-discuss@openvswitch.org<mailto:ovs-discuss@openvswitch.org>
Subject: Re: [ovs-discuss] segmentation fault when adding a VF in DPDK to a 
switch

Hi Ian,
Thanks for looking into the issue. Anything new?
Thanks a lot!
Riccardo

On 11 January 2018 at 23:50, Stokes, Ian 
<ian.sto...@intel.com<mailto:ian.sto...@intel.com>> wrote:
Hi Ricardo,

That’s for reporting the issue and providing the steps to reproduce.

I was able to reproduce this with an i40e VF using igb_uio.

In short it seems there is no support currently for ixgbe and i40e VF devices 
in OVS with DPDK.

There are 2  issues at play here. First is the configuration error when 
creating and starting the VF in DPDK, the second issue is the Segfault in OVS.

The configuration of the VF fails (For the i40e device at least) because of the 
expectation in DPDK that the HW_CRC stripping flag is enabled in the device 
configuration for VFs. In your logs you will see an error reporting this. By 
default this seems to be disabled for VFs in OVS.

Looking in the DPDK code this is confirmed by the following in 
i40evf_dev_configure()  which code execution hits

   │1568            /* For non-DPDK PF drivers, VF has no ability to disable HW
   │1569             * CRC strip, and is implicitly enabled by the PF.
   │1570             */
   │1571            if (!conf->rxmode.hw_strip_crc) {
   │1572                    vf = 
I40EVF_DEV_PRIVATE_TO_VF(dev->data->dev_private);
   │1573                    if ((vf->version_major == VIRTCHNL_VERSION_MAJOR) &&
   │1574                        (vf->version_minor <= VIRTCHNL_VERSION_MINOR)) {
   │1575                            /* Peer is running non-DPDK PF driver. */
   │1576                            PMD_INIT_LOG(ERR, "VF can't disable HW CRC 
Strip");
   │1577                            return -EINVAL;
   │1578                    }

Out of interest I enabled HW_CRC in the configuration for the device manually 
in the ovs code for testing purposes. Although this allows the queue 
configuration to succeed the VF will later fail to start due to an issue with 
VSI queue mapping when DPDK attempts to start the device. I’ll have to take 
another look to see what exactly is going wrong here, I suspect there is more 
configuration needed for VFs than PFs.

The segmentation fault happens due to the error occurring during the 
dpdk_eth_dev_queue_setup() function, this is a separate issue and unrelated to 
VFs. I have seen failures in this area cause segmentation faults before in OVS 
so it’s an area that needs to be looked at again to handle DPDK errors properly 
IMO.

I hope this answers your question and I’ll follow up once I have a little more 
info on how to enable the VF functionality.

Thanks
Ian



From: 
ovs-discuss-boun...@openvswitch.org<mailto:ovs-discuss-boun...@openvswitch.org> 
[mailto:ovs-discuss-boun...@openvswitch.org<mailto:ovs-discuss-boun...@openvswitch.org>]
 On Behalf Of Riccardo Ravaioli
Sent: Thursday, January 11, 2018 10:27 AM
To: ovs-discuss@openvswitch.org<mailto:ovs-discuss@openvswitch.org>
Subject: Re: [ovs-discuss] segmentation fault when adding a VF in DPDK to a 
switch

Here are the steps to reproduce the issue:
1. Create one Virtual Function (VF) on a physical interface that supports 
SR-IOV (in my case it's an Intel i350 interface):
$ echo 1 > /sys/class/net/eth10/device/sriov_numvfs
2. Lookup its PCI address, for example with dpdk-devbind.py:
$ dpdk-devbind.py --status-dev net
0000:05:10.3 'I350 Ethernet Controller Virtual Function 1520' if=eth11 
drv=igbvf unused=igb_uio,vfio-pci,uio_pci_generic
3. Bind the VF to a DPDK-compatible driver. I'll use vfio-pci, but igb_uio too 
will reproduce the issue:
$ dpdk-devbind.py --bind=vfio-pci 0000:05:10.3
4. Create an OVS bridge and set its datapath type to netdev:
$ ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev
5. Add the VF to the bridge as a DPDK interface:
$ ovs-vsctl add-port br0 dpdk-p0 -- set Interface dpdk-p0 type=dpdk 
options:dpdk-devargs=0000:05:10.3
6. Now ovs-vswitchd.log reports that OVS repeatedly crashes (segmentation 
fault) and restarts itself, in a loop:
2018-01-11T09:28:28.338Z|00139|dpdk|INFO|EAL: PCI device 0000:05:10.3 on NUMA 
socket 0
2018-01-11T09:28:28.338Z|00140|dpdk|INFO|EAL:   probe driver: 8086:1520 
net_e1000_igb_vf
2018-01-11T09:28:28.338Z|00141|dpdk|INFO|EAL:   using IOMMU type 1 (Type 1)
2018-01-11T09:28:28.560Z|00142|dpdk|INFO|PMD: eth_igbvf_dev_init():     VF MAC 
address not assigned by Host PF
2018-01-11T09:28:28.560Z|00143|dpdk|INFO|PMD: eth_igbvf_dev_init():     Assign 
randomly generated MAC address c6:13:67:7b:31:6b
2018-01-11T09:28:28.560Z|00144|netdev_dpdk|INFO|Device '0000:05:10.3' attached 
to DPDK
2018-01-11T09:28:28.563Z|00145|dpif_netdev|INFO|PMD thread on numa_id: 0, core 
id:  3 created.
2018-01-11T09:28:28.566Z|00146|dpif_netdev|INFO|PMD thread on numa_id: 0, core 
id:  2 created.
2018-01-11T09:28:28.566Z|00147|dpif_netdev|INFO|There are 2 pmd threads on numa 
node 0
2018-01-11T09:28:28.646Z|00148|dpdk|INFO|PMD: igbvf_dev_configure(): VF can't 
disable HW CRC Strip
2018-01-11T09:28:28.646Z|00149|netdev_dpdk|ERR|Interface dpdk-p0 MTU (1500) 
setup error: Operation not supported
2018-01-11T09:28:28.646Z|00150|netdev_dpdk|ERR|Interface dpdk-p0(rxq:1 txq:1) 
configure error: Operation not supported
2018-01-11T09:28:29.062Z|00002|daemon_unix(monitor)|ERR|1 crashes: pid 2494 
died, killed (Segmentation fault), core dumped, restarting
7. Removing the VF from the bridge stops this behaviour:
$ ovs-vsctl del-port br0 dpdk-p0

The same happens if I restart openvswitch between steps 4 and 5 and let it 
initialize itself with the list of DPDK devices, instead of hotplugging them at 
runtime, as described above.
Riccardo


On 11 January 2018 at 01:27, Riccardo Ravaioli 
<riccardoravai...@gmail.com<mailto:riccardoravai...@gmail.com>> wrote:
Hi,

I was going through the openvswitch+dpdk tutorial and wanted to add a virtual 
function (VF) to a bridge as a dpdk interface.

I can bind the VF to the vfio-pci driver successfully with dpdk-devbind.py, but 
as soon as I add the interface to an ovs bridge (in netdev mode), openvswitch 
goes in segmentation fault and continuously tries to restart itself.

I'm running openvswitch 2.8.1 and dpdk 17.11 on Debian jessie with Linux kernel 
4.6.

Is this a known problem? Is there a fix?
I have the same issue with VFs bound to igb_uio, whereas with real physical 
interfaces it works just fine.

Here are the relevant lines from ovs-vswitchd.log:

2018-01-10T15:53:26.949Z|00157|dpdk|INFO|PMD: igbvf_dev_configure(): VF can't 
disable HW CRC Strip
2018-01-10T15:53:26.949Z|00158|netdev_dpdk|ERR|Interface 0.extra2 MTU (1500) 
setup error: Operation not supported
2018-01-10T15:53:26.949Z|00159|netdev_dpdk|ERR|Interface 0.extra2(rxq:1 txq:1) 
configure error: Operation not supported
2018-01-10T15:53:27.333Z|00066|daemon_unix(monitor)|ERR|fork child died before 
signaling startup (killed (Segmentation fault))
2018-01-10T15:53:27.333Z|00067|daemon_unix(monitor)|WARN|23 crashes: pid 21413 
died, killed (Segmentation fault), core dumped, waiting until 10 seconds since 
last restart
2018-01-10T15:53:33.333Z|00068|daemon_unix(monitor)|ERR|23 crashes: pid 21413 
died, killed (Segmentation fault), core dumped, restarting
Thanks!

Riccardo



_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to