On 20.11.2019 18:54, William Tu wrote:
> On Wed, Nov 20, 2019 at 05:43:48PM +0100, Ilya Maximets wrote:
>> On 20.11.2019 8:35, Eelco Chaudron wrote:
>>>
>>>
>>> On 19 Nov 2019, at 17:52, Ilya Maximets wrote:
>>>
>>>> On 19.11.2019 17:16, Eelco Chaudron wrote:
>>>>>
>>>>>
>>>>> On 7 Nov 2019, at 12:36, Ilya Maximets wrote:
>>>>>
>>>>>> Until now there was only two options for XDP mode in OVS: SKB or DRV.
>>>>>> i.e. 'generic XDP' or 'native XDP with zero-copy enabled'.
>>>>>>
>>>>>> Devices like 'veth' interfaces in Linux supports native XDP, but
>>>>>> doesn't support zero-copy mode. This case can not be covered by
>>>>>> existing API and we have to use slower generic XDP for such devices.
>>>>>> There are few more issues, e.g. TCP is not supported in generic XDP
>>>>>> mode for veth interfaces due to kernel limitations, however it is
>>>>>> supported in native mode.
>>>>>>
>>>>>> This change introduces ability to use native XDP without zero-copy
>>>>>> along with best-effort configuration option that enabled by default.
>>>>>> In best-effort case OVS will sequentially try different modes starting
>>>>>> from the fastest one and will choose the first acceptable for current
>>>>>> interface. This will guarantee the best possible performance.
>>>>>>
>>>>>> If user will want to choose specific mode, it's still possible by
>>>>>> setting the 'options:xdp-mode'.
>>>>>>
>>>>>> This change additionally changes the API by renaming the configuration
>>>>>> knob from 'xdpmode' to 'xdp-mode' and also renaming the modes
>>>>>> themselves to be more user-friendly.
>>>>>>
>>>>>> The full list of currently supported modes:
>>>>>> * native-with-zerocopy - former DRV
>>>>>> * native - new one, DRV without zero-copy
>>>>>> * generic - former SKB
>>>>>> * best-effort - new one, chooses the best available from
>>>>>> 3 above modes
>>>>>>
>>>>>> Since 'best-effort' is a default mode, users will not need to
>>>>>> explicitely set 'xdp-mode' in most cases.
>>>>>>
>>>>>> TCP related tests enabled back in system afxdp testsuite, because
>>>>>> 'best-effort' will choose 'native' mode for veth interfaces
>>>>>> and this mode has no issues with TCP.
>>>>> Patch in general looks good, two small comments inline.
>>>>
>>>> Thanks for review.
>>>>
>>>>>
>>>>> The only thing that bothers me is the worse performance of the TAP
>>>>> interface with the new default config. Can we somehow keep the old
>>>>> behavior for TAP interfaces?
>>>>
>>>> Could you check if TCP works over tap interfaces in generic mode?
>>>> For me the point is that correctness is better than performance.
>>>> I also hope that native implementation for tap will be improved
>>>> over time.
>>>
>>> So if I understood your email chain with William correctly TCP is not
>>> working, so I affray correctness is better than performance.
>>
>> Not exactly. William didn't test the actual TAP interfaces.
>>
>> I tested today with kernel vhost backed virtio-user port and it seems to pass
>> TCP frames in generic mode.
>>
>> The setup is following:
>>
>> tap1 <-- ovs-vswitchd --> tap0 <-- testpmd --> tap2
>>
>> tap1 -- tap port created by OVS (type=tap)
>> tap0 -- tap port, virtio-user, created by testpmd, opened by OVS with
>> type=afxdp
>> tap2 -- tap port created by testpmd (net_tap)
>>
>> tap1 and tap2 are in their own network namespaces and iperf works on them.
>
> Hi Ilya,
>
> This is an interesting setup.
> Can you share roughly your commands to do this test?
I just copied and modified one of the system-dpdk tests like this:
diff --git a/tests/system-dpdk.at b/tests/system-dpdk.at
index a015d52f7..b0d10fcdd 100644
--- a/tests/system-dpdk.at
+++ b/tests/system-dpdk.at
@@ -232,3 +232,83 @@ OVS_VSWITCHD_STOP(["\@does not exist. The Open vSwitch
kernel module is probably
\@EAL: No free hugepages reported in hugepages-1048576kB@d"])
AT_CLEANUP
dnl --------------------------------------------------------------------------
+
+dnl --------------------------------------------------------------------------
+dnl Ping afxdp port
+AT_SETUP([OVS-DPDK - ping afxdp ports])
+AT_KEYWORDS([afxdp tap])
+OVS_DPDK_PRE_CHECK()
+AT_SKIP_IF([! which testpmd >/dev/null 2>/dev/null])
+OVS_DPDK_START()
+
+dnl Find number of sockets
+AT_CHECK([lscpu], [], [stdout])
+AT_CHECK([cat stdout | grep "NUMA node(s)" | awk '{c=1; while (c++<$(3))
{printf "512,"}; print "512"}' > NUMA_NODE])
+
+dnl Add userspace bridge and attach it to OVS
+AT_CHECK([ovs-vsctl add-br br10 -- set bridge br10 datapath_type=netdev])
+dnl Set up namespaces
+ADD_NAMESPACES(ns1, ns2)
+
+dnl Add veth device
+ADD_VETH(tap1, ns2, br10, "172.31.110.12/24")
+
+dnl Execute testpmd in background
+on_exit "pkill -f -x -9 'tail -f /dev/null'"
+tail -f /dev/null | testpmd --socket-mem="$(cat NUMA_NODE)" --no-pci\
+ --vdev="net_virtio_user,path=/dev/vhost-net,queue_size=1024" \
+ --vdev="net_tap7,iface=tap7" --file-prefix page0 \
+ --single-file-segments -- -a
>$OVS_RUNDIR/testpmd-dpdkvhostuserclient0.log 2>&1 &
+
+dnl Give settling time to the testpmd processes - NOTE: this is bad form.
+sleep 10
+
+ip link set tap0 up
+ethtool -K tap7 tx off
+ip netns exec ns2 ethtool -K tap1 tx off
+ip link show
+
+AT_CHECK([ovs-vsctl add-port br10 tap0 -- set Interface tap0 \
+ type=afxdp \
+ options:xdp-mode=generic], [],
+ [stdout], [stderr])
+AT_CHECK([ovs-vsctl show], [], [stdout])
+
+
+dnl Move the tap devices to the namespaces
+AT_CHECK([ps aux | grep testpmd], [], [stdout], [stderr])
+AT_CHECK([ip link show], [], [stdout], [stderr])
+AT_CHECK([ip link set tap7 netns ns1], [], [stdout], [stderr])
+
+AT_CHECK([ip netns exec ns1 ip link show], [], [stdout], [stderr])
+AT_CHECK([ip netns exec ns1 ip link show | grep tap7], [], [stdout], [stderr])
+AT_CHECK([ip netns exec ns1 ip link set tap7 up], [], [stdout], [stderr])
+AT_CHECK([ip netns exec ns1 ip addr add 172.31.110.11/24 dev tap7], [],
+ [stdout], [stderr])
+
+AT_CHECK([ip netns exec ns1 ip link show], [], [stdout], [stderr])
+AT_CHECK([ip netns exec ns2 ip link show], [], [stdout], [stderr])
+AT_CHECK([ip netns exec ns1 ping -c 4 -I tap7 172.31.110.12], [], [stdout],
+ [stderr])
+
+dnl NETNS_DAEMONIZE([ns1], [nc -l -k 1234 > /dev/null], [nc1.pid])
+dnl NS_CHECK_EXEC([ns2], [echo "foobar" | nc $NC_EOF_OPT 10.1.1.1 1234])
+sleep 180
+
+dnl Clean up the testpmd now
+pkill -f -x -9 'tail -f /dev/null'
+
+dnl Clean up
+AT_CHECK([ovs-vsctl del-port br10 tap0], [], [stdout], [stderr])
+OVS_VSWITCHD_STOP(["\@does not exist. The Open vSwitch kernel module is
probably not loaded.@d
+\@Failed to enable flow control@d
+\@VHOST_CONFIG: recvmsg failed@d
+\@VHOST_CONFIG: failed to connect to $OVS_RUNDIR/dpdkvhostclient0: No such
file or directory@d
+\@Global register is changed during@d
+\@dpdkvhostuser ports are considered deprecated; please migrate to
dpdkvhostuserclient ports.@d
+\@failed to enumerate system datapaths: No such file or directory@d
+\@EAL: Invalid NUMA socket, default to 0@d
+\@EAL: WARNING: cpu flags constant_tsc=yes nonstop_tsc=no -> using unreliable
clock cycles !@d
+\@EAL: No free hugepages reported in hugepages-1048576kB@d"])
+AT_CLEANUP
+dnl --------------------------------------------------------------------------
---
tap2 is a tap7 in the script.
The test is not well-shaped and I was too lazy to properly invoke iperf from
the test so I just added 'sleep 180' to run iperf from the separate terminals
by hands:
(term 1)#ip netns exec ns1 iperf3 -s -i 1
(term 2)#ip netns exec ns2 iperf3 -c 172.31.110.11 -i 1
BTW, to run 'make check-dpdk' you need testpmd available in PATH and
hugepages configured.
Best regards, Ilya Maximets.
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev