On Tue, Apr 21, 2020 at 11:47:00PM +0900, Toshiaki Makita wrote:
> This patch adds an XDP-based flow cache using the OVS netdev-offload
> flow API provider. When an OVS device with XDP offload enabled,
> packets first are processed in the XDP flow cache (with parse, and
> table lookup implemented in eBPF) and if hits, the action processing
> are also done in the context of XDP, which has the minimum overhead.
>
> This provider is based on top of William's recently posted patch for
> custom XDP load. When a custom XDP is loaded, the provider detects if
> the program supports classifier, and if supported it starts offloading
> flows to the XDP program.
>
> The patches are derived from xdp_flow[1], which is a mechanism similar to
> this but implemented in kernel.
>
>
> * Motivation
>
> While userspace datapath using netdev-afxdp or netdev-dpdk shows good
> performance, there are use cases where packets better to be processed in
> kernel, for example, TCP/IP connections, or container to container
> connections. Current solution is to use tap device or af_packet with
> extra kernel-to/from-userspace overhead. But with XDP, a better solution
> is to steer packets earlier in the XDP program, and decides to send to
> userspace datapath or stay in kernel.
>
> One problem with current netdev-afxdp is that it forwards all packets to
> userspace, The first patch from William (netdev-afxdp: Enable loading XDP
> program.) only provides the interface to load XDP program, howerver users
> usually don't know how to write their own XDP program.
>
> XDP also supports HW-offload so it may be possible to offload flows to
> HW through this provider in the future, although not currently.
> The reason is that map-in-map is required for our program to support
> classifier with subtables in XDP, but map-in-map is not offloadable.
> If map-in-map becomes offloadable, HW-offload of our program will also
> be doable.
>
>
> * How to use
>
> 1. Install clang/llvm >= 9, libbpf >= 0.0.4, and kernel >= 5.3.
>
> 2. make with --enable-afxdp --enable-bpf
> --enable-bpf will generate XDP program "bpf/flowtable_afxdp.o". Note that
> the BPF object will not be installed anywhere by "make install" at this
> point.
When configure, there is a missing include, causing error due to __u64
checking bpf/bpf_helpers.h usability... no
checking bpf/bpf_helpers.h presence... yes
configure: WARNING: bpf/bpf_helpers.h: present but cannot be compiled
configure: WARNING: bpf/bpf_helpers.h: check for missing prerequisite
headers?
configure:18876: gcc -c -g -O2 conftest.c >&5
In file included from /usr/local/include/bpf/bpf_helpers.h:5:0,
from conftest.c:73:
/usr/local/include/bpf/bpf_helper_defs.h:55:82: error: unknown type name '__u64'
static int (*bpf_map_update_elem)(void *map, const void *key, const void
*value, __u64 flags) = (void *) 2;
^
/usr/local/include/bpf/bpf_helper_defs.h:79:41: error: unknown type name '__u32'
static int (*bpf_probe_read)(void *dst, __u32 size, const void *unsafe_ptr) =
(void *) 4;
I applied this to fix it:
diff --git a/acinclude.m4 b/acinclude.m4
index 5eeab6feb9cc..39dfce565182 100644
--- a/acinclude.m4
+++ b/acinclude.m4
@@ -326,7 +326,8 @@ AC_DEFUN([OVS_CHECK_LINUX_BPF], [
[AC_MSG_ERROR([unable to find llc to compile BPF program])])
AC_CHECK_HEADER([bpf/bpf_helpers.h], [],
- [AC_MSG_ERROR([unable to find bpf/bpf_helpers.h to compile BPF
program])])
+ [AC_MSG_ERROR([unable to find bpf/bpf_helpers.h to compile BPF
program])],
+ [#include <linux/types.h>])
AC_CHECK_HEADER([linux/bpf.h], [],
[AC_MSG_ERROR([unable to find linux/bpf.h to compile BPF program])])
>
> 3. Load custom XDP program
> E.g.
> $ ovs-vsctl add-port ovsbr0 veth0 -- set int veth0 options:xdp-mode=native \
> options:xdp-obj="path/to/ovs/bpf/flowtable_afxdp.o"
> $ ovs-vsctl add-port ovsbr0 veth1 -- set int veth1 options:xdp-mode=native \
> options:xdp-obj="path/to/ovs/bpf/flowtable_afxdp.o"
>
> 4. Enable XDP_REDIRECT
> If you use veth devices, make sure to load some (possibly dummy) programs
> on the peers of veth devices.
>
> 5. Enable hw-offload
> $ ovs-vsctl set Open_vSwitch . other_config:offload-driver=linux_xdp
> $ ovs-vsctl set Open_vSwitch . other_config:hw-offload=true
> This will starts offloading flows to the XDP program.
>
Thanks, I can succefully get it working...
When applying your patch on current master, I hit a bug.
I will send a patch to fix it.
diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
index 40d0cc1105ea..b52071e92ec7 100644
--- a/lib/netdev-linux.c
+++ b/lib/netdev-linux.c
@@ -3588,6 +3588,7 @@ const struct netdev_class netdev_internal_class = {
#ifdef HAVE_AF_XDP
#define NETDEV_AFXDP_CLASS_COMMON \
+ .init = netdev_afxdp_init, \
.construct = netdev_afxdp_construct, \
.destruct = netdev_afxdp_destruct, \
.get_stats = netdev_afxdp_get_stats, \
Other part works fine.
I'm planning to play with more rules and performance.
William
> You should be able to see some maps installed, including "debug_stats".
> $ bpftool map
>
> If packets are successfully redirected by the XDP program,
> debug_stats[2] will be counted.
> $ bpftool map dump id <ID of debug_stats>
>
> Currently only very limited keys and output actions are supported.
> For example NORMAL action entry and IP based matching work with current
> key support.
>
>
> * Performance
>
> Tested 2 cases. 1) i40e to veth, 2) i40e to i40e.
> Test 1 Measured drop rate at veth interface with redirect action from
> physical interface (i40e 25G NIC, XXV 710) to veth. The CPU is Xeon
> Silver 4114 (2.20 GHz).
> XDP_DROP
> +------+ +-------+ +-------+
> pktgen -- wire --> | eth0 | -- NORMAL ACTION --> | veth0 |----| veth2 |
> +------+ +-------+ +-------+
>
> Test 2 uses i40e instead of veth, and measured tx packet rate at output
> device.
>
> Single-flow performance test results:
>
> 1) i40e-veth
>
> a) no-zerocopy in i40e
>
> - xdp 3.7 Mpps
> - afxdp 820 kpps
>
> b) zerocopy in i40e (veth does not have zc)
>
> - xdp 1.8 Mpps
> - afxdp 800 Kpps
>
> 2) i40e-i40e
>
> a) no-zerocopy
>
> - xdp 3.0 Mpps
> - afxdp 1.1 Mpps
>
> b) zerocopy
>
> - xdp 1.7 Mpps
> - afxdp 4.0 Mpps
>
> ** xdp is better when zc is disabled. The reason of poor performance on zc
> is that xdp_frame requires packet memory allocation and memcpy on
> XDP_REDIRECT to other devices iff zc is enabled.
>
> ** afxdp with zc is better than xdp without zc, but afxdp is using 2 cores
> in this case, one is pmd and the other is softirq. When pmd and softirq
> were running on the same core, the performance was extremely poor as
> pmd consumes cpus.
> When offloading to xdp, xdp only uses softirq while pmd is still
> consuming 100% cpu. This means we need probably only one pmd for xdp
> even when we want to use more cores for multi-flow.
> I'll also test afxdp-nonpmd when it is applied.
>
>
> This patch set is based on top of commit 82b7e6d19 ("compat: Fix broken
> partial backport of extack op parameter").
>
> [1] https://lwn.net/Articles/802653/
>
> v2:
> - Add uninit callback of netdev-offload-xdp.
> - Introduce "offload-driver" other_config to specify offload driver.
> - Add --enable-bpf (HAVE_BPF) config option to build bpf programs.
> - Workaround incorrect UINTPTR_MAX in x64 clang bpf build.
> - Fix boot.sh autoconf warning.
>
> TODO:
> - CI fails due to missing function "bpf_program__get_type" which is not
> provided by libbpf from linux 5.3. Although we can use linux >= 5.5 to
> fix it, maybe it's time to switch to using libbpf standalone repository?
> - Fix a crash bug in patch 1 which has been reported by Eelco Chaudron.
> - Add test for XDP offload driver.
> - Add documentation.
> - Implement more actions like vlan push/pop.
>
> Toshiaki Makita (3):
> netdev-offload: Add "offload-driver" other_config to specify offload
> driver
> netdev-offload: Add xdp flow api provider
> bpf: Add reference XDP program implementation for netdev-offload-xdp
>
> William Tu (1):
> netdev-afxdp: Enable loading XDP program.
>
> Documentation/intro/install/afxdp.rst | 59 ++
> Makefile.am | 9 +-
> NEWS | 2 +
> acinclude.m4 | 56 ++
> bpf/.gitignore | 4 +
> bpf/Makefile.am | 59 ++
> bpf/bpf_miniflow.h | 179 ++++
> bpf/bpf_netlink.h | 34 +
> bpf/bpf_workaround.h | 28 +
> bpf/flowtable_afxdp.c | 515 +++++++++++
> configure.ac | 2 +
> lib/automake.mk | 6 +-
> lib/bpf-util.c | 38 +
> lib/bpf-util.h | 22 +
> lib/netdev-afxdp.c | 342 +++++++-
> lib/netdev-afxdp.h | 3 +
> lib/netdev-linux-private.h | 5 +
> lib/netdev-offload-provider.h | 6 +
> lib/netdev-offload-xdp.c | 1143 +++++++++++++++++++++++++
> lib/netdev-offload-xdp.h | 49 ++
> lib/netdev-offload.c | 40 +-
> 21 files changed, 2582 insertions(+), 19 deletions(-)
> create mode 100644 bpf/.gitignore
> create mode 100644 bpf/Makefile.am
> create mode 100644 bpf/bpf_miniflow.h
> create mode 100644 bpf/bpf_netlink.h
> create mode 100644 bpf/bpf_workaround.h
> create mode 100644 bpf/flowtable_afxdp.c
> create mode 100644 lib/bpf-util.c
> create mode 100644 lib/bpf-util.h
> create mode 100644 lib/netdev-offload-xdp.c
> create mode 100644 lib/netdev-offload-xdp.h
>
> --
> 2.25.1
>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev