Add a couple of people who might be interested in this feature.
On Tue, Mar 10, 2020 at 8:29 AM Toshiaki Makita <[email protected]> wrote: > > This patch adds an XDP-based flow cache using the OVS netdev-offload > flow API provider. When a OVS device with XDP offload enabled, > packets first are processed in the XDP flow cache (with parse, and > table lookup implemented in eBPF) and if hits, the action processing > are also done in the context of XDP, which has the minimum overhead. > > This provider is based on top of William's recently posted patch for > custom XDP load. When a custom XDP is loaded, the provider detects if > the program supports classifier, and if supported it starts offloading > flows to the XDP program. > > The patches are derived from xdp_flow[1], which is a mechanism similar to > this but implemented in kernel. > > > * Motivation > > While userspace datapath using netdev-afxdp or netdev-dpdk shows good > performance, there are use cases where packets better to be processed in > kernel, for example, TCP/IP connections, or container to container > connections. Current solution is to use tap device or af_packet with > extra kernel-to/from-userspace overhead. But with XDP, a better solution > is to steer packets earlier in the XDP program, and decides to send to > userspace datapath or stay in kernel. > > One problem with current netdev-afxdp is that it forwards all packets to > userspace, The first patch from William (netdev-afxdp: Enable loading XDP > program.) only provides the interface to load XDP program, howerver users > usually don't know how to write their own XDP program. > > XDP also supports HW-offload so it may be possible to offload flows to > HW through this provider in the future, although not currently. > The reason is that map-in-map is required for our program to support > classifier with subtables in XDP, but map-in-map is not offloadable. > If map-in-map becomes offloadable, HW-offload of our program will also > be doable. > > > * How to use > > 1. Install clang/llvm >= 9, libbpf >= 0.0.4, and kernel >= 5.3. > > 2. make with --enable-afxdp > It will generate XDP program "bpf/flowtable_afxdp.o". Note that the BPF > object will not be installed anywhere by "make install" at this point. > > 3. Load custom XDP program > E.g. > $ ovs-vsctl add-port ovsbr0 veth0 -- set int veth0 options:xdp-mode=native \ > options:xdp-obj="path/to/ovs/bpf/flowtable_afxdp.o" > $ ovs-vsctl add-port ovsbr0 veth1 -- set int veth1 options:xdp-mode=native \ > options:xdp-obj="path/to/ovs/bpf/flowtable_afxdp.o" > > 4. Enable XDP_REDIRECT > If you use veth devices, make sure to load some (possibly dummy) programs > on the peers of veth devices. > > 5. Enable hw-offload > $ ovs-vsctl set Open_vSwitch . other_config:hw-offload=true > This will starts offloading flows to the XDP program. > > You should be able to see some maps installed, including "debug_stats". > $ bpftool map > > If packets are successfully redirected by the XDP program, > debug_stats[2] will be counted. > $ bpftool map dump id <ID of debug_stats> > > Currently only very limited keys and output actions is supported. > For example NORMAL action entry and IP based matching work with current > key support. > > > * Performance > > Tested 2 cases. 1) i40e to veth, 2) i40e to i40e. > Test 1 Measured drop rate at veth interface with redirect action from > physical interface (i40e 25G NIC, XXV 710) to veth. The CPU is Xeon > Silver 4114 (2.20 GHz). > XDP_DROP > +------+ +-------+ +-------+ > pktgen -- wire --> | eth0 | -- NORMAL ACTION --> | veth0 |----| veth2 | > +------+ +-------+ +-------+ > > Test 2 uses i40e instead of veth, and measured tx packet rate at output > device. > > Single-flow performance test results: > > 1) i40e-veth > > a) no-zerocopy in i40e > > - xdp 3.7 Mpps > - afxdp 820 kpps > > b) zerocopy in i40e (veth does not have zc) > > - xdp 1.8 Mpps > - afxdp 800 Kpps > > 2) i40e-i40e > > a) no-zerocopy > > - xdp 3.0 Mpps > - afxdp 1.1 Mpps > > b) zerocopy > > - xdp 1.7 Mpps > - afxdp 4.0 Mpps > > ** xdp is better when zc is disabled. The reason of poor performance on zc > is that xdp_frame requires packet memory allocation and memcpy on > XDP_REDIRECT to other devices iff zc is enabled. > > ** afxdp with zc is better than xdp without zc, but afxdp is using 2 cores > in this case, one is pmd and the other is softirq. When pmd and softirq > were running on the same core, the performance was extremely poor as > pmd consumes cpus. > When offloading to xdp, xdp only uses softirq while pmd is still > consuming 100% cpu. This means we need probably only one pmd for xdp > even when we want to use more cores for multi-flow. > I'll also test afxdp-nonpmd when it is applied. > > > This patch set is based on top of commit 59e994426 ("datapath: Update > kernel test list, news and FAQ"). > > [1] https://lwn.net/Articles/802653/ > > Toshiaki Makita (4): > netdev-offload: Add xdp flow api provider > netdev-offload: Register xdp flow api provider > tun_metadata: Use OVS_ALIGNED_VAR to align opts field > bpf: Add reference XDP program implementation for netdev-offload-xdp > > William Tu (1): > netdev-afxdp: Enable loading XDP program. > > Documentation/intro/install/afxdp.rst | 59 ++ > Makefile.am | 10 +- > NEWS | 2 + > bpf/.gitignore | 4 + > bpf/Makefile.am | 56 ++ > bpf/bpf_miniflow.h | 199 +++++ > bpf/bpf_netlink.h | 34 + > bpf/flowtable_afxdp.c | 510 +++++++++++ > configure.ac | 1 + > include/openvswitch/tun-metadata.h | 6 +- > lib/automake.mk | 6 +- > lib/bpf-util.c | 38 + > lib/bpf-util.h | 22 + > lib/netdev-afxdp.c | 342 +++++++- > lib/netdev-afxdp.h | 3 + > lib/netdev-linux-private.h | 5 + > lib/netdev-offload-provider.h | 3 + > lib/netdev-offload-xdp.c | 1116 +++++++++++++++++++++++++ > lib/netdev-offload-xdp.h | 49 ++ > lib/netdev.c | 4 +- > 20 files changed, 2452 insertions(+), 17 deletions(-) > create mode 100644 bpf/.gitignore > create mode 100644 bpf/Makefile.am > create mode 100644 bpf/bpf_miniflow.h > create mode 100644 bpf/bpf_netlink.h > create mode 100644 bpf/flowtable_afxdp.c > create mode 100644 lib/bpf-util.c > create mode 100644 lib/bpf-util.h > create mode 100644 lib/netdev-offload-xdp.c > create mode 100644 lib/netdev-offload-xdp.h > > -- > 2.24.1 > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
