The patch series introduces AF_XDP support for OVS netdev.
AF_XDP is a new address family working together with eBPF.
In short, a socket with AF_XDP family can receive and send
packets from an eBPF/XDP program attached to the netdev.
For more details about AF_XDP, please see linux kernel's
Documentation/networking/af_xdp.rst
OVS has a couple of netdev types, i.e., system, tap, or
internal. The patch first adds a new netdev types called
"afxdp", and implement its configuration, packet reception,
and transmit functions. Since the AF_XDP socket, xsk,
operates in userspace, once ovs-vswitchd receives packets
from xsk, the proposed architecture re-uses the existing
userspace dpif-netdev datapath. As a result, most of
the packet processing happens at the userspace instead of
linux kernel.
Architecure
===========
_
| +-------------------+
| | ovs-vswitchd |<-->ovsdb-server
| +-------------------+
| | ofproto |<-->OpenFlow controllers
| +--------+-+--------+
| | netdev | |ofproto-|
userspace | +--------+ | dpif |
| | netdev | +--------+
| |provider| | dpif |
| +---||---+ +--------+
| || | dpif- |
| || | netdev |
|_ || +--------+
||
_ +---||-----+--------+
| | af_xdp prog + |
kernel | | xsk_map |
|_ +--------||---------+
||
physical
NIC
To simply start, create a ovs userspace bridge using dpif-netdev
by setting the datapath_type to netdev:
# ovs-vsctl -- add-br br0 -- set Bridge br0 datapath_type=netdev
And attach a linux netdev with type afxdp:
# ovs-vsctl add-port br0 afxdp-p0 -- \
set interface afxdp-p0 type="afxdp"
Performance
===========
For this version, v4, I mainly focus on making the features right with
libbpf AF_XDP API and use the AF_XDP SKB mode, which is the slower set-up.
My next version is to measure the performance and add optimizations.
Documentation
=============
Most of the design details are described in the paper presetned at
Linux Plumber 2018, "Bringing the Power of eBPF to Open vSwitch"[1],
section 4, and slides[2].
This path uses a not-yet upstreamed feature called XDP_ATTACH[3],
described in section 3.1, which is a built-in XDP program for the AF_XDP.
This greatly simplifies the management of XDP/eBPF programs.
[1] http://vger.kernel.org/lpc_net2018_talks/ovs-ebpf-afxdp.pdf
[2] http://vger.kernel.org/lpc_net2018_talks/ovs-ebpf-lpc18-presentation.pdf
[3] http://vger.kernel.org/lpc_net2018_talks/lpc18_paper_af_xdp_perf-v2.pdf
For installation and configuration guide, see
# Documentation/intro/install/bpf.rst
Test Cases
==========
Test cases are created using namespaces and veth peer, with AF_XDP socket
attached to the veth (thus the SKB_MODE). By issuing "make check-afxdp",
the patch shows the following:
AF_XDP netdev datapath-sanity
1: datapath - ping between two ports ok
2: datapath - ping between two ports on vlan ok
3: datapath - ping6 between two ports ok
4: datapath - ping6 between two ports on vlan ok
5: datapath - ping over vxlan tunnel ok
6: datapath - ping over vxlan6 tunnel ok
7: datapath - ping over gre tunnel ok
8: datapath - ping over erspan v1 tunnel ok
9: datapath - ping over erspan v2 tunnel ok
10: datapath - ping over ip6erspan v1 tunnel ok
11: datapath - ping over ip6erspan v2 tunnel ok
12: datapath - ping over geneve tunnel ok
13: datapath - ping over geneve6 tunnel ok
14: datapath - clone action ok
15: datapath - basic truncate action ok
conntrack
16: conntrack - controller ok
17: conntrack - force commit ok
18: conntrack - ct flush by 5-tuple ok
19: conntrack - IPv4 ping ok
20: conntrack - get_nconns and get/set_maxconns ok
21: conntrack - IPv6 ping ok
system-ovn
22: ovn -- 2 LRs connected via LS, gateway router, SNAT and DNAT ok
23: ovn -- 2 LRs connected via LS, gateway router, easy SNAT ok
24: ovn -- multiple gateway routers, SNAT and DNAT ok
25: ovn -- load-balancing ok
26: ovn -- load-balancing - same subnet. ok
27: ovn -- load balancing in gateway router ok
28: ovn -- multiple gateway routers, load-balancing ok
29: ovn -- load balancing in router with gateway router port ok
30: ovn -- DNAT and SNAT on distributed router - N/S ok
31: ovn -- DNAT and SNAT on distributed router - E/W ok
---
v1->v2:
- add a list to maintain unused umem elements
- remove copy from rx umem to ovs internal buffer
- use hugetlb to reduce misses (not much difference)
- use pmd mode netdev in OVS (huge performance improve)
- remove malloc dp_packet, instead put dp_packet in umem
v2->v3:
- rebase on the OVS master, 7ab4b0653784
("configure: Check for more specific function to pull in pthread library.")
- remove the dependency on libbpf and dpif-bpf.
instead, use the built-in XDP_ATTACH feature.
- data structure optimizations for better performance, see[1]
- more test cases support
v3: https://mail.openvswitch.org/pipermail/ovs-dev/2018-November/354179.html
v3->v4:
- Use AF_XDP API provided by libbpf
- Remove the dependency on XDP_ATTACH kernel patch set
- Add documentation, bpf.rst
William Tu (4):
Add libbpf build support.
netdev-afxdp: add new netdev type for AF_XDP
tests: add AF_XDP netdev test cases.
afxdp netdev: add documentation and configuration.
Documentation/automake.mk | 1 +
Documentation/index.rst | 1 +
Documentation/intro/install/bpf.rst | 182 +++++++
Documentation/intro/install/index.rst | 1 +
acinclude.m4 | 20 +
configure.ac | 1 +
lib/automake.mk | 7 +-
lib/dp-packet.c | 12 +
lib/dp-packet.h | 32 +-
lib/dpif-netdev.c | 2 +-
lib/netdev-afxdp.c | 491 +++++++++++++++++
lib/netdev-afxdp.h | 39 ++
lib/netdev-linux.c | 78 ++-
lib/netdev-provider.h | 1 +
lib/netdev.c | 1 +
lib/xdpsock.c | 179 +++++++
lib/xdpsock.h | 129 +++++
tests/automake.mk | 17 +
tests/system-afxdp-macros.at | 153 ++++++
tests/system-afxdp-testsuite.at | 26 +
tests/system-afxdp-traffic.at | 978 ++++++++++++++++++++++++++++++++++
21 files changed, 2345 insertions(+), 6 deletions(-)
create mode 100644 Documentation/intro/install/bpf.rst
create mode 100644 lib/netdev-afxdp.c
create mode 100644 lib/netdev-afxdp.h
create mode 100644 lib/xdpsock.c
create mode 100644 lib/xdpsock.h
create mode 100644 tests/system-afxdp-macros.at
create mode 100644 tests/system-afxdp-testsuite.at
create mode 100644 tests/system-afxdp-traffic.at
--
2.7.4
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev