On Fri, Oct 30, 2020 at 10:49 AM Toshiaki Makita
<[email protected]> wrote:
>
> Hi all,
>
> It's about 3 months since I submitted this patch set.
> Could someone review this?
> Or should I resubmit the patch set on the top of current master?
Since the patches don't apply cleanly, I think you can rebase and
repost them and/or provide the
ovs commit id on top of which these patches apply cleanly.
Thanks
Numan
>
> Thanks,
> Toshiaki Makita
>
> On 2020/08/15 10:54, Toshiaki Makita wrote:
> > Ping.
> > Any feedback is welcome.
> >
> > Thanks,
> > Toshiaki Makita
> >
> > On 2020/07/31 11:55, Toshiaki Makita wrote:
> >> This patch adds an XDP-based flow cache using the OVS netdev-offload
> >> flow API provider. When an OVS device with XDP offload enabled,
> >> packets first are processed in the XDP flow cache (with parse, and
> >> table lookup implemented in eBPF) and if hits, the action processing
> >> are also done in the context of XDP, which has the minimum overhead.
> >>
> >> This provider is based on top of William's recently posted patch for
> >> custom XDP load. When a custom XDP is loaded, the provider detects if
> >> the program supports classifier, and if supported it starts offloading
> >> flows to the XDP program.
> >>
> >> The patches are derived from xdp_flow[1], which is a mechanism similar to
> >> this but implemented in kernel.
> >>
> >>
> >> * Motivation
> >>
> >> While userspace datapath using netdev-afxdp or netdev-dpdk shows good
> >> performance, there are use cases where packets better to be processed in
> >> kernel, for example, TCP/IP connections, or container to container
> >> connections. Current solution is to use tap device or af_packet with
> >> extra kernel-to/from-userspace overhead. But with XDP, a better solution
> >> is to steer packets earlier in the XDP program, and decides to send to
> >> userspace datapath or stay in kernel.
> >>
> >> One problem with current netdev-afxdp is that it forwards all packets to
> >> userspace, The first patch from William (netdev-afxdp: Enable loading XDP
> >> program.) only provides the interface to load XDP program, howerver users
> >> usually don't know how to write their own XDP program.
> >>
> >> XDP also supports HW-offload so it may be possible to offload flows to
> >> HW through this provider in the future, although not currently.
> >> The reason is that map-in-map is required for our program to support
> >> classifier with subtables in XDP, but map-in-map is not offloadable.
> >> If map-in-map becomes offloadable, HW-offload of our program may also
> >> be possible.
> >>
> >>
> >> * How to use
> >>
> >> 1. Install clang/llvm >= 9, libbpf >= 0.0.6 (included in kernel 5.5), and
> >> kernel >= 5.3.
> >>
> >> 2. make with --enable-afxdp --enable-xdp-offload
> >> --enable-bpf will generate XDP program "bpf/flowtable_afxdp.o". Note that
> >> the BPF object will not be installed anywhere by "make install" at this
> >> point.
> >>
> >> 3. Load custom XDP program
> >> E.g.
> >> $ ovs-vsctl add-port ovsbr0 veth0 -- set int veth0 options:xdp-mode=native
> >> \
> >> options:xdp-obj="/path/to/ovs/bpf/flowtable_afxdp.o"
> >> $ ovs-vsctl add-port ovsbr0 veth1 -- set int veth1 options:xdp-mode=native
> >> \
> >> options:xdp-obj="/path/to/ovs/bpf/flowtable_afxdp.o"
> >>
> >> 4. Enable XDP_REDIRECT
> >> If you use veth devices, make sure to load some (possibly dummy) programs
> >> on the peers of veth devices. This patch set includes a program which
> >> does nothing but returns XDP_PASS. You can use it for the veth peer like
> >> this:
> >> $ ip link set veth1 xdpdrv object /path/to/ovs/bpf/xdp_noop.o section xdp
> >>
> >> Some HW NIC drivers require as many queues as cores on its system. Tweak
> >> queues using "ethtool -L".
> >>
> >> 5. Enable hw-offload
> >> $ ovs-vsctl set Open_vSwitch . other_config:offload-driver=linux_xdp
> >> $ ovs-vsctl set Open_vSwitch . other_config:hw-offload=true
> >> This will starts offloading flows to the XDP program.
> >>
> >> You should be able to see some maps installed, including "debug_stats".
> >> $ bpftool map
> >>
> >> If packets are successfully redirected by the XDP program,
> >> debug_stats[2] will be counted.
> >> $ bpftool map dump id <ID of debug_stats>
> >>
> >> Currently only very limited keys and output actions are supported.
> >> For example NORMAL action entry and IP based matching work with current
> >> key support. VLAN actions used by port tag/trunks are also supported.
> >>
> >>
> >> * Performance
> >>
> >> Tested 2 cases. 1) i40e to veth, 2) i40e to i40e.
> >> Test 1 Measured drop rate at veth interface with redirect action from
> >> physical interface (i40e 25G NIC, XXV 710) to veth. The CPU is Xeon
> >> Silver 4114 (2.20 GHz).
> >> XDP_DROP
> >> +------+ +-------+ +-------+
> >> pktgen -- wire --> | eth0 | -- NORMAL ACTION --> | veth0 |----| veth2 |
> >> +------+ +-------+ +-------+
> >>
> >> Test 2 uses i40e instead of veth, and measured tx packet rate at output
> >> device.
> >>
> >> Single-flow performance test results:
> >>
> >> 1) i40e-veth
> >>
> >> a) no-zerocopy in i40e
> >>
> >> - xdp 3.7 Mpps
> >> - afxdp 980 kpps
> >>
> >> b) zerocopy in i40e (veth does not have zc)
> >>
> >> - xdp 1.9 Mpps
> >> - afxdp 980 Kpps
> >>
> >> 2) i40e-i40e
> >>
> >> a) no-zerocopy
> >>
> >> - xdp 3.5 Mpps
> >> - afxdp 1.5 Mpps
> >>
> >> b) zerocopy
> >>
> >> - xdp 2.0 Mpps
> >> - afxdp 4.4 Mpps
> >>
> >> ** xdp is better when zc is disabled. The reason of poor performance on zc
> >> is that xdp_frame requires packet memory allocation and memcpy on
> >> XDP_REDIRECT to other devices iff zc is enabled.
> >>
> >> ** afxdp with zc is better than xdp without zc, but afxdp is using 2 cores
> >> in this case, one is pmd and the other is softirq. When pmd and softirq
> >> were running on the same core, the performance was extremely poor as
> >> pmd consumes cpus. I also tested afxdp-nonpmd to run softirq and
> >> userspace processing on the same core, but the result was lower than
> >> (pmd results) / 2.
> >> With nonpmd, xdp performance was the same as xdp with pmd. This means
> >> xdp only uses one core (for softirq only). Even with pmd, we need only
> >> one pmd for xdp even when we want to use more cores for multi-flow.
> >>
> >>
> >> This patch set is based on top of commit e8bf77748 ("odp-util: Fix clearing
> >> match mask if set action is partially unnecessary.").
> >>
> >> To make review easier I left pre-squashed commits from v3 here.
> >> https://github.com/tmakita/ovs/compare/xdp_offload_v3...tmakita:xdp_offload_v4_history?expand=1
> >>
> >> [1] https://lwn.net/Articles/802653/
> >>
> >> v4:
> >> - Fix checkpatch errors.
> >> - Fix duplicate flow api register.
> >> - Don't call unnecessary flow api init callbacks when default flow api
> >> provider can be used.
> >> - Fix typo in comments.
> >> - Improve bpf Makefile.am to support automatic dependencies.
> >> - Add a dummy XDP program for veth peers.
> >> - Rename netdev_info to netdev_xdp_info.
> >> - Use id-pool for free subtable entry management and devmap indexes.
> >> - Rename --enable-bpf to --enable-xdp-offload.
> >> - Compile xdp flow api provider only with --enable-xdp-offload.
> >> - Tested again and updated performance numbers in cover letter (get
> >> slightly better numbers).
> >>
> >> v3:
> >> - Use ".ovs_meta" section to inform vswitchd of metadata like supported
> >> keys.
> >> - Rewrite action loop logic in bpf to support multiple actions.
> >> - Add missing linux/types.h in acinclude.m4, as per William Tu.
> >> - Fix infinite reconfiguration loop when xsks_map is missing.
> >> - Add vlan-related actions in bpf program.
> >> - Fix CI build error.
> >> - Fix inability to delete subtable entries.
> >>
> >> v2:
> >> - Add uninit callback of netdev-offload-xdp.
> >> - Introduce "offload-driver" other_config to specify offload driver.
> >> - Add --enable-bpf (HAVE_BPF) config option to build bpf programs.
> >> - Workaround incorrect UINTPTR_MAX in x64 clang bpf build.
> >> - Fix boot.sh autoconf warning.
> >>
> >>
> >> Toshiaki Makita (4):
> >> netdev-offload: Add "offload-driver" other_config to specify offload
> >> driver
> >> netdev-offload: Add xdp flow api provider
> >> bpf: Add reference XDP program implementation for netdev-offload-xdp
> >> bpf: Add dummy program for veth devices
> >>
> >> William Tu (1):
> >> netdev-afxdp: Enable loading XDP program.
> >>
> >> .travis.yml | 2 +-
> >> Documentation/intro/install/afxdp.rst | 59 ++
> >> Makefile.am | 9 +-
> >> NEWS | 2 +
> >> acinclude.m4 | 60 ++
> >> bpf/.gitignore | 4 +
> >> bpf/Makefile.am | 83 ++
> >> bpf/bpf_compiler.h | 25 +
> >> bpf/bpf_miniflow.h | 179 ++++
> >> bpf/bpf_netlink.h | 63 ++
> >> bpf/bpf_workaround.h | 28 +
> >> bpf/flowtable_afxdp.c | 585 ++++++++++++
> >> bpf/xdp_noop.c | 31 +
> >> configure.ac | 2 +
> >> lib/automake.mk | 8 +
> >> lib/bpf-util.c | 38 +
> >> lib/bpf-util.h | 22 +
> >> lib/netdev-afxdp.c | 373 +++++++-
> >> lib/netdev-afxdp.h | 3 +
> >> lib/netdev-linux-private.h | 5 +
> >> lib/netdev-offload-provider.h | 8 +-
> >> lib/netdev-offload-xdp.c | 1213 +++++++++++++++++++++++++
> >> lib/netdev-offload-xdp.h | 49 +
> >> lib/netdev-offload.c | 42 +
> >> 24 files changed, 2881 insertions(+), 12 deletions(-)
> >> create mode 100644 bpf/.gitignore
> >> create mode 100644 bpf/Makefile.am
> >> create mode 100644 bpf/bpf_compiler.h
> >> create mode 100644 bpf/bpf_miniflow.h
> >> create mode 100644 bpf/bpf_netlink.h
> >> create mode 100644 bpf/bpf_workaround.h
> >> create mode 100644 bpf/flowtable_afxdp.c
> >> create mode 100644 bpf/xdp_noop.c
> >> create mode 100644 lib/bpf-util.c
> >> create mode 100644 lib/bpf-util.h
> >> create mode 100644 lib/netdev-offload-xdp.c
> >> create mode 100644 lib/netdev-offload-xdp.h
> >>
> _______________________________________________
> dev mailing list
> [email protected]
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev