On Fri, Oct 30, 2020 at 10:49 AM Toshiaki Makita
<[email protected]> wrote:
>
> Hi all,
>
> It's about 3 months since I submitted this patch set.
> Could someone review this?
> Or should I resubmit the patch set on the top of current master?

Since the patches don't apply cleanly, I think you can rebase and
repost them and/or provide the
ovs commit id on top of which these patches apply cleanly.

Thanks
Numan


>
> Thanks,
> Toshiaki Makita
>
> On 2020/08/15 10:54, Toshiaki Makita wrote:
> > Ping.
> > Any feedback is welcome.
> >
> > Thanks,
> > Toshiaki Makita
> >
> > On 2020/07/31 11:55, Toshiaki Makita wrote:
> >> This patch adds an XDP-based flow cache using the OVS netdev-offload
> >> flow API provider.  When an OVS device with XDP offload enabled,
> >> packets first are processed in the XDP flow cache (with parse, and
> >> table lookup implemented in eBPF) and if hits, the action processing
> >> are also done in the context of XDP, which has the minimum overhead.
> >>
> >> This provider is based on top of William's recently posted patch for
> >> custom XDP load.  When a custom XDP is loaded, the provider detects if
> >> the program supports classifier, and if supported it starts offloading
> >> flows to the XDP program.
> >>
> >> The patches are derived from xdp_flow[1], which is a mechanism similar to
> >> this but implemented in kernel.
> >>
> >>
> >> * Motivation
> >>
> >> While userspace datapath using netdev-afxdp or netdev-dpdk shows good
> >> performance, there are use cases where packets better to be processed in
> >> kernel, for example, TCP/IP connections, or container to container
> >> connections.  Current solution is to use tap device or af_packet with
> >> extra kernel-to/from-userspace overhead.  But with XDP, a better solution
> >> is to steer packets earlier in the XDP program, and decides to send to
> >> userspace datapath or stay in kernel.
> >>
> >> One problem with current netdev-afxdp is that it forwards all packets to
> >> userspace, The first patch from William (netdev-afxdp: Enable loading XDP
> >> program.) only provides the interface to load XDP program, howerver users
> >> usually don't know how to write their own XDP program.
> >>
> >> XDP also supports HW-offload so it may be possible to offload flows to
> >> HW through this provider in the future, although not currently.
> >> The reason is that map-in-map is required for our program to support
> >> classifier with subtables in XDP, but map-in-map is not offloadable.
> >> If map-in-map becomes offloadable, HW-offload of our program may also
> >> be possible.
> >>
> >>
> >> * How to use
> >>
> >> 1. Install clang/llvm >= 9, libbpf >= 0.0.6 (included in kernel 5.5), and
> >>     kernel >= 5.3.
> >>
> >> 2. make with --enable-afxdp --enable-xdp-offload
> >> --enable-bpf will generate XDP program "bpf/flowtable_afxdp.o".  Note that
> >> the BPF object will not be installed anywhere by "make install" at this 
> >> point.
> >>
> >> 3. Load custom XDP program
> >> E.g.
> >> $ ovs-vsctl add-port ovsbr0 veth0 -- set int veth0 options:xdp-mode=native 
> >> \
> >>    options:xdp-obj="/path/to/ovs/bpf/flowtable_afxdp.o"
> >> $ ovs-vsctl add-port ovsbr0 veth1 -- set int veth1 options:xdp-mode=native 
> >> \
> >>    options:xdp-obj="/path/to/ovs/bpf/flowtable_afxdp.o"
> >>
> >> 4. Enable XDP_REDIRECT
> >> If you use veth devices, make sure to load some (possibly dummy) programs
> >> on the peers of veth devices. This patch set includes a program which
> >> does nothing but returns XDP_PASS. You can use it for the veth peer like
> >> this:
> >> $ ip link set veth1 xdpdrv object /path/to/ovs/bpf/xdp_noop.o section xdp
> >>
> >> Some HW NIC drivers require as many queues as cores on its system. Tweak
> >> queues using "ethtool -L".
> >>
> >> 5. Enable hw-offload
> >> $ ovs-vsctl set Open_vSwitch . other_config:offload-driver=linux_xdp
> >> $ ovs-vsctl set Open_vSwitch . other_config:hw-offload=true
> >> This will starts offloading flows to the XDP program.
> >>
> >> You should be able to see some maps installed, including "debug_stats".
> >> $ bpftool map
> >>
> >> If packets are successfully redirected by the XDP program,
> >> debug_stats[2] will be counted.
> >> $ bpftool map dump id <ID of debug_stats>
> >>
> >> Currently only very limited keys and output actions are supported.
> >> For example NORMAL action entry and IP based matching work with current
> >> key support. VLAN actions used by port tag/trunks are also supported.
> >>
> >>
> >> * Performance
> >>
> >> Tested 2 cases. 1) i40e to veth, 2) i40e to i40e.
> >> Test 1 Measured drop rate at veth interface with redirect action from
> >> physical interface (i40e 25G NIC, XXV 710) to veth. The CPU is Xeon
> >> Silver 4114 (2.20 GHz).
> >>                                                                 XDP_DROP
> >>                      +------+                      +-------+    +-------+
> >>   pktgen -- wire --> | eth0 | -- NORMAL ACTION --> | veth0 |----| veth2 |
> >>                      +------+                      +-------+    +-------+
> >>
> >> Test 2 uses i40e instead of veth, and measured tx packet rate at output
> >> device.
> >>
> >> Single-flow performance test results:
> >>
> >> 1) i40e-veth
> >>
> >>    a) no-zerocopy in i40e
> >>
> >>      - xdp   3.7 Mpps
> >>      - afxdp 980 kpps
> >>
> >>    b) zerocopy in i40e (veth does not have zc)
> >>
> >>      - xdp   1.9 Mpps
> >>      - afxdp 980 Kpps
> >>
> >> 2) i40e-i40e
> >>
> >>    a) no-zerocopy
> >>
> >>      - xdp   3.5 Mpps
> >>      - afxdp 1.5 Mpps
> >>
> >>    b) zerocopy
> >>
> >>      - xdp   2.0 Mpps
> >>      - afxdp 4.4 Mpps
> >>
> >> ** xdp is better when zc is disabled. The reason of poor performance on zc
> >>     is that xdp_frame requires packet memory allocation and memcpy on
> >>     XDP_REDIRECT to other devices iff zc is enabled.
> >>
> >> ** afxdp with zc is better than xdp without zc, but afxdp is using 2 cores
> >>     in this case, one is pmd and the other is softirq. When pmd and softirq
> >>     were running on the same core, the performance was extremely poor as
> >>     pmd consumes cpus. I also tested afxdp-nonpmd to run softirq and
> >>     userspace processing on the same core, but the result was lower than
> >>     (pmd results) / 2.
> >>     With nonpmd, xdp performance was the same as xdp with pmd. This means
> >>     xdp only uses one core (for softirq only). Even with pmd, we need only
> >>     one pmd for xdp even when we want to use more cores for multi-flow.
> >>
> >>
> >> This patch set is based on top of commit e8bf77748 ("odp-util: Fix clearing
> >> match mask if set action is partially unnecessary.").
> >>
> >> To make review easier I left pre-squashed commits from v3 here.
> >> https://github.com/tmakita/ovs/compare/xdp_offload_v3...tmakita:xdp_offload_v4_history?expand=1
> >>
> >> [1] https://lwn.net/Articles/802653/
> >>
> >> v4:
> >> - Fix checkpatch errors.
> >> - Fix duplicate flow api register.
> >> - Don't call unnecessary flow api init callbacks when default flow api
> >>    provider can be used.
> >> - Fix typo in comments.
> >> - Improve bpf Makefile.am to support automatic dependencies.
> >> - Add a dummy XDP program for veth peers.
> >> - Rename netdev_info to netdev_xdp_info.
> >> - Use id-pool for free subtable entry management and devmap indexes.
> >> - Rename --enable-bpf to --enable-xdp-offload.
> >> - Compile xdp flow api provider only with --enable-xdp-offload.
> >> - Tested again and updated performance numbers in cover letter (get
> >>    slightly better numbers).
> >>
> >> v3:
> >> - Use ".ovs_meta" section to inform vswitchd of metadata like supported
> >>    keys.
> >> - Rewrite action loop logic in bpf to support multiple actions.
> >> - Add missing linux/types.h in acinclude.m4, as per William Tu.
> >> - Fix infinite reconfiguration loop when xsks_map is missing.
> >> - Add vlan-related actions in bpf program.
> >> - Fix CI build error.
> >> - Fix inability to delete subtable entries.
> >>
> >> v2:
> >> - Add uninit callback of netdev-offload-xdp.
> >> - Introduce "offload-driver" other_config to specify offload driver.
> >> - Add --enable-bpf (HAVE_BPF) config option to build bpf programs.
> >> - Workaround incorrect UINTPTR_MAX in x64 clang bpf build.
> >> - Fix boot.sh autoconf warning.
> >>
> >>
> >> Toshiaki Makita (4):
> >>    netdev-offload: Add "offload-driver" other_config to specify offload
> >>      driver
> >>    netdev-offload: Add xdp flow api provider
> >>    bpf: Add reference XDP program implementation for netdev-offload-xdp
> >>    bpf: Add dummy program for veth devices
> >>
> >> William Tu (1):
> >>    netdev-afxdp: Enable loading XDP program.
> >>
> >>   .travis.yml                           |    2 +-
> >>   Documentation/intro/install/afxdp.rst |   59 ++
> >>   Makefile.am                           |    9 +-
> >>   NEWS                                  |    2 +
> >>   acinclude.m4                          |   60 ++
> >>   bpf/.gitignore                        |    4 +
> >>   bpf/Makefile.am                       |   83 ++
> >>   bpf/bpf_compiler.h                    |   25 +
> >>   bpf/bpf_miniflow.h                    |  179 ++++
> >>   bpf/bpf_netlink.h                     |   63 ++
> >>   bpf/bpf_workaround.h                  |   28 +
> >>   bpf/flowtable_afxdp.c                 |  585 ++++++++++++
> >>   bpf/xdp_noop.c                        |   31 +
> >>   configure.ac                          |    2 +
> >>   lib/automake.mk                       |    8 +
> >>   lib/bpf-util.c                        |   38 +
> >>   lib/bpf-util.h                        |   22 +
> >>   lib/netdev-afxdp.c                    |  373 +++++++-
> >>   lib/netdev-afxdp.h                    |    3 +
> >>   lib/netdev-linux-private.h            |    5 +
> >>   lib/netdev-offload-provider.h         |    8 +-
> >>   lib/netdev-offload-xdp.c              | 1213 +++++++++++++++++++++++++
> >>   lib/netdev-offload-xdp.h              |   49 +
> >>   lib/netdev-offload.c                  |   42 +
> >>   24 files changed, 2881 insertions(+), 12 deletions(-)
> >>   create mode 100644 bpf/.gitignore
> >>   create mode 100644 bpf/Makefile.am
> >>   create mode 100644 bpf/bpf_compiler.h
> >>   create mode 100644 bpf/bpf_miniflow.h
> >>   create mode 100644 bpf/bpf_netlink.h
> >>   create mode 100644 bpf/bpf_workaround.h
> >>   create mode 100644 bpf/flowtable_afxdp.c
> >>   create mode 100644 bpf/xdp_noop.c
> >>   create mode 100644 lib/bpf-util.c
> >>   create mode 100644 lib/bpf-util.h
> >>   create mode 100644 lib/netdev-offload-xdp.c
> >>   create mode 100644 lib/netdev-offload-xdp.h
> >>
> _______________________________________________
> dev mailing list
> [email protected]
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to