On 08/03/2018 09:58 AM, Toshiaki Makita wrote:
> This patch set introduces driver XDP for veth.
> Basically this is used in conjunction with redirect action of another XDP
> NIC -----------> veth===veth
> (XDP) (redirect) (XDP)
> In this case xdp_frame can be forwarded to the peer veth without
> modification, so we can expect far better performance than generic XDP.
> Envisioned use-cases
> * Container managed XDP program
> Container host redirects frames to containers by XDP redirect action, and
> privileged containers can deploy their own XDP programs.
> * XDP program cascading
> Two or more XDP programs can be called for each packet by redirecting
> xdp frames to veth.
> * Internal interface for an XDP bridge
> When using XDP redirection to create a virtual bridge, veth can be used
> to create an internal interface for the bridge.
> This changeset is making use of NAPI to implement ndo_xdp_xmit and
> XDP_TX/REDIRECT. This is mainly because XDP heavily relies on NAPI
> - patch 1: Export a function needed for veth XDP.
> - patch 2-3: Basic implementation of veth XDP.
> - patch 4-6: Add ndo_xdp_xmit.
> - patch 7-9: Add XDP_TX and XDP_REDIRECT.
> - patch 10: Performance optimization for multi-queue env.
> Tests and performance numbers
> Tested with a simple XDP program which only redirects packets between
> NIC and veth. I used i40e 25G NIC (XXV710) for the physical NIC. The
> server has 20 of Xeon Silver 2.20 GHz cores.
> pktgen --(wire)--> XXV710 (i40e) <--(XDP redirect)--> veth===veth (XDP)
> The rightmost veth loads XDP progs and just does DROP or TX. The number
> of packets is measured in the XDP progs. The leftmost pktgen sends
> packets at 37.1 Mpps (almost 25G wire speed).
> veth XDP action Flows Mpps
> DROP 1 10.6
> DROP 2 21.2
> DROP 100 36.0
> TX 1 5.0
> TX 2 10.0
> TX 100 31.0
> I also measured netperf TCP_STREAM but was not so great performance due
> to lack of tx/rx checksum offload and TSO, etc.
> netperf <--(wire)--> XXV710 (i40e) <--(XDP redirect)--> veth===veth (XDP
> Direction Flows Gbps
> external->veth 1 20.8
> external->veth 2 23.5
> external->veth 100 23.6
> veth->external 1 9.0
> veth->external 2 17.8
> veth->external 100 22.9
> Also tested doing ifup/down or load/unload a XDP program repeatedly
> during processing XDP packets in order to check if enabling/disabling
> NAPI is working as expected, and found no problems.
> - Don't use xdp_frame pointer address to calculate skb->head, headroom,
> and xdp_buff.data_hard_start.
> - Introduce xdp_scrub_frame() to clear kernel pointers in xdp_frame and
> use it instead of memset().
> - Check skb->len only if reallocation is needed.
> - Add __GFP_NOWARN to alloc_page() since it can be triggered by external
> - Fix sparse warning around EXPORT_SYMBOL.
> - Fix broken SOBs.
> - Don't adjust MTU automatically.
> - Skip peer IFF_UP check on .ndo_xdp_xmit() because it is unnecessary.
> Add comments to explain that.
> - Use redirect_info instead of xdp_mem_info for storing no_direct flag
> to avoid per packet copy cost.
> - Drop skb bulk xmit patch since it makes little performance
> difference. The hotspot in TCP skb xmit at this point is checksum
> computation in skb_segment and packet copy on XDP_REDIRECT due to
> cloned/nonlinear skb.
> - Fix race on closing device.
> - Add extack messages in ndo_bpf.
> - Squash NAPI patch with "Add driver XDP" patch.
> - Remove conversion from xdp_frame to skb when NAPI is not enabled.
> - Introduce per-queue XDP ring (patch 8).
> - Introduce bulk skb xmit when XDP is enabled on the peer (patch 9).
> Signed-off-by: Toshiaki Makita <makita.toshi...@lab.ntt.co.jp>
> Toshiaki Makita (10):
> net: Export skb_headers_offset_update
> veth: Add driver XDP
> veth: Avoid drops by oversized packets when XDP is enabled
> xdp: Helper function to clear kernel pointers in xdp_frame
> veth: Handle xdp_frames in xdp napi ring
> veth: Add ndo_xdp_xmit
> bpf: Make redirect_info accessible from modules
> xdp: Helpers for disabling napi_direct of xdp_return_frame
> veth: Add XDP TX and REDIRECT
> veth: Support per queue XDP ring
> drivers/net/veth.c | 750
> include/linux/filter.h | 35 +++
> include/linux/skbuff.h | 1 +
> include/net/xdp.h | 7 +
> net/core/filter.c | 29 +-
> net/core/skbuff.c | 3 +-
> net/core/xdp.c | 6 +-
> 7 files changed, 801 insertions(+), 30 deletions(-)
Applied to bpf-next, thanks Toshiaki!