Hi Ilya, Thanks for all the feedback!
On Fri, Jul 5, 2019 at 8:32 AM Ilya Maximets <i.maxim...@samsung.com> wrote: > > On 01.07.2019 19:08, William Tu wrote: > > The patch introduces experimental AF_XDP support for OVS netdev. > > AF_XDP, the Address Family of the eXpress Data Path, is a new Linux socket > > type built upon the eBPF and XDP technology. It is aims to have comparable > > performance to DPDK but cooperate better with existing kernel's networking > > stack. An AF_XDP socket receives and sends packets from an eBPF/XDP program > > attached to the netdev, by-passing a couple of Linux kernel's subsystems > > As a result, AF_XDP socket shows much better performance than AF_PACKET > > For more details about AF_XDP, please see linux kernel's > > Documentation/networking/af_xdp.rst. Note that by default, this feature is > > not compiled in. > > > > Signed-off-by: William Tu <u9012...@gmail.com> > > --- > > > > v13-v14 > > - Mainly address issue reported by Ilya > > > > https://protect2.fireeye.com/url?k=a0e04e091a64b944.a0e1c546-0896b5863118ce25&u=https://patchwork.ozlabs.org/patch/1118972/ > > when doing 'make check-afxdp' > > - Fix xdp frame headroom issue > > - Fix vlan test cases by disabling txvlan offload > > - Skip cvlan > > - Document TCP limitation (currently all tcp tests fail due to > > kernel veth driver) > > Maybe it's better to skip them too? The easy way could be: > > diff --git a/tests/system-afxdp-macros.at b/tests/system-afxdp-macros.at > index 5ee2ceb1a..f0683c0a9 100644 > --- a/tests/system-afxdp-macros.at > +++ b/tests/system-afxdp-macros.at > @@ -30,3 +30,10 @@ m4_define([CONFIGURE_VETH_OFFLOADS], > AT_CHECK([ethtool -K $1 txvlan off], [0], [ignore], [ignore]) > ] > ) > + > +# OVS_START_L7([namespace], [protocol]) > +# > +# AF_XDP doesn't work with TCP over virtual interfaces for now. > +# > +m4_define([OVS_START_L7], > + [AT_SKIP_IF([:])]) > --- > > BTW, documentation should stay. > OK Will do it! I rebase to master, added the above and run make check-afxdp TESTSUITEFLAGS='-v -x 2' it hangs at ip link del ovs-$1 I have to remove it to make it works --- a/tests/system-afxdp-macros.at +++ b/tests/system-afxdp-macros.at @@ -15,7 +15,6 @@ m4_define([ADD_VETH], if test -n "$6"; then NS_CHECK_EXEC([$2], [ip route add default via $6]) fi - on_exit 'ip link del ovs-$1' ] ) I don't know why, but in the end ip netns del at_ns0 clean up the ovs-p0 device > > - Fix tunnel test cases due to --disable-system (another patch) > > - Switch to use pthread_spin_lock, suggested by Ben > > - Add coverage counter for debugging > > - Fix buffer starvation issue at batch_send reported by Eelco > > when using tap device with type=afxdp > > - TODO: > > 'make check-afxdp' still not all pass > > IP fragmentation expiry test not fix yet, need to implement > > deferral memory free, s.t like dpdk_mp_sweep. Currently hit > > some missing umem descs when reclaiming. > > --- > > Instead of inline comments I'll suggest incremental patch (below) with > following explanations: > > * You still need to return EAGAIN in rxq_recv if socket is not ready. > Otherwise upper layers will count the call as successful. > > * sendto() in SKB mode will send at most 16 packets, but our batch might > be up to 32 packets. If batch size will be larger than 16, TX ring will > be quickly exhausted and transmission will be blocked. We have to kick > the kernel up to n_packets / 16 times to be sure that all packets are > transmitted. This fixes my issues with zero traffic in test with iperf > between 2 namespaces while more than 2 traffic flows involved. > Thanks for getting down to the root cause! > * EAGAIN is very likely case for sendto. > > * Suggesting to reclaim all CONS_NUM_DESCS addresses from the completion > queue each time instead of BATCH_SIZE addresses. This will guarantee > that we'll have enough cq entries for all cases. This is important > because we may have much more than BATCH_SIZE packets in outstanding_tx. Good point, thanks. > > * No need to kick_tx inside afxdp_complete_tx since we'll retry more times > while kicking on transmit. > > * Suggesting warning on exhausted memory pool. This is a bad case and should > be reported. > > * 'concurrent_txq' is a likely case, because we can't configure tx queues > separately from rx queues, so we'll likely have number of tx queues less > than number of threads --> 'concurrent_txq' will likely be true. > I think that we need to just remove 'OVS_UNLIKELY' macro there. > > * The worst case for the mpool size is larger than just size of the rings. > There might be more packets currently under processing in OVS, stored > in output batches in datapath. It's hard to estimate the good size. > Suggesting just to increase two times for now. Thanks, I saw your comments above the code. <snip> > > > Few additional questions/requests: > > * Why we need dev->xdp_flags and dev->xdp_bind_flags? These are not used and > mostly just duplicates the xdpmode. I think you're right, we can combine them into one flag right now. But the reason is AF_XDP can be configured in driver mode + copy (for those drivers having XDP support but not af_xdp support), ex: cfg.bind_flags = XDP_COPY; cfg.xdp_flags = XDP_FLAGS_UPDATE_IF_NOEXIST | XDP_FLAGS_DRV_MODE; Now for OVS netdev-afxdp, I didn't enable this option, so it's either SKB or DRV+ZEROCOPY. > > * Please, don't declare two structure fields at the same line: > > int xdpmode, requested_xdpmode; > int xdp_flags, xdp_bind_flags; > OK Thanks! Regards, William _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev