> Could you tell roughly how many packets were sent in a single test? Was
the latency measured for all the UDP packets in average?
Let me describe my test method more clearly. In fact, we only tested
pod-to-pod performance *not* pod-to-service and then do profile with
flamegraph and find the loadbalancer process took about 30% CPU usage.
Run two Pods in two different node, and one run qperf server the other run
qperf client to test udp latency and bandwidth performance with command
`qperf {another Pod IP} -ub -oo msg_size:1 -vu udp_lat udp_bw`.
In the first test, we use kube-ovn default setup which use ovn loadbalancer
to replace kube-proxy and got the result latency 25.7us and bandwidth
2.8Mb/s
Then we manually delete all ovn loadbalancer rules bind to the logical
switch, and got a much better result 18.5us and 6Mb/s
> Was it clear why the total datapath cannot be offloaded to HW?
The issue we meet with hw-offload is that mellanox cx5/cx6 didn't support
dp_hash and hash at the moment and these two method are used by
group table to select a backend.
What makes things worse is that when any lb bind to a ls all packet will go
through the lb pipeline even if it not designate to service. So the total
ls datapath cannot be offloaded.
We have a customized path to bypaas the lb pipeline if traffic not
designate to service here
https://github.com/kubeovn/ovn/commit/d26ae4de0ab070f6b602688ba808c8963f69d5c4.patch
> I am sorry that I am confused by OVN "L2" LB. I think you might mean OVN
"L3/L4" LB?
I mean loadbalancers add to ls by ls-lb-add, kube-ovn uses it to replace
kube-proxy
> I am asking because if the packets hit mega flows in the kernel cache,
it shouldn't be slower than kube-proxy which also uses conntrack. If it is
HW offloaded it should be faster.
In my previous profile it seems unrelated to mega flow cache. The flame
graph shows that there is extra ovs clone and reprocess compared to the
flame graph without lb. I have introduced how to profile and optimize
kube-ovn performance before and give more detail about the lb performance
issue at the beginning of the video in Chinese
https://www.youtube.com/watch?v=eqKHs05NUlg&t=27s hope it can provide more
help
On Wed, 8 Jun 2022 at 23:53, Han Zhou <[email protected]> wrote:
>
>
> On Wed, Jun 8, 2022 at 8:08 AM Numan Siddique <[email protected]> wrote:
> >
> > On Wed, Jun 8, 2022 at 6:34 AM 刘梦馨 <[email protected]> wrote:
> > >
> > > Just give some input about eBPF/XDP support.
> > >
> > > We used to use OVN L2 LB to replace kube-proxy in Kubernetes, but found
> > > that
> > > the L2 LB will use conntrack and ovs clone which hurts performance
> badly.
> > > The latency
> > > for 1byte udp packet jumps from 18.5us to 25.7us and bandwidth drop
> from
> > > 6Mb/s to 2.8Mb/s.
> > >
> Thanks for the input!
> Could you tell roughly how many packets were sent in a single test? Was
> the latency measured for all the UDP packets in average? I am asking
> because if the packets hit mega flows in the kernel cache, it shouldn't be
> slower than kube-proxy which also uses conntrack. If it is HW offloaded it
> should be faster.
>
> > > Even if the traffic does not target to LB VIPs has the same performance
> > > drop and it also leads to the
> > > total datapath cannot be offloaded to hardware.
> > >
>
> Was it clear why the total datapath cannot be offloaded to HW? There might
> be problems of supporting HW offloading in earlier version of OVN. There
> have been improvements to make it more HW offload friendly.
>
> > > And finally we turn to using Cilium's chaining mode to replace the OVN
> L2
> > > LB to implement kube-proxy to
> > > resolve the above issues. We hope to see the lb optimization by
> eBPF/XDP on
> > > the OVN side.
> > >
> >
> > Thanks for your comments and inputs. I think we should definitely
> > explore optimizing this use case
> > and see if its possible to leverage eBPF/XDP for this.
> >
>
> I am sorry that I am confused by OVN "L2" LB. I think you might mean OVN
> "L3/L4" LB?
>
> Some general thoughts on this is, OVN is primarily to program OVS (or
> other OpenFlow based datapath) to implement SDN. OVS OpenFlow is a
> data-driven approach (as mentioned by Ben in several talks). The advantage
> is that it uses caches to accelerate datapath, regardless of the number of
> pipeline stages in the forwarding logic; and the disadvantage is of course
> when a packet has a cache miss, it will be slow. So I would think the
> direction of using eBPF/XDP is better to be within OVS itself, instead of
> adding an extra stage that cannot be cached within the OVS framework,
> because even if the extra stage is very fast, it is still extra.
>
> I would consider such an extra eBPF/XDP stage in OVN directly only for the
> cases that we know it is likely to miss the OVS/HW flow caches. One example
> may be DOS attacks that always trigger CT unestablished entries, which is
> not HW offload friendly. (But I don't have concrete use cases/scenarios)
>
> In the case of OVN LB, I don't see a reason why it would miss the cache
> except for the first packets. Adding an extra eBPF/XDP stage on top of the
> OVS cached pipeline doesn't seem to improve the performance.
>
> > > On Wed, 8 Jun 2022 at 14:43, Han Zhou <[email protected]> wrote:
> > >
> > > > On Mon, May 30, 2022 at 5:46 PM <[email protected]> wrote:
> > > > >
> > > > > From: Numan Siddique <[email protected]>
> > > > >
> > > > > XDP program - ovn_xdp.c added in this RFC patch series implements
> basic
> > > > port
> > > > > security and drops any packet if the port security check fails.
> > > > > There are still few TODOs in the port security checks. Like
> > > > > - Make ovn xdp configurable.
> > > > > - Removing the ingress Openflow rules from table 73 and 74
> if ovn
> > > > xdp
> > > > > is enabled.
> > > > > - Add IPv6 support.
> > > > > - Enhance the port security xdp program for ARP/IPv6 ND
> checks.
> > > > >
> > > > > This patch adds a basic XDP support in OVN and in future we can
> > > > > leverage eBPF/XDP features.
> > > > >
> > > > > I'm not sure how much value this RFC patch adds to make use of
> eBPF/XDP
> > > > > just for port security. Submitting as RFC to get some feedback and
> > > > > start some conversation on eBPF/XDP in OVN.
> > > > >
> > > > Hi Numan,
> > > >
> > > > This is really cool. It demonstrates how OVN could leverage eBPF/XDP.
> > > >
> > > > On the other hand, for the port-security feature in XDP, I keep
> thinking
> > > > about the scenarios and it is still not very clear to me. One
> advantage I
> > > > can think of is to prevent DOS attacks from VM/Pod when invalid
> IP/MAC are
> > > > used, XDP may perform better and drop packets with lower CPU cost
> > > > (comparing with OVS kernel datapath). However, I am also wondering
> why
> > > > would a attacker use invalid IP/MAC for DOS attacks? Do you have
> some more
> > > > thoughts about the use cases?
> >
> > My idea was to demonstrate the use of eBPF/XDP and port security
> > checks were easy to do
> > before the packet hits the OVS pipeline.
> >
> Understand. It is indeed a great demonstration.
>
> > If we were to move the port security check to XDP, then the only
> > advantage we would be getting
> > in my opinion is to remove the corresponding ingress port security
> > check related OF rules from ovs-vswitchd, thereby decreasing some
> > looks up during
> > flow translation.
> >
> For slow path, it might reduce the lookups in two tables, but considering
> that we have tens of tables, this cost may be negligible?
> For fast path, there is no impact on the megaflow cache.
>
> > I'm not sure why an attacker would use invalid IP/MAC for DOS attacks.
> > But from what I know, ovn-kubernetes do want to restrict each POD to
> > its assigned IP/MAC.
> >
> Yes, restricting pods to use assigned IP/MAC is for port security, which
> is implemented by the port-security flows. I was talking about DOS attacks
> just to imagine a use case that utilizes the performance advantage of XDP.
> If it is just to detect and drop a regular amount of packets that try to
> use fake IP/MAC to circumvent security policies (ACLs), it doesn't reflect
> the benefit of XDP.
>
> > And do you have any performance results
> > > > comparing with the current OVS implementation?
> >
> > I didn't do any scale/performance related tests.
> >
> > If we were to move port security feature to XDP in OVN, then I think we
> need to
> > - Complete the TODO's like adding IPv6 and ARP/ND related checks
> > - Do some scale testing and see whether its reducing memory
> > footprint of ovs-vswitchd and ovn-controller because of the reduction
> > in OF rules
> >
>
> Maybe I am wrong, but I think port-security flows are only related to
> local LSPs on each node, which doesn't contribute much to the
> OVS/ovn-controller memory footprint, and thanks to your patches that moves
> port-security flow generation from northd to ovn-controller, the central
> components are already out of the picture of the port-security related
> costs. So I guess we won't see obvious differences in scale tests.
>
> > > >
> > > > Another question is, would it work with smart NIC HW-offload, where
> VF
> > > > representer ports are added to OVS on the smart NIC? I guess XDP
> doesn't
> > > > support representer port, right?
> >
> > I think so. I don't have much experience/knowledge on this. From what
> > I understand, if datapath flows are offloaded and since XDP is not
> > offloaded, the xdo checks will be totally missed.
> > So if XDP is to be used, then offloading should be disabled.
> >
>
> Agree, although I did hope it could help for HW offload enabled
> environments to mitigate the scenarios when packets would miss the HW flow
> cache.
>
> Thanks,
> Han
>
> > Thanks
> > Numan
> >
> > > >
> > > > Thanks,
> > > > Han
> > > >
> > > > > In order to attach and detach xdp programs, libxdp [1] and libbpf
> is
> > > > used.
> > > > >
> > > > > To test it out locally, please install libxdp-devel and
> libbpf-devel
> > > > > and the compile OVN first and then compile ovn_xdp by running "make
> > > > > bpf". Copy ovn_xdp.o to either /usr/share/ovn/ or
> /usr/local/share/ovn/
> > > > >
> > > > >
> > > > > Numan Siddique (2):
> > > > > RFC: Add basic xdp/eBPF support in OVN.
> > > > > RFC: ovn-controller: Attach XDP progs to the VIFs of the logical
> > > > > ports.
> > > > >
> > > > > Makefile.am | 6 +-
> > > > > bpf/.gitignore | 5 +
> > > > > bpf/automake.mk | 23 +++
> > > > > bpf/ovn_xdp.c | 156 +++++++++++++++
> > > > > configure.ac | 2 +
> > > > > controller/automake.mk | 4 +-
> > > > > controller/binding.c | 45 +++--
> > > > > controller/binding.h | 7 +
> > > > > controller/ovn-controller.c | 79 +++++++-
> > > > > controller/xdp.c | 389
> ++++++++++++++++++++++++++++++++++++
> > > > > controller/xdp.h | 41 ++++
> > > > > m4/ovn.m4 | 20 ++
> > > > > tests/automake.mk | 1 +
> > > > > 13 files changed, 753 insertions(+), 25 deletions(-)
> > > > > create mode 100644 bpf/.gitignore
> > > > > create mode 100644 bpf/automake.mk
> > > > > create mode 100644 bpf/ovn_xdp.c
> > > > > create mode 100644 controller/xdp.c
> > > > > create mode 100644 controller/xdp.h
> > > > >
> > > > > --
> > > > > 2.35.3
> > > > >
> > > > > _______________________________________________
> > > > > dev mailing list
> > > > > [email protected]
> > > > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> > > > _______________________________________________
> > > > dev mailing list
> > > > [email protected]
> > > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> > > >
> > >
> > >
> > > --
> > > 刘梦馨
> > > Blog: http://oilbeater.com
> > > Weibo: @oilbeater <http://weibo.com/oilbeater>
> > > _______________________________________________
> > > dev mailing list
> > > [email protected]
> > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
--
刘梦馨
Blog: http://oilbeater.com
Weibo: @oilbeater <http://weibo.com/oilbeater>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev