Re: [vpp-dev] Throughput of VPP on KVM is Significantly Worse than Linux Kernel

Benoit Ganne (bganne) via lists.fd.io Mon, 03 Oct 2022 01:25:19 -0700

Hi Wentian, Xiaodong,

When testing VM-to-VM iperf (ie TCP) throughput like Wentian does, the most 
important factor is whether GSO is turned on: when using Linux as a router, it 
is by default whereas when using VPP as a router it is not.
GSO means Linux is going to use 64K bytes TCP packets whereas no GSO means VPP 
is going to use MTU-sized packets (probably 1500 bytes).
You can check by using tcpdump to look at packet on both sides of the iperf and 
see packet sizes: when using Linux routing, I'd expect to see GSO packets (10's 
of Kbytes) and non-GSO packets (1500 bytes) with VPP.


With that said, the performances you have with VPP are still really slow, even 
without GSO. Getting high-performance in VMs environment is tricky, you must 
takes special care about qemu configuration (virtio queue depth, IO mode etc), 
cpu-pinning (vCPU threads, vhost threads, vpp workers, etc).
If your usecase is to use this setup as a traditional router connected through 
physical interfaces and forwarding packets over a network, and this is a test 
setup, I'd recommend to change the setup to match reality, this is going to be 
much simpler to setup & optimize correctly.

Best
ben

> -----Original Message-----
> From: vpp-dev@lists.fd.io <vpp-dev@lists.fd.io> On Behalf Of Xiaodong Xu
> Sent: Monday, October 3, 2022 1:18
> To: vpp-dev@lists.fd.io
> Subject: Re: [vpp-dev] Throughput of VPP on KVM is Significantly Worse
> than Linux Kernel
> 
> Hi Wentian,
> 
> I ran a perf test with a similar topo to your setup and got the same
> result. The only difference is that the iperf server is running on the
> host rather than in another VM. So the throughput is close to 20Gbps with
> Linux kernel data plane while only 1.5Gbps with VPP dataplane.
> I think we might have run into the same issue as
> https://lists.fd.io/g/vpp-dev/message/9571.
> 
> Before that, I tried TRex and Pktgen-DPDK, and the results were different.
> Usually the throughput would be a bit higher with VPP dataplane than Linux
> kernel data plane, but not much. When I was checking the CPU usage with
> VPP dataplane (I change the rx-mode to interrupt from polling), it is
> pretty low (< 10%), so it sounds to me the bottleneck is not the CPU.
> 
> I also ran a similar test with vhost-user driver, following
> https://www.redhat.com/en/blog/hands-vhost-user-warm-welcome-dpdk, I got
> much better throughput (close to 20Gbps with Pktgen-DPDK), thanks to the
> capability to access share memory brought by vhost-user. There are much
> less memory copies than the above user case where vhost-net is used.
> 
> I don't have a chance to run any test in the cloud environment but still
> believe it is an important use case for VPP to run in KVM setup. If anyone
> in the VPP community can shed some light on this issue, I believe both
> Wentian and I would appreciate it very much.
> 
> Xiaodong
> 
> On Sun, Oct 2, 2022 at 8:10 AM Bu Wentian <buwent...@outlook.com
> <mailto:buwent...@outlook.com> > wrote:
> 
> 
>       Hi Xiaodong,
> 
>       Thank you for your reply!
> 
>       I'm exactly using the VPP 22.06 installed through apt from FD.io
> repo. The linux-cp and linux-nl plugins also come with the VPP from the
> repo.
> 
>       The virtual NICs on my VMs use the virtio(assigned by
> "model=virtio" when installing with virt-install). The VMs are connected
> through libvirtd networks (auto-create bridges). In my experiments, I can
> ping M2 from M1, and the neighbor table and routing table in VPP seem to
> be correct.
> 
>       I'm not sure which driver the VPP is using (maybe vfio-pci?). The
> packet counter looked like this:
>       vpp# show int GE1
>                     Name               Idx    State  MTU
> (L3/IP4/IP6/MPLS)     Counter          Count
>       GE1                               1      up          9000/0/0/0
> rx packets               3719027
> 
> rx bytes              5630430079
> 
> tx packets               1107500
> 
> tx bytes                73176221
> 
> drops                         76
> 
> ip4                      3718961
> 
> ip6                           61
> 
> tx-error                       1
>       vpp# show int GE2
>                     Name               Idx    State  MTU
> (L3/IP4/IP6/MPLS)     Counter          Count
>       GE2                               2      up          9000/0/0/0
> rx packets               1107520
> 
> rx bytes                73177597
> 
> tx packets               3718998
> 
> tx bytes              5630427889
> 
> drops                         63
> 
> ip4                      1107455
> 
> ip6                           62
> 
> tx-error                    1162
> 
>       Could you give me more information about how can I get details of
> the error packets? And the main problem is that the VPP forwarding
> performance is much worse than Linux kernel (2Gbps vs 26Gbps), is there
> any method to improve it, or what can I do to find the reason?
> 
>       Sincerely,
>       Wentian
> 
> 
> 
> 
> 
> 
> ________________________________
> 
>       发件人: vpp-dev@lists.fd.io <mailto:vpp-dev@lists.fd.io>  <vpp-
> d...@lists.fd.io <mailto:vpp-dev@lists.fd.io> > 代表 Xiaodong Xu
> <stid.s...@gmail.com <mailto:stid.s...@gmail.com> >
>       发送时间: 2022年10月2日 1:05
>       收件人: vpp-dev@lists.fd.io <mailto:vpp-dev@lists.fd.io>  <vpp-
> d...@lists.fd.io <mailto:vpp-dev@lists.fd.io> >
>       主题: Re: [vpp-dev] Throughput of VPP on KVM is Significantly Worse
> than Linux Kernel
> 
>       Which vpp version are you using in your testing? As of VPP 22.06,
> linux-cp and linux-nl plugins have been supported and the binary builds
> are available at FD.io repository (https://s3-
> docs.fd.io/vpp/22.10/gettingstarted/installing/ubuntu.html
> <https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fs3-
> docs.fd.io%2Fvpp%2F22.10%2Fgettingstarted%2Finstalling%2Fubuntu.html&data=
> 05%7C01%7C%7Cbadce86b78394ff18b8408daa3cf2dc3%7C84df9e7fe9f640afb435aaaaaa
> aaaaaa%7C1%7C0%7C638002407476279512%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLj
> AwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata
> =y5dLTkQx%2B3E%2Bg2VEpwMos6YhNZSD0mSlOdfGR6t77tY%3D&reserved=0> ).
> 
>       Can you install vpp from the FD.io repo and try again? (BTW, you
> might want to disable the ping plugin if linux-cp is used) I would also
> suggest you add static routes to rule out any issue with FRR (in which
> case you don't actually need linux-cp plugin).
> 
>       In the meanwhile, I wonder what uio driver you are using for your
> VPP machine (igb_uio, uio_pci_generic, or vfio-pci). Assuming you are
> running virtio-net driver on the guest machine, and you are connecting the
> M1 and R1, R1 and M2 with linux kernel bridge.
> 
>       If you still run into any issue, you may want to check the neighbor
> table and routing table in the VPP system first, and maybe the interface
> counter as well.
> 
>       Regards,
>       Xiaodong
> 
>       On Sat, Oct 1, 2022 at 3:55 AM Bu Wentian <buwent...@outlook.com
> <mailto:buwent...@outlook.com> > wrote:
> 
> 
>               Hi everyone,
>               I am a beginner of VPP, and I'm trying to use VPP+FRR on KVM
> VMs as routers. I have installed VPP and FRR on Ubuntu 20.04.5 VMs, and
> made them running in a seperated network namespace. I use VPP Linux-cp
> plugin to synchronize the route from kernel stack to VPP. The VPP and FRR
> seems to work, but when I use iperf3 to test the throughput, I find the
> performance of VPP is not good.
> 
>               I created a very simple topology to test the throughput:
>               M1 ----- R1(with VPP) ----- M2
>               M1, M2 are also Ubuntu VMs(without VPP), in different
> subnets. I ran iperf3 server on M1 and client on M2, but only got about
> 2.1Gbps throughput, which is significantly worse than using Linux kernel
> as a router(about 26.1Gbps).
> 
>               I made another experiment on the topology:
>               M1 ------ R1(with VPP) ---- R2(with VPP) ------ M2
>               The iperf3 result is even worse (only 1.6Gbps).
> 
>               I also noticed that many retransmissions happend during the
> iperf3 test. If I use Linux kernel as router rather than VPP, no
> retransmission will happen.
>               Part of iperf3 output:
>               [ ID] Interval           Transfer     Bitrate         Retr
> Cwnd
> 
>               [  5]   0.00-1.00   sec   166 MBytes  1.39 Gbits/sec   23
> 344 KBytes
>               [  5]   1.00-2.00   sec   179 MBytes  1.50 Gbits/sec   49
> 328 KBytes
>               [  5]   2.00-3.00   sec   203 MBytes  1.70 Gbits/sec   47
> 352 KBytes
>               [  5]   3.00-4.00   sec   203 MBytes  1.70 Gbits/sec   54
> 339 KBytes
>               [  5]   4.00-5.00   sec   211 MBytes  1.77 Gbits/sec   59
> 325 KBytes
> 
> 
>               Another phenomenon I found is that when I ran iperf3
> directly on the R1 and R2, I got 0 throughput at all. The output of iperf3
> is like this:
>               [ ID] Interval           Transfer     Bitrate         Retr
> Cwnd
>               [  5]   0.00-1.00   sec   324 KBytes  2.65 Mbits/sec    4
> 8.74 KBytes
>               [  5]   1.00-2.00   sec  0.00 Bytes  0.00 bits/sec    1
> 8.74 KBytes
>               [  5]   2.00-3.00   sec  0.00 Bytes  0.00 bits/sec    0
> 8.74 KBytes
>               [  5]   3.00-4.00   sec  0.00 Bytes  0.00 bits/sec    1
> 8.74 KBytes
>               [  5]   4.00-5.00   sec  0.00 Bytes  0.00 bits/sec    0
> 8.74 KBytes
>               [  5]   5.00-6.00   sec  0.00 Bytes  0.00 bits/sec    0
> 8.74 KBytes
>               [  5]   6.00-7.00   sec  0.00 Bytes  0.00 bits/sec    1
> 8.74 KBytes
>               [  5]   7.00-8.00   sec  0.00 Bytes  0.00 bits/sec    0
> 8.74 KBytes
>               [  5]   8.00-9.00   sec  0.00 Bytes  0.00 bits/sec    0
> 8.74 KBytes
>               [  5]   9.00-10.00  sec  0.00 Bytes  0.00 bits/sec    0
> 8.74 KBytes
>               - - - - - - - - - - - - - - - - - - - - - - - - -
>               [ ID] Interval           Transfer     Bitrate         Retr
>               [  5]   0.00-10.00  sec   324 KBytes   266 Kbits/sec    7
> sender
>               [  5]   0.00-10.00  sec  0.00 Bytes  0.00 bits/sec
> receiver
> 
> 
>               All my VMs use 4vcpus and 8G RAM. The host machine has
> 16Cores(32 Threads) and 32GB RAM.
>               The VMs are connected by libvirtd networks.
>               I installed the VPP +FRR following this tutorial:
> https://ipng.ch/s/articles/2021/12/23/vpp-playground.html
> <https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fipng.ch
> %2Fs%2Farticles%2F2021%2F12%2F23%2Fvpp-
> playground.html&data=05%7C01%7C%7Cbadce86b78394ff18b8408daa3cf2dc3%7C84df9
> e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638002407476279512%7CUnknown%7CTWFpb
> GZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%
> 7C3000%7C%7C%7C&sdata=XHwcWCBs%2F1Fk6mjLlul6S5d390cru%2BzjoTXFclbmg5g%3D&r
> eserved=0>
>               The VPP startup.conf is in the attachment.
> 
>               I want to know why the VPP throughput is worse than Linux
> kernel, and what can I do to improve it (I hope it better than Linux
> kernel forwarding).  I have searched on google for the solution but got
> nothing helpful. It will be appreciated if anyone could give me a help.
> Please contact me if more information or logs are needed.
> 
> 
>               Sincerely,
>               Wentian Bu
> 
> 
> 
> 
> 
> 
>

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#21960): https://lists.fd.io/g/vpp-dev/message/21960
Mute This Topic: https://lists.fd.io/mt/94080585/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] Throughput of VPP on KVM is Significantly Worse than Linux Kernel

Reply via email to