Hi,

I tested KNI, and compared with virtio-user. The result is beyond my 
expectation:

The KNI performance is better (+30%) in simpe netperf test with TCP and 
different size UDP. I though they have similar performance, but it proved that 
KNI performed better in my test. Not sure why.

Note in my test, I did not enable checksum/gso/… offloading and multi-queue, 
since we need do vxLan encapsulation using SW. I am using ovs2.8.1 and dpdk 
17.05.2.

In addition, one queue pair on virtio-user would create one vhost thread. If we 
have many containters, it seems hard to manage the CPU usage. Is there any 
proposal/practice to limit the vhost kthread CPU resource?

Br,
Wang Zhike

From: Tan, Jianfeng [mailto:[email protected]]
Sent: Thursday, October 26, 2017 4:53 PM
To: 王志克; [email protected]; [email protected]
Subject: Re: VIRTIO for containers

Hi,


[Wang Zhike] I once saw you mentioned that something like mmap solution may be 
used. Is it still on your roadmap? I am not sure whether it is same as the 
“vhost tx zero copy”.
Can I know the forecasted day that the optimization can be done? Some Linux 
kernel upstream module would be updated, or DPDK module? Just want to know 
which modules will be touched.

Yes, I was planning to do that. But found out it helps on user->kernel path; 
not so easy for kernel->user path. It’s not the same as “vhost tx zero copy” 
(there are some restrictions BTW). The packet mmap would share a bulk of memory 
with user and kernel space, so that we don’t need to copy (the effect is the 
same with “vhost tx zero copy”). As for the date, it still lack of detailed 
design and feasibility analysis.



1) Yes, we have done some initial tests internally, with testpmd as the vswitch 
instead of OVS-DPDK; and we were comparing with KNI for exceptional path.
[Wang Zhike]Can you please kindly indicate how to configure for KNI mode? I 
would like to also compare it.

Now KNI is a vdev now. You can refer to this link: 
http://dpdk.org/doc/guides/nics/kni.html




2) We also see similar asymmetric result. For user->kernel path, it not only 
copies data from mbuf to skb, but also might go above to tcp stack (you can 
check using perf).
[Wang Zhike] Yes, indeed.  User->kernel path, tcp/ip related work is done by 
vhost thread, while kernel to user  thread, tcp/ip related work is done by the 
app (my case netperf) in syscall.


To put tcp/ip rx into app thread, actually, might avoid that with a little 
change on tap driver. Currently, we use netif_rx/netif_receive_skb() to rx in 
tap, which could result in going up to the tcp/ip stack in the vhost kthread. 
Instead, we could backlog the packets into other cpu (application thread's 
cpu?).

Thanks,
Jianfeng



Reply via email to