On 2017年10月19日 04:17, Matthew Rosato wrote:
2. It might be useful to short the traffic path as a reference, What I am 
running
is briefly like:
     pktgen(host kernel) -> tap(x) -> guest(DPDK testpmd)

The bridge driver(br_forward(), etc) might impact performance due to my personal
experience, so eventually I settled down with this simplified testbed which 
fully
isolates the traffic from both userspace and host kernel stack(1 and 50 
instances,
bridge driver, etc), therefore reduces potential interferences.

The down side of this is that it needs DPDK support in guest, has this ever be
run on s390x guest? An alternative approach is to directly run XDP drop on
virtio-net nic in guest, while this requires compiling XDP inside guest which 
needs
a newer distro(Fedora 25+ in my case or Ubuntu 16.10, not sure).

I made an attempt at DPDK, but it has not been run on s390x as far as
I'm aware and didn't seem trivial to get working.

So instead I took your alternate suggestion & did:
pktgen(host) -> tap(x) -> guest(xdp_drop)

When running this setup, I am not able to reproduce the regression.  As
mentioned previously, I am also unable to reproduce when running one end
of the uperf connection from the host - I have only ever been able to
reproduce when both ends of the uperf connection are running within a guest.


Thanks for the test. Looking at the code, the only obvious difference when BATCH is 1 is that one spinlock which was previously called by tun_peek_len() was avoided since we can do it locally. I wonder whether or not this speeds up handle_rx() a little more then leads more wakeups during some rates/sizes of TCP stream. To prove this, maybe you can try:

- enable busy polling, using poll-us=1000, and to see if we can still get the regression
- measure the pps pktgen(vm1) -> tap1 -> bridge -> tap2 -> vm2

Michael, any another possibility in your mind?

Thanks

Reply via email to