Re: [PATCH 0/6] Kill off the virtio_net tx mitigation timer

Avi Kivity Mon, 03 Nov 2008 04:41:03 -0800

Mark McLoughlin wrote:

On Sun, 2008-11-02 at 11:48 +0200, Avi Kivity wrote:

Mark McLoughlin wrote:
Hey,
The main patch in this series is 5/6 - it just kills off the
virtio_net tx mitigation timer and does all the tx I/O in the
I/O thread.
What will it do to small packet, multi-flow loads (simulated by ping -f-l 30 $external)?


It should improve the latency - the packets will be flushed more quickly
than the 150us timeout without blocking the guest.

But it will increase overhead, since suddenly we aren't queueinganymore. One vmexit per small packet.

Where does the benefit come from?


There are two things going on here, I think.

First is that the timer affects latency, removing the timeout helps
that.

If the timer affects latency, then something is very wrong. We'relacking an adjustable window.

The way I see it, the notification window should be adjusted accordingto the current workload. If the link is idle, the window should be onepacket -- notify as soon as something is queued. As the workloadincreases, the window increases to (safety_factor * allowable_latency /packet_rate). The timer is set to allowable_latency to catch changes inworkload.


For example:

- allowable_latency 1ms (implies 1K vmexits/sec desired)
- current packet_rate 20K packets/sec
- safety_factor 0.8

So we request notifications every 0.8 * 20K * 1m = 16 packets, and setthe timer to 1ms. Usually we get a notification every 16 packets, justbefore timer expiration. If the workload increases, we getnotifications sooner, so we increase the window. If the workload drops,the timer fires and we decrease the window.


The timer should never fire on an all-out benchmark, or in a ping test.

Second is that currently when we fill up the ring we block the guest
vcpu and flush. Thus, while we're copying a entire ring full of packets
that guest isn't making progress. Doing the copying in the I/O thread
helps there.

We're hurting our cache, and this won't work well with many nics. Atthe very least this should be done in a dedicated thread. It's alsogoing to damage latency.


The only real fix is to avoid the copy altogether.

Note - the only net I/O we currently do in the vcpu thread at the moment
is when the guest is saturating the link. Any other timer, all the I/O
is done in the I/O thread by virtue of the timer.

This is fundamental brokenness, as mentioned above, in mynon-networking-expert opinion.

Is the overhead of managing the timer too high, or does it fire too
late and so we sleep?  If the latter, can we tune it dynamically?
For example, if the guest sees it is making a lot of progress withoutthe host catching up (waiting on the tx timer), it cankick_I_really_mean_this_now(), to get the host to notice.
It does that already - if the ring fills up the guests forces a kick
which causes the host to flush the ring in the vcpu thread.

Should happen some time before the ring fills up. Especially if we makethe flushing aync by offloading to some other thread.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/6] Kill off the virtio_net tx mitigation timer

Reply via email to