* Sasha Levin <[email protected]> wrote:
> On Fri, 2011-04-29 at 15:13 +0800, Asias He wrote:
> > On 04/29/2011 02:52 PM, Pekka Enberg wrote:
> > > Please make that IRQ latency fix a separate patch. Don't we need to do
> > > it for TX path as well, btw?
> > >
> >
> > No. We only need it in RX path. Sasha's threadpool patch breaks this.
> > I'm just moving it back.
> >
>
> I've moved the kvm__irq_line() call out because what would happen
> sometimes is the following:
> - Call kvm__irq_line() to signal completion.
> - virt_queue__available() would return true.
> - readv() call blocks.
>
> I figured it happens because we catch virt_queue in while it's being
> updated by the guest and use an erroneous state of it.
How can the vring.avail flag be 'erroneous'?
Can virt_queue__available() really be true while it's not really true? How is
that possible and what are the semantics / expected behavior of interaction
with the virtual ring?
It's this code AFAICS:
if (!vq->vring.avail)
return 0;
return vq->vring.avail->idx != vq->last_avail_idx;
And it's used like this:
static void virtio_net_rx_callback(struct kvm *self, void *param)
{
struct iovec iov[VIRTIO_NET_QUEUE_SIZE];
struct virt_queue *vq;
uint16_t out, in;
uint16_t head;
int len;
vq = param;
while (virt_queue__available(vq)) {
head = virt_queue__get_iov(vq, iov, &out, &in, self);
len = readv(net_device.tap_fd, iov, in);
virt_queue__set_used_elem(vq, head, len);
}
kvm__irq_line(self, VIRTIO_NET_IRQ, 1);
the guest kernel has access to these same virt_queue structure and updates it -
possibly in parallel to the virtio-net thread.
( One thing that sticks out is that there are no memory barriers used -
although very likely that is not needed right now and it is not related to
the stalls. )
the rx_callback() is one where the host receives packet from the TAP device,
right? I.e. the host receives brand new packets here and forwards them to the
guest.
If that is so then indeed the right approach might be to signal the guest every
time we manage to readv() something - there might not come any other packet for
a long time. But the reason is not some 'erroneous state' - all state is
perfectly fine, this is simply a basic property of the event loop that the rx
thread implements ...
And no, the mutex lock is not needed around the guest signalling - why would it
be needed?
The thing is, the queueing and state machine code within virtio-net.c is rather
hard to read - do people actually understand it? It took me several tried until
i convinced mysef that with this fix it would work. The whole file does not
have a *single* comment!
So please document it much better, document the state machine, document the
interaction between the rx and tx threads and the rx and tx queue, document the
interaction between guest and host, outline what happens when any of the queues
gets full, etc. etc. Parallel state machines are exceedingly complex and
subtle, not documenting them is one of the seven deadly sins ...
Thanks,
Ingo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html