On Fri, 16 Mar 2018 16:05:17 +0800 Jason Wang <jasow...@redhat.com> wrote:
> On 2018年03月08日 02:57, Greg Kurz wrote: > > If the backend could not transmit a packet right away for some reason, > > the packet is queued for asynchronous sending. The corresponding vq > > element is tracked in the async_tx.elem field of the VirtIONetQueue, > > for later freeing when the transmission is complete. > > > > If a reset happens before completion, virtio_net_tx_complete() will push > > async_tx.elem back to the guest anyway, and we end up with the inuse flag > > of the vq being equal to -1. The next call to virtqueue_pop() is then > > likely to fail with "Virtqueue size exceeded". > > > > This can be reproduced easily by starting a guest without a net backend, > > doing a system reset when it is booted, and finally snapshotting it. > > > > The appropriate fix is to ensure that such an asynchronous transmission > > cannot survive a device reset. So for all queues, we first try to send > > the packet again, and eventually we purge it if the backend still could > > not deliver it. > > > > Reported-by: R. Nageswara Sastry <nasas...@in.ibm.com> > > Buglink: https://github.com/open-power-host-os/qemu/issues/37 > > Signed-off-by: Greg Kurz <gr...@kaod.org> > > --- > > hw/net/virtio-net.c | 11 +++++++++++ > > 1 file changed, 11 insertions(+) > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c > > index 188744e17d57..eea3cdb2c700 100644 > > --- a/hw/net/virtio-net.c > > +++ b/hw/net/virtio-net.c > > @@ -422,6 +422,7 @@ static RxFilterInfo > > *virtio_net_query_rxfilter(NetClientState *nc) > > static void virtio_net_reset(VirtIODevice *vdev) > > { > > VirtIONet *n = VIRTIO_NET(vdev); > > + int i; > > > > /* Reset back to compatibility mode */ > > n->promisc = 1; > > @@ -445,6 +446,16 @@ static void virtio_net_reset(VirtIODevice *vdev) > > memcpy(&n->mac[0], &n->nic->conf->macaddr, sizeof(n->mac)); > > qemu_format_nic_info_str(qemu_get_queue(n->nic), n->mac); > > memset(n->vlans, 0, MAX_VLAN >> 3); > > + > > + /* Flush any async TX */ > > + for (i = 0; i < n->max_queues; i++) { > > + NetClientState *nc = qemu_get_subqueue(n->nic, i); > > + > > + if (!qemu_net_queue_flush(nc->peer->incoming_queue)) { > > + qemu_net_queue_purge(nc->peer->incoming_queue, nc); > > + } > > Looks like we can use qemu_flush_or_purge_queued_packets(nc->peer) here. > It should be made extern first but we can use it indeed. > But a questions, you said it could be reproduced without a backend, in > this case nc->peer should be NULL I believe or we won't even get here > since qemu_sendv_packet_async() won't return zero? > My bad, I didn't use an appropriate wording. The issue is always reproducible if you only pass '-net nic,model=virtio', without any other -net option that would provide a functional network. ie, -device virtio-net-pci,netdev=netdev0 -netdev hubport,id=netdev0,hubid=0 So, we do have a hubport peer, but since it isn't connected to the host network, net_hub_port_can_receive() returns 0, and so does qemu_sendv_packet_async(). What about the following change in the changelog ? "This can be reproduced easily by starting a guest with an hubport backend that is not connected to a functional network, eg, -device virtio-net-pci,netdev=netdev0 -netdev hubport,id=netdev0,hubid=0" Cheers, -- Greg > Thanks > > > + assert(!virtio_net_get_subqueue(nc)->async_tx.elem); > > + } > > } > > > > static void peer_test_vnet_hdr(VirtIONet *n) > > > > >