On Thu, Mar 17, 2016 at 03:56:42PM -0000, Laszlo Ersek (Red Hat) wrote: > Stefan, I too had the same immediate idea upon seeing this bug report. > But, after I skimmed the DPDK code briefly, I think it does reset the > virtio-net device correctly, before it tries to use it. > > Instead, at least based on the extensive log that Julien pasted, I > believe the following happens: when the first instance of testpmd is > killed ungracefully, it gets no chance at resetting the virtio-net > device at shutdown. The vtpci_reset() call in virtio_dev_close() is > likely never reached. This leaves the virtio queues alive, as far as > QEMU is concerned, but in the guest, the memory that used to cover them > goes away. > > So when the second instance of testpmd is started, and a bunch of memory > is allocated and written to, I think testpmd scribbles over the > "leftover" live virtio queues that QEMU / KVM are still watching. The > hypervisor is allowed to notice changes to the virtqueues without > explicit guest notifications (hence the elaborate barrier stuff in the > Linux kernel drivers, for example). I suspect things blow up before the > second testpmd process even thinks about using virtio-net. (It is hard > to confirm from the log that Julien pasted, because he snipped exactly > the part that leads up to the failure.) > > This failure mode (if my hunch is correct) is special to DPDK, I think. > In a normal guest kernel scenario, the memory that covers the virtqueues > is managed by the kernel, and you can't just kill the kernel. You might > be able to unload the virtio-net driver module, but for that one has to > tear down the corresponding ethX interfaces first, and I'm quite sure > the virtio-net devices will be re-set then. > > We've seen the exact same problem with iPXE (in UEFI guests) as well, > when iPXE would transfer control to the kernel or another payload; but > iPXE got fixed: it now disconnects the virtio-net NIC (and other NICs > too) in the ExitBootServices() callback. (I'm not perfectly happy with > that fix for unrelated reasons, but it definitely covers this issue.) > > OVMF too resets virtio devices in the ExitBootServices() callbacks of > its virtio drivers. So this failure mode seems to be special to DPDK, > where you can kill the testpmd process and deprive it from the chance to > clean up the virtqueues (by resetting the device).
QEMU can and should help by making this a non-fatal error: treat the device as broken when an invalid state is reached and stop processing virtqueues until it is reset. Fatal errors in QEMU device emulation are a bad thing. However, it's still a guest code bug because a driver must not abandon an active device. Depending on the contents of the rings it could cause spurious I/O leading to data corruption. So this needs to be fixed in DPDK or the application. Stefan
signature.asc
Description: PGP signature