On 14.12.2017 17:31, Ilya Maximets wrote: > One update for the testing scenario: > > No need to kill OVS. The issue reproducible with simple 'del-port' > and 'add-port'. virtio driver in guest could crash on both operations. > Most times it crashes in my case on 'add-port' after deletion. > > Hi Maxime, > I already saw below patches and original linux kernel virtio issue. > Just had no enough time to test them. > Now I tested below patches and they fixes virtio driver crash. > Thanks for suggestion. > > Michael, > I tested "[PATCH] virtio_error: don't invoke status callbacks " > and it fixes the QEMU crash in case of broken guest index. > Thanks. > > Best regards, Ilya Maximets. > > P.S. Previously I mentioned that I can not reproduce virtio driver > crash with "[PATCH] virtio_error: don't invoke status callbacks"
It should be "[PATCH dontapply] virtio: rework set_status callbacks". Sorry again. > applied. I was wrong. I can reproduce now. System was misconfigured. > Sorry. > > > On 14.12.2017 12:01, Maxime Coquelin wrote: >> Hi Ilya, >> >> On 12/14/2017 08:06 AM, Ilya Maximets wrote: >>> On 13.12.2017 22:48, Michael S. Tsirkin wrote: >>>> On Wed, Dec 13, 2017 at 04:45:20PM +0300, Ilya Maximets wrote: >>>>>>> That >>>>>>> looks very strange. Some of the functions gets 'old_status', others >>>>>>> the 'new_status'. I'm a bit confused. >>>>>> >>>>>> OK, fair enough. Fixed - let's pass old status everywhere, >>>>>> users that need the new one can get it from the vdev. >>>>>> >>>>>>> And it's not functional in current state: >>>>>>> >>>>>>> hw/net/virtio-net.c:264:28: error: ‘status’ undeclared >>>>>> >>>>>> Fixed too. new version below. >>>>> >>>>> This doesn't fix the segmentation fault. >>>> >>>> Hmm you are right. Looking into it. >>>> >>>>> I have exactly same crash stacktrace: >>>>> >>>>> #0 vhost_memory_unmap hw/virtio/vhost.c:446 >>>>> #1 vhost_virtqueue_stop hw/virtio/vhost.c:1155 >>>>> #2 vhost_dev_stop hw/virtio/vhost.c:1594 >>>>> #3 vhost_net_stop_one hw/net/vhost_net.c:289 >>>>> #4 vhost_net_stop hw/net/vhost_net.c:368 >>>>> #5 virtio_net_vhost_status (old_status=15 '\017', n=0x5625f3901100) at >>>>> hw/net/virtio-net.c:180 >>>>> #6 virtio_net_set_status (vdev=0x5625f3901100, old_status=<optimized >>>>> out>) at hw/net/virtio-net.c:254 >>>>> #7 virtio_set_status (vdev=vdev@entry=0x5625f3901100, val=<optimized >>>>> out>) at hw/virtio/virtio.c:1152 >>>>> #8 virtio_error (vdev=0x5625f3901100, fmt=fmt@entry=0x5625f014f688 >>>>> "Guest says index %u is available") at hw/virtio/virtio.c:2460 >>>> >>>> BTW what is causing this? Why is guest avail index corrupted? >>> >>> My testing environment for the issue: >>> >>> * QEMU 2.10.1 >> >> Could you try to backport below patch and try again killing OVS? >> >> commit 2ae39a113af311cb56a0c35b7f212dafcef15303 >> Author: Maxime Coquelin <maxime.coque...@redhat.com> >> Date: Thu Nov 16 19:48:35 2017 +0100 >> >> vhost: restore avail index from vring used index on disconnection >> >> vhost_virtqueue_stop() gets avail index value from the backend, >> except if the backend is not responding. >> >> It happens when the backend crashes, and in this case, internal >> state of the virtio queue is inconsistent, making packets >> to corrupt the vring state. >> >> With a Linux guest, it results in following error message on >> backend reconnection: >> >> [ 22.444905] virtio_net virtio0: output.0:id 0 is not a head! >> [ 22.446746] net enp0s3: Unexpected TXQ (0) queue failure: -5 >> [ 22.476360] net enp0s3: Unexpected TXQ (0) queue failure: -5 >> >> Fixes: 283e2c2adcb8 ("net: virtio-net discards TX data after link down") >> Cc: qemu-sta...@nongnu.org >> Signed-off-by: Maxime Coquelin <maxime.coque...@redhat.com> >> Reviewed-by: Michael S. Tsirkin <m...@redhat.com> >> Signed-off-by: Michael S. Tsirkin <m...@redhat.com> >> >> commit 2d4ba6cc741df15df6fbb4feaa706a02e103083a >> Author: Maxime Coquelin <maxime.coque...@redhat.com> >> Date: Thu Nov 16 19:48:34 2017 +0100 >> >> virtio: Add queue interface to restore avail index from vring used index >> >> In case of backend crash, it is not possible to restore internal >> avail index from the backend value as vhost_get_vring_base >> callback fails. >> >> This patch provides a new interface to restore internal avail index >> from the vring used index, as done by some vhost-user backend on >> reconnection. >> >> Signed-off-by: Maxime Coquelin <maxime.coque...@redhat.com> >> Reviewed-by: Michael S. Tsirkin <m...@redhat.com> >> Signed-off-by: Michael S. Tsirkin <m...@redhat.com> >> >> >> Cheers, >> Maxime >> >> >> > >