Re: [Qemu-devel] [PATCH] vhost: fix crash on virtio_error while device stop

Ilya Maximets Thu, 14 Dec 2017 07:00:06 -0800


On 14.12.2017 17:31, Ilya Maximets wrote:
> One update for the testing scenario:
> 
> No need to kill OVS. The issue reproducible with simple 'del-port'
> and 'add-port'. virtio driver in guest could crash on both operations.
> Most times it crashes in my case on 'add-port' after deletion.
> 
> Hi Maxime,
> I already saw below patches and original linux kernel virtio issue.
> Just had no enough time to test them.
> Now I tested below patches and they fixes virtio driver crash.
> Thanks for suggestion.
> 
> Michael,
> I tested "[PATCH] virtio_error: don't invoke status callbacks "
> and it fixes the QEMU crash in case of broken guest index.
> Thanks.
> 
> Best regards, Ilya Maximets.
> 
> P.S. Previously I mentioned that I can not reproduce virtio driver
>      crash with "[PATCH] virtio_error: don't invoke status callbacks"


It should be "[PATCH dontapply] virtio: rework set_status callbacks".
Sorry again.

>      applied. I was wrong. I can reproduce now. System was misconfigured.
>      Sorry.
> 
> 
> On 14.12.2017 12:01, Maxime Coquelin wrote:
>> Hi Ilya,
>>
>> On 12/14/2017 08:06 AM, Ilya Maximets wrote:
>>> On 13.12.2017 22:48, Michael S. Tsirkin wrote:
>>>> On Wed, Dec 13, 2017 at 04:45:20PM +0300, Ilya Maximets wrote:
>>>>>>> That
>>>>>>> looks very strange. Some of the functions gets 'old_status', others
>>>>>>> the 'new_status'. I'm a bit confused.
>>>>>>
>>>>>> OK, fair enough. Fixed - let's pass old status everywhere,
>>>>>> users that need the new one can get it from the vdev.
>>>>>>
>>>>>>> And it's not functional in current state:
>>>>>>>
>>>>>>> hw/net/virtio-net.c:264:28: error: ‘status’ undeclared
>>>>>>
>>>>>> Fixed too. new version below.
>>>>>
>>>>> This doesn't fix the segmentation fault.
>>>>
>>>> Hmm you are right. Looking into it.
>>>>
>>>>> I have exactly same crash stacktrace:
>>>>>
>>>>> #0  vhost_memory_unmap hw/virtio/vhost.c:446
>>>>> #1  vhost_virtqueue_stop hw/virtio/vhost.c:1155
>>>>> #2  vhost_dev_stop hw/virtio/vhost.c:1594
>>>>> #3  vhost_net_stop_one hw/net/vhost_net.c:289
>>>>> #4  vhost_net_stop hw/net/vhost_net.c:368
>>>>> #5  virtio_net_vhost_status (old_status=15 '\017', n=0x5625f3901100) at 
>>>>> hw/net/virtio-net.c:180
>>>>> #6  virtio_net_set_status (vdev=0x5625f3901100, old_status=<optimized 
>>>>> out>) at hw/net/virtio-net.c:254
>>>>> #7  virtio_set_status (vdev=vdev@entry=0x5625f3901100, val=<optimized 
>>>>> out>) at hw/virtio/virtio.c:1152
>>>>> #8  virtio_error (vdev=0x5625f3901100, fmt=fmt@entry=0x5625f014f688 
>>>>> "Guest says index %u is available") at hw/virtio/virtio.c:2460
>>>>
>>>> BTW what is causing this? Why is guest avail index corrupted?
>>>
>>> My testing environment for the issue:
>>>
>>> * QEMU 2.10.1
>>
>> Could you try to backport below patch and try again killing OVS?
>>
>> commit 2ae39a113af311cb56a0c35b7f212dafcef15303
>> Author: Maxime Coquelin <maxime.coque...@redhat.com>
>> Date:   Thu Nov 16 19:48:35 2017 +0100
>>
>>     vhost: restore avail index from vring used index on disconnection
>>
>>     vhost_virtqueue_stop() gets avail index value from the backend,
>>     except if the backend is not responding.
>>
>>     It happens when the backend crashes, and in this case, internal
>>     state of the virtio queue is inconsistent, making packets
>>     to corrupt the vring state.
>>
>>     With a Linux guest, it results in following error message on
>>     backend reconnection:
>>
>>     [   22.444905] virtio_net virtio0: output.0:id 0 is not a head!
>>     [   22.446746] net enp0s3: Unexpected TXQ (0) queue failure: -5
>>     [   22.476360] net enp0s3: Unexpected TXQ (0) queue failure: -5
>>
>>     Fixes: 283e2c2adcb8 ("net: virtio-net discards TX data after link down")
>>     Cc: qemu-sta...@nongnu.org
>>     Signed-off-by: Maxime Coquelin <maxime.coque...@redhat.com>
>>     Reviewed-by: Michael S. Tsirkin <m...@redhat.com>
>>     Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
>>
>> commit 2d4ba6cc741df15df6fbb4feaa706a02e103083a
>> Author: Maxime Coquelin <maxime.coque...@redhat.com>
>> Date:   Thu Nov 16 19:48:34 2017 +0100
>>
>>     virtio: Add queue interface to restore avail index from vring used index
>>
>>     In case of backend crash, it is not possible to restore internal
>>     avail index from the backend value as vhost_get_vring_base
>>     callback fails.
>>
>>     This patch provides a new interface to restore internal avail index
>>     from the vring used index, as done by some vhost-user backend on
>>     reconnection.
>>
>>     Signed-off-by: Maxime Coquelin <maxime.coque...@redhat.com>
>>     Reviewed-by: Michael S. Tsirkin <m...@redhat.com>
>>     Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
>>
>>
>> Cheers,
>> Maxime
>>
>>
>>
> 
>

Re: [Qemu-devel] [PATCH] vhost: fix crash on virtio_error while device stop

Reply via email to