On 6/24/26 17:00, Denis V. Lunev wrote:
> On 6/24/26 16:55, David Hildenbrand (Arm) wrote:
>> On 6/24/26 16:08, Denis V. Lunev wrote:
>>> Commit 8bd2fa086a04 ("virtio: break and reset virtio devices on
>>> device_shutdown()") added a generic virtio bus .shutdown handler that
>>> breaks and resets every virtio device during device_shutdown(), i.e. on
>>> reboot and kexec.
>>>
>>> virtio_balloon provides no .shutdown of its own, so that generic path
>>> runs while the balloon's asynchronous work is still armed. Once the
>>> device has been broken, virtqueue_add_inbuf() in
>>> virtballoon_free_page_report() returns -EIO and trips its
>>> WARN_ON_ONCE(). On a kernel booted with panic_on_warn that turns an
>>> ordinary reboot, for example a kexec based upgrade, into a fatal panic
>>> in the middle of device_shutdown(), so the machine never reaches the
>>> new kernel.
>>>
>>> Relaxing that single WARN_ON_ONCE() would only hide the symptom: the
>>> inflate/deflate and OOM paths do not warn, they call
>>> wait_event(vb->acked, ...) and would instead block forever on a broken
>>> queue that can no longer complete. The device has to be quiesced, not
>>> just kept quiet.
>>>
>>> Add a .shutdown handler that quiesces the balloon via the shared
>>> virtballoon_quiesce() helper while the device is still alive, and only
>>> then breaks and resets it via virtio_device_shutdown(). Unlike
>>> virtballoon_remove() the balloon workqueue is not destroyed, as shutdown
>>> does not free the device and cancel_work_sync() together with stop_update
>>> already prevent any further work from being queued.
>>>
>>> Fixes: 8bd2fa086a04 ("virtio: break and reset virtio devices on 
>>> device_shutdown()")
>>> Signed-off-by: Denis V. Lunev <[email protected]>
>>> ---
>>>  drivers/virtio/virtio_balloon.c | 7 +++++++
>>>  1 file changed, 7 insertions(+)
>>>
>>> diff --git a/drivers/virtio/virtio_balloon.c 
>>> b/drivers/virtio/virtio_balloon.c
>>> index 5b02d9191ac6..26fc3c40d5b2 100644
>>> --- a/drivers/virtio/virtio_balloon.c
>>> +++ b/drivers/virtio/virtio_balloon.c
>>> @@ -1137,6 +1137,12 @@ static void virtballoon_remove(struct virtio_device 
>>> *vdev)
>>>     kfree(vb);
>>>  }
>>>  
>>> +static void virtballoon_shutdown(struct virtio_device *vdev)
>>> +{
>>> +   virtballoon_quiesce(vdev->priv);
>>> +   virtio_device_shutdown(vdev);
>>> +}
>> I'm curious why virtio_gpu_shutdown() doesn't need that (did not look into 
>> the
>> details).
>>
>> Reviewed-by: David Hildenbrand (Arm) <[email protected]>
>>
> I would spend more time with other drivers once we will
> done with this. I have strong candidate - virtio-mem.

Heh, I briefly checked and it should handle it better I think.

If virtqueue_add_sgs() fails, it propagates the error (-EIO?) back to the main
loop where we end up in

switch (rc) {
        ...
        default:
        /* Unknown error, mark as broken */
        dev_err(&vm->vdev->dev, ...
        vm->broken = true;
}

And just stop.

But I didn't actually look into the details.

-- 
Cheers,

David

Reply via email to