On 10/23/25 02:33, Raphael Norwitz wrote:
On Wed, Oct 22, 2025 at 3:59 AM Alexandr Moshkov
<[email protected]> wrote:
Hi!
On 10/21/25 23:54, Raphael Norwitz wrote:
The logic looks ok from the vhost-user-blk side but some comments inline.
On Mon, Oct 20, 2025 at 1:47 AM Alexandr Moshkov
<[email protected]> wrote:
Hi!
During inter-host migration, waiting for disk requests to be drained
in the vhost-user backend can incur significant downtime.
This can be avoided if QEMU migrates the inflight region in vhost-user-blk.
Thus, during the qemu migration, the vhost-user backend can cancel all inflight
requests and
then, after migration, they will be executed on another host.
At first, I tried to implement migration for all vhost-user devices that
support inflight at once,
but this would require a lot of changes both in vhost-user-blk (to transfer it
to the base class) and
in the vhost-user-base base class (inflight implementation and remodeling + a
large refactor).
Even if it's a more significant change I'd rather generalize as much
logic as possible and expose it as a vhost-user protocol feature. IMO
too much vhost-user device-agnositic code is being pushed into
vhost-user-blk.
As far as I understand (correct me if I'm wrong), but this feature must be
implemented in device itself (along with inflight field location) or in some
base class for vhost-user devices. For now vhost-user-blk doesn't have one yet.
The closest base class that i could find is vhost-user-base, but for using it,
it has to implement all device-agnostic code from vhost-user-blk, and don't
break all existing vhost-user-base derived devices. For example, to support
inflight migration in base class, there first need to implement inflight field
in it, along with the reconnect feature. Then, with a big refactor, somehow
inherit vhost-user-blk from it, along with other devices such as vhost-user-fs.
So, IMO it looks like a whole separate track that will need to be tackled.
I get that inflight is inside of struct VhostUserBlk, not in struct
vhost_user. Even if not all device types support inflight I think that
could probably be changed? If we do keep things as you have it here
patch 1 should probably bring the vhost_dev_load_inflight() and
vhost_dev_save_inflight() into vhost_user.c rather than vhost.c since
only vhost_user devices seem to use inflight.
Firstly, I tried to implement inflight in the vhost_user structure, but
this structure is initialized on every vhost_user_blk_connect() and then
destroyed on every vhost_user_blk_disconnect(). But, the inflight region
must survive device disconnect.
During active migration, the vhost_user struct will be destroyed before
the inflight saving.
And you are right about the location of the functions in patch 1. I will
move them to the vhost-user.c file, which will be more appropriate. Thanks!
As Markus noted this also conflicts significantly with Vladimir's
series so I'd suggest waiting until those are in, or possibly
attempting to generalize on top of his changes.
Yea, but i think, i just need to perform inflight migration only when some
non-local migration flag is set.
I'm not sure about that. While it may be completely safe in your case
to avoid GET_VRING_BASE/quiescing the backend, it could introduce
corruption cases on other vhost-user-blk storage backends which work
differently. I think this would require an "opt-in" migration
parameter over and above checking that it's not a local migration.
In patch 2 I introduce migration capability for that. Or you think that
we need something like per-device parameter?
Ideally there would be some way to have the vhost-user backends expose
support per device with a protocol feature and, if the source and
destination both advertise support, we perform the inflight migration
skipping GET_VRING_BASE. I don't know enough about the migration code
to know how that would be done or if it is practical.
Also (and I think this also applies to Vladimir's series), it should
be possible to add multiple vhost-user-blk devices to the same Qemu
instance but have the devices backed by very different backend
implementations. What happens if some of those backend implementations
support things like lightweight local migration and/or inflight
migration but others don't? Ideally you would want the API to be able
to select target vhost-user-blk devices for accelerated migration
rather than assuming they will all want to use the same features. I
don't know if anyone is doing that or if we care about such cases but
I just want to throw the concern out there.
Hmm, that looks like we need some per-device parameter in order to set
this feature on for them.
I guess we need a opinion from migration maintainer about such cases