On Thu, 19 Jan 2023 at 11:58, Anton Kuchin <antonkuc...@yandex-team.ru> wrote:
>
> On 19/01/2023 18:02, Stefan Hajnoczi wrote:
> > On Thu, 19 Jan 2023 at 10:29, Anton Kuchin <antonkuc...@yandex-team.ru> 
> > wrote:
> >> On 19/01/2023 16:30, Stefan Hajnoczi wrote:
> >>> On Thu, 19 Jan 2023 at 07:43, Anton Kuchin <antonkuc...@yandex-team.ru> 
> >>> wrote:
> >>>> On 18/01/2023 17:52, Stefan Hajnoczi wrote:
> >>>>> On Sun, 15 Jan 2023 at 12:21, Anton Kuchin <antonkuc...@yandex-team.ru> 
> >>>>> wrote:
> >>>>>> Now any vhost-user-fs device makes VM unmigratable, that also prevents
> >>>>>> qemu update without stopping the VM. In most cases that makes sense
> >>>>>> because qemu has no way to transfer FUSE session state.
> >>>>>>
> >>>>>> But we can give an option to orchestrator to override this if it can
> >>>>>> guarantee that state will be preserved (e.g. it uses migration to
> >>>>>> update qemu and dst will run on the same host as src and use the same
> >>>>>> socket endpoints).
> >>>>>>
> >>>>>> This patch keeps default behavior that prevents migration with such 
> >>>>>> devices
> >>>>>> but adds migration capability 'vhost-user-fs' to explicitly allow 
> >>>>>> migration.
> >>>>>>
> >>>>>> Signed-off-by: Anton Kuchin <antonkuc...@yandex-team.ru>
> >>>>>> ---
> >>>>>>     hw/virtio/vhost-user-fs.c | 25 ++++++++++++++++++++++++-
> >>>>>>     qapi/migration.json       |  7 ++++++-
> >>>>>>     2 files changed, 30 insertions(+), 2 deletions(-)
> >>>>>>
> >>>>>> diff --git a/hw/virtio/vhost-user-fs.c b/hw/virtio/vhost-user-fs.c
> >>>>>> index f5049735ac..13d920423e 100644
> >>>>>> --- a/hw/virtio/vhost-user-fs.c
> >>>>>> +++ b/hw/virtio/vhost-user-fs.c
> >>>>>> @@ -24,6 +24,7 @@
> >>>>>>     #include "hw/virtio/vhost-user-fs.h"
> >>>>>>     #include "monitor/monitor.h"
> >>>>>>     #include "sysemu/sysemu.h"
> >>>>>> +#include "migration/migration.h"
> >>>>>>
> >>>>>>     static const int user_feature_bits[] = {
> >>>>>>         VIRTIO_F_VERSION_1,
> >>>>>> @@ -298,9 +299,31 @@ static struct vhost_dev 
> >>>>>> *vuf_get_vhost(VirtIODevice *vdev)
> >>>>>>         return &fs->vhost_dev;
> >>>>>>     }
> >>>>>>
> >>>>>> +static int vhost_user_fs_pre_save(void *opaque)
> >>>>>> +{
> >>>>>> +    MigrationState *s = migrate_get_current();
> >>>>>> +
> >>>>>> +    if (!s->enabled_capabilities[MIGRATION_CAPABILITY_VHOST_USER_FS]) 
> >>>>>> {
> >>>>>> +        error_report("Migration of vhost-user-fs devices requires 
> >>>>>> internal FUSE "
> >>>>>> +                     "state of backend to be preserved. If 
> >>>>>> orchestrator can "
> >>>>>> +                     "guarantee this (e.g. dst connects to the same 
> >>>>>> backend "
> >>>>>> +                     "instance or backend state is migrated) set 
> >>>>>> 'vhost-user-fs' "
> >>>>>> +                     "migration capability to true to enable 
> >>>>>> migration.");
> >>>>>> +        return -1;
> >>>>>> +    }
> >>>>>> +
> >>>>>> +    return 0;
> >>>>>> +}
> >>>>>> +
> >>>>>>     static const VMStateDescription vuf_vmstate = {
> >>>>>>         .name = "vhost-user-fs",
> >>>>>> -    .unmigratable = 1,
> >>>>>> +    .minimum_version_id = 0,
> >>>>>> +    .version_id = 0,
> >>>>>> +    .fields = (VMStateField[]) {
> >>>>>> +        VMSTATE_VIRTIO_DEVICE,
> >>>>>> +        VMSTATE_END_OF_LIST()
> >>>>>> +    },
> >>>>>> +   .pre_save = vhost_user_fs_pre_save,
> >>>>>>     };
> >>>>> Will it be possible to extend this vmstate when virtiofsd adds support
> >>>>> for stateful migration without breaking migration compatibility?
> >>>>>
> >>>>> If not, then I think a marker field should be added to the vmstate:
> >>>>> 0 - stateless/reconnect migration (the approach you're adding in this 
> >>>>> patch)
> >>>>> 1 - stateful migration (future virtiofsd feature)
> >>>>>
> >>>>> When the field is 0 there are no further vmstate fields and we trust
> >>>>> that the destination vhost-user-fs server already has the necessary
> >>>>> state.
> >>>>>
> >>>>> When the field is 1 there are additional vmstate fields that contain
> >>>>> the virtiofsd state.
> >>>>>
> >>>>> The goal is for QEMU to support 3 migration modes, depending on the
> >>>>> vhost-user-fs server:
> >>>>> 1. No migration support.
> >>>>> 2. Stateless migration.
> >>>>> 3. Stateful migration.
> >>>> Sure. These vmstate fields are very generic and mandatory for any
> >>>> virtio device. If in future more state can be transfer in migration
> >>>> stream the vmstate can be extended with additional fields. This can
> >>>> be done with new subsections and/or bumping version_id.
> >>> My concern is that the vmstate introduced in this patch may be
> >>> unusable when stateful migration is added. So additional compatibility
> >>> code will need to be introduced to make your stateless migration
> >>> continue working with extended vmstate.
> >>>
> >>> By adding a marker field in this patch it should be possible to
> >>> continue using the same vmstate for stateless migration without adding
> >>> extra compatibility code in the future.
> >> I understand, but this fields in vmstate just packs generic virtio
> >> device state that is accessible by qemu. All additional data could be
> >> added later by extra fields. I believe we couldn't pull off any type
> >> of virtio device migration without transferring virtqueues so more
> >> sophisticated types of migration would require adding more data and
> >> not modification to this part of vmstate.
> > What I'm saying is that your patch could define the vmstate such that
> > it that contains a field to differentiate between stateless and
> > stateful migration. That way QEMU versions that only support stateless
> > migration (this patch) will be able to migrate to future QEMU versions
> > that support both stateless and stateful without compatibility issues.
> I double-checked migration documentation for subsections at
> https://www.qemu.org/docs/master/devel/migration.html#subsections
> and believe it perfectly describes our case: virtio device state
> should always be transferred both in stateless or stateful migration.
> With stateful one we would add new subsection with extra data that
> will be transferred only if stateless capability is not set. We can
> connect this subsection to device property and machine type if we
> need to.
> On the receiving side we always need basic virtio device state and
> newer versions will be able to load extra data from subsection if it
> is present, while older versions will be still able to accept the
> migrations that were initiated from new versions with stateless flag
> set and don't have extra subsection.
>
> The only scenario that will fail is older qemu won't be able to load
> new migration stream with additional subsection that it can't
> understand - this is general limitation of migration compatibility.
> So we can't completely protect older versions from future migration
> stream format because we don't know what will be in that stream
> but looks like we have all the tools to maintain compatibility
> reasonably wide.
> > I'm not sure if my suggestion to add a marker field to vuf_vmstate is
> > the best way to do this, but have you thought of how to handle the
> > future addition of stateful migration to the vmstate without breaking
> > stateless vmstates? Maybe David Gilbert has a suggestion for how to do
> > this cleanly.
> >
> > Stefan
>
> I think we'd be better without a new marker because migration code
> has standard generic way of solving such puzzles that I described
> above. So adding new marker would go against existing practice.

That sounds good to me, thanks!

Stefan

Reply via email to