Re: [RFC V2 0/8] Live update: tap and vhost

Peter Xu Wed, 10 Sep 2025 09:59:24 -0700

On Wed, Sep 10, 2025 at 12:35:10AM +0300, Vladimir Sementsov-Ogievskiy wrote:
> > > I wished devices could opt-in to provide its own model so that it is
> > > prepared to boot the QEMU without FDs being there and pause itself at that
> > > stage if a load would happen.
> > 
> > So, you suggest to postpone the initialization up to "start" even for 
> > "normal start"
> > of QEMU, to avoid these endless "if (we are in our special 
> > local-incoming/CPR mode)".
> > 
> > Actually, that's how normal migratable devices live: we don't have "if 
> > (incoming)" for
> > every step of initialization/start currently.
> > 
> > I'll see, could I apply the concept to TAP local migration series.
> 
> 
> Hmm, not so simple.
> 
> OK, my current series behave like this:
> 
> init:  if tap.local_incoming then do nothing else open(/dev/net/tun)
> 
> incoming migration: get fd, and continue initialization
> 
> 
> Assume, we want to avoid extra "if"s, and just postpone the initialization to 
> vm start point, like
> 
> init: do nothing. set fd=-1
> 
> incmoing migration: get fd (if cap-fd-passing enabled)
> 
> start: open(), if fd==-1, continue initialization
> 
> 
> But that mean that we postpone possible errors up to start as well, when we 
> cannot rollback the
> migration..


Yep, doesn't sound like a good idea.  We also don't want to slow down VM
starts.

> 
> 
> Alternatively, we can postpone open() to post-load.. But what for normal 
> start of vm?
> 
> init: if INMIGRATE then do nothing, else open()
> 
> incoming: get fd (if cap-fd-passing)
> 
> post-load: open(), if fd==-1, continue initialization
> 
> start: if fd is still -1, open(), continue initialization
> 
> that avoids extra tap.local_incoming option, but:
> 
> - seems even more complicated
> - open() and some initialization is done in downtime, when we don't enable 
> cap-fd-passing
> 
> 
> So, now I think, that my current approach with additional "local-incoming" 
> per-device option is better.
> 
> What do you think?
> 
> 
> Probably I'm trying to optimize wrong "if". As "if local-incomging .." in 
> generic layer is a lot
> more expensive than checking the options in device code.
> 
> But the idea is generic: for non-fd migration, we do as much initialization 
> at start as possible,

AFAIU, the non-fd migrations works simply because the portion that VMSD
loads will always be over-writeable.  When it's not, a pre_load() or
post_load() would make it work.

> to get early errors and to decrease further downtime. For fd migration, we 
> postpone fd-initialization
> up to post-load stage. So, we have "if"s in device code to handle it, and we 
> have "if"s in generic
> code to support device, which doesn't still have fully initialized backend 
> (no fds during init).

What I meant is, IMHO we should try to not use things like
cpr_is_incoming() too deep into the device stack, and we should use it as
less frequent as possible.

In many cases, IIUC it's because the current device emulation code is not
yet separating the FD installation (and also whatever that can be relevant
to the FD) from the realize() process.  Hence a quick way to make it work
is to add cpr_is_incoming() or similar helpers either to skip some process,
or do something different with an existing FD.

If we can have device emulation be prepared with such, in an ideal world
and just to show what I am thinking.. it could be:

  - realize()
    - realize_frontend()
    - if migration is incoming, and backend should be postponed (for fd
      loading, or maybe something else)?
      - ... realize_backend() postponed until post_load()...
    - else
      - realize_backend()

If all of the devices would support such split of realize() process
v.s. FDs / backends, _maybe_ we can remove all cpr_is_incoming() but move
it upper and upper until qdev code, like:

device_set_realized():
        if (migration_incoming_XXX() && dc->realize_prepare) {
            /*
             * This is only part of realize(), rest done in a separate VMSD
             * post_load().
             */
            dc->realize_prepare(dev, &local_err);
            if (local_err != NULL) {
                goto fail;
            }
        } else if (dc->realize) {
            dc->realize(dev, &local_err);
            if (local_err != NULL) {
                goto fail;
            }
        }

In general, that "whether is incoming fd migration" concept will be passed
down from higher the stack, rather than randomly checked very deep in
stack.  That should IMHO make code more maintenable.

But that's only my two cents.. so please take that with a grain of salt.  I
don't really know device code well to say.

Thanks,

-- 
Peter Xu

Re: [RFC V2 0/8] Live update: tap and vhost

Reply via email to