On Tue, Oct 21, 2025 at 05:49:51PM +0100, Peter Maydell wrote:
> On Tue, 21 Oct 2025 at 17:46, Peter Xu <[email protected]> wrote:
> >
> > On Tue, Oct 21, 2025 at 05:21:44PM +0100, Peter Maydell wrote:
> > > On Tue, 21 Oct 2025 at 17:16, Peter Xu <[email protected]> wrote:
> > > >
> > > > On Tue, Oct 21, 2025 at 04:43:52PM +0100, Peter Maydell wrote:
> > > > > Do you have plans for further cleanup/extension of the
> > > > > use of Error here that would let these functions pass
> > > > > the Error back up the chain ?
> > > >
> > > > It would be non-trivial though as we'll need to change VMStateInfo.get()
> > > > API and that'll be another lot of churns.
> > >
> > > We could at least do it in stages, so we add new fields
> > > .get_err and .put_err that have the new API with Error*;
> > > the calling code in migration/ uses the new functions if
> > > they're non-NULL, otherwise falling back to the old ones.
> > > Then we only need to update the implementations which
> > > want to be able to return an Error. (This is the same sort
> > > of thing we have with MemoryRegionOps and its read/write
> > > vs read_with_attrs/write_with_attrs methods.)
> > >
> > > The downside is we end up with another "there's two ways
> > > you can do this" API.
> >
> > Right, the other thing is get()/put() logically should only be used for
> > primitives...  I wished they're never used in complicated ways.
> >
> > IOW, in an ideal world, get()/put() should only fail because of qemufile
> > channel errors, rather than anything else.
> 
> I suppose so, but that seems like something in practice
> we make a lot of use of. It's really handy to be able to
> say "this is how you obtain the integer you wanted to put
> in the migration stream" -- we do a fair amount of that
> in target/arm/machine.c for instance. But those don't
> need to return a failure, I suppose.

That's exactly what pre_save() / post_load() should do, IMHO.

Taking example of the arm's case here:

static int put_fpscr(QEMUFile *f, void *opaque, size_t size,
                     const VMStateField *field, JSONWriter *vmdesc)
{
    ARMCPU *cpu = opaque;
    CPUARMState *env = &cpu->env;
    uint32_t fpscr = vfp_fpcr_fpsr_needed(opaque) ? 0 : vfp_get_fpscr(env);

    qemu_put_be32(f, fpscr);
    return 0;
}

The .pre_save() should be exactly the logic that generalize whatever we
have had in QEMU's structs into a pure uint32_t, put that into a temp
u32. Then migration should be able to transfer that using whatever way it
prefers.  When using qemufile API, it should use vmstate_info_uint32 so as
to not open-code qemu_put_be32().

Now, we have a major issue with qemufile and all the APIs that bound to it
when we want to move the IO layers from qemufile to iochannels, exactly
because we lack such abstraction layer, where we mixed up "this is how you
obtain the integer" and "send this integer to the wire" operations in one
API.

It would be ideal if the IO part can be left to generic primitives like
vmstate_info_uint32 in this case, then it's easier when we want to convert
to using iochannels rather than qemufile APIs.  iochannels are more
flexible in many ways (e.g., multifd only support iochannels, not
qemufiles).  Mneanwhile qemufile is a known bad API... :(

> 
> > Then, descriptive errors for get()/put() may not even be needed..  OTOH,
> > complicated structs (like virtio_gpu_load...) should really be done in
> > VMSDs already, and anything can fail outside -EIO should be done only in
> > pre_save() / post_load() / ...
> 
> You can't return an Error from a post_load callback either :-)

We can, after this pull landed. :)  Please see:

[PULL 25/45] migration: Add error-parameterized function variants in VMSD struct

-- 
Peter Xu


Reply via email to