> On Jan 25, 2022, at 10:48 AM, Stefan Hajnoczi <stefa...@redhat.com> wrote: > > On Wed, Jan 19, 2022 at 04:42:06PM -0500, Jagannathan Raman wrote: >> + * The client subsequetly asks the remote server for any data that > > subsequently > >> +static void vfu_mig_state_running(vfu_ctx_t *vfu_ctx) >> +{ >> + VfuObject *o = vfu_get_private(vfu_ctx); >> + VfuObjectClass *k = VFU_OBJECT_GET_CLASS(OBJECT(o)); >> + static int migrated_devs; >> + Error *local_err = NULL; >> + int ret; >> + >> + /** >> + * TODO: move to VFU_MIGR_STATE_RESUME handler. Presently, the >> + * VMSD data from source is not available at RESUME state. >> + * Working on a fix for this. >> + */ >> + if (!o->vfu_mig_file) { >> + o->vfu_mig_file = qemu_fopen_ops(o, &vfu_mig_fops_load, false); >> + } >> + >> + ret = qemu_remote_loadvm(o->vfu_mig_file); >> + if (ret) { >> + VFU_OBJECT_ERROR(o, "vfu: failed to restore device state"); >> + return; >> + } >> + >> + qemu_file_shutdown(o->vfu_mig_file); >> + o->vfu_mig_file = NULL; >> + >> + /* VFU_MIGR_STATE_RUNNING begins here */ >> + if (++migrated_devs == k->nr_devs) { > > When is this counter reset so migration can be tried again if it > fails/cancels?
Detecting cancellation is a pending item. We will address it in the next rev. Will check with you if we get stuck during the process of implementing it. > >> +static ssize_t vfu_mig_read_data(vfu_ctx_t *vfu_ctx, void *buf, >> + uint64_t size, uint64_t offset) >> +{ >> + VfuObject *o = vfu_get_private(vfu_ctx); >> + >> + if (offset > o->vfu_mig_buf_size) { >> + return -1; >> + } >> + >> + if ((offset + size) > o->vfu_mig_buf_size) { >> + warn_report("vfu: buffer overflow - check pending_bytes"); >> + size = o->vfu_mig_buf_size - offset; >> + } >> + >> + memcpy(buf, (o->vfu_mig_buf + offset), size); >> + >> + o->vfu_mig_buf_pending -= size; > > This assumes that the caller increments offset by size each time. If > that assumption is okay, then we can just trust offset and don't need to > do arithmetic on vfu_mig_buf_pending. If that assumption is not correct, > then the code needs to be extended to safely update vfu_mig_buf_pending > when offset jumps around arbitrarily between calls. Going by the definition of vfu_migration_callbacks_t in the library, I assumed that read_data advances the offset by size bytes. Will add a comment a comment to explain that. > >> +uint64_t vmstate_vmsd_size(PCIDevice *pci_dev) >> +{ >> + DeviceClass *dc = DEVICE_GET_CLASS(DEVICE(pci_dev)); >> + const VMStateField *field = NULL; >> + uint64_t size = 0; >> + >> + if (!dc->vmsd) { >> + return 0; >> + } >> + >> + field = dc->vmsd->fields; >> + while (field && field->name) { >> + size += vmstate_size(pci_dev, field); >> + field++; >> + } >> + >> + return size; >> +} > > This function looks incorrect because it ignores subsections as well as > runtime behavior during save(). Although VMStateDescription is partially > declarative, there is still a bunch of imperative code that can write to > the QEMUFile at save() time so there's no way of knowing the size ahead > of time. I see your point, it would be a problem for any field which has the (VMS_BUFFER | VMS_ALLOC) flags set. > > I asked this in a previous revision of this series but I'm not sure if > it was answered: is it really necessary to know the size of the vmstate? > I thought the VFIO migration interface is designed to support > streaming reads/writes. We could choose a fixed size like 64KB and > stream the vmstate in 64KB chunks. The library exposes the migration data to the client as a device BAR with fixed size - the size of which is fixed at boot time, even when using vfu_migration_callbacks_t callbacks. I don’t believe the library supports streaming vmstate/migration-data - see the following comment in migration_region_access() defined in the library: * Does this mean that partial reads are not allowed? Thanos or John, Could you please clarify this? Stefan, We attempted to answer the migration cancellation and vmstate size questions previously also, in the following email: https://lore.kernel.org/all/f48606b1-15a4-4dd2-9d71-2fcafc0e6...@oracle.com/ Thank you very much! -- Jag