Hello, The motivation behind these changes is to improve error reporting to the upper management layer (libvirt) with a more detailed error, this to let it decide, depending on the reported error, whether to try migration again later. It would be useful in cases where migration fails due to lack of HW resources on the host. For instance, some adapters can only initiate a limited number of simultaneous dirty tracking requests and this imposes a limit on the the number of VMs that can be migrated simultaneously.
We are not quite ready for such a mechanism but what we can do first is to cleanup the error reporting in the early save_setup sequence. This is what the following changes propose, by adding an Error** argument to various handlers and propagating it to the core migration subsystem. Thanks, C. Changes in v2: - Removed v1 patches addressing the return-path thread termination as they are now superseded by : https://lore.kernel.org/qemu-devel/20240226203122.22894-1-faro...@suse.de/ - Documentation updates of handlers - Removed call to PRECOPY_NOTIFY_SETUP notifiers in case of errors - Modified routines taking an Error** argument to return a bool when possible and made adjustments in callers. - new MEMORY_LISTENER_CALL_LOG_GLOBAL macro for .log_global*() handlers - Handled SETUP state when migration terminates - Modified memory_get_xlat_addr() to take an Error** argument - Various refinements on error handling Cédric Le Goater (21): migration: Report error when shutdown fails migration: Remove SaveStateHandler and LoadStateHandler typedefs migration: Add documentation for SaveVMHandlers migration: Do not call PRECOPY_NOTIFY_SETUP notifiers in case of error migration: Add Error** argument to qemu_savevm_state_setup() migration: Add Error** argument to .save_setup() handler migration: Add Error** argument to .load_setup() handler memory: Add Error** argument to .log_global*() handlers memory: Add Error** argument to the global_dirty_log routines migration: Modify ram_init_bitmaps() to report dirty tracking errors migration: Fix migration termination vfio: Add Error** argument to .set_dirty_page_tracking() handler vfio: Add Error** argument to vfio_devices_dma_logging_start() vfio: Add Error** argument to vfio_devices_dma_logging_stop() vfio: Use new Error** argument in vfio_save_setup() vfio: Add Error** argument to .vfio_save_config() handler vfio: Reverse test on vfio_get_dirty_bitmap() memory: Add Error** argument to memory_get_xlat_addr() vfio: Add Error** argument to .get_dirty_bitmap() handler vfio: Also trace event failures in vfio_save_complete_precopy() vfio: Extend vfio_set_migration_error() with Error* argument include/exec/memory.h | 40 +++- include/hw/vfio/vfio-common.h | 29 ++- include/hw/vfio/vfio-container-base.h | 35 +++- include/migration/register.h | 267 +++++++++++++++++++++++--- include/qemu/typedefs.h | 2 - migration/savevm.h | 2 +- hw/i386/xen/xen-hvm.c | 10 +- hw/ppc/spapr.c | 2 +- hw/s390x/s390-stattrib.c | 2 +- hw/vfio/common.c | 160 +++++++++------ hw/vfio/container-base.c | 9 +- hw/vfio/container.c | 19 +- hw/vfio/migration.c | 89 ++++++--- hw/vfio/pci.c | 5 +- hw/virtio/vhost-vdpa.c | 5 +- hw/virtio/vhost.c | 6 +- migration/block-dirty-bitmap.c | 2 +- migration/block.c | 2 +- migration/dirtyrate.c | 21 +- migration/migration.c | 24 ++- migration/qemu-file.c | 5 +- migration/ram.c | 48 ++++- migration/savevm.c | 28 +-- system/memory.c | 95 +++++++-- system/physmem.c | 5 +- 25 files changed, 699 insertions(+), 213 deletions(-) -- 2.43.2