On 2/27/25 23:01, Maciej S. Szmigiero wrote:
On 27.02.2025 07:48, Cédric Le Goater wrote:
On 2/19/25 21:34, Maciej S. Szmigiero wrote:
From: "Maciej S. Szmigiero" <maciej.szmigi...@oracle.com>
Allow capping the maximum count of in-flight VFIO device state buffers
queued at the destination, otherwise a malicious QEMU source could
theoretically cause the target QEMU to allocate unlimited amounts of memory
for buffers-in-flight.
Since this is not expected to be a realistic threat in most of VFIO live
migration use cases and the right value depends on the particular setup
disable the limit by default by setting it to UINT64_MAX.
Signed-off-by: Maciej S. Szmigiero <maciej.szmigi...@oracle.com>
---
hw/vfio/migration-multifd.c | 14 ++++++++++++++
hw/vfio/pci.c | 2 ++
include/hw/vfio/vfio-common.h | 1 +
3 files changed, 17 insertions(+)
diff --git a/hw/vfio/migration-multifd.c b/hw/vfio/migration-multifd.c
index 18a5ff964a37..04aa3f4a6596 100644
--- a/hw/vfio/migration-multifd.c
+++ b/hw/vfio/migration-multifd.c
@@ -53,6 +53,7 @@ typedef struct VFIOMultifd {
QemuMutex load_bufs_mutex; /* Lock order: this lock -> BQL */
uint32_t load_buf_idx;
uint32_t load_buf_idx_last;
+ uint32_t load_buf_queued_pending_buffers;
} VFIOMultifd;
static void vfio_state_buffer_clear(gpointer data)
@@ -121,6 +122,15 @@ static bool vfio_load_state_buffer_insert(VFIODevice
*vbasedev,
assert(packet->idx >= multifd->load_buf_idx);
+ multifd->load_buf_queued_pending_buffers++;
+ if (multifd->load_buf_queued_pending_buffers >
+ vbasedev->migration_max_queued_buffers) {
+ error_setg(errp,
+ "queuing state buffer %" PRIu32 " would exceed the max of
%" PRIu64,
+ packet->idx, vbasedev->migration_max_queued_buffers);
+ return false;
+ }
+
lb->data = g_memdup2(&packet->data, packet_total_size - sizeof(*packet));
lb->len = packet_total_size - sizeof(*packet);
lb->is_present = true;
@@ -374,6 +384,9 @@ static bool vfio_load_bufs_thread(void *opaque, bool
*should_quit, Error **errp)
goto ret_signal;
}
+ assert(multifd->load_buf_queued_pending_buffers > 0);
+ multifd->load_buf_queued_pending_buffers--;
+
if (multifd->load_buf_idx == multifd->load_buf_idx_last - 1) {
trace_vfio_load_state_device_buffer_end(vbasedev->name);
}
@@ -408,6 +421,7 @@ VFIOMultifd *vfio_multifd_new(void)
multifd->load_buf_idx = 0;
multifd->load_buf_idx_last = UINT32_MAX;
+ multifd->load_buf_queued_pending_buffers = 0;
qemu_cond_init(&multifd->load_bufs_buffer_ready_cond);
multifd->load_bufs_thread_running = false;
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 9111805ae06c..247418f0fce2 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -3383,6 +3383,8 @@ static const Property vfio_pci_dev_properties[] = {
vbasedev.migration_multifd_transfer,
qdev_prop_on_off_auto_mutable, OnOffAuto,
.set_default = true, .defval.i = ON_OFF_AUTO_AUTO),
+ DEFINE_PROP_UINT64("x-migration-max-queued-buffers", VFIOPCIDevice,
+ vbasedev.migration_max_queued_buffers, UINT64_MAX),
UINT64_MAX doesn't make sense to me. What would be a reasonable value ?
It's the value that effectively disables this limit.
Have you monitored the max ? Should we collect some statistics on this
value and raise a warning if a high water mark is reached ? I think
this would more useful.
It's an additional mechanism, which is not expected to be necessary
in most of real-world setups, hence it's disabled by default:
Since this is not expected to be a realistic threat in most of VFIO live
migration use cases and the right value depends on the particular setup
disable the limit by default by setting it to UINT64_MAX.
The minimum value that works with particular setup depends on number of
multifd channels, probably also the number of NIC queues, etc. so it's
not something we should propose hard default to - unless it's a very
high default like 100 buffers, but then why have it set by default?.
IMHO setting it to UINT64_MAX clearly shows that it is disabled by
default since it obviously couldn't be set higher.
This doesn't convince me that we should take this patch in QEMU 10.0.
Please keep for now. We will decide in v6.
DEFINE_PROP_BOOL("migration-events", VFIOPCIDevice,
vbasedev.migration_events, false),
DEFINE_PROP_BOOL("x-no-mmap", VFIOPCIDevice, vbasedev.no_mmap, false),
Please add property documentation in vfio_pci_dev_class_init()
I'm not sure what you mean by that, vfio_pci_dev_class_init() doesn't
contain any documentation or even references to either
x-migration-max-queued-buffers or x-migration-multifd-transfer:
Indeed :/ I am trying to fix documentation here :
https://lore.kernel.org/qemu-devel/20250217173455.449983-1-...@redhat.com/
Please do something similar. I will polish the edges when merging
if necessary.
Overall, we should improve VFIO documentation, migration is one sub-feature
among many.
Thanks,
C.
static void vfio_pci_dev_class_init(ObjectClass *klass, void *data)
{
DeviceClass *dc = DEVICE_CLASS(klass);
PCIDeviceClass *pdc = PCI_DEVICE_CLASS(klass);
device_class_set_legacy_reset(dc, vfio_pci_reset);
device_class_set_props(dc, vfio_pci_dev_properties);
#ifdef CONFIG_IOMMUFD
object_class_property_add_str(klass, "fd", NULL, vfio_pci_set_fd);
#endif
dc->desc = "VFIO-based PCI device assignment";
set_bit(DEVICE_CATEGORY_MISC, dc->categories);
pdc->realize = vfio_realize;
pdc->exit = vfio_exitfn;
pdc->config_read = vfio_pci_read_config;
pdc->config_write = vfio_pci_write_config;
}
Thanks,
C.
Thanks,
Maciej