On Thu, Jan 29, 2026 at 09:24:56PM +0000, David Matlack wrote: > Stash a pointer to a device's incoming Live Updated state in struct > vfio_pci_core_device. This will enable subsequent commits to use the > preserved state when initializing the device. > > To enable VFIO to safely access this pointer during device enablement, > require that the device is fully enabled before returning true from > can_finish(). This is synchronized by vfio_pci_core.c setting > vdev->liveupdate_incoming_state to NULL under dev_set lock once it's > done using it. > > Signed-off-by: David Matlack <[email protected]> > --- > drivers/vfio/pci/vfio_pci_core.c | 2 +- > drivers/vfio/pci/vfio_pci_liveupdate.c | 17 ++++++++++++++++- > include/linux/vfio_pci_core.h | 1 + > 3 files changed, 18 insertions(+), 2 deletions(-) > > diff --git a/drivers/vfio/pci/vfio_pci_core.c > b/drivers/vfio/pci/vfio_pci_core.c > index 3a11e6f450f7..b01b94d81e28 100644 > --- a/drivers/vfio/pci/vfio_pci_core.c > +++ b/drivers/vfio/pci/vfio_pci_core.c > @@ -569,7 +569,7 @@ int vfio_pci_core_enable(struct vfio_pci_core_device > *vdev) > if (!vfio_vga_disabled() && vfio_pci_is_vga(pdev)) > vdev->has_vga = true; > > - > + vdev->liveupdate_incoming_state = NULL; > return 0; > > out_free_zdev: > diff --git a/drivers/vfio/pci/vfio_pci_liveupdate.c > b/drivers/vfio/pci/vfio_pci_liveupdate.c > index ad915352303f..1ad7379c70c4 100644 > --- a/drivers/vfio/pci/vfio_pci_liveupdate.c > +++ b/drivers/vfio/pci/vfio_pci_liveupdate.c > @@ -131,6 +131,7 @@ static int match_device(struct device *dev, const void > *arg) > static int vfio_pci_liveupdate_retrieve(struct liveupdate_file_op_args *args) > { > struct vfio_pci_core_device_ser *ser; > + struct vfio_pci_core_device *vdev; > struct vfio_device *device; > struct file *file; > int ret; > @@ -160,6 +161,9 @@ static int vfio_pci_liveupdate_retrieve(struct > liveupdate_file_op_args *args) > goto out; > } > > + vdev = container_of(device, struct vfio_pci_core_device, vdev); > + vdev->liveupdate_incoming_state = ser; > + > args->file = file; > > out: > @@ -171,7 +175,18 @@ static int vfio_pci_liveupdate_retrieve(struct > liveupdate_file_op_args *args) > > static bool vfio_pci_liveupdate_can_finish(struct liveupdate_file_op_args > *args) > { > - return args->retrieved; > + struct vfio_pci_core_device *vdev; > + struct vfio_device *device; > + > + if (!args->retrieved) > + return false; > + > + device = vfio_device_from_file(args->file); > + vdev = container_of(device, struct vfio_pci_core_device, vdev); > + > + /* Check that vdev->liveupdate_incoming_state is no longer in use. */ > + guard(mutex)(&device->dev_set->lock); > + return !vdev->liveupdate_incoming_state;
Since we set this to NULL in the success path of vfio_pci_core_enable() I'm wondering if a failure in vfio_pci_core_enable could cause a resource leak? Because vfio_pci_liveupdate_can_finish() returns false as long as that pointer is valid, a single device failure will perpetually block the LIVEUPDATE_SESSION_FINISH IOCTL for the entire session preventing the LUO from reclaiming KHO memory. Shall we also set vdev->liveupdate_incoming_state = NULL on the error paths of vfio_pci_core_enable() ? Thanks, Praan

