On Thu, Jan 29, 2026 at 09:24:56PM +0000, David Matlack wrote:
> Stash a pointer to a device's incoming Live Updated state in struct
> vfio_pci_core_device. This will enable subsequent commits to use the
> preserved state when initializing the device.
> 
> To enable VFIO to safely access this pointer during device enablement,
> require that the device is fully enabled before returning true from
> can_finish(). This is synchronized by vfio_pci_core.c setting
> vdev->liveupdate_incoming_state to NULL under dev_set lock once it's
> done using it.
> 
> Signed-off-by: David Matlack <[email protected]>
> ---
>  drivers/vfio/pci/vfio_pci_core.c       |  2 +-
>  drivers/vfio/pci/vfio_pci_liveupdate.c | 17 ++++++++++++++++-
>  include/linux/vfio_pci_core.h          |  1 +
>  3 files changed, 18 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/vfio/pci/vfio_pci_core.c 
> b/drivers/vfio/pci/vfio_pci_core.c
> index 3a11e6f450f7..b01b94d81e28 100644
> --- a/drivers/vfio/pci/vfio_pci_core.c
> +++ b/drivers/vfio/pci/vfio_pci_core.c
> @@ -569,7 +569,7 @@ int vfio_pci_core_enable(struct vfio_pci_core_device 
> *vdev)
>       if (!vfio_vga_disabled() && vfio_pci_is_vga(pdev))
>               vdev->has_vga = true;
>  
> -
> +     vdev->liveupdate_incoming_state = NULL;
>       return 0;
>  
>  out_free_zdev:
> diff --git a/drivers/vfio/pci/vfio_pci_liveupdate.c 
> b/drivers/vfio/pci/vfio_pci_liveupdate.c
> index ad915352303f..1ad7379c70c4 100644
> --- a/drivers/vfio/pci/vfio_pci_liveupdate.c
> +++ b/drivers/vfio/pci/vfio_pci_liveupdate.c
> @@ -131,6 +131,7 @@ static int match_device(struct device *dev, const void 
> *arg)
>  static int vfio_pci_liveupdate_retrieve(struct liveupdate_file_op_args *args)
>  {
>       struct vfio_pci_core_device_ser *ser;
> +     struct vfio_pci_core_device *vdev;
>       struct vfio_device *device;
>       struct file *file;
>       int ret;
> @@ -160,6 +161,9 @@ static int vfio_pci_liveupdate_retrieve(struct 
> liveupdate_file_op_args *args)
>               goto out;
>       }
>  
> +     vdev = container_of(device, struct vfio_pci_core_device, vdev);
> +     vdev->liveupdate_incoming_state = ser;
> +
>       args->file = file;
>  
>  out:
> @@ -171,7 +175,18 @@ static int vfio_pci_liveupdate_retrieve(struct 
> liveupdate_file_op_args *args)
>  
>  static bool vfio_pci_liveupdate_can_finish(struct liveupdate_file_op_args 
> *args)
>  {
> -     return args->retrieved;
> +     struct vfio_pci_core_device *vdev;
> +     struct vfio_device *device;
> +
> +     if (!args->retrieved)
> +             return false;
> +
> +     device = vfio_device_from_file(args->file);
> +     vdev = container_of(device, struct vfio_pci_core_device, vdev);
> +
> +     /* Check that vdev->liveupdate_incoming_state is no longer in use. */
> +     guard(mutex)(&device->dev_set->lock);
> +     return !vdev->liveupdate_incoming_state;

Since we set this to NULL in the success path of vfio_pci_core_enable()
I'm wondering if a failure in vfio_pci_core_enable could cause a
resource leak? Because vfio_pci_liveupdate_can_finish() returns false
as long as that pointer is valid, a single device failure will 
perpetually block the LIVEUPDATE_SESSION_FINISH IOCTL for the entire 
session preventing the LUO from reclaiming KHO memory.

Shall we also set vdev->liveupdate_incoming_state = NULL on the error
paths of vfio_pci_core_enable() ?

Thanks,
Praan

Reply via email to