On Mon, Mar 16, 2026 at 03:14:18PM -0700, David Matlack wrote:
> On Mon, Mar 16, 2026 at 2:49 PM Vipin Sharma <[email protected]> wrote:
> >
> > On Mon, Mar 16, 2026 at 10:18:22AM -0700, David Matlack wrote:
> > > On Mon, Mar 16, 2026 at 9:22 AM Vipin Sharma <[email protected]> wrote:
> > > >
> > > > On Thu, Mar 12, 2026 at 11:39:45PM +0000, David Matlack wrote:
> > > > > On 2026-03-09 10:32 AM, David Matlack wrote:
> > > > > > On Fri, Feb 27, 2026 at 9:57 AM Alex Williamson <[email protected]> 
> > > > > > wrote:
> > > > >
> > > > > > > Sorry if I don't have the whole model in my head yet, but is 
> > > > > > > exposing
> > > > > > > the restriction to the vfio user of the device sufficient to 
> > > > > > > manage the
> > > > > > > liveupdate orchestration?  For example, a VFIO_DEVICE_INFO_CAP 
> > > > > > > pushes
> > > > > > > the knowledge to QEMU... what does QEMU do with that knowledge?  
> > > > > > > Who
> > > > > > > imposes the policy decision to decide what support is sufficient?
> > > > > >
> > > > > > Hm.. good questions. I don't think we want userspace inspecting bits
> > > > > > exposed by the kernel and trying to infer exactly what's being
> > > > > > preserved and whether it's "good enough" to use. And such a UAPI 
> > > > > > would
> > > > > > become tech debt once we finish development, I suspect.
> > > > > >
> > > > > > A better approach would be to hide this support from userspace until
> > > > > > we decide it is ready for production use-cases.
> > > > > >
> > > > > > To enable development and testing, we can add an opt-in mechanism
> > > > >
> > > > > Here is what I am trending towards sending in v3 as the opt-in 
> > > > > mechanism:
> > > > >
> > > > > diff --git a/drivers/vfio/pci/Kconfig b/drivers/vfio/pci/Kconfig
> > > > > index 1e82b44bda1a..770231554221 100644
> > > > > --- a/drivers/vfio/pci/Kconfig
> > > > > +++ b/drivers/vfio/pci/Kconfig
> > > > > @@ -58,6 +58,27 @@ config VFIO_PCI_ZDEV_KVM
> > > > >  config VFIO_PCI_DMABUF
> > > > >         def_bool y if VFIO_PCI_CORE && PCI_P2PDMA && DMA_SHARED_BUFFER
> > > > >
> > > > > +config VFIO_PCI_LIVEUPDATE
> > > > > +       bool "VFIO PCI support for Live Update (EXPERIMENTAL)"
> > > > > +       depends on LIVEUPDATE && VFIO_PCI
> > > > > +       help
> > > > > +         Support for preserving devices bound to vfio-pci across a 
> > > > > Live
> > > > > +         Update. The eventual goal is that preserved devices can run
> > > > > +         uninterrupted during a Live Update, including DMA to 
> > > > > preserved
> > > > > +         memory buffers and P2P. However there are many steps still 
> > > > > needed to
> > > > > +         achieve this, including:
> > > > > +
> > > > > +          - Preservation of iommufd files
> > > > > +          - Preservation of IOMMU driver state
> > > > > +          - Preservation of PCI state (BAR resources, device state, 
> > > > > ...)
> > > > > +          - Preservation of vfio-pci driver state
> > > > > +
> > > > > +         This option should only be enabled by developers working on
> > > > > +         implementing this support. Once enough support has landed 
> > > > > in the
> > > > > +         kernel, this option will no longer be marked EXPERIMENTAL.
> > > > > +
> > > > > +         If you don't know what to do here, say N.
> > > > > +
> > > >
> > > > To use VFIO liveupdate, user has to do at least two things:
> > > > 1. Enable CONFIG_LIVEUPDATE
> > > > 2. Pass VFIO FD to a live update session.
> > > >
> > > > This means someone using it has to know what live update is and
> > > > intentionally pass the VFIO FDs. Isn't act of doing this itself an
> > > > opt-in mechanism?
> > >
> > > If it is, then I can leave this out. Alex?
> > >
> > > My thinking was: Distros are free to enable LIVEUPDATE and use it. The
> > > support it enables today is all fully functional (albeit new).
> > > vfio-cdev, OTOH, is not. A separate Kconfig can help express that
> > > difference.
> > >
> > > Consider that LIVEUPDATE could be enabled by default in a future
> > > release, but vfio-cdev support might not be ready yet at that point.
> >
> > But that also requires point 2 above i.e. userspace explicitly passing
> > VFIO FD to liveupdate. Unless there is a capability mechanism like KVM
> > then userspace cannot know what is exactly supported.
> 
> Yes that is why I propose not exposing the support to userspace at all
> until it is ready, by compiling it out of kernel via new Kconfig. This
> way it does not get accidentally enabled in distros or downstream
> kernels before it is ready.
> 
> > Also, users who
> > are using these APIs will already be advanced users and have to know
> > many details about what liveupdate supports or not.
> 
> VMMs will be the ones preserving VFIO cdev files. I think you are
> suggesting they should know what versions of Linux support what kind
> of preservation? Like QEMU would know that Linux 7.1-7.4 supports
> partial VFIO preservation and 7.5+ supports fully? That does not sound
> like a good situation to be in.

I agree, for VMM its better to just assume it is a complete preservation
feature but it is an experimental code in kernel.

> 
> I think it's much better to hide the support behind Kconfig until its
> ready. That way the PRESERVE_FD ioctl just fails on kernels that do
> not fully support (because VFIO_PCI_LIVEUPDATE is not enabled), and
> succeeds on kernels that do fully support.
> 
> If someone wants to enable and use VFIO_PCI_LIVEUPDATE while it is
> still marked experimental, they're on their own.
> 

Sounds good. Thanks!

Reply via email to