On Mon, Mar 16, 2026 at 03:14:18PM -0700, David Matlack wrote: > On Mon, Mar 16, 2026 at 2:49 PM Vipin Sharma <[email protected]> wrote: > > > > On Mon, Mar 16, 2026 at 10:18:22AM -0700, David Matlack wrote: > > > On Mon, Mar 16, 2026 at 9:22 AM Vipin Sharma <[email protected]> wrote: > > > > > > > > On Thu, Mar 12, 2026 at 11:39:45PM +0000, David Matlack wrote: > > > > > On 2026-03-09 10:32 AM, David Matlack wrote: > > > > > > On Fri, Feb 27, 2026 at 9:57 AM Alex Williamson <[email protected]> > > > > > > wrote: > > > > > > > > > > > > Sorry if I don't have the whole model in my head yet, but is > > > > > > > exposing > > > > > > > the restriction to the vfio user of the device sufficient to > > > > > > > manage the > > > > > > > liveupdate orchestration? For example, a VFIO_DEVICE_INFO_CAP > > > > > > > pushes > > > > > > > the knowledge to QEMU... what does QEMU do with that knowledge? > > > > > > > Who > > > > > > > imposes the policy decision to decide what support is sufficient? > > > > > > > > > > > > Hm.. good questions. I don't think we want userspace inspecting bits > > > > > > exposed by the kernel and trying to infer exactly what's being > > > > > > preserved and whether it's "good enough" to use. And such a UAPI > > > > > > would > > > > > > become tech debt once we finish development, I suspect. > > > > > > > > > > > > A better approach would be to hide this support from userspace until > > > > > > we decide it is ready for production use-cases. > > > > > > > > > > > > To enable development and testing, we can add an opt-in mechanism > > > > > > > > > > Here is what I am trending towards sending in v3 as the opt-in > > > > > mechanism: > > > > > > > > > > diff --git a/drivers/vfio/pci/Kconfig b/drivers/vfio/pci/Kconfig > > > > > index 1e82b44bda1a..770231554221 100644 > > > > > --- a/drivers/vfio/pci/Kconfig > > > > > +++ b/drivers/vfio/pci/Kconfig > > > > > @@ -58,6 +58,27 @@ config VFIO_PCI_ZDEV_KVM > > > > > config VFIO_PCI_DMABUF > > > > > def_bool y if VFIO_PCI_CORE && PCI_P2PDMA && DMA_SHARED_BUFFER > > > > > > > > > > +config VFIO_PCI_LIVEUPDATE > > > > > + bool "VFIO PCI support for Live Update (EXPERIMENTAL)" > > > > > + depends on LIVEUPDATE && VFIO_PCI > > > > > + help > > > > > + Support for preserving devices bound to vfio-pci across a > > > > > Live > > > > > + Update. The eventual goal is that preserved devices can run > > > > > + uninterrupted during a Live Update, including DMA to > > > > > preserved > > > > > + memory buffers and P2P. However there are many steps still > > > > > needed to > > > > > + achieve this, including: > > > > > + > > > > > + - Preservation of iommufd files > > > > > + - Preservation of IOMMU driver state > > > > > + - Preservation of PCI state (BAR resources, device state, > > > > > ...) > > > > > + - Preservation of vfio-pci driver state > > > > > + > > > > > + This option should only be enabled by developers working on > > > > > + implementing this support. Once enough support has landed > > > > > in the > > > > > + kernel, this option will no longer be marked EXPERIMENTAL. > > > > > + > > > > > + If you don't know what to do here, say N. > > > > > + > > > > > > > > To use VFIO liveupdate, user has to do at least two things: > > > > 1. Enable CONFIG_LIVEUPDATE > > > > 2. Pass VFIO FD to a live update session. > > > > > > > > This means someone using it has to know what live update is and > > > > intentionally pass the VFIO FDs. Isn't act of doing this itself an > > > > opt-in mechanism? > > > > > > If it is, then I can leave this out. Alex? > > > > > > My thinking was: Distros are free to enable LIVEUPDATE and use it. The > > > support it enables today is all fully functional (albeit new). > > > vfio-cdev, OTOH, is not. A separate Kconfig can help express that > > > difference. > > > > > > Consider that LIVEUPDATE could be enabled by default in a future > > > release, but vfio-cdev support might not be ready yet at that point. > > > > But that also requires point 2 above i.e. userspace explicitly passing > > VFIO FD to liveupdate. Unless there is a capability mechanism like KVM > > then userspace cannot know what is exactly supported. > > Yes that is why I propose not exposing the support to userspace at all > until it is ready, by compiling it out of kernel via new Kconfig. This > way it does not get accidentally enabled in distros or downstream > kernels before it is ready. > > > Also, users who > > are using these APIs will already be advanced users and have to know > > many details about what liveupdate supports or not. > > VMMs will be the ones preserving VFIO cdev files. I think you are > suggesting they should know what versions of Linux support what kind > of preservation? Like QEMU would know that Linux 7.1-7.4 supports > partial VFIO preservation and 7.5+ supports fully? That does not sound > like a good situation to be in.
I agree, for VMM its better to just assume it is a complete preservation feature but it is an experimental code in kernel. > > I think it's much better to hide the support behind Kconfig until its > ready. That way the PRESERVE_FD ioctl just fails on kernels that do > not fully support (because VFIO_PCI_LIVEUPDATE is not enabled), and > succeeds on kernels that do fully support. > > If someone wants to enable and use VFIO_PCI_LIVEUPDATE while it is > still marked experimental, they're on their own. > Sounds good. Thanks!

