On Fri, Feb 27, 2026 at 9:57 AM Alex Williamson <[email protected]> wrote:
>
> On Fri, 27 Feb 2026 09:07:48 -0800
> David Matlack <[email protected]> wrote:
>
> > On Fri, Feb 27, 2026 at 7:47 AM Alex Williamson <[email protected]> wrote:
> > >
> > > On Fri, 27 Feb 2026 00:51:18 +0000
> > > David Matlack <[email protected]> wrote:
> > >
> > > > On 2026-02-26 05:00 PM, Alex Williamson wrote:
> > > > > On Thu, 29 Jan 2026 21:24:57 +0000
> > > > > David Matlack <[email protected]> wrote:
> > > > > >
> > > > > > - vdev->reset_works = !ret;
> > > > > > pci_save_state(pdev);
> > > > > > vdev->pci_saved_state = pci_store_saved_state(pdev);
> > > > >
> > > > > Isn't this a problem too? In the first kernel we store the initial,
> > > > > post reset state of the device, now we're storing some arbitrary
> > > > > state.
> > > > > This is the state we're restore when the device is closed.
> > > >
> > > > The previous kernel resets the device and restores it back to its
> > > > post reset state in vfio_pci_liveupdate_freeze() before handing off
> > > > control to the next kernel. So my intention here is that VFIO will
> > > > receive the device in that state, allowing it to call
> > > > pci_store_saved_state() here to capture the post reset state of the
> > > > device again.
> > > >
> > > > Eventually we want to drop the reset in vfio_pci_liveupdate_freeze() and
> > > > preserve vdev->pci_saved_state across the Live Update. But I was hoping
> > > > to add that in a follow up series to avoid this one getting too long.
> > >
> > > I appreciate reviewing this in smaller chunks, but how does userspace
> > > know whether the kernel contains a stub implementation of liveupdate or
> > > behaves according to the end goal?
> >
> > Would a new VFIO_DEVICE_INFO_CAP be a good way to communicate this
> > information to userspace?
>
> Sorry if I don't have the whole model in my head yet, but is exposing
> the restriction to the vfio user of the device sufficient to manage the
> liveupdate orchestration? For example, a VFIO_DEVICE_INFO_CAP pushes
> the knowledge to QEMU... what does QEMU do with that knowledge? Who
> imposes the policy decision to decide what support is sufficient?
Hm.. good questions. I don't think we want userspace inspecting bits
exposed by the kernel and trying to infer exactly what's being
preserved and whether it's "good enough" to use. And such a UAPI would
become tech debt once we finish development, I suspect.
A better approach would be to hide this support from userspace until
we decide it is ready for production use-cases.
To enable development and testing, we can add an opt-in mechanism,
such as CONFIG_EXPERIMENTAL or a kernel parameter. For example, adding
something like this to vfio_pci_liveupdate_preserve():
if (!IS_ENABLED(CONFIG_EXPERIMENTAL)) {
pr_warn("vfio-pci file preservation requires
CONFIG_EXPERIMENTAL to enable!\n");
return -EOPNOTSUPP;
}
Once we feel the support is ready, we can just submit a patch to
delete those lines, and there will be no left-over UAPI.