On 2025-10-19 10:15 AM, Lukas Wunner wrote:
> On Sat, Oct 18, 2025 at 03:36:20PM -0700, Vipin Sharma wrote:
> > On 2025-10-18 09:17:33, Lukas Wunner wrote:
> > > On Fri, Oct 17, 2025 at 05:07:07PM -0700, Vipin Sharma wrote:
> > > > Move struct pci_saved_state{} and struct pci_cap_saved_data{} to
> > > > linux/pci.h so that they are available to code outside of the PCI core.
> > > > 
> > > > These structs will be used in subsequent commits to serialize and
> > > > deserialize PCI state across Live Update.
> > > 
> > > That's not sufficient as a justification to make these public in my view.
> > > 
> > > There are already pci_store_saved_state() and pci_load_saved_state()
> > > helpers to serialize PCI state.  Why do you need anything more?
> > > (Honest question.)
> > 
> > In LUO ecosystem, currently,  we do not have a solid solution to do
> > proper serialization/deserialization of structs along with versioning
> > between different kernel versions. This work is still being discussed.
> > 
> > Here, I created separate structs (exactly same as the original one) to
> > have little bit control on what gets saved in serialized state and
> > correctly gets deserialized after kexec.
> > 
> > For example, if I am using existing structs and not creating my own
> > structs then I cannot just do a blind memcpy() between whole of the PCI 
> > state
> > prior to kexec to PCI state after the kexec. In the new kernel
> > layout might have changed like addition or removal of a field.
> 
> The last time we changed those structs was in 2013 by fd0f7f73ca96.
> So changes are extremely rare.
> 
> What could change in theory is the layout of the individual
> capabilities (the data[] in struct pci_cap_saved_data).
> E.g. maybe we decide that we need to save an additional register.
> But that's also rare.  Normally we add all the mutable registers
> when a new capability is supported and have no need to amend that
> afterwards.

Yeah that has me worried. A totally innocuous commit that adds, removes,
or reorders a register stashed in data[] could lead a broken device when
VFIO does pci_restore_state() after a Live Update.

Turing pci_save_state into an actual ABI would require adding the
registers into the save state probably, rather than assuming their
order.

But... I wonder if we truly need to preserve the PCI save state
across Live Update.

Based on this comment in drivers/vfio/pci/vfio_pci_core.c, the PCI
save/restore stuff in VFIO is for cleaning up devices that do not
support resets:

 648         /*
 649          * If we have saved state, restore it.  If we can reset the device,
 650          * even better.  Resetting with current state seems better than
 651          * nothing, but saving and restoring current state without reset
 652          * is just busy work.
 653          */
 654         if (pci_load_and_free_saved_state(pdev, &vdev->pci_saved_state)) {
 655                 pci_info(pdev, "%s: Couldn't reload saved state\n", 
__func__);
 656
 657                 if (!vdev->reset_works)
 658                         goto out;
 659
 660                 pci_save_state(pdev);
 661         }

So if we just limit Live Update support to devices with reset_works,
then we don't have to deal with preserving the save state.

I will have to double check that reset_works is true for all the devices
we care about supporting for Live Update, but I imagine it will be.
They're all relatively modern PCIe devices.

Reply via email to