On Fri, Nov 21, 2025 at 10:57:24AM -0800, Farhan Ali wrote: > On 11/21/2025 9:31 AM, Lukas Wunner wrote: > > +++ b/Documentation/PCI/pci-error-recovery.rst > > @@ -326,6 +326,21 @@ be recovered, there is nothing more that can be done; > > the platform > > will typically report a "permanent failure" in such a case. The > > device will be considered "dead" in this case. > > +Drivers typically need to call pci_restore_state() after reset to > > +re-initialize the device's config space registers and thereby > > +bring it from D0\ :sub:`uninitialized` into D0\ :sub:`active` state > > +(PCIe r7.0 sec 5.3.1.1). The PCI core invokes pci_save_state() > > +on enumeration after initializing config space to ensure that a > > +saved state is available for subsequent error recovery. > > +Drivers which modify config space on probe may need to invoke > > +pci_save_state() afterwards to record those changes for later > > +error recovery. When going into system suspend, pci_save_state() > > +is called for every PCI device and that state will be restored > > +not only on resume, but also on any subsequent error recovery. > > Nit: Should we clarify in the above sentence on what calls the > pci_save_state() when going into suspend? My assumption is the > pci_save_state() is called by the PCI core and not the drivers?
Per section 3.1.2 of Documentation/power/pci.rst, pci_save_state() may be called by either the driver or the PCI core. Normally it's the PCI core's responsibility, but a driver may choose to call it and bring the device into a low power state itself. The PCI core recognizes that by looking at the state_saved flag in struct pci_dev and will then neither call pci_save_state() nor transition the device to a low power state. That is the (only) purpose of the flag. I could maybe add a cross-reference pointing to Documentation/power/pci.rst. And/or that document could be moved to Documentation/PCI/. > What should the PCI core do if the saved state recorded is bad? should we > continue to restore the device with the recorded bad state? Basically the answer is, it should never happen and if it does, we've got a bug somewhere. > On s390 restoring the device with the bad state can break the device > put into error again. My (limited) understanding is that you may end up with a bad saved state on s390 virtualization scenarios because you're telling the PCI core in the ->error_detected phase() that the device has recovered and then you try to reset and recover the device on your own. I think the solution is to enhance qemu to integrate better with error recovery on the host. Thanks, Lukas
