On Thu, Nov 13, 2025 at 10:15:56AM -0600, Bjorn Helgaas wrote: > It seems like there are two things going on here, and I'm not sure > they're completely compatible: > > 1) Driver calls pci_save_state() to take over device power > management and prevent the PCI core from doing it. > > 2) Driver calls pci_save_state() to capture the device state it > wants to restore when recovering from an error. > > Shouldn't a driver be able to do 2) without also getting 1)?
In general, it can: A number of drivers already call pci_save_state() on probe to capture the state for subsequent error recovery. If the driver has modified config space in its probe hook, then calling pci_save_state() continues to make sense. If the driver has *not* modified config space, then the call becomes obsolete once this patch is accepted. The reason I'm using the "in general" qualifier: I've identified two corner cases where the PCI core neglects to set state_saved = false before commencing the suspend sequence: * If a driver has legacy PCI PM callbacks, pci_legacy_suspend() neglects to set state_saved = false. Yet both pci_legacy_suspend() and pci_legacy_suspend_late() subsequently query the state_saved flag. * If a device is unbound or its driver has no PM callbacks (driver->pm == NULL), pci_pm_freeze() neglects to set state_saved = false. Yet pci_pm_freeze_noirq() subsequently queries the state_saved flag. In these corner cases, pci_legacy_suspend() and pci_pm_freeze() depend on some other part of the PCI core to set state_saved = false. For a freshly enumerated device, the flag is initialized to false by kzalloc() and pci_device_add() also explicitly sets it to false for good measure. On resume (or thaw or restore), the flag is set to false by pci_restore_state(). The latter is preserved as is by my patch and the former is moved to pci_bus_add_device() to retain the current behavior. Clearly, the two corner cases should be fixed and then setting state_saved = false in pci_bus_add_device() becomes unnecessary. I'd prefer doing that in a separate step though. So drivers which use legacy PCI PM callbacks or have no PM callbacks should currently not call pci_save_state() on probe without manually setting state_saved = false afterwards. If they neglect that, then pci_legacy_suspend_late() and pci_pm_freeze_noirq() will not call pci_save_state() on the next suspend cycle and so the state that will be restored on resume is the one recorded on probe, not the one that the device had on suspend. If these two states happen to be identical, there's no problem. > > > > +++ b/drivers/pci/bus.c > > > > @@ -358,6 +358,13 @@ void pci_bus_add_device(struct pci_dev *dev) > > > > pci_bridge_d3_update(dev); > > > > > > > > /* > > > > + * Save config space for error recoverability. Clear > > > > state_saved > > > > + * to detect whether drivers invoked pci_save_state() on suspen [...] > > > Can we expand this a little to explain how this is detected and what > > > drivers *should* be doing? [...] > Yes. I should have proposed some text for the comment, e.g., > > Save config space for error recoverability. Clear state_saved. If > driver calls pci_save_state() again, state_saved will be set and > we'll know that on suspend, the PCI core shouldn't call > pci_save_state() or change the device power state. I'm fine with rewording the code comment like this, as well as splitting the code comment as suggested by Rafael. Would you prefer amending the code comment when applying or should I respin with a reworded comment? Again, clearing state_saved in pci_bus_add_device() is just a temporary measure to retain the existing behavior. It (and an accompanying code comment) can be dropped once pci_legacy_suspend() and pci_pm_freeze() are fixed. > I'm just wishing for a more concrete mention of "pci_save_state()", > since that's where the critical "state_saved" flag is updated. > > And I'm not sure Documentation/ includes anything about the idea of > a driver using pci_save_state() to capture the state it wants to > restore after an error. I guess that's obvious now that I write it > down, but I'm sure learning a lot from this conversation :) Okay, noted that the documentation could be improved. I'd be glad if this could be postponed to a separate step as well though. I can only address problems one at a time. :) Thanks, Lukas
