On Thu, Aug 14, 2025 at 09:56:09AM +0200, Niklas Schnelle wrote: > On Wed, 2025-08-13 at 07:11 +0200, Lukas Wunner wrote: > > +++ b/drivers/pci/pcie/err.c > > @@ -217,15 +217,10 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev, > > pci_walk_bridge(bridge, pci_pm_runtime_get_sync, NULL); > > > > pci_dbg(bridge, "broadcast error_detected message\n"); > > - if (state == pci_channel_io_frozen) { > > + if (state == pci_channel_io_frozen) > > pci_walk_bridge(bridge, report_frozen_detected, &status); > > - if (reset_subordinates(bridge) != PCI_ERS_RESULT_RECOVERED) { > > - pci_warn(bridge, "subordinate device reset failed\n"); > > - goto failed; > > - } > > - } else { > > + else > > pci_walk_bridge(bridge, report_normal_detected, &status); > > - } > > > > if (status == PCI_ERS_RESULT_CAN_RECOVER) { > > status = PCI_ERS_RESULT_RECOVERED; > > On s390 PCI errors leave the device with MMIO blocked until either the > error state is cleared or we reset via the firmware interface. With > this change and the pci_channel_io_frozen case AER would now do the > report_mmio_enabled() before the reset with nothing happening between > report_frozen_detected() and report_mmio_enabled() is MMIO enabled at > this point?
Yes, MMIO access is controlled through the Memory Space Enable and Bus Master Enable bits in the Command Register (PCIe r7.0 sec 7.5.1.1.3). Drivers generally set those bits on probe and they're not automatically cleared by hardware upon an Uncorrectable Error. EEH and s390 blocking MMIO access likely serves the purpose of preventing corrupted data being propagated upstream. AER doesn't support that (or at least doesn't mandate that -- certain implementations might be capable of blocking poisoned data). Thus with AER, MMIO access is usually possible already in ->error_detected(). The only reason why drivers shouldn't be doing that and instead defer MMIO access to ->mmio_enabled() is to be compatible with EEH and s390. There are exceptions to this rule: E.g. if the Uncorrectable Error was "Surprise Down", the link to the device is down and MMIO access isn't possible, neither in ->error_detected() nor ->mmio_enabled(). Drivers should check whether an MMIO read results in an "all ones" response (PCI_POSSIBLE_ERROR()), which is usually what the host bridge fabricates upon a Completion Timeout caused by the link being down and the Endpoint being inaccessible. (There's a list of all the errors with default severity etc in PCIe r7.0 sec 6.2.7.) I believe DPC was added to the PCIe Base Specification to prevent propagating corrupted data upstream, similarly to EEH and s390. DPC brings down the link immediately upon an Uncorrectable Error (the Linux driver confines this to Fatal Errors), but as a side effect this results in a Hot Reset being propagated down the hierarchy, making it impossible to access the device in the faulting state to retrieve debug information etc. After the link has been brought up again, the device is in post-reset state. So DPC doesn't allow for reset-less recovery. With the ordering change introduced by this commit, ->mmio_enabled() will no longer be able to access MMIO space in the DPC case because the link hasn't been brought back up until ->slot_reset(). But as explained in the commit message, I only found two drivers affected by this. One seems to be EEH-specific (drivers/scsi/ipr.c). And the other one (drivers/scsi/sym53c8xx_2/sym_glue.c) dumps debug registers in ->mmio_enabled() and I'm arguing that with this commit it's dumping "all ones", but right now it's dumping the post-reset state of the device, which isn't any more useful. > I think this callback really only makes sense if you have > an equivalent to s390's clearing of the error state that enables MMIO > but doesn't otherwise reset. Similarly EEH has eeh_pci_enable(pe, > EEH_OPT_THAW_MMIO). Right. > > @@ -233,6 +228,14 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev, > > pci_walk_bridge(bridge, report_mmio_enabled, &status); > > } > > > > + if (status == PCI_ERS_RESULT_NEED_RESET || > > + state == pci_channel_io_frozen) { > > + if (reset_subordinates(bridge) != PCI_ERS_RESULT_RECOVERED) { > > + pci_warn(bridge, "subordinate device reset failed\n"); > > + goto failed; > > + } > > + } > > + > > if (status == PCI_ERS_RESULT_NEED_RESET) { > > /* > > * TODO: Should call platform-specific > > I wonder if it might make sense to merge the reset into the above > existing if. There are drivers such as drivers/bus/mhi/host/pci_generic.c which return PCI_ERS_RESULT_RECOVERED from ->error_detected(). So they fall through directly to the ->resume() stage. They're doing this even in the pci_channel_io_frozen case (i.e. for Fatal Errors). But for DPC we must call reset_subordinates() to bring the link back up. And for Fatal Errors, Documentation/PCI/pcieaer-howto.rst suggests that we must likewise call it because the link may be unreliable. Hence the if-condition must use a boolean OR, i.e.: if (status == PCI_ERS_RESULT_NEED_RESET || state == pci_channel_io_frozen) { ... whereas if I would move the invocation of reset_subordinates() inside the existing "if (status == PCI_ERS_RESULT_NEED_RESET)", it would be equivalent to a boolean AND. I would have to amend drivers such as drivers/bus/mhi/host/pci_generic.c to work with the changed logic and I figured that it's simpler to only change pcie_do_recovery() rather than touching all affected drivers. I don't have any of that hardware and so it seems risky touching all those drivers. > I'm a bit confused by that TODO comment and > anyway, it asks for a platform-specific reset to be added bu > reset_subordinate() already seems to allow platform specific behavior, > so maybe the moved reset should replace the TODO? Manivannan has a patch pending which removes the TODO: https://lore.kernel.org/all/20250715-pci-port-reset-v6-1-6f9cce94e...@oss.qualcomm.com/ > Also I think for the > kind of broken case where the state is pci_channel_io_frozen but the > drivers just reports PCI_ERS_RESULT_CAN_RECOVER it looks like there > would be a reset but no calls to ->slot_reset(). Right, but that's what AER is currently doing and drivers such as drivers/bus/mhi/host/pci_generic.c are written to work with the existing flow. Thanks, Lukas