Hi Mani: I've accidentally encountered a new issue based on the reset root port patch-set. After performing a few hot-reset operations, the PCIe link enters a continuous up/down cycling pattern.
I found that calling pci_reset_secondary_bus() first in pcibios_reset_secondary_bus() appears to resolve this issue. Have you experienced a similar problem? " ... [ 141.897701] imx6q-pcie 4c300000.pcie: PCIe(LNK_STS:0x00000700) link up detected [ 142.086341] imx6q-pcie 4c300000.pcie: PCIe Gen.3 x1 link up [ 142.092038] imx6q-pcie 4c300000.pcie: PCIe(LNK_STS:0x00000c00) link down detected ... " Platform: i.MX95 EVK board plus local Root Ports reset supports based on the #1 and #2 patches of v7 patch-set. Notes of the logs: - One Gen3 NVME device is connected. - "./memtool 4c341058=0;./memtool 4c341058=1;" is used to toggle the LTSSM_EN bit to trigger the link down. - Toggle BIT6 of Bridge Control Register to trigger hot reset by "./memtool 4c30003c=004001ff; ./memtool 4c30003c=000001ff;" - The Root Port reset patches works correctly at first. However, after several hot-reset triggers, the link enters a repeated down/up cycling state. Logs: [ 3.553188] imx6q-pcie 4c300000.pcie: host bridge /soc/pcie@4c300000 ranges: [ 3.560308] imx6q-pcie 4c300000.pcie: IO 0x006ff00000..0x006fffffff -> 0x0000000000 [ 3.568525] imx6q-pcie 4c300000.pcie: MEM 0x0910000000..0x091fffffff -> 0x0010000000 [ 3.577314] imx6q-pcie 4c300000.pcie: config reg[1] 0x60100000 == cpu 0x60100000 [ 3.796029] imx6q-pcie 4c300000.pcie: iATU: unroll T, 128 ob, 128 ib, align 4K, limit 1024G [ 4.003746] imx6q-pcie 4c300000.pcie: PCIe Gen.3 x1 link up [ 4.009553] imx6q-pcie 4c300000.pcie: PCI host bridge to bus 0000:00 root@imx95evk:~# root@imx95evk:~# root@imx95evk:~# ./memtool 4c341058=0;./memtool 4c341058=1; Writing 32-bit value 0x0 to address 0x4C341058 Writing 32-bit v [ 87.265348] imx6q-pcie 4c300000.pcie: PCIe(LNK_STS:0x00000d01) link down detected alue 0x1 to adder [ 87.273106] imx6q-pcie 4c300000.pcie: Stop root bus and handle link down ss 0x4C341058 [ 87.281264] pcieport 0000:00:00.0: Recovering Root Port due to Link Down [ 87.289245] pci 0000:01:00.0: AER: can't recover (no error_detected callback) root@imx95evk:~# [ 87.514216] imx6q-pcie 4c300000.pcie: PCIe(LNK_STS:0x00000700) link up detected [ 87.702968] imx6q-pcie 4c300000.pcie: PCIe Gen.3 x1 link up [ 87.834983] pcieport 0000:00:00.0: Root Port has been reset [ 87.840714] pcieport 0000:00:00.0: AER: device recovery failed [ 87.846592] imx6q-pcie 4c300000.pcie: Rescan bus after link up is detected [ 87.855947] pcieport 0000:00:00.0: bridge configuration invalid ([bus 00-00]), reconfiguring [ 87.864423] pci_bus 0000:01: busn_res: [bus 01-ff] end is updated to 01 root@imx95evk:~# root@imx95evk:~# cat /proc/interrupts | grep lnk; 273: 2 0 0 0 0 0 GICv3 342 Level PCIe PME, lnk_notify root@imx95evk:~# root@imx95evk:~# root@imx95evk:~# ./memtool 4c30003c=004001ff; ./memtool 4c30003c=000001ff; Writing 32-bit va [ 107.028086] imx6q-pcie 4c300000.pcie: PCIe(LNK_STS:0x00000d00) link down detected lue 0x4001FF to a [ 107.037018] imx6q-pcie 4c300000.pcie: Stop root bus and handle link down ddress 0x4C30003C [ 107.045137] pcieport 0000:00:00.0: Recovering Root Port due to Link Down Writing 32-bit [ 107.053332] pci 0000:01:00.0: AER: can't recover (no error_detected callback) value 0x1FF to address 0x4C30003C root@imx95evk:~# [ 107.282146] imx6q-pcie 4c300000.pcie: PCIe(LNK_STS:0x00000700) link up detected [ 107.470801] imx6q-pcie 4c300000.pcie: PCIe Gen.3 x1 link up [ 107.602823] pcieport 0000:00:00.0: Root Port has been reset [ 107.608601] pcieport 0000:00:00.0: AER: device recovery failed [ 107.614497] imx6q-pcie 4c300000.pcie: Rescan bus after link up is detected [ 107.623805] pcieport 0000:00:00.0: bridge configuration invalid ([bus 00-00]), reconfiguring [ 107.632281] pci_bus 0000:01: busn_res: [bus 01] end is updated to 01 root@imx95evk:~# root@imx95evk:~# cat /proc/interrupts | grep lnk; 273: 4 0 0 0 0 0 GICv3 342 Level PCIe PME, lnk_notify root@imx95evk:~# root@imx95evk:~# ./memtool 4c30003c=004001ff; ./memtool 4c30003c=000001ff; Writing 32-bit va [ 133.424041] imx6q-pcie 4c300000.pcie: PCIe(LNK_STS:0x00000d00) link down detected lue 0x4001FF to a [ 133.432954] imx6q-pcie 4c300000.pcie: Stop root bus and handle link down ddress 0x4C30003C [ 133.441106] pcieport 0000:00:00.0: Recovering Root Port due to Link Down Writing 32-bit [ 133.449309] pci 0000:01:00.0: AER: can't recover (no error_detected callback) value 0x1FF to address 0x4C30003C root@imx95evk:~# [ 133.677824] imx6q-pcie 4c300000.pcie: PCIe(LNK_STS:0x00000700) link up detected [ 133.870414] imx6q-pcie 4c300000.pcie: PCIe Gen.3 x1 link up [ 134.002534] pcieport 0000:00:00.0: Root Port has been reset [ 134.008307] pcieport 0000:00:00.0: AER: device recovery failed [ 134.014193] imx6q-pcie 4c300000.pcie: Rescan bus after link up is detected [ 134.023418] pcieport 0000:00:00.0: bridge configuration invalid ([bus 00-00]), reconfiguring [ 134.031881] pci_bus 0000:01: busn_res: [bus 01] end is updated to 01 root@imx95evk:~# ./memtool 4c30003c=004001ff; ./memtool 4c30003c=000001ff; Writing 32-bit va [ 140.149713] imx6q-pcie 4c300000.pcie: PCIe(LNK_STS:0x00000d00) link down detected lue 0x4001FF to a [ 140.158614] imx6q-pcie 4c300000.pcie: Stop root bus and handle link down ddress 0x4C30003C [ 140.166779] pcieport 0000:00:00.0: Recovering Root Port due to Link Down [ 140.174981] pci 0000:01:00.0: AER: can't recover (no error_detected callback) Writing 32-bit value 0x1FF to address 0x4C30003C root@imx95evk:~# [ 140.401605] imx6q-pcie 4c300000.pcie: PCIe(LNK_STS:0x00000700) link up detected [ 140.590491] imx6q-pcie 4c300000.pcie: PCIe Gen.3 x1 link up [ 140.596206] imx6q-pcie 4c300000.pcie: PCIe(LNK_STS:0x00000c00) link down detected root@imx95evk:~# [ 141.630311] pcieport 0000:00:00.0: Data Link Layer Link Active not set in 100 msec [ 141.637950] pcieport 0000:00:00.0: Failed to reset Root Port: -25 [ 141.644095] pcieport 0000:00:00.0: AER: subordinate device reset failed [ 141.650883] pcieport 0000:00:00.0: AER: device recovery failed [ 141.656784] imx6q-pcie 4c300000.pcie: Stop root bus and handle link down [ 141.663520] pcieport 0000:00:00.0: Recovering Root Port due to Link Down [ 141.670271] pci 0000:01:00.0: AER: can't recover (no error_detected callback) [ 141.897701] imx6q-pcie 4c300000.pcie: PCIe(LNK_STS:0x00000700) link up detected [ 142.086341] imx6q-pcie 4c300000.pcie: PCIe Gen.3 x1 link up [ 142.092038] imx6q-pcie 4c300000.pcie: PCIe(LNK_STS:0x00000c00) link down detected [ 143.126273] pcieport 0000:00:00.0: Data Link Layer Link Active not set in 100 msec [ 143.133919] pcieport 0000:00:00.0: Failed to reset Root Port: -25 [ 143.140052] pcieport 0000:00:00.0: AER: subordinate device reset failed [ 143.146747] pcieport 0000:00:00.0: AER: device recovery failed [ 143.152604] imx6q-pcie 4c300000.pcie: Stop root bus and handle link down [ 143.159314] pcieport 0000:00:00.0: Recovering Root Port due to Link Down [ 143.166022] pci 0000:01:00.0: AER: can't recover (no error_detected callback) [ 143.389723] imx6q-pcie 4c300000.pcie: PCIe(LNK_STS:0x00000700) link up detected [ 143.582294] imx6q-pcie 4c300000.pcie: PCIe Gen.3 x1 link up [ 143.587996] imx6q-pcie 4c300000.pcie: PCIe(LNK_STS:0x00000c00) link down detected Thanks. Best Regards Richard Zhu > -----Original Message----- > From: Manivannan Sadhasivam via B4 Relay > <[email protected]> > Sent: 2026年3月10日 22:02 > To: Bjorn Helgaas <[email protected]>; Mahesh J Salgaonkar > <[email protected]>; Oliver O'Halloran <[email protected]>; Will > Deacon <[email protected]>; Lorenzo Pieralisi <[email protected]>; > Krzysztof Wilczy��ski <[email protected]>; Manivannan Sadhasivam > <[email protected]>; Rob Herring <[email protected]>; Heiko Stuebner > <[email protected]>; Philipp Zabel <[email protected]> > Cc: [email protected]; [email protected]; > [email protected]; [email protected]; > [email protected]; [email protected]; Niklas > Cassel <[email protected]>; Wilfred Mallawa <[email protected]>; > Krishna Chaitanya Chundru <[email protected]>; > [email protected]; Lukas Wunner <[email protected]>; Hongxing Zhu > <[email protected]>; Brian Norris <[email protected]>; > Wilson Ding <[email protected]>; Manivannan Sadhasivam > <[email protected]>; Frank Li > <[email protected]>; Manivannan Sadhasivam <[email protected]> > Subject: [PATCH v7 0/4] PCI: Add support for resetting the Root Ports in a > platform specific way > > Hi, > > Currently, in the event of AER/DPC, PCI core will try to reset the slot (Root > Port) and its subordinate devices by invoking bridge control reset and FLR. > But in some cases like AER Fatal error, it might be necessary to reset the > Root Ports using the PCI host bridge drivers in a platform specific way (as > indicated by the TODO in the pcie_do_recovery() function in > drivers/pci/pcie/err.c). > Otherwise, the PCI link won't be recovered successfully. > > So this series adds a new callback 'pci_host_bridge::reset_root_port' for the > host bridge drivers to reset the Root Port when a fatal error happens. > > Also, this series allows the host bridge drivers to handle PCI link down event > by resetting the Root Ports and recovering the bus. This is accomplished by > the help of the new 'pci_host_handle_link_down()' API. Host bridge drivers > are expected to call this API (preferrably from a threaded IRQ handler) with > relevant Root Port 'pci_dev' when a link down event is detected for the port. > The API will reuse the pcie_do_recovery() function to recover the link if AER > support is enabled, otherwise it will directly call the reset_root_port() > callback of the host bridge driver (if exists). > > For reference, I've modified the pcie-qcom driver to call > pci_host_handle_link_down() API with Root Port 'pci_dev' after receiving the > LDn global_irq event and populated 'pci_host_bridge::reset_root_port()' > callback to reset the Root Ports. > > Testing > ------- > > Tested on Qcom Lemans AU Ride platform with Host and EP SoCs connected > over PCIe link. Simulated the LDn by disabling LTSSM_EN on the EP and I > could verify that the link was getting recovered successfully. > > Changes in v7: > - Dropped Rockchip Root port reset patch due to reported issues. But the > series > works on other platforms as tested by others. > - Added pci_{lock/unlock}_rescan_remove() to guard pci_bus_error_reset() > as the > device could be removed in-between due to Native hotplug interrupt. > - Rebased on top of v7.0-rc1 > - Link to v6: > https://lore.ke/ > rnel.org%2Fr%2F20250715-pci-port-reset-v6-0-6f9cce94e7bb%40oss.qualcom > m.com&data=05%7C02%7Chongxing.zhu%40nxp.com%7Cfc7ebf6f8fbf44e7e2 > 9d08de7eada6e1%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C63 > 9087481469497457%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRy > dWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3 > D%3D%7C0%7C%7C%7C&sdata=0Jfe20c6n9OcTMUApKYRXuDd%2B0o85QzjG > %2B4IbIT%2BT6k%3D&reserved=0 > > Changes in v6: > - Incorporated the patch: > https://lore.ke/ > rnel.org%2Fall%2F20250524185304.26698-2-manivannan.sadhasivam%40lina > ro.org%2F&data=05%7C02%7Chongxing.zhu%40nxp.com%7Cfc7ebf6f8fbf44e7 > e29d08de7eada6e1%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C > 639087481469541004%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOn > RydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ > %3D%3D%7C0%7C%7C%7C&sdata=Y8Qjn9yJfxJ41h%2FRHjPITgY%2BQOqh5LEr > zir%2Fe3s%2B8g8%3D&reserved=0 > - Link to v5: > https://lore.ke/ > rnel.org%2Fr%2F20250715-pci-port-reset-v5-0-26a5d278db40%40oss.qualco > mm.com&data=05%7C02%7Chongxing.zhu%40nxp.com%7Cfc7ebf6f8fbf44e7e > 29d08de7eada6e1%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C6 > 39087481469572178%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnR > ydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ% > 3D%3D%7C0%7C%7C%7C&sdata=JUoqvSMB2sMZAxmGlGlN3iKYIQ9edjQhEfD > kQpqdeWQ%3D&reserved=0 > > Changes in v5: > * Reworked the pci_host_handle_link_down() to accept Root Port instead of > resetting all Root Ports in the event of link down. > * Renamed 'reset_slot' to 'reset_root_port' to avoid confusion as both terms > were used interchangibly and the series is intended to reset Root Port > only. > * Added the Rockchip driver change to this series. > * Dropped the applied patches and review/tested tags due to rework. > * Rebased on top of v6.16-rc1. > > Changes in v4: > - Handled link down first in the irq handler > - Updated ICC & OPP bandwidth after link up in reset_slot() callback > - Link to v3: > https://lore.ke/ > rnel.org%2Fr%2F20250417-pcie-reset-slot-v3-0-59a10811c962%40linaro.org& > data=05%7C02%7Chongxing.zhu%40nxp.com%7Cfc7ebf6f8fbf44e7e29d08de7 > eada6e1%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C6390874814 > 69600118%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYi > OiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7 > C0%7C%7C%7C&sdata=k39HxA8Zgft%2FGJ9HaugboOoPbQQkctqWRtiiGa0H95 > I%3D&reserved=0 > > Changes in v3: > - Made the pci-host-common driver as a common library for host controller > drivers > - Moved the reset slot code to pci-host-common library > - Link to v2: > https://lore.ke/ > rnel.org%2Fr%2F20250416-pcie-reset-slot-v2-0-efe76b278c10%40linaro.org& > data=05%7C02%7Chongxing.zhu%40nxp.com%7Cfc7ebf6f8fbf44e7e29d08de7 > eada6e1%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C6390874814 > 69625186%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYi > OiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7 > C0%7C%7C%7C&sdata=kQx%2B%2BgCe5TUTDDEwHttUGmqx0WYgL20WfPMf > WmfA1dk%3D&reserved=0 > > Changes in v2: > - Moved calling reset_slot() callback from pcie_do_recovery() to > pcibios_reset_secondary_bus() > - Link to v1: > https://lore.ke/ > rnel.org%2Fr%2F20250404-pcie-reset-slot-v1-0-98952918bf90%40linaro.org& > data=05%7C02%7Chongxing.zhu%40nxp.com%7Cfc7ebf6f8fbf44e7e29d08de7 > eada6e1%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C6390874814 > 69647067%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYi > OiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7 > C0%7C%7C%7C&sdata=kag0bpW2uzExYJAl84eHiMcUMVJmMcBPCahete6bDq > 8%3D&reserved=0 > > Signed-off-by: Manivannan Sadhasivam > <[email protected]> > --- > Manivannan Sadhasivam (4): > PCI/ERR: Add support for resetting the Root Ports in a platform > specific way > PCI: host-common: Add link down handling for Root Ports > PCI: qcom: Add support for resetting the Root Port due to link down > event > misc: pci_endpoint_test: Add AER error handlers > > drivers/misc/pci_endpoint_test.c | 20 +++++ > drivers/pci/controller/dwc/pcie-qcom.c | 143 > ++++++++++++++++++++++++++++++- > drivers/pci/controller/pci-host-common.c | 35 ++++++++ > drivers/pci/controller/pci-host-common.h | 1 + > drivers/pci/pci.c | 21 +++++ > drivers/pci/pcie/err.c | 6 +- > include/linux/pci.h | 1 + > 7 files changed, 221 insertions(+), 6 deletions(-) > --- > base-commit: 6de23f81a5e08be8fbf5e8d7e9febc72a5b5f27f > change-id: 20250715-pci-port-reset-4d9519570123 > > Best regards, > -- > Manivannan Sadhasivam <[email protected]> >
