Re: [PATCH v8 2/3] PCI/AER: Disable AER service on suspend
On Fri, Apr 19, 2024 at 4:35 AM Bjorn Helgaas wrote: > > On Tue, Apr 16, 2024 at 12:32:24PM +0800, Kai-Heng Feng wrote: > > When the power rail gets cut off, the hardware can create some electric > > noise on the link that triggers AER. If IRQ is shared between AER with > > PME, such AER noise will cause a spurious wakeup on system suspend. > > > > When the power rail gets back, the firmware of the device resets itself > > and can create unexpected behavior like sending PTM messages. For this > > case, the driver will always be too late to toggle off features should > > be disabled. > > > > As Per PCIe Base Spec 5.0, section 5.2, titled "Link State Power > > Management", TLP and DLLP transmission are disabled for a Link in L2/L3 > > Ready (D3hot), L2 (D3cold with aux power) and L3 (D3cold) states. So if > > the power will be turned off during suspend process, disable AER service > > and re-enable it during the resume process. This should not affect the > > basic functionality. > > > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=209149 > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=216295 > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=218090 > > Signed-off-by: Kai-Heng Feng > > Thanks for reviving this series. I tried follow the history about > this, but there are at least two series that were very similar and I > can't put it all together. > > > --- > > v8: > > - Add more bug reports. > > > > v7: > > - Wording > > - Disable AER completely (again) if power will be turned off > > > > v6: > > v5: > > - Wording. > > > > v4: > > v3: > > - No change. > > > > v2: > > - Only disable AER IRQ. > > - No more check on PME IRQ#. > > - Use helper. > > > > drivers/pci/pcie/aer.c | 25 + > > 1 file changed, 25 insertions(+) > > > > diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c > > index ac6293c24976..bea7818c2d1b 100644 > > --- a/drivers/pci/pcie/aer.c > > +++ b/drivers/pci/pcie/aer.c > > @@ -28,6 +28,7 @@ > > #include > > #include > > #include > > +#include > > #include > > #include > > #include > > @@ -1497,6 +1498,28 @@ static int aer_probe(struct pcie_device *dev) > > return 0; > > } > > > > +static int aer_suspend(struct pcie_device *dev) > > +{ > > + struct aer_rpc *rpc = get_service_data(dev); > > + struct pci_dev *pdev = rpc->rpd; > > + > > + if (pci_ancestor_pr3_present(pdev) || pm_suspend_via_firmware()) > > + aer_disable_rootport(rpc); > > Why do we check pci_ancestor_pr3_present(pdev) and > pm_suspend_via_firmware()? I'm getting pretty convinced that we need > to disable AER interrupts on suspend in general. I think it will be > better if we do that consistently on all platforms, not special cases > based on details of how we suspend. Sure. Will change in next revision. > > Also, why do we use aer_disable_rootport() instead of just > aer_disable_irq()? I think it's the interrupt that causes issues on > suspend. I see that there *were* some versions that used > aer_disable_irq(), but I can't find the reason it changed. Interrupt can cause system wakeup, if it's shared with PME. The reason why aer_disable_rootport() is used over aer_disable_irq() is that when the latter is used the error still gets logged during sleep cycle. Once the pcieport driver resumes, it invokes aer_root_reset() to reset the hierarchy, while the hierarchy hasn't resumed yet. So use aer_disable_rootport() to prevent such issue from happening. Kai-Heng > > > + > > + return 0; > > +} > > + > > +static int aer_resume(struct pcie_device *dev) > > +{ > > + struct aer_rpc *rpc = get_service_data(dev); > > + struct pci_dev *pdev = rpc->rpd; > > + > > + if (pci_ancestor_pr3_present(pdev) || pm_resume_via_firmware()) > > + aer_enable_rootport(rpc); > > + > > + return 0; > > +} > > + > > /** > > * aer_root_reset - reset Root Port hierarchy, RCEC, or RCiEP > > * @dev: pointer to Root Port, RCEC, or RCiEP > > @@ -1561,6 +1584,8 @@ static struct pcie_port_service_driver aerdriver = { > > .service= PCIE_PORT_SERVICE_AER, > > > > .probe = aer_probe, > > + .suspend= aer_suspend, > > + .resume = aer_resume, > > .remove = aer_remove, > > }; > > > > -- > > 2.34.1 > >
Re: [PATCH v8 2/3] PCI/AER: Disable AER service on suspend
On Tue, Apr 16, 2024 at 12:32:24PM +0800, Kai-Heng Feng wrote: > When the power rail gets cut off, the hardware can create some electric > noise on the link that triggers AER. If IRQ is shared between AER with > PME, such AER noise will cause a spurious wakeup on system suspend. > > When the power rail gets back, the firmware of the device resets itself > and can create unexpected behavior like sending PTM messages. For this > case, the driver will always be too late to toggle off features should > be disabled. > > As Per PCIe Base Spec 5.0, section 5.2, titled "Link State Power > Management", TLP and DLLP transmission are disabled for a Link in L2/L3 > Ready (D3hot), L2 (D3cold with aux power) and L3 (D3cold) states. So if > the power will be turned off during suspend process, disable AER service > and re-enable it during the resume process. This should not affect the > basic functionality. > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=209149 > Link: https://bugzilla.kernel.org/show_bug.cgi?id=216295 > Link: https://bugzilla.kernel.org/show_bug.cgi?id=218090 > Signed-off-by: Kai-Heng Feng Thanks for reviving this series. I tried follow the history about this, but there are at least two series that were very similar and I can't put it all together. > --- > v8: > - Add more bug reports. > > v7: > - Wording > - Disable AER completely (again) if power will be turned off > > v6: > v5: > - Wording. > > v4: > v3: > - No change. > > v2: > - Only disable AER IRQ. > - No more check on PME IRQ#. > - Use helper. > > drivers/pci/pcie/aer.c | 25 + > 1 file changed, 25 insertions(+) > > diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c > index ac6293c24976..bea7818c2d1b 100644 > --- a/drivers/pci/pcie/aer.c > +++ b/drivers/pci/pcie/aer.c > @@ -28,6 +28,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -1497,6 +1498,28 @@ static int aer_probe(struct pcie_device *dev) > return 0; > } > > +static int aer_suspend(struct pcie_device *dev) > +{ > + struct aer_rpc *rpc = get_service_data(dev); > + struct pci_dev *pdev = rpc->rpd; > + > + if (pci_ancestor_pr3_present(pdev) || pm_suspend_via_firmware()) > + aer_disable_rootport(rpc); Why do we check pci_ancestor_pr3_present(pdev) and pm_suspend_via_firmware()? I'm getting pretty convinced that we need to disable AER interrupts on suspend in general. I think it will be better if we do that consistently on all platforms, not special cases based on details of how we suspend. Also, why do we use aer_disable_rootport() instead of just aer_disable_irq()? I think it's the interrupt that causes issues on suspend. I see that there *were* some versions that used aer_disable_irq(), but I can't find the reason it changed. > + > + return 0; > +} > + > +static int aer_resume(struct pcie_device *dev) > +{ > + struct aer_rpc *rpc = get_service_data(dev); > + struct pci_dev *pdev = rpc->rpd; > + > + if (pci_ancestor_pr3_present(pdev) || pm_resume_via_firmware()) > + aer_enable_rootport(rpc); > + > + return 0; > +} > + > /** > * aer_root_reset - reset Root Port hierarchy, RCEC, or RCiEP > * @dev: pointer to Root Port, RCEC, or RCiEP > @@ -1561,6 +1584,8 @@ static struct pcie_port_service_driver aerdriver = { > .service= PCIE_PORT_SERVICE_AER, > > .probe = aer_probe, > + .suspend= aer_suspend, > + .resume = aer_resume, > .remove = aer_remove, > }; > > -- > 2.34.1 >
Re: [PATCH v8 2/3] PCI/AER: Disable AER service on suspend
On 4/15/24 9:32 PM, Kai-Heng Feng wrote: > When the power rail gets cut off, the hardware can create some electric > noise on the link that triggers AER. If IRQ is shared between AER with > PME, such AER noise will cause a spurious wakeup on system suspend. > > When the power rail gets back, the firmware of the device resets itself > and can create unexpected behavior like sending PTM messages. For this > case, the driver will always be too late to toggle off features should > be disabled. > > As Per PCIe Base Spec 5.0, section 5.2, titled "Link State Power > Management", TLP and DLLP transmission are disabled for a Link in L2/L3 > Ready (D3hot), L2 (D3cold with aux power) and L3 (D3cold) states. So if > the power will be turned off during suspend process, disable AER service > and re-enable it during the resume process. This should not affect the > basic functionality. > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=209149 > Link: https://bugzilla.kernel.org/show_bug.cgi?id=216295 > Link: https://bugzilla.kernel.org/show_bug.cgi?id=218090 > Signed-off-by: Kai-Heng Feng > --- Looks good to me. Reviewed-by: Kuppuswamy Sathyanarayanan > v8: > - Add more bug reports. > > v7: > - Wording > - Disable AER completely (again) if power will be turned off > > v6: > v5: > - Wording. > > v4: > v3: > - No change. > > v2: > - Only disable AER IRQ. > - No more check on PME IRQ#. > - Use helper. > > drivers/pci/pcie/aer.c | 25 + > 1 file changed, 25 insertions(+) > > diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c > index ac6293c24976..bea7818c2d1b 100644 > --- a/drivers/pci/pcie/aer.c > +++ b/drivers/pci/pcie/aer.c > @@ -28,6 +28,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -1497,6 +1498,28 @@ static int aer_probe(struct pcie_device *dev) > return 0; > } > > +static int aer_suspend(struct pcie_device *dev) > +{ > + struct aer_rpc *rpc = get_service_data(dev); > + struct pci_dev *pdev = rpc->rpd; > + > + if (pci_ancestor_pr3_present(pdev) || pm_suspend_via_firmware()) > + aer_disable_rootport(rpc); > + > + return 0; > +} > + > +static int aer_resume(struct pcie_device *dev) > +{ > + struct aer_rpc *rpc = get_service_data(dev); > + struct pci_dev *pdev = rpc->rpd; > + > + if (pci_ancestor_pr3_present(pdev) || pm_resume_via_firmware()) > + aer_enable_rootport(rpc); > + > + return 0; > +} > + > /** > * aer_root_reset - reset Root Port hierarchy, RCEC, or RCiEP > * @dev: pointer to Root Port, RCEC, or RCiEP > @@ -1561,6 +1584,8 @@ static struct pcie_port_service_driver aerdriver = { > .service= PCIE_PORT_SERVICE_AER, > > .probe = aer_probe, > + .suspend= aer_suspend, > + .resume = aer_resume, > .remove = aer_remove, > }; > -- Sathyanarayanan Kuppuswamy Linux Kernel Developer