The final discuss and version are here https://bugzilla.kernel.org/show_bug.cgi?id=216877 https://patchwork.kernel.org/project/linux-pci/patch/[email protected]/
And koba has backported the old version of that commit to oem kernel commit 16e5386dabd18bd9c507867b0df6c414783af4a8 Author: Mika Westerberg <[email protected]> Date: Mon Oct 2 10:00:44 2023 +0300 UBUNTU: SAUCE: PCI/ASPM: Add back L1 PM Substate save and restore BugLink: https://bugs.launchpad.net/bugs/2042500 Commit a7152be79b62 ("Revert "PCI/ASPM: Save L1 PM Substates Capability for suspend/resume"") reverted saving and restoring of ASPM L1 Substates due to a regression that caused resume from suspend to fail on certain systems. However, we never added this capability back and this is now causing systems fail to enter low power CPU states, drawing more power from the battery. The original revert mentioned that we restore L1 PM substate configuration even though ASPM L1 may already be enabled. This is due the fact that the pci_restore_aspm_l1ss_state() was called before pci_restore_pcie_state(). Try to enable this functionality again following PCIe r6.0.1, sec 5.5.4 more closely by: 1) Do not restore ASPM configuration in pci_restore_pcie_state() but do that after PCIe capability is restored in pci_restore_aspm_state() following PCIe r6.0, sec 5.5.4. 2) ASPM is first enabled on the upstream component and then downstream (this is already forced by the parent-child ordering of Linux Device Power Management framework). 3) Program ASPM L1 PM substate configuration before L1 enables. 4) Program ASPM L1 PM substate enables last after rest of the fields in the capability are programmed. 5) Add denylist that skips restoring on the ASUS and TUXEDO systems where these regressions happened, just in case. For the TUXEDO case we only skip restore if the BIOS is involved in system suspend (that's forcing "mem_sleep=deep" in the command line). This is to avoid possible power regression when the default suspend to idle is used, and at the same time make sure the devices continue working after resume when the BIOS is involved. Reported-by: Koba Ko <[email protected]> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217321 Link: https://bugzilla.kernel.org/show_bug.cgi?id=216782 Link: https://bugzilla.kernel.org/show_bug.cgi?id=216877 Cc: Tasev Nikola <[email protected]> Cc: Mark Enriquez <[email protected]> Cc: Thomas Witt <[email protected]> Cc: Werner Sembach <[email protected]> Tested-by: Kai-Heng Feng <[email protected]> Signed-off-by: Mika Westerberg <[email protected]> Reviewed-by: Ilpo Järvinen <[email protected]> (backported from https://lore.kernel.org/all/[email protected]/) Signed-off-by: Koba Ko <[email protected]> Signed-off-by: Timo Aaltonen <[email protected]> ** Bug watch added: Linux Kernel Bug Tracker #216877 https://bugzilla.kernel.org/show_bug.cgi?id=216877 ** Bug watch added: Linux Kernel Bug Tracker #217321 https://bugzilla.kernel.org/show_bug.cgi?id=217321 ** Bug watch added: Linux Kernel Bug Tracker #216782 https://bugzilla.kernel.org/show_bug.cgi?id=216782 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-oem-5.14 in Ubuntu. https://bugs.launchpad.net/bugs/1980829 Title: System freeze after resuming from suspend due to PCI ASPM settings Status in HWE Next: Fix Released Status in linux package in Ubuntu: Fix Released Status in linux-oem-5.14 package in Ubuntu: Invalid Status in linux-oem-5.17 package in Ubuntu: Invalid Status in linux-oem-6.0 package in Ubuntu: Invalid Status in linux-oem-6.1 package in Ubuntu: Invalid Status in linux-oem-6.5 package in Ubuntu: Invalid Status in linux source package in Focal: Invalid Status in linux-oem-5.14 source package in Focal: Fix Released Status in linux-oem-5.17 source package in Focal: Invalid Status in linux-oem-6.0 source package in Focal: Invalid Status in linux-oem-6.1 source package in Focal: Invalid Status in linux-oem-6.5 source package in Focal: Invalid Status in linux source package in Jammy: Fix Released Status in linux-oem-5.14 source package in Jammy: Invalid Status in linux-oem-5.17 source package in Jammy: Fix Released Status in linux-oem-6.0 source package in Jammy: Fix Released Status in linux-oem-6.1 source package in Jammy: Fix Released Status in linux-oem-6.5 source package in Jammy: In Progress Status in linux source package in Kinetic: Fix Released Status in linux-oem-5.14 source package in Kinetic: Invalid Status in linux-oem-5.17 source package in Kinetic: Invalid Status in linux-oem-6.0 source package in Kinetic: Invalid Status in linux-oem-6.1 source package in Kinetic: Invalid Status in linux-oem-6.5 source package in Kinetic: Invalid Bug description: For OEM-6.1 [Impact] While doing some tests such as suspend/resume or CPU stress tests the system would hang. [Fix] Below commit fixed the issue, but not going to be merged into mainline. The patch is still under discussion and have other variance, and we already merged the origin patch into oem-6.0 and 5.15/5.19 for a year, so could consider it's safer for us. https://patchwork.ozlabs.org/project/linux-pci/patch/[email protected]/ I also created a DMI quirk to make the patches only affects on listed platforms. [Test] The affected machines could suspend/resume well. [Where problems could occur] The patches only affects on the listed platforms, and won't affect other platforms. ====================================================================== For Jammy/Kinetic SRU [Impact] While doing some tests such as suspend/resume or CPU stress tests the system would hang. [Fix] The 2 commits fix the issue, but still not get accepted yet. https://patchwork.ozlabs.org/project/linux-pci/patch/[email protected]/ https://patchwork.ozlabs.org/project/linux-pci/patch/[email protected]/ So, I created a DMI quirk to make the patches only affects on listed platforms. [Test] Verified on the failed machines and ODM also verified on their side. [Where problems could occur] The patches only affects on the listed platforms, and won't affect other platforms. ====================================================================== For OEM-6.0 [Impact] While doing some tests such as suspend/resume or CPU stress tests the system would hang. [Fix] The 2 commits fix the issue, but still not get accepted yet. https://patchwork.ozlabs.org/project/linux-pci/patch/[email protected]/ https://patchwork.ozlabs.org/project/linux-pci/patch/[email protected]/ [Test] Verified on the failed machines and ODM also verified on their side. [Where problems could occur] The 2 patches look pretty safe to me, they try to preserve the ASPM state of devices. To manage notifications about this bug go to: https://bugs.launchpad.net/hwe-next/+bug/1980829/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : [email protected] Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp

