**[RESOLVED] Root cause identified and fixed via ACPI SSDT override**

---

## Update on original report

The original report described the symptom as a D3cold → D0 wake failure
with a broken 64-bit BAR allocation above 4GB. After deeper
investigation, the BAR addresses were actually correctly assigned below
4GB, and the D3cold state was a consequence rather than the cause. The
actual root cause is an ACPI firmware bug in the HP ENVY x360 BIOS that
prevents the GPU from completing its hardware initialization sequence at
boot.

---

## Root cause

The NVIDIA GP108M (MX250) ACPI device at `\_SB.PCI0.RP05.PXSX` has its
`_PS0` power-on method defined as **completely empty** in the HP
firmware (`ssdt4.dsl`):

```asl
Method (_PS0, 0, Serialized) {}
```

The actual hardware initialization sequence lives in a separate method
called `HGON`, which performs the full bring-up: GPIO reset signals,
Embedded Controller power pin (`EC0.DGPO`), PCIe L2/L3 exit (`L23D()`),
and link training configuration. However, `HGON` is guarded by a check
function `CCHK` that reads an internal flag `ONOF`. This flag is
initialized to `One` ("GPU already on") at boot:

```asl
Name (ONOF, One)
```

Because `ONOF` starts as `One`, `CCHK(1)` returns `Zero` and `HGON`
exits immediately without doing anything. The GPU receives bus power via
the `PC01` power resource but the chip never completes initialization —
which is why `lspci` reports `Unknown header type 7f` (the PCI config
space returns garbage because the chip is unresponsive).

Meanwhile, `runtime_status` reports `active` because the kernel
successfully called `_PS0` and received no error — it had no way to know
the method was a no-op.

This explains why Windows works: the Windows NVIDIA driver performs an
implicit power cycle during initialization, calling the `HGOF` path
first (which sets `ONOF = Zero`), then re-enabling the GPU. This causes
`CCHK` to pass and `HGON` to execute the full hardware sequence. Linux
does not perform this cycle, so the GPU is never actually initialized.

---

## What was attempted (expanding on original report)

- `pci=nocrs` → caused NVMe controller resource conflicts, system failed to boot
- `acpi_osi="Windows 2015"` and `acpi_osi="Windows 2009"` → BIOS executed the 
Windows firmware path, but the driver-side initialization cycle was still 
missing; no effect
- D3cold → D3hot forcing via custom kernel build → gave the GPU bus power but 
`HGON` still never ran; `header type 7f` persisted
- `pcie_port_pm=off`, `pci=realloc`, `NVreg_DynamicPowerManagement` → none 
address the missing `HGON` call
- `prime-select intel` → stable workaround but GPU completely unused

---

## Solution: ACPI SSDT override

A minimal SSDT override replaces the empty `_PS0` with one that
correctly initializes the GPU:

```asl
DefinitionBlock ("", "SSDT", 2, "HACK", "MX250FIX", 0x00000001)
{
    External (_SB_.PCI0.RP05.PXSX, DeviceObj)
    External (_SB_.PCI0.RP05.PXSX.HGON, MethodObj)
    External (_SB_.PCI0.RP05.PXSX.ONOF, IntObj)

    Scope (\_SB.PCI0.RP05.PXSX)
    {
        Method (_PS0, 0, Serialized)
        {
            ONOF = Zero   // clear flag so CCHK(1) allows HGON to proceed
            HGON ()       // execute full GPIO + EC + PCIe L23D init sequence
        }
    }
}
```

**Installation steps:**

```bash
# 1. Install tools
sudo apt install acpica-tools

# 2. Save the above as ssdt-mx250-fix.asl, then compile
iasl ssdt-mx250-fix.asl

# 3. Install
sudo mkdir -p /boot/acpi
sudo cp ssdt-mx250-fix.aml /boot/acpi/

# 4. Create GRUB loader script
echo '#!/bin/sh
echo "acpi /acpi/ssdt-mx250-fix.aml"' | sudo tee /etc/grub.d/01_acpi
sudo chmod +x /etc/grub.d/01_acpi
sudo update-grub

# 5. Blacklist nouveau so the proprietary driver can load
echo 'blacklist nouveau
options nouveau modeset=0' | sudo tee /etc/modprobe.d/blacklist-nouveau.conf
sudo update-initramfs -u
sudo reboot
```

**Results after applying the fix:**
- `lspci` reports `type 00` (valid PCI header) — GPU fully responsive on PCI bus
- `nouveau` loaded: detected GP108 (138000a1), 4096 MiB GDDR5
- NVIDIA proprietary driver 580.159.03 loaded successfully
- `nvidia-smi` functional, GPU at 50°C idle in P8 power state
- `glmark2` confirmed rendering on `NVIDIA GeForce MX250/PCIe/SSE2` via PRIME 
offload
- Tested on kernel 6.17.0-35-generic

---

## Affected hardware

- **Laptop:** HP ENVY x360 15-dr0003np (Subsystem ID 103c:85e7)
- **GPU:** NVIDIA GP108M GeForce MX250, PCI ID `10de:1d13`, rev a1
- **iGPU:** Intel UHD 620 (WhiskeyLake-U, `8086:3ea0`)
- **ACPI path:** `\_SB.PCI0.RP05.PXSX`
- **GPU VBIOS:** 86.08.3b.00.27
- **OS:** Linux Mint 22.3 (Noble/Ubuntu 24.04 base)
- **Kernels affected:** 6.11.0-29-generic, 6.17.0-35-generic

---

## Recommended kernel-level fix

The correct long-term fix would be a quirk entry for PCI subsystem ID
`103c:85e7` (and potentially other HP platforms with GP108M) that
ensures `ONOF` is reset to `Zero` before the `PC01._ON` power resource
is evaluated, or that calls `HGON` directly as part of the device power-
on sequence. This would eliminate the need for a per-user SSDT override
on affected HP hardware.

The SSDT override above serves as a fully working workaround in the
meantime.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2158075

Title:
  PCI/PM Regression: NVIDIA MX250 / PCI Bridge unable to change power
  state from D3cold to D0 on kernels >= 6.8 (HP Envy)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2158075/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to