https://bugzilla.kernel.org/show_bug.cgi?id=221543
Bug ID: 221543
Summary: [nouveau] RTX 5080 (GB203): Xid:13 WIDTH CT Violation
on kernels 6.16.7+ causing system hang
Product: Drivers
Version: 2.5
Hardware: Other
OS: Linux
Status: NEW
Severity: high
Priority: P3
Component: Video(DRI - non Intel)
Assignee: [email protected]
Reporter: [email protected]
Regression: No
Hardware: Lenovo ThinkStation P8 (30HJS53F00)
CPU: AMD Threadripper PRO 7975WX (32C/64T)
GPU: NVIDIA GeForce RTX 5080 (GB203, rev a1) [PCI 81:00.0]
RAM: 128GB
OS: Fedora 43 (Workstation)
Affected kernels: 6.16.7 through 7.0.8 (latest tested)
Working kernel: 6.13.6-200.fc41
Problem:
On kernels 6.16.7 and later, the nouveau driver triggers repeated Xid:13
(WIDTH CT Violation) errors on all 7 GPCs of the RTX 5080. The errors are
triggered by Chrome GPU acceleration but occur even with hardware
acceleration disabled in Chrome.
The error cascade is:
1. Chrome triggers Xid:13 on nouveau
2. Chrome GPU channel is killed
3. Xid:13 repeats every ~20 minutes
4. Xwayland encounters an MMU fault
5. System hangs completely (B0E1/S202 diagnostic codes on chassis)
Reproducer:
- Boot any kernel >= 6.16.7 with nouveau on RTX 5080
- Open Chrome (or Chromium)
- Xid:13 errors appear within seconds in dmesg
- System hangs within hours (sometimes minutes)
What does NOT help:
- Switching to X11 (hangs before reaching graphical mode)
- Disabling Chrome hardware acceleration
- Disabling C-States in BIOS
- glxgears/GPU stress tests do NOT reproduce (only Chrome triggers it)
Workaround:
Downgrade to kernel 6.13.6-200.fc41. On that kernel, nouveau reports
"unknown chipset" for RTX 5080 (expected, as GB203 support was added
later), so the proprietary NVIDIA driver is required.
This suggests a regression was introduced between 6.13.6 and 6.16.7 in
nouveau's GB203/RTX 5080 support path.
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are watching the assignee of the bug.