** Changed in: linux-firmware (Ubuntu)
Assignee: Seth Forshee (sforshee) => (unassigned)
--
You received this bug notification because you are a member of Desktop
Packages, which is subscribed to mesa in Ubuntu.
https://bugs.launchpad.net/bugs/1928393
Title:
linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0]
retry page fault"
Status in amd:
New
Status in linux-firmware package in Ubuntu:
Invalid
Status in mesa package in Ubuntu:
Invalid
Status in linux-firmware source package in Hirsute:
Confirmed
Bug description:
After upgrading linux-firmware from 1.190.5 to 1.197 (as part of the
upgrade from Ubuntu 20.10 to 21.04), I started experiencing frequent
and severe GPU instability. When this happens, I see this error in
dmesg:
[20061.061069] amdgpu 0000:03:00.0: amdgpu: [gfxhub0] retry page fault
(src_id:0 ring:0 vmid:1 pasid:32769, for process Xorg pid 1141 thread Xorg:cs0
pid 1236)
[20061.061103] amdgpu 0000:03:00.0: amdgpu: in page starting at address
0x800000401000 from client 27
[20061.061135] amdgpu 0000:03:00.0: amdgpu:
VM_L2_PROTECTION_FAULT_STATUS:0x00101031
[20061.061147] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: TCP
(0x8)
[20061.061157] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x1
[20061.061167] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
[20061.061174] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x3
[20061.061183] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
[20061.061189] amdgpu 0000:03:00.0: amdgpu: RW: 0x0
I'll attach a couple of full dmesgs that I collected.
Many of the times when this happens, the screen and keyboard freeze
irreversibly (I tried waiting for more than 30 minutes, but it doesn't
help). I can still log in via ssh though. When there's no freeze, I
can continue using the computer normally, but the laptop fans keep
running are always running and the battery depletes fast. There's
probably something on a permanent loop either in the kernel or in the
GPU.
This bug happens several times a day, rendering the machine so
unstable as to be almost unusable. It is a severe regression and I'm
aghast that it passed AMD's Quality Assurance.
After downgrading back to linux-firmware 1.190.5, the machine is back
to the previous, mostly-reliable state. Which is to say, this bug is
gone, I'm just left with the other amdgpu suspend bug I've learned to
live with since I bought this computer.
Please revert the amdgpu firmware in this package as soon as possible.
This is unbearable.
Relevant information:
Ubuntu version: 21.04
Linux kernel: 5.11.0-17-generic x86_64
CPU model: AMD Ryzen 7 3700U with Radeon Vega Mobile Gfx
GPU: 03:00.0 VGA compatible controller: Advanced Micro Devices, Inc.
[AMD/ATI] Picasso (rev c1)
Laptop model: Lenovo Ideapad S145
To manage notifications about this bug go to:
https://bugs.launchpad.net/amd/+bug/1928393/+subscriptions
--
Mailing list: https://launchpad.net/~desktop-packages
Post to : [email protected]
Unsubscribe : https://launchpad.net/~desktop-packages
More help : https://help.launchpad.net/ListHelp