Public bug reported:

After upgrading linux-firmware from 1.190.5 to 1.197 (as part of the
upgrade from Ubuntu 20.10 to 21.04), I started experiencing frequent and
severe GPU instability. When this happens, I see this error in dmesg:

[20061.061069] amdgpu 0000:03:00.0: amdgpu: [gfxhub0] retry page fault 
(src_id:0 ring:0 vmid:1 pasid:32769, for process Xorg pid 1141 thread Xorg:cs0 
pid 1236)
[20061.061103] amdgpu 0000:03:00.0: amdgpu:   in page starting at address 
0x800000401000 from client 27
[20061.061135] amdgpu 0000:03:00.0: amdgpu: 
[20061.061147] amdgpu 0000:03:00.0: amdgpu:      Faulty UTCL2 client ID: TCP 
[20061.061157] amdgpu 0000:03:00.0: amdgpu:      MORE_FAULTS: 0x1
[20061.061167] amdgpu 0000:03:00.0: amdgpu:      WALKER_ERROR: 0x0
[20061.061174] amdgpu 0000:03:00.0: amdgpu:      PERMISSION_FAULTS: 0x3
[20061.061183] amdgpu 0000:03:00.0: amdgpu:      MAPPING_ERROR: 0x0
[20061.061189] amdgpu 0000:03:00.0: amdgpu:      RW: 0x0

I'll attach a couple of full dmesgs that I collected.

Many of the times when this happens, the screen and keyboard freeze
irreversibly (I tried waiting for more than 30 minutes, but it doesn't
help). I can still log in via ssh though. When there's no freeze, I can
continue using the computer normally, but the laptop fans keep running
are always running and the battery depletes fast. There's probably
something on a permanent loop either in the kernel or in the GPU.

This bug happens several times a day, rendering the machine so unstable
as to be almost unusable. It is a severe regression and I'm aghast that
it passed AMD's Quality Assurance.

After downgrading back to linux-firmware 1.190.5, the machine is back to
the previous, mostly-reliable state. Which is to say, this bug is gone,
I'm just left with the other amdgpu suspend bug I've learned to live
with since I bought this computer.

Please revert the amdgpu firmware in this package as soon as possible.
This is unbearable.

Relevant information:
Ubuntu version: 21.04
Linux kernel: 5.11.0-17-generic x86_64
CPU model: AMD Ryzen 7 3700U with Radeon Vega Mobile Gfx
GPU: 03:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] 
Picasso (rev c1)
Laptop model: Lenovo Ideapad S145

** Affects: linux-firmware (Ubuntu)
     Importance: Undecided
         Status: New

** Attachment added: "This GPU bug happened after the machine resumed from 

You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

  linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0]
  retry page fault"

To manage notifications about this bug go to:

ubuntu-bugs mailing list

Reply via email to