[Bug 1928393] Re: linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0] retry page fault"

2022-04-28 Thread Mario Limonciello
Just to correct a few of the targets on this issue.  
* The reverts mentioned in #30 need to be pulled into linux-firmware for focal. 
 
* They're already included in jammy.

** Changed in: amd
   Status: New => Fix Released

** No longer affects: mesa (Ubuntu)

** Also affects: linux-firmware (Ubuntu Focal)
   Importance: Undecided
   Status: New

** Changed in: linux-firmware (Ubuntu Focal)
 Assignee: (unassigned) => Juerg Haefliger (juergh)

** Changed in: linux-firmware (Ubuntu)
   Status: Invalid => Fix Released

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1928393

Title:
  linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0]
  retry page fault"

To manage notifications about this bug go to:
https://bugs.launchpad.net/amd/+bug/1928393/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1928393] Re: linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0] retry page fault"

2022-02-15 Thread Juerg Haefliger
Hirsute is EOL so closing this bug. Please open a new one if the problem
still persists with one of the supported series.

** Changed in: linux-firmware (Ubuntu Hirsute)
   Status: Incomplete => Won't Fix

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1928393

Title:
  linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0]
  retry page fault"

To manage notifications about this bug go to:
https://bugs.launchpad.net/amd/+bug/1928393/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Re: [Bug 1928393] Re: linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0] retry page fault"

2022-01-29 Thread Thiago Jung Bauermann
Hello Juerg,

Em quinta-feira, 20 de janeiro de 2022, às 12:32:48 -03, Juerg Haefliger 
escreveu:
> If you want this fixed in Ubuntu I need to know what series are
> affected. Hirsute goes EOL at the end of the month. Are Impish and/or
> Jammy working or affected as well?

I upgraded to Impish a while ago.

I haven’t seen “retry page fault” messages in a long while (I don’t think 
it’s related to the distro upgrade, but not sure) so I’d say this 
particular bug is fixed at least for me (I have a Picasso GPU).

Which is not to say that things are rosy, unfortunately. But the other 
issues I see don’t cause any message to appear in dmesg so it’s hard to 
search for existing bug reports about them or open a new one.

The following is off-topic for this bug report, but I’ll mention anyway, 
hope you’ll bear with me:

One thing I noticed is that things did get rosy when I did two things:

1. Switched from Xorg to Wayland.
2. Switched Firefox to use Wayland as well.

This led me to the conclusion that the bugs that plague my machine are 
triggered by something that Firefox does when it uses X (both “natively” or 
via XWayland). For some reason, when it uses Wayland it doesn’t trigger 
these GPU bugs.

Another thing that might be relevant is that I have tons of tabs open 
(probably more than 200) distributed in 27 open windows. Perhaps I’m 
stressing some kind of resource limit in the driver or firmware?

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1928393

Title:
  linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0]
  retry page fault"

To manage notifications about this bug go to:
https://bugs.launchpad.net/amd/+bug/1928393/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1928393] Re: linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0] retry page fault"

2022-01-20 Thread Juerg Haefliger
If you want this fixed in Ubuntu I need to know what series are
affected. Hirsute goes EOL at the end of the month. Are Impish and/or
Jammy working or affected as well?

** Changed in: linux-firmware (Ubuntu Hirsute)
   Status: Confirmed => Incomplete

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1928393

Title:
  linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0]
  retry page fault"

To manage notifications about this bug go to:
https://bugs.launchpad.net/amd/+bug/1928393/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1928393] Re: linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0] retry page fault"

2021-12-07 Thread Lancillotto
No more crashes with firmware
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-
firmware.git/snapshot/linux-firmware-20211027.tar.gz and kernel 5.15.6.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1928393

Title:
  linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0]
  retry page fault"

To manage notifications about this bug go to:
https://bugs.launchpad.net/amd/+bug/1928393/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1928393] Re: linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0] retry page fault"

2021-11-29 Thread Seth Forshee
** Changed in: linux-firmware (Ubuntu)
 Assignee: Seth Forshee (sforshee) => (unassigned)

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1928393

Title:
  linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0]
  retry page fault"

To manage notifications about this bug go to:
https://bugs.launchpad.net/amd/+bug/1928393/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1928393] Re: linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0] retry page fault"

2021-11-29 Thread Juerg Haefliger
@antonio-petricca, What series? What kernel?

I can produce a hirsute linux-firmware package with the reverted sdma
firmware but need someone to verify it on hirsute with the hirsute
kernel. Any takers? Or have you all moved on to impish?

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1928393

Title:
  linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0]
  retry page fault"

To manage notifications about this bug go to:
https://bugs.launchpad.net/amd/+bug/1928393/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1928393] Re: linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0] retry page fault"

2021-11-29 Thread Lancillotto
With latest firmare
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-
firmware.git/snapshot/linux-firmware-20211027.tar.gz is much more
stable.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1928393

Title:
  linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0]
  retry page fault"

To manage notifications about this bug go to:
https://bugs.launchpad.net/amd/+bug/1928393/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1928393] Re: linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0] retry page fault"

2021-11-29 Thread Juerg Haefliger
** Also affects: mesa (Ubuntu Hirsute)
   Importance: Undecided
   Status: New

** Also affects: linux-firmware (Ubuntu Hirsute)
   Importance: Undecided
   Status: New

** No longer affects: mesa (Ubuntu Hirsute)

** Changed in: mesa (Ubuntu)
   Status: Confirmed => Invalid

** Changed in: linux-firmware (Ubuntu Hirsute)
   Status: New => Confirmed

** Changed in: linux-firmware (Ubuntu)
   Status: Confirmed => Invalid

** Changed in: linux-firmware (Ubuntu Hirsute)
 Assignee: (unassigned) => Juerg Haefliger (juergh)

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1928393

Title:
  linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0]
  retry page fault"

To manage notifications about this bug go to:
https://bugs.launchpad.net/amd/+bug/1928393/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1928393] Re: linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0] retry page fault"

2021-11-29 Thread Juerg Haefliger
@antonio-petricca, sorry but 5.15.2 is not a supported Ubuntu kernel and
especially not on Bionic with (old) Bionic firmware.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1928393

Title:
  linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0]
  retry page fault"

To manage notifications about this bug go to:
https://bugs.launchpad.net/amd/+bug/1928393/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1928393] Re: linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0] retry page fault"

2021-11-18 Thread Lancillotto
I have the same issue on:

Dell E5495
AMD Ryzen 7 PRO 2700U w/ Radeon Vega Mobile Gfx
16Gb RAM
Linux Mint 19.3 (Ubuntu 18.04)
Kernel 5.15.2
Linux Firmware 1.173.20

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1928393

Title:
  linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0]
  retry page fault"

To manage notifications about this bug go to:
https://bugs.launchpad.net/amd/+bug/1928393/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1928393] Re: linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0] retry page fault"

2021-10-27 Thread Alex Deucher
The reverts are in the latest firmware tree:
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/amdgpu?id=d7b50e61669dc137924337d03d09b8986eb752a3
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/amdgpu?id=d843e520a4b0d92b986645548d11ade3b9b239a4
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/amdgpu?id=99d72504bff7ab40c261b8509c0b9d8abf98b296

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1928393

Title:
  linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0]
  retry page fault"

To manage notifications about this bug go to:
https://bugs.launchpad.net/amd/+bug/1928393/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1928393] Re: linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0] retry page fault"

2021-10-27 Thread Juerg Haefliger
Hi. I'm picking up this ticket from Seth. Reading through the history it
seems it's still an open issue? My understanding is that upstream
'fixed' this by reverting fw blobs in version 20210818. I can produce a
linux-firmware test package for hirsute 20.04 with these reverts if
necessary. Just let me know.


** Changed in: linux-firmware (Ubuntu)
   Status: Incomplete => Confirmed

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1928393

Title:
  linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0]
  retry page fault"

To manage notifications about this bug go to:
https://bugs.launchpad.net/amd/+bug/1928393/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Re: [Bug 1928393] Re: linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0] retry page fault"

2021-10-06 Thread Thiago Jung Bauermann
Hello,

I’d just like to report that I haven’t seen this problem in a while. The 
last time I see the “retry page fault” messages in my log was on August 9.

I’ve been using the ‘amdgpu/picasso*‘ files from linux-firmware commit 
c46b8c364b82 (“ice: update package file to 1.3.26.0”) so apparently this 
particular problem was recently fixed.

Which isn’t to say that I’m having a trouble-free amdgpu experience, 
unfortunately. Every week or so my laptop comes back from sleep with the 
screen and keyboard frozen (I can still ssh into it), but now the error is:

kernel: [drm:drm_atomic_helper_wait_for_flip_done [drm_kms_helper]] *ERROR* 
[CRTC:67:crtc-0] flip_done timed out

But it seems to be a separate problem from the one reported in this 
particular launchpad issue. I’ll see if I can find a more appropriate 
launchpad issue and post the details there.

Thank you all for your help and support with this issue.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1928393

Title:
  linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0]
  retry page fault"

To manage notifications about this bug go to:
https://bugs.launchpad.net/amd/+bug/1928393/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1928393] Re: linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0] retry page fault"

2021-08-29 Thread I-Cat
It is the Blue-Tooth Driver.
I Got This Too on my 
Acer Aspire F5-573 series "Laptop"
There is a sticker that says "Intel i-5 core" Know amd is a possablity ?
I Do not think My Processor Is an amd?
Also It set my Screen Res to like 1377x768 (Its a 6k screen) 17" screen
When I was On Windows "Yes I had to switch off Windows Because my Windows 11 os 
failed"
It was installed When I installed viurtalbox or kodi.
It is like a nightmare.
I lost function of my usb 1.0 driver.
It installs a pnp driver that do not exsist.
The pnp Driver is a printer driver that is pointing to your desktop. 

I Think This Is a Virus That is effecting other systems like My Self.
Thanks -- I-Cat

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1928393

Title:
  linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0]
  retry page fault"

To manage notifications about this bug go to:
https://bugs.launchpad.net/amd/+bug/1928393/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1928393] Re: linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0] retry page fault"

2021-08-29 Thread Michal Przybylowicz
I have similar messages in journalctl:

Package: linux-firmware
Version: 1.197.3

Aug 29 16:58:44 dagon kernel: amdgpu :03:00.0: amdgpu: [mmhub] page fault 
(src_id:0 ring:0 vmid:1 pasid:32778, for process vivaldi-bin pid 1673 thread 
vivaldi-bi:cs0 pid 1699)
Aug 29 16:58:44 dagon kernel: amdgpu :03:00.0: amdgpu:   in page starting 
at address 0x80010114 from client 0x12 (VMC)
Aug 29 16:58:44 dagon kernel: amdgpu :03:00.0: amdgpu: 
MMVM_L2_PROTECTION_FAULT_STATUS:0x00105631
Aug 29 16:58:44 dagon kernel: amdgpu :03:00.0: amdgpu:  Faulty 
UTCL2 client ID: VCN0 (0x2b)
Aug 29 16:58:44 dagon kernel: amdgpu :03:00.0: amdgpu:  
MORE_FAULTS: 0x1
Aug 29 16:58:44 dagon kernel: amdgpu :03:00.0: amdgpu:  
WALKER_ERROR: 0x0
Aug 29 16:58:44 dagon kernel: amdgpu :03:00.0: amdgpu:  
PERMISSION_FAULTS: 0x3
Aug 29 16:58:44 dagon kernel: amdgpu :03:00.0: amdgpu:  
MAPPING_ERROR: 0x0
Aug 29 16:58:44 dagon kernel: amdgpu :03:00.0: amdgpu:  RW: 0x0
Aug 29 16:58:44 dagon kernel: amdgpu :03:00.0: amdgpu: [mmhub] page fault 
(src_id:0 ring:0 vmid:1 pasid:32778, for process vivaldi-bin pid 1673 thread 
vivaldi-bi:cs0 pid 1699)
Aug 29 16:58:44 dagon kernel: amdgpu :03:00.0: amdgpu:   in page starting 
at address 0x800101188000 from client 0x12 (VMC)
Aug 29 16:58:44 dagon kernel: amdgpu :03:00.0: amdgpu: 
MMVM_L2_PROTECTION_FAULT_STATUS:0x00105631
Aug 29 16:58:44 dagon kernel: amdgpu :03:00.0: amdgpu:  Faulty 
UTCL2 client ID: VCN0 (0x2b)
Aug 29 16:58:44 dagon kernel: amdgpu :03:00.0: amdgpu:  
MORE_FAULTS: 0x1
Aug 29 16:58:44 dagon kernel: amdgpu :03:00.0: amdgpu:  
WALKER_ERROR: 0x0
Aug 29 16:58:44 dagon kernel: amdgpu :03:00.0: amdgpu:  
PERMISSION_FAULTS: 0x3
Aug 29 16:58:44 dagon kernel: amdgpu :03:00.0: amdgpu:  
MAPPING_ERROR: 0x0
Aug 29 16:58:44 dagon kernel: amdgpu :03:00.0: amdgpu:  RW: 0x0
Aug 29 16:58:44 dagon kernel: amdgpu :03:00.0: amdgpu: [mmhub] page fault 
(src_id:0 ring:0 vmid:1 pasid:32778, for process vivaldi-bin pid 1673 thread 
vivaldi-bi:cs0 pid 1699)
Aug 29 16:58:44 dagon kernel: amdgpu :03:00.0: amdgpu:   in page starting 
at address 0x800101189000 from client 0x12 (VMC)
Aug 29 16:58:44 dagon kernel: amdgpu :03:00.0: amdgpu: 
MMVM_L2_PROTECTION_FAULT_STATUS:0x00105631
Aug 29 16:58:44 dagon kernel: amdgpu :03:00.0: amdgpu:  Faulty 
UTCL2 client ID: VCN0 (0x2b)
Aug 29 16:58:44 dagon kernel: amdgpu :03:00.0: amdgpu:  
MORE_FAULTS: 0x1
Aug 29 16:58:44 dagon kernel: amdgpu :03:00.0: amdgpu:  
WALKER_ERROR: 0x0
Aug 29 16:58:44 dagon kernel: amdgpu :03:00.0: amdgpu:  
PERMISSION_FAULTS: 0x3
Aug 29 16:58:44 dagon kernel: amdgpu :03:00.0: amdgpu:  
MAPPING_ERROR: 0x0
Aug 29 16:58:44 dagon kernel: amdgpu :03:00.0: amdgpu:  RW: 0x0

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1928393

Title:
  linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0]
  retry page fault"

To manage notifications about this bug go to:
https://bugs.launchpad.net/amd/+bug/1928393/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1928393] Re: linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0] retry page fault"

2021-08-05 Thread Leandro Scott
It happened to me too

** Attachment added: "Crash log of amdgpu driver"
   
https://bugs.launchpad.net/ubuntu/+source/linux-firmware/+bug/1928393/+attachment/5516220/+files/amdgu_crash.txt

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1928393

Title:
  linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0]
  retry page fault"

To manage notifications about this bug go to:
https://bugs.launchpad.net/amd/+bug/1928393/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Re: [Bug 1928393] Re: linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0] retry page fault"

2021-08-04 Thread Thiago Jung Bauermann
Hello,

For some reason, in the past week or so this bug has been freezing my 
machine every couple of days or so (I’m surprised that AMD wasn’t able 
to reproduce the problem yet¹). You can imagine how “pleasant” it makes 
using this computer.

Today I got an interesting error in dmesg, perhaps it provides some
clue:

[38454.299445] [ cut here ]
[38454.299449] refcount_t: underflow; use-after-free.
[38454.299457] WARNING: CPU: 5 PID: 17577 at lib/refcount.c:28 
refcount_warn_saturate+0xae/0xf0
[38454.299465] Modules linked in: overlay ccm rfcomm xt_CHECKSUM xt_MASQUERADE 
xt_conntrack ipt_REJECT xt_tcpudp nft_compat nft_counter nft_objref 
nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 
nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject 
nft_ct bridge stp llc nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 
nf_defrag_ipv4 ip_set nf_tables nfnetlink cmac algif_hash algif_skcipher af_alg 
bnep binfmt_misc nls_iso8859_1 snd_hda_codec_generic ledtrig_audio 
snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg soundwire_intel 
soundwire_generic_allocation soundwire_cadence snd_hda_codec snd_hda_core 
snd_hwdep soundwire_bus snd_soc_core snd_compress ac97_bus snd_pcm_dmaengine 
intel_rapl_msr intel_rapl_common joydev snd_pcm edac_mce_amd snd_seq_midi 
ath10k_pci ath10k_core snd_seq_midi_event kvm_amd snd_rawmidi ath mac80211 kvm 
uvcvideo snd_seq btusb videobuf2_vmalloc rapl videobuf2_memops videobuf2_v4l2 
videobuf2_common btrtl input_leds
[38454.299510]  serio_raw btbcm videodev btintel wmi_bmof snd_seq_device 
efi_pstore bluetooth snd_timer mc cfg80211 k10temp ecdh_generic snd ecc 
ideapad_laptop ccp libarc4 sparse_keymap soundcore elan_i2c mac_hid 
sch_fq_codel msr parport_pc ppdev lp parport ip_tables x_tables autofs4 btrfs 
blake2b_generic xor raid6_pq libcrc32c dm_crypt zstd zram z3fold amdgpu 
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel iommu_v2 gpu_sched 
aesni_intel i2c_algo_bit drm_ttm_helper ttm crypto_simd cryptd glue_helper 
drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec rc_core drm 
i2c_piix4 nvme xhci_pci i2c_hid xhci_pci_renesas nvme_core wmi video hid
[38454.299550] CPU: 5 PID: 17577 Comm: kworker/u32:18 Not tainted 
5.11.0-25-generic #27-Ubuntu
[38454.299552] Hardware name: LENOVO 81V7/LNVNB161216, BIOS BUCN23WW 11/05/2019
[38454.299554] Workqueue: events_unbound async_run_entry_fn
[38454.299559] RIP: 0010:refcount_warn_saturate+0xae/0xf0
[38454.299562] Code: f8 1c 96 01 01 e8 9f f1 62 00 0f 0b 5d c3 80 3d e5 1c 96 
01 00 75 91 48 c7 c7 e8 c7 60 b9 c6 05 d5 1c 96 01 01 e8 7f f1 62 00 <0f> 0b 5d 
c3 80 3d c3 1c 96 01 00 0f 85 6d ff ff ff 48 c7 c7 40 c8
[38454.299564] RSP: 0018:b60383537b58 EFLAGS: 00010282
[38454.299566] RAX:  RBX:  RCX: 8d4578b58ac8
[38454.299567] RDX: ffd8 RSI: 0027 RDI: 8d4578b58ac0
[38454.299568] RBP: b60383537b58 R08: b9c73540 R09: b60383537af0
[38454.299569] R10: 2d2d2d2d R11: b603835379e8 R12: 8d44cf64d000
[38454.299570] R13:  R14: b6038b8cd000 R15: 0004
[38454.299571] FS:  () GS:8d4578b4() 
knlGS:
[38454.299572] CS:  0010 DS:  ES:  CR0: 80050033
[38454.299574] CR2:  CR3: 00016ae1 CR4: 003506e0
[38454.299575] Call Trace:
[38454.299578]  dc_stream_release+0x78/0x80 [amdgpu]
[38454.299751]  dc_resource_state_destruct+0x58/0x80 [amdgpu]
[38454.299904]  dc_release_state+0x2f/0x60 [amdgpu]
[38454.300055]  dm_atomic_destroy_state+0x21/0x30 [amdgpu]
[38454.300211]  drm_atomic_state_default_clear+0x23d/0x2f0 [drm]
[38454.300236]  __drm_atomic_state_free+0x5e/0xa0 [drm]
[38454.300257]  drm_atomic_helper_resume+0x12b/0x150 [drm_kms_helper]
[38454.300271]  dm_resume+0x2bd/0x540 [amdgpu]
[38454.300427]  amdgpu_device_ip_resume_phase2+0x58/0xc0 [amdgpu]
[38454.300531]  amdgpu_device_resume+0x8d/0x370 [amdgpu]
[38454.300635]  ? native_queued_spin_lock_slowpath+0x2b/0x30
[38454.300638]  ? _raw_spin_lock_irq+0x26/0x2a
[38454.300642]  ? __wait_for_common+0xfb/0x150
[38454.300644]  amdgpu_pmops_resume+0x17/0x20 [amdgpu]
[38454.300748]  pci_pm_resume+0x6b/0xf0
[38454.300751]  ? pci_pm_poweroff_noirq+0x120/0x120
[38454.300752]  dpm_run_callback+0x50/0x110
[38454.300755]  device_resume+0xad/0x200
[38454.300757]  async_resume+0x1e/0x40
[38454.300759]  async_run_entry_fn+0x3c/0x150
[38454.300761]  process_one_work+0x220/0x3c0
[38454.300764]  worker_thread+0x50/0x370
[38454.300765]  kthread+0x12f/0x150
[38454.300767]  ? process_one_work+0x3c0/0x3c0
[38454.300768]  ? __kthread_bind_mask+0x70/0x70
[38454.300770]  ret_from_fork+0x22/0x30
[38454.300775] ---[ end trace 1f54ad57671def2f ]---

Note that immediately before it there’s a page allocation failure during
wake up from suspend. So there’s some refcounting bug in an error path
somewhere.

Much later there’s the familiar 

Re: [Bug 1928393] Re: linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0] retry page fault"

2021-07-16 Thread Thiago Jung Bauermann
Em segunda-feira, 12 de julho de 2021, às 15:12:19 -03, Alex Deucher 
escreveu:
> Does the latest firmware in the firmware git tree help?
> https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.g
> it/log/amdgpu

I updated the picasso* files from commit:

d79c26779d45 amdgpu: update vcn firmware for green sardine for 21.20

And I still see the issue. It took a while to reproduce: I updated the 
firmware (and ran `update-initramfs -u -k all` to get it into the 
initramfs) on July 12 and had the laptop turned on since then (closing the 
lid to put it to sleep), and today I saw the problem again.

The full dmesg is attached.

** Attachment added: "dmesg.log"
   https://bugs.launchpad.net/bugs/1928393/+attachment/5511525/+files/dmesg.log

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1928393

Title:
  linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0]
  retry page fault"

To manage notifications about this bug go to:
https://bugs.launchpad.net/amd/+bug/1928393/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1928393] Re: linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0] retry page fault"

2021-07-15 Thread Heinrich Schuchardt
After updating to linux-firmware commit d79c26779d45906 the problems
persist on Lenovo Thinkpad E585:

amdgpu :05:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0
vmid:1 pasid:32769, for process Xorg pid 1336 thread Xorg:cs0 pid 1862)

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1928393

Title:
  linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0]
  retry page fault"

To manage notifications about this bug go to:
https://bugs.launchpad.net/amd/+bug/1928393/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1928393] Re: linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0] retry page fault"

2021-07-13 Thread Serj
@alexander-deucher

CPU model: AMD Ryzen 7 2700U with Radeon Vega Mobile Gfx
Kernel: 5.10.49
Firmware:
VCE feature version: 0, firmware version: 0x
UVD feature version: 0, firmware version: 0x
MC feature version: 0, firmware version: 0x
ME feature version: 52, firmware version: 0x00a4
PFP feature version: 52, firmware version: 0x00bc
CE feature version: 52, firmware version: 0x004f
RLC feature version: 1, firmware version: 0x0213
RLC SRLC feature version: 1, firmware version: 0x0001
RLC SRLG feature version: 1, firmware version: 0x0001
RLC SRLS feature version: 1, firmware version: 0x0001
MEC feature version: 52, firmware version: 0x01c2
MEC2 feature version: 52, firmware version: 0x01c2
SOS feature version: 0, firmware version: 0x
ASD feature version: 0, firmware version: 0x2155
TA RAS feature version: 0x, firmware version: 0x212b
TA XGMI feature version: 0x, firmware version: 0x212b
TA HDCP feature version: 0x1711, firmware version: 0x212b
TA DTM feature version: 0x1203, firmware version: 0x212b
SMC feature version: 0, firmware version: 0x1e49
SDMA0 feature version: 41, firmware version: 0x0028
VCN feature version: 0, firmware version: 0x0210c005
DMCU feature version: 0, firmware version: 0x
DMCUB feature version: 0, firmware version: 0x
VBIOS version: 113-RAVEN-107

Also affected

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1928393

Title:
  linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0]
  retry page fault"

To manage notifications about this bug go to:
https://bugs.launchpad.net/amd/+bug/1928393/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1928393] Re: linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0] retry page fault"

2021-07-12 Thread Heinrich Schuchardt
There is an upstream bug report 
https://bugzilla.kernel.org/show_bug.cgi?id=213391
Comment 9 suggest: "downgrade the firmware."
Comment 15 claims: "20210315 seems to work fine here (on an E595)."

** Bug watch added: Linux Kernel Bug Tracker #213391
   https://bugzilla.kernel.org/show_bug.cgi?id=213391

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1928393

Title:
  linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0]
  retry page fault"

To manage notifications about this bug go to:
https://bugs.launchpad.net/amd/+bug/1928393/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1928393] Re: linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0] retry page fault"

2021-07-12 Thread Alex Deucher
Does the latest firmware in the firmware git tree help?
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/log/amdgpu

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1928393

Title:
  linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0]
  retry page fault"

To manage notifications about this bug go to:
https://bugs.launchpad.net/amd/+bug/1928393/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1928393] Re: linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0] retry page fault"

2021-07-09 Thread Heinrich Schuchardt
Also affected:

Ubuntu version: 21.04
Linux kernel: 5.11.0-22-generic  x86_64
CPU model: AMD Ryzen 5 2500U with Radeon Vega Mobile Gfx
GPU: 05:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] 
Raven Ridge [Radeon Vega Series / Radeon Vega Mobile Series] (rev c4)
Laptop model: Lenovo Thinkpad E585

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1928393

Title:
  linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0]
  retry page fault"

To manage notifications about this bug go to:
https://bugs.launchpad.net/amd/+bug/1928393/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Re: [Bug 1928393] Re: linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0] retry page fault"

2021-06-22 Thread Thiago Jung Bauermann
Em quinta-feira, 17 de junho de 2021, às 00:45:30 -03, Thiago Jung 
Bauermann escreveu:
> > > I think it may be related to a change in mesa.  Specifically mesa
> > > commit
> > > 820dec3f7c7.  For more info see
> > > https://gitlab.freedesktop.org/mesa/mesa/-/issues/4866
> > 
> > I’ll run with Mario’s build of Mesa with that patch backported.
> > Thanks, Mario!
> 
> I’m running with the Mesa build from Mario’s PPA now. If I don’t see any
> issue within two weeks, I think it will be possible to say that the bug
> is gone, or at least much harder to hit.
> 
> I can’t use my reproducer in this case, because I can’t change the Mesa
> version inside the flatpak image.

I just had this bug happen again spontaneously, while running with  Mesa 
from Mario’s PPA.

So this bug isn’t fixed by the patch mentioned by Alex.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1928393

Title:
  linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0]
  retry page fault"

To manage notifications about this bug go to:
https://bugs.launchpad.net/amd/+bug/1928393/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1928393] Re: linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0] retry page fault"

2021-06-21 Thread Launchpad Bug Tracker
Status changed to 'Confirmed' because the bug affects multiple users.

** Changed in: mesa (Ubuntu)
   Status: New => Confirmed

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1928393

Title:
  linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0]
  retry page fault"

To manage notifications about this bug go to:
https://bugs.launchpad.net/amd/+bug/1928393/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Re: [Bug 1928393] Re: linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0] retry page fault"

2021-06-16 Thread Thiago Jung Bauermann
I was finally able to spend a bit of time on this. Unfortunately, there’s 
not much to report back.

Em terça-feira, 8 de junho de 2021, às 15:13:36 -03, Thiago Jung Bauermann 
escreveu:
> Em terça-feira, 8 de junho de 2021, às 10:30:24 -03, Alex Deucher 
escreveu:
> > Can you narrow down which specific firmware file causes the problem?
> 
> Ok, I will try.

I don’t think I can narrow down which firmware file causes the problem, 
because I don’t have a last known good version. All firmware files that I 
tested (Ubuntu versions 1.190.5, 1.197 and latest linux-firmware.git) 
immediately trigger the bug when I try the only reliable reproducer I know 
(i.e., running flatpak’s com.github.quaternion package).

Since it can take several days for the bug to happen if I just use the 
machine normally, it would take weeks to narrow down which of the picasso_* 
files is more stable relative to the others. And even then, I wouldn’t be 
sure about it.

> > I think it may be related to a change in mesa.  Specifically mesa
> > commit
> > 820dec3f7c7.  For more info see
> > https://gitlab.freedesktop.org/mesa/mesa/-/issues/4866
> 
> I’ll run with Mario’s build of Mesa with that patch backported.
> Thanks, Mario!

I’m running with the Mesa build from Mario’s PPA now. If I don’t see any 
issue within two weeks, I think it will be possible to say that the bug is 
gone, or at least much harder to hit.

I can’t use my reproducer in this case, because I can’t change the Mesa 
version inside the flatpak image.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1928393

Title:
  linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0]
  retry page fault"

To manage notifications about this bug go to:
https://bugs.launchpad.net/amd/+bug/1928393/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1928393] Re: linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0] retry page fault"

2021-06-09 Thread Timo Aaltonen
21.04 comes with Mesa 21.0.1 which does not seem to have 820dec3f7c7

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1928393

Title:
  linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0]
  retry page fault"

To manage notifications about this bug go to:
https://bugs.launchpad.net/amd/+bug/1928393/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1928393] Re: linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0] retry page fault"

2021-06-08 Thread Thiago Jung Bauermann
** Bug watch added: gitlab.freedesktop.org/drm/amd/-/issues #1598
   https://gitlab.freedesktop.org/drm/amd/-/issues/1598

** Bug watch added: gitlab.freedesktop.org/drm/amd/-/issues #920
   https://gitlab.freedesktop.org/drm/amd/-/issues/920

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1928393

Title:
  linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0]
  retry page fault"

To manage notifications about this bug go to:
https://bugs.launchpad.net/amd/+bug/1928393/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Re: [Bug 1928393] Re: linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0] retry page fault"

2021-06-08 Thread Thiago Jung Bauermann
Thanks for your input.

Em terça-feira, 8 de junho de 2021, às 10:30:24 -03, Alex Deucher escreveu:
> Can you narrow down which specific firmware file causes the problem?

Ok, I will try.

Also, is it possible and/or worthwhile trying to bisect firmware versions from 
the linux-firmware repo? How coupled is the firmware with the kernel 
driver? E.g., can I try using firmware files from 1 year ago with current 
kernel and Mesa?

> We haven't been able to repro this.

One thing that’s a bit “fishy” about my machine is that it doesn’t seem to 
have a good clock:

[0.211436] TSC synchronization [CPU#0 -> CPU#1]:
[0.211436] Measured 3304683447 cycles TSC warp between CPUs, turning off 
TSC clock.
[0.211436] tsc: Marking TSC unstable due to check_tsc_sync_source failed
…
[0.252117] hpet0: at MMIO 0xfed0, IRQs 2, 8, 0 
[0.252117] hpet0: 3 comparators, 32-bit 14.318180 MHz counter
[0.253970] clocksource: Switched to clocksource hpet
…
[0.580451] Unstable clock detected, switching default tracing clock to 
"global"
   If you want to keep using the local clock, then add:
 "trace_clock=local"
   on the kernel command line

Could this bug be related to that?

> I think it may be related to a change in mesa.  Specifically mesa commit
> 820dec3f7c7.  For more info see
> https://gitlab.freedesktop.org/mesa/mesa/-/issues/4866

I’ll run with Mario’s build of Mesa with that patch backported.
Thanks, Mario!

> ** Bug watch added: gitlab.freedesktop.org/mesa/mesa/-/issues #4866
>https://gitlab.freedesktop.org/mesa/mesa/-/issues/4866

Other upstream issues that look similar:

https://gitlab.freedesktop.org/drm/amd/-/issues/1598
https://gitlab.freedesktop.org/drm/amd/-/issues/920

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1928393

Title:
  linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0]
  retry page fault"

To manage notifications about this bug go to:
https://bugs.launchpad.net/amd/+bug/1928393/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1928393] Re: linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0] retry page fault"

2021-06-08 Thread Mario Limonciello
Here's a PPA build with the mesa fix Alex mentioned backported:
https://launchpad.net/~superm1/+archive/ubuntu/lp1928393

If you can follow the directions to add that PPA and upgrade to that
mesa package you can see if that indeed fixes it.

** Also affects: mesa (Ubuntu)
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1928393

Title:
  linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0]
  retry page fault"

To manage notifications about this bug go to:
https://bugs.launchpad.net/amd/+bug/1928393/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1928393] Re: linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0] retry page fault"

2021-06-08 Thread Alex Deucher
Can you narrow down which specific firmware file causes the problem?  We
haven't been able to repro this.

I think it may be related to a change in mesa.  Specifically mesa commit
820dec3f7c7.  For more info see
https://gitlab.freedesktop.org/mesa/mesa/-/issues/4866


** Bug watch added: gitlab.freedesktop.org/mesa/mesa/-/issues #4866
   https://gitlab.freedesktop.org/mesa/mesa/-/issues/4866

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1928393

Title:
  linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0]
  retry page fault"

To manage notifications about this bug go to:
https://bugs.launchpad.net/amd/+bug/1928393/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1928393] Re: linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0] retry page fault"

2021-06-08 Thread Timo Aaltonen
Maybe folks at AMD can chime in? Picasso FW regressed.

** Also affects: amd
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1928393

Title:
  linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0]
  retry page fault"

To manage notifications about this bug go to:
https://bugs.launchpad.net/amd/+bug/1928393/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Re: [Bug 1928393] Re: linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0] retry page fault"

2021-05-30 Thread Thiago Jung Bauermann
Em quarta-feira, 26 de maio de 2021, às 16:46:14 -03, Thiago Jung Bauermann 
escreveu:
> But perhaps the upstream version is not too bad?

I take this back. I've been running with the upstream picasso* files since 
Wednesday, and I just had two freezes in less than one hour.

linux-firmware 1.190.5 is the only version which contains picasso* files 
that are stable.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1928393

Title:
  linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0]
  retry page fault"

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-firmware/+bug/1928393/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Re: [Bug 1928393] Re: linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0] retry page fault"

2021-05-26 Thread Thiago Jung Bauermann
Over the weekend I was finally able to revert back to the previous versions 
of the org.freedesktop.Platform and org.freedesktop.Platform.GL.default 
flatpak runtimes. It turns out that the `flatpak history` command wasn't 
necessay for the rollback.

Em sexta-feira, 14 de maio de 2021, às 13:14:22 -03, Seth Forshee escreveu:
> Before we revert we should see if newer firmware fixes the issue, and
> make sure we are only changing the specific firmware files for your
> hardware.
> 
> I think your hardware is the "Picasso" series. Can you try the
> following? If you are unsure about any of the following steps, let me
> know and I can provide you with test packages to install instead.
> 
> Save all files matching /lib/firmware/amdgpu/picasso* from linux-
> firmware 1.190.5. Reinstall 1.197, then overwrite the picasso firmware
> files with the ones you saved. Reboot, and confirm that the issues you
> see with 1.197 are fixed. If they are not fixed, then there's no need to
> proceed as we haven't found the correct firmware files which are causing
> your issues.

I did that exactly that, and I was able to run for 4 days without any retry 
page fault error. This makes me confident that the 1.190.5 firmware doesn't 
have the bug, and also that the amdgpu/picasso* files are the relevant 
ones.

> Then please download the picasso firmware files from here:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-> 
> firmware.git/tree/amdgpu
> 
> Use the "plain" link next to each file to download the file. Overwrite
> the files in /lib/firmware/admgpu with these files, reboot, and see if
> you continue to have problems.

I also did that, using the files from commit 55d964905a2b. More recent 
commits in that repo didn't touch the amdgpu directory so they're still the 
most recent firmware files for my hardware.

Unfortunately I still saw the retry page fault message on dmesg with it, 
and very soon after boot (IIRC it happened while running the sddm login 
manager, before I log in). On the bright side, it didn't have any advert 
effect on my computer and I just noticed hours later because I specifically 
grepped for it. So perhaps the latest firmware has a less nasty version of 
the bug?

And just to double-check the baseline reference, I also ran with pristine 
linux-firmware 1.197, the version which made my machine so unstable. I had 
a somewhat different experience this time. The bug still happened, but only 
after 20h of uptime. And the symptom was "just" a visual glitch while 
scrolling inside Firefox, not a complete freeze of the display and 
keyboard, as I was experiencing originally. Perhaps if I rebooted and 
insisted on using it again I would experience worse effects. But I thought 
that was enough to confirm that 1.197 is still bad.

So I'm not sure what to make of all this. I still wasn't able to pinpoint 
exactly what triggers the worst manifestation of the bug. But of the three 
versions of linux-firmware I used (1.190.5, 1.197 and upstream), 1.197 is 
still the one where things are worse so IMHO the picasso files need to come 
from one of the other two versions. 1.190.5 is the rock solid one, so I 
think it's the safest bet. But perhaps the upstream version is not too bad?

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1928393

Title:
  linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0]
  retry page fault"

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-firmware/+bug/1928393/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Re: [Bug 1928393] Re: linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0] retry page fault"

2021-05-18 Thread Thiago Jung Bauermann
Em sábado, 15 de maio de 2021, às 11:24:17 -03, Thiago Jung Bauermann 
escreveu:
> Unexpectedly, 1.197 is now reliable too! I have been running it for about
> half a day (which is more than what was possible before) and it is fine.

After 4 days of stability I just had the retry page fault problem again, 
with stock linux-firmware 1.197 and kernel 5.11.0-17-generic.

> The only thing that changed was that flatpak's org.freedesktop.Platform
> and org.freedesktop.Platform.GL.default were updated, not sure if
> yesterday or the day before.
> 
> This is relevant because I use Firefox from flathub.
> 
> I'm suspecting that the instability comes from the combination of linux-
> firmware 1.197 + a particular version of some userspace component (Mesa I
> guess) that was in org.freedesktop.Platform{,.GL.default}.

So apparently the flatpak update made the problem less likely to happen, 
but it still does.

> I'll try reverting the flatpak update to see if I can get back to the
> unstable state to confirm the hypothesis.

Life got in the way and I wasn't able to do this yet.

With it taking 4 days to reproduce the problem with the current stack, I 
think it still makes sense to revert the flatpak update.

Unfortunately "flatpak history" seems to be broken on my system:

$ flatpak history
error: appstream2/x86_64 is not application or runtime

I'll try using ostree directly. I don't think I'll be able to do it today, 
but hopefuly within the next couple of days.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1928393

Title:
  linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0]
  retry page fault"

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-firmware/+bug/1928393/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Re: [Bug 1928393] Re: linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0] retry page fault"

2021-05-15 Thread Thiago Jung Bauermann
The latest upstream firmware is stable, so I reverted back to 1.197 so that 
I could test only the picasso* files.

Before doing that, I decided to run for a while with pristine linux-
firmware 1.197 to double-check that the bug happens quickly.

Unexpectedly, 1.197 is now reliable too! I have been running it for about 
half a day (which is more than what was possible before) and it is fine. 
The only thing that changed was that flatpak's org.freedesktop.Platform and 
org.freedesktop.Platform.GL.default were updated, not sure if yesterday or 
the day before.

This is relevant because I use Firefox from flathub.

I'm suspecting that the instability comes from the combination of linux-
firmware 1.197 + a particular version of some userspace component (Mesa I 
guess) that was in org.freedesktop.Platform{,.GL.default}.

I'll try reverting the flatpak update to see if I can get back to the 
unstable state to confirm the hypothesis.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1928393

Title:
  linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0]
  retry page fault"

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-firmware/+bug/1928393/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Re: [Bug 1928393] Re: linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0] retry page fault"

2021-05-14 Thread Seth Forshee
On Fri, May 14, 2021 at 06:01:50PM -, Thiago Jung Bauermann wrote:
> Ah, ok. This morning I went ahead and overwrote the whole of /lib/firmware/
> amdgpu/ with the files from the latest commit of the upstream linux-
> firmware git repo you mention below. It's been only a little over 2 hours, 
> but my impression is that the latest firmware does solve the problem. If no 
> retry page fault happens by the end of the day, then I think it's safe to 
> say that it fixes the issue.

Sounds good.

> So at the end of the day (or earlier if I get a retry page fault) I'll do 
> the procedure you mention to use firmware 1.197 and only overwrite the 
> picasso* files to confirm if those are the ones that need to be changed.
> 
> I don't know much about GPUs, but from looking at Wikipedia¹ I think my 
> model is a "Vega 10". Should I overwrite the vega10* files as well?
> 
> ¹ https://en.wikipedia.org/wiki/Radeon_RX_Vega_series#Picasso_(2019)_2

I'm basing it on the PCI id I see in dmesg, which is 1002:15d8, which
the driver looks to identify as CHIP_RAVEN and flag with
AMD_APU_IS_PICASSO. Based on that it will pick firmware files which
begin with "picasso".  It would be informative though to have the output
from "lspci -vvnn" on your machine.

Let's try the picasso files first. If that doesn't work you can try the
vega10 files (without the new picasso files), then if that doesn't work,
try both. Then we'll have an idea about what the minimal set of files is
to fix the problem.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1928393

Title:
  linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0]
  retry page fault"

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-firmware/+bug/1928393/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Re: [Bug 1928393] Re: linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0] retry page fault"

2021-05-14 Thread Thiago Jung Bauermann
Hello Seth,

Thank you for the quick and detailed response.

Em sexta-feira, 14 de maio de 2021, às 13:14:22 -03, Seth Forshee escreveu:
> Before we revert we should see if newer firmware fixes the issue, and
> make sure we are only changing the specific firmware files for your
> hardware.

Ok, sounds good.

> I think your hardware is the "Picasso" series. Can you try the
> following? If you are unsure about any of the following steps, let me
> know and I can provide you with test packages to install instead.

Thanks for the offer. It's not a problem getting the files from git and 
overwrite them manually.

> Save all files matching /lib/firmware/amdgpu/picasso* from linux-
> firmware 1.190.5. Reinstall 1.197, then overwrite the picasso firmware
> files with the ones you saved. Reboot, and confirm that the issues you
> see with 1.197 are fixed. If they are not fixed, then there's no need to
> proceed as we haven't found the correct firmware files which are causing
> your issues.

Ah, ok. This morning I went ahead and overwrote the whole of /lib/firmware/
amdgpu/ with the files from the latest commit of the upstream linux-
firmware git repo you mention below. It's been only a little over 2 hours, 
but my impression is that the latest firmware does solve the problem. If no 
retry page fault happens by the end of the day, then I think it's safe to 
say that it fixes the issue.

So at the end of the day (or earlier if I get a retry page fault) I'll do 
the procedure you mention to use firmware 1.197 and only overwrite the 
picasso* files to confirm if those are the ones that need to be changed.

I don't know much about GPUs, but from looking at Wikipedia¹ I think my 
model is a "Vega 10". Should I overwrite the vega10* files as well?

¹ https://en.wikipedia.org/wiki/Radeon_RX_Vega_series#Picasso_(2019)_2

> Then please download the picasso firmware files from here:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-> 
firmware.git/tree/amdgpu
> 
> Use the "plain" link next to each file to download the file. Overwrite
> the files in /lib/firmware/admgpu with these files, reboot, and see if
> you continue to have problems.
> 
> ** Changed in: linux-firmware (Ubuntu)
>Importance: Undecided => High

Thanks!

> 
> ** Changed in: linux-firmware (Ubuntu)
>Status: New => Incomplete
> 
> ** Changed in: linux-firmware (Ubuntu)
>  Assignee: (unassigned) => Seth Forshee (sforshee)

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1928393

Title:
  linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0]
  retry page fault"

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-firmware/+bug/1928393/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1928393] Re: linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0] retry page fault"

2021-05-14 Thread Seth Forshee
Before we revert we should see if newer firmware fixes the issue, and
make sure we are only changing the specific firmware files for your
hardware.

I think your hardware is the "Picasso" series. Can you try the
following? If you are unsure about any of the following steps, let me
know and I can provide you with test packages to install instead.

Save all files matching /lib/firmware/amdgpu/picasso* from linux-
firmware 1.190.5. Reinstall 1.197, then overwrite the picasso firmware
files with the ones you saved. Reboot, and confirm that the issues you
see with 1.197 are fixed. If they are not fixed, then there's no need to
proceed as we haven't found the correct firmware files which are causing
your issues.

Then please download the picasso firmware files from here:

https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-
firmware.git/tree/amdgpu

Use the "plain" link next to each file to download the file. Overwrite
the files in /lib/firmware/admgpu with these files, reboot, and see if
you continue to have problems.

** Changed in: linux-firmware (Ubuntu)
   Importance: Undecided => High

** Changed in: linux-firmware (Ubuntu)
   Status: New => Incomplete

** Changed in: linux-firmware (Ubuntu)
 Assignee: (unassigned) => Seth Forshee (sforshee)

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1928393

Title:
  linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0]
  retry page fault"

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-firmware/+bug/1928393/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1928393] Re: linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0] retry page fault"

2021-05-13 Thread Thiago Jung Bauermann
This is the dmesg of an instance where I was able to continue using the
laptop despite the GPU bug (in the case of the dmesg I attached
previously, I had to ssh in to the machine to turn it off).

Notice that there are two instances of the retry page fault, one of them
right within 15 minutes of the machine being turned on. There was no
suspend/resume event this time. The laptop was turned on the whole time.

** Attachment added: "This GPU bug didn't involve suspend/resume and didn't 
freeze the screen."
   
https://bugs.launchpad.net/ubuntu/+source/linux-firmware/+bug/1928393/+attachment/5497260/+files/dmesg-4-didnt-freeze-screen.log

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1928393

Title:
  linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0]
  retry page fault"

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-firmware/+bug/1928393/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs