Posted the following comment to the Mate-desktop issue:
Had another hang with the same configuration as a youtube video played via a USB headphone (Jabra40). I was able to recover by killing Firefox, in which the video was playing. The video became choppy and garbled and then stopped. The stderr is below: ALSA lib conf.c:5187:(snd_config_expand) Unknown parameters 1 ALSA lib control.c:1379:(snd_ctl_open_noupdate) Invalid CTL sysdefault:1 ALSA lib conf.c:5187:(snd_config_expand) Unknown parameters 2 ALSA lib control.c:1379:(snd_ctl_open_noupdate) Invalid CTL sysdefault:2 ALSA lib pcm_dmix.c:1089:(snd_pcm_dmix_open) unable to open slave ALSA lib pcm_dmix.c:1089:(snd_pcm_dmix_open) unable to open slave ALSA lib pcm_dmix.c:1089:(snd_pcm_dmix_open) unable to open slave ALSA lib pcm_dmix.c:1089:(snd_pcm_dmix_open) unable to open slave ALSA lib pcm_dmix.c:1089:(snd_pcm_dmix_open) unable to open slave On re-launch of Firefox from terminal window, the following appeared: [GFX1-]: More than 1 GPU from same vendor detected via PCI, cannot deduce device On Thu, 2021-07-29 at 12:04 -0400, Tim Cahill wrote: > I apologize if the name callout is disconcerting. I was trying to > follow instructions for sending bugs and saw your name listed as the > owner of this code area. > FYI, I'd done some more troubleshooting and tinkering regarding the > crashing and Mate seems to be at the center of all the issues. As a > result, I also opened an Issue with the Mate Desktop team ( > https://github.com/mate-desktop/mate-panel/issues/1242). Mate also > has a power management component, which is probably responsible for > the excess logging and the confusion over Navil10. However, I have no > way to vouch for now accurately the Mate PM applet gathered data for > its instantiation. I have no external devices connected that I'm > aware would use it since I thought that was via HDMI. I *do* have a > Jabra Evolve2 headset that uses the TypeC USB connector, but I assume > that's not using the GPU. > The issue documentation I left with Mate notes that if I launch apps > from a terminal that is NOT launched from the Mate panel (right-click > on desktop instead to open terminal), the parent for all the apps > (Firefox, Evolution, etc.) is separate from Mate (at least separate > from mate-panel). Everything has worked fine (except for the constant > logging of the wake-up action) since I've done that (and turned off > the screensaver and screensaver lock). So, I'm not sure what else to > do at this point. Please advise if I should do anything on the driver > side. > Thanks,Tim > On Thu, 2021-07-29 at 11:14 -0400, Felix Kuehling wrote: > > Am 2021-07-28 um 12:10 p.m. schrieb Tim Cahill: > > > Hi Felix, > > > > I'm not sure why you're calling me out by name. I'm not working > > onanything obviously related to your crashes. > > Anyway, I took a quick look at the backtraces. They all point at > > libgdk.Two of them are segfaults, one is an abort. It's not clear > > how thesewould be related to the GPU driver. That said, when you > > boot withnomodeset, the GPU driver and all HW acceleration is > > completelydisabled. If that makes the problem disappear, the GPU > > driver is clearlyinvolved in the problem in some way. > > The abort points at a problem while freeing memory. This could be > > causedby a double-free problem in some unrelated code, possibly > > related to theGPU driver. This would be a problem in a user mode > > component (maybeMesa), not the kernel mode driver. > > I believe the messages you're seeing when you move the mouse are > > theresult of runtime power management that puts the GPU to sleep > > when it'sidle and reinitializes it when it's needed. You have 2 > > GPUs in yourlaptop, an integrated Renoir GPU in the Ryzen CPU, and > > an externalNavi10 GPU for higher gaming performance. The GPU that > > goes to sleep andwakes up is the external Navi10 GPU. > > The OpenGL renderer string specifies "RENOIR". Therefore I'm > > surprisedthat the Navi10 GPU wakes up when you move the mouse. > > Ideally itshouldn't be used at all when you're just using the > > desktop. > > If you suspect that runtime power management is responsible for > > yourproblems, you could disable it with amdgpu.runpm=0 on the > > kernel commandline. That means the Navi10 GPU won't go into the low > > power mode anddrain your battery more quickly. So this is not a > > permanent solution.Just an experiment to narrow down the problem. > > Regards, Felix > > > > > I'm not sure how to do this as I haven't had to report a bug > > > before.I've looked to a variety of bug reporting sites to see if > > > anyone elseis running into the same issues that I'm having (such > > > as the Mateproject) and haven't seen anything at all similar to > > > the issue I'mhaving. Since I had issues with AMD drivers with my > > > distro (infobelow) and some consistent and high volume dmesg > > > content shows up,I've decided that I should start here with the > > > AMD kernel team. > > > I have a fairly new MSI laptop with the following configuration: > > > [code]System: Kernel: 5.11.0-25-generic x86_64 bits: 64 > > > compiler: N/ADesktop: MATE 1.24.0 wm: marco dm: > > > LightDM Distro: Linux Mint 20.2 Uma base: Ubuntu > > > 20.04focal Machine: Type: Laptop System: Micro-Star product: > > > Alpha 17 A4DEK v:REV:1.0 serial: <filter> Chassis: > > > type: 10 serial: <filter> Mobo: Micro-Star model: MS- > > > 17EK v: REV:1.0 serial: <filter>UEFI: American > > > Megatrends v: E17EKAMS.101 date: > > > 10/26/2020 Battery: ID-1: BAT1 charge: 66.2 Wh condition: > > > 67.0/65.7 Wh (102%)volts: 12.4/10.8 model: MSI Corp. > > > MS-17EK serial: N/A status: Unknown CPU: Topology: 8-Core > > > model: AMD Ryzen 7 4800H with RadeonGraphics bits: 64 type: MT > > > MCP arch: Zen rev: 1 L2 cache: 4096 > > > KiB flags: avx avx2 lm nx pae sse sse2 sse3 sse4_1 > > > sse4_2 sse4assse3 svm bogomips: 92630 Speed: 4278 MHz > > > min/max: 1400/2900 MHz Core speeds (MHz):1: 4280 2: 1865 3: > > > 1397 4: 2188 5: 1489 6: 2265 7: 1907 8: 1906 9: 1729 > > > 10: 139711: 1397 12: 1397 13: 1397 14: 1397 15: 1907 > > > 16: 1740 Graphics: Device-1: AMD Navi 10 [Radeon RX 5600 > > > OEM/5600 XT /5700/5700 XT] vendor: Micro-Star MSI > > > driver: amdgpu v: kernel bus ID:03:00.0 chip ID: > > > 1002:731f Device-2: AMD Renoir vendor: Micro-Star MSI > > > driver: amdgpuv: kernel bus ID: 08:00.0 chip ID: > > > 1002:1636 Display: x11 server: X.Org 1.20.9 driver: > > > amdgpu,ati unloaded: fbdev,modesetting,radeon,vesa > > > compositor: marcoresolution: 1920x1080~144Hz OpenGL: > > > renderer: AMD RENOIR (DRM 3.40.0 5.11.0-25-genericLLVM > > > 11.0.0) v: 4.6 Mesa 20.2.6 direct render: > > > Yes Audio: Device-1: AMD Navi 10 HDMI Audio vendor: Micro- > > > Star MSIdriver: snd_hda_intel v: kernel bus ID: > > > 03:00.1 chip ID: 1002:ab38 Device-2: AMD > > > Raven/Raven2/FireFlight/Renoir AudioProcessor vendor: Micro-Star > > > MSI driver: N/A bus ID: 08:00.5 chip ID: > > > 1022:15e2 Device-3: AMD Family 17h HD Audio vendor: > > > Micro-Star MSIdriver: snd_hda_intel v: kernel bus ID: > > > 08:00.6 chip ID: 1022:15e3 Sound Server: ALSA v: > > > k5.11.0-25-generic Network: Device-1: Intel Wi-Fi 6 AX200 > > > driver: iwlwifi v: kernel busID: 04:00.0 chip ID: > > > 8086:2723 IF: wlp4s0 state: up mac: > > > <filter> Device-2: Realtek RTL8111/8168/8411 PCI > > > Express GigabitEthernet vendor: Micro-Star MSI driver: > > > r8169 v: kernel port: f000 bus ID: 05:00.0 chip > > > ID:10ec:8168 IF: eno1 state: down mac: > > > <filter> Drives: Local Storage: total: 476.94 GiB used: 89.79 > > > GiB (18.8%) ID-1: /dev/nvme0n1 vendor: Kingston model: > > > OM8PCP3512F-AI1size: 476.94 GiB speed: 31.6 Gb/s > > > lanes: 4 serial: <filter> Partition: ID-1: / size: 466.30 GiB > > > used: 89.28 GiB (19.1%) fs: ext4dev: /dev/dm-1 ID-2: > > > /boot size: 704.5 MiB used: 519.7 MiB (73.8%) fs:ext4 dev: > > > /dev/nvme0n1p2 ID-3: swap-1 size: 980.0 MiB used: 0 > > > KiB (0.0%) fs: swapdev: /dev/dm-2 USB: Hub: 1-0:1 info: > > > Full speed (or root) Hub ports: 4 rev: 2.0chip ID: > > > 1d6b:0002 Device-1: 1-3:2 info: SteelSeries ApS > > > SteelSeries KLC type:HID driver: hid-generic,usbhid > > > rev: 2.0 chip ID: 1038:1122 Device-2: 1-4:3 info: Acer > > > HD Webcam type: Video driver:uvcvideo rev: 2.0 chip > > > ID: 5986:211c Hub: 2-0:1 info: Full speed (or root) > > > Hub ports: 2 rev: 3.1chip ID: 1d6b:0003 Hub: 3-0:1 > > > info: Full speed (or root) Hub ports: 4 rev: 2.0chip ID: > > > 1d6b:0002 Device-3: 3-3:2 info: Intel type: Bluetooth > > > driver: btusbrev: 2.0 chip ID: 8087:0029 Hub: 4-0:1 > > > info: Full speed (or root) Hub ports: 2 rev: 3.1chip ID: > > > 1d6b:0003 Sensors: System Temperatures: cpu: 46.5 C mobo: > > > N/A Fan Speeds (RPM): N/A GPU: device: > > > amdgpu temp: 0 C fan: 65535 device: amdgputemp: 31 > > > C Repos: No active apt repos in: > > > /etc/apt/sources.list Active apt repos > > > in:/etc/apt/sources.list.d/official-package- > > > repositories.list 1: deb http: > > > //mirrors.seas.harvard.edu/linuxmint-packagesuma main upstream > > > import backport 2: deb http: //mirror.us- > > > ny2.kamatera.com/ubuntu focal mainrestricted universe > > > multiverse 3: deb http: //mirror.us- > > > ny2.kamatera.com/ubuntufocal-updates main restricted universe > > > multiverse 4: deb http: //mirror.us- > > > ny2.kamatera.com/ubuntufocal-backports main restricted universe > > > multiverse 5: deb http: //security.ubuntu.com/ubuntu/ > > > focal-securitymain restricted universe multiverse 6: > > > deb http: //archive.canonical.com/ubuntu/ focal > > > partnerInfo: Processes: 372 Uptime: 2h 44m Memory: 15.10 GiB > > > used: 1.15GiB (7.6%) Init: systemd v: 245 runlevel: 5 > > > Compilers: gcc: 9.3.0 alt: 9 Client:Unknown python3.8 > > > client inxi: 3.0.38 [/code] > > > > > > If I am using it interactively, I get random crashes that seems > > > to hitelements of mate (mate-panel, etc.) consistently - just not > > > predictably. LibreOffice applications, xed,Firefox, and Evolution > > > seem to be more proneto crashing the X session. I can easily move > > > to tty1, login, and killservices running in tty7 as the > > > crashesdon't appear to completely kill tty7. Sometimes, I can > > > kill mate andlaunch a new instance to salvagethe tty7 session. > > > However, i usually end up having to kill the rootpid of the > > > xwindows session in orderto re-login. But I think this is related > > > to the AMD GPU driver becauseevery time I simply move the mouse > > > in tty7 session, I get the following in dmesg: > > > [13164.399550] [drm] PCIE GART of 512M enabled (table > > > at0x0000008000000000).[13164.399579] [drm] PSP is > > > resuming...[13164.486593] [drm] reserve 0xa00000 from > > > 0x800f400000 for PSP TMR[13164.678788] amdgpu 0000:03:00.0: > > > amdgpu: RAS: optional ras ta ucodeis not available[13164.702624] > > > amdgpu 0000:03:00.0: amdgpu: RAP: optional rap ta ucodeis not > > > available[13164.702639] amdgpu 0000:03:00.0: amdgpu: SMU is > > > resuming...[13164.702648] amdgpu 0000:03:00.0: amdgpu: smu driver > > > if version =0x00000036, smu fw if version = 0x00000037, smu fw > > > version =0x002a3f00 (42.63.0)[13164.702664] amdgpu 0000:03:00.0: > > > amdgpu: SMU driver if version notmatched[13164.746143] amdgpu > > > 0000:03:00.0: amdgpu: SMU is resumed successfully![13164.768978] > > > [drm] kiq ring mec 2 pipe 1 q 0[13164.779651] [drm] VCN decode > > > and encode initializedsuccessfully(under DPG Mode).[13164.779758] > > > [drm] JPEG decode initialized successfully.[13164.779779] amdgpu > > > 0000:03:00.0: amdgpu: ring gfx_0.0.0 uses VM inveng 0 on hub > > > 0[13164.779783] amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.0 uses > > > VMinv eng 1 on hub 0[13164.779784] amdgpu 0000:03:00.0: amdgpu: > > > ring comp_1.1.0 uses VMinv eng 4 on hub 0[13164.779785] amdgpu > > > 0000:03:00.0: amdgpu: ring comp_1.2.0 uses VMinv eng 5 on hub > > > 0[13164.779786] amdgpu 0000:03:00.0: amdgpu: ring comp_1.3.0 uses > > > VMinv eng 6 on hub 0[13164.779787] amdgpu 0000:03:00.0: amdgpu: > > > ring comp_1.0.1 uses VMinv eng 7 on hub 0[13164.779788] amdgpu > > > 0000:03:00.0: amdgpu: ring comp_1.1.1 uses VMinv eng 8 on hub > > > 0[13164.779789] amdgpu 0000:03:00.0: amdgpu: ring comp_1.2.1 uses > > > VMinv eng 9 on hub 0[13164.779790] amdgpu 0000:03:00.0: amdgpu: > > > ring comp_1.3.1 uses VMinv eng 10 on hub 0[13164.779792] amdgpu > > > 0000:03:00.0: amdgpu: ring kiq_2.1.0 uses VM inveng 11 on hub > > > 0[13164.779793] amdgpu 0000:03:00.0: amdgpu: ring sdma0 uses VM > > > inv eng12 on hub 0[13164.779803] amdgpu 0000:03:00.0: amdgpu: > > > ring sdma1 uses VM inv eng13 on hub 0[13164.779804] amdgpu > > > 0000:03:00.0: amdgpu: ring vcn_dec uses VM inveng 0 on hub > > > 1[13164.779805] amdgpu 0000:03:00.0: amdgpu: ring vcn_enc0 uses > > > VM inveng 1 on hub 1[13164.779806] amdgpu 0000:03:00.0: amdgpu: > > > ring vcn_enc1 uses VM inveng 4 on hub 1[13164.779807] amdgpu > > > 0000:03:00.0: amdgpu: ring jpeg_dec uses VM inveng 5 on hub > > > 1[13164.783807] amdgpu 0000:03:00.0: [drm] Cannot find any crtc > > > or sizes[13170.722306] [drm] free PSP TMR buffer > > > If I boot with nomodeset, I can operate fine - just without > > > screenbrightness control, etc. It justseems strange that an event > > > is generated like this all the time. > > > I only get sporadic crashes, though. Humorously, I've been > > > runningonly Firefox, crash reporter andMate Terminal this morning > > > and it's run fine for over 4 hours. Therewere times when I > > > wouldn't runanything at all and it's lock up on me. So I just > > > can't find anycommon denominator for this (using vi in terminal > > > to type this - going to copy-paste into email client[Evolution] > > > once I'm done this). > > > I've attached 3 crash reports that were captured on the system > > > overthe last couple days. I apologizein advance - profusely! - if > > > the problem turns out to be somewhere else. > > > Thanks,Tim > > > _______________________________________________amd-gfx mailing > > > listamd-...@lists.freedesktop.org > > > https://lists.freedesktop.org/mailman/listinfo/amd-gfx > >