Is it possible that your motherboard or power supply is going bad? There look to be a lot of errors here beyond the GPU hang.
Alex On Mon, Oct 4, 2021 at 5:25 AM Guus Ellenkamp <g...@activediscovery.net> wrote: > I replaced the video card with a similar, but different GPU: AMD JUNIPER. > > Same issue occurred just now, after I presumed the issue was solved. > > I was able to go to a terminal screen and reboot the system. Mouse was > moving, but remainder of screen frozen. > > Some of the system log: > > [44512.708485] CIFS VFS: \\laptopguus.megaheights.net has not responded > in 180 seconds. Reconnecting... > [57093.595216] ata8.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 > [57093.595222] ata8.00: irq_stat 0x40000001 > [57093.595227] ata8.00: failed command: READ DMA EXT > [57093.595236] ata8.00: cmd 25/00:20:20:7c:b9/00:00:b3:01:00/e0 tag 20 dma > 16384 in > [57093.595236] res 51/40:0f:26:7c:b9/00:00:b3:01:00/e0 Emask 0x9 > (media error) > [57093.595239] ata8.00: status: { DRDY ERR } > [57093.595242] ata8.00: error: { UNC } > [57093.621841] ata8.00: configured for UDMA/133 > [57093.621866] sd 7:0:0:0: [sdf] tag#20 FAILED Result: hostbyte=DID_OK > driverbyte=DRIVER_SENSE > [57093.621872] sd 7:0:0:0: [sdf] tag#20 Sense Key : Medium Error [current] > [57093.621876] sd 7:0:0:0: [sdf] tag#20 Add. Sense: Unrecovered read error > - auto reallocate failed > [57093.621882] sd 7:0:0:0: [sdf] tag#20 CDB: Read(16) 88 00 00 00 00 01 b3 > b9 7c 20 00 00 00 20 00 00 > [57093.621888] blk_update_request: I/O error, dev sdf, sector 7310244896 > op 0x0:(READ) flags 0x1000 phys_seg 2 prio class 0 > [57093.621899] BTRFS error (device sdf3): bdev /dev/sdf3 errs: wr 0, rd 1, > flush 0, corrupt 5, gen 0 > [57093.621946] ata8: EH complete > [57093.704448] BTRFS info (device sdf3): read error corrected: ino 0 off > 10072584896512 (dev /dev/sdf3 sector 3809520672) > [57093.722193] BTRFS info (device sdf3): read error corrected: ino 0 off > 10072584900608 (dev /dev/sdf3 sector 3809520680) > [57093.722324] BTRFS info (device sdf3): read error corrected: ino 0 off > 10072584904704 (dev /dev/sdf3 sector 3809520688) > [57093.722434] BTRFS info (device sdf3): read error corrected: ino 0 off > 10072584908800 (dev /dev/sdf3 sector 3809520696) > [63723.394435] systemd-resolve[80364]: segfault at ed5eda1ba ip > 00007fcf9bfb130d sp 00007ffe9ab99c90 error 4 in > libsystemd-shared-245.so[7fcf9bf89000+174000] > [63723.394454] Code: 8d 15 a2 68 15 00 31 ff 48 8d 35 fa c0 14 00 e8 b9 a2 > 08 00 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 85 ff 74 7f 48 83 ec 18 > <8b> 47 04 85 c0 74 4c 83 e8 01 89 47 04 75 11 f6 47 29 02 74 16 f6 > [71460.546805] systemd-resolve[84672]: segfault at 600000000 ip > 0000000600000000 sp 00007fffdfd83c48 error 14 in > systemd-resolved[562e7fe8b000+a000] > [71460.546822] Code: Bad RIP value. > [72353.671080] ata8.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 > [72353.671084] ata8.00: irq_stat 0x40000001 > [72353.671086] ata8.00: failed command: READ DMA EXT > [72353.671091] ata8.00: cmd 25/00:20:80:68:b9/00:00:b3:01:00/e0 tag 12 dma > 16384 in > [72353.671091] res 51/40:00:99:68:b9/00:00:b3:01:00/e0 Emask 0x9 > (media error) > [72353.671093] ata8.00: status: { DRDY ERR } > [72353.671094] ata8.00: error: { UNC } > [72353.694979] ata8.00: configured for UDMA/133 > [72353.694990] sd 7:0:0:0: [sdf] tag#12 FAILED Result: hostbyte=DID_OK > driverbyte=DRIVER_SENSE > [72353.694992] sd 7:0:0:0: [sdf] tag#12 Sense Key : Medium Error [current] > [72353.694993] sd 7:0:0:0: [sdf] tag#12 Add. Sense: Unrecovered read error > - auto reallocate failed > [72353.694995] sd 7:0:0:0: [sdf] tag#12 CDB: Read(16) 88 00 00 00 00 01 b3 > b9 68 80 00 00 00 20 00 00 > [72353.694997] blk_update_request: I/O error, dev sdf, sector 7310239872 > op 0x0:(READ) flags 0x1000 phys_seg 4 prio class 0 > [72353.695002] BTRFS error (device sdf3): bdev /dev/sdf3 errs: wr 0, rd 2, > flush 0, corrupt 5, gen 0 > [72353.695016] ata8: EH complete > [72353.714389] BTRFS info (device sdf3): read error corrected: ino 0 off > 10072582324224 (dev /dev/sdf3 sector 3809515648) > [72353.714533] BTRFS info (device sdf3): read error corrected: ino 0 off > 10072582328320 (dev /dev/sdf3 sector 3809515656) > [72353.714817] BTRFS info (device sdf3): read error corrected: ino 0 off > 10072582332416 (dev /dev/sdf3 sector 3809515664) > [72353.783786] BTRFS info (device sdf3): read error corrected: ino 0 off > 10072582336512 (dev /dev/sdf3 sector 3809515672) > [81101.923913] systemd-resolve[90514]: segfault at 708784e ip > 00007f25381cb30d sp 00007ffd5a4f4580 error 4 in > libsystemd-shared-245.so[7f25381a3000+174000] > [81101.923920] Code: 8d 15 a2 68 15 00 31 ff 48 8d 35 fa c0 14 00 e8 b9 a2 > 08 00 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 85 ff 74 7f 48 83 ec 18 > <8b> 47 04 85 c0 74 4c 83 e8 01 89 47 04 75 11 f6 47 29 02 74 16 f6 > [88659.280083] systemd-resolve[99760]: segfault at 2642fa92 ip > 00007f0b975db30d sp 00007ffdaed18a40 error 4 in > libsystemd-shared-245.so[7f0b975b3000+174000] > [88659.280092] Code: 8d 15 a2 68 15 00 31 ff 48 8d 35 fa c0 14 00 e8 b9 a2 > 08 00 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 85 ff 74 7f 48 83 ec 18 > <8b> 47 04 85 c0 74 4c 83 e8 01 89 47 04 75 11 f6 47 29 02 74 16 f6 > [89642.694939] radeon 0000:01:00.0: ring 0 stalled for more than 10024msec > [89642.694946] radeon 0000:01:00.0: GPU lockup (current fence id > 0x0000000000180ee1 last fence id 0x0000000000180ef8 on ring 0) > [89642.718629] radeon 0000:01:00.0: Saved 727 dwords of commands on ring 0. > [89642.718640] radeon 0000:01:00.0: GPU softreset: 0x00000008 > [89642.718642] radeon 0000:01:00.0: GRBM_STATUS = > 0xA0003828 > [89642.718643] radeon 0000:01:00.0: GRBM_STATUS_SE0 = > 0x00000007 > [89642.718644] radeon 0000:01:00.0: GRBM_STATUS_SE1 = > 0x00000007 > [89642.718645] radeon 0000:01:00.0: SRBM_STATUS = > 0x200000C0 > [89642.718646] radeon 0000:01:00.0: SRBM_STATUS2 = > 0x00000000 > [89642.718647] radeon 0000:01:00.0: R_008674_CP_STALLED_STAT1 = > 0x00000000 > [89642.718648] radeon 0000:01:00.0: R_008678_CP_STALLED_STAT2 = > 0x00010002 > [89642.718649] radeon 0000:01:00.0: R_00867C_CP_BUSY_STAT = > 0x00020182 > [89642.718650] radeon 0000:01:00.0: R_008680_CP_STAT = > 0x80038243 > [89642.718652] radeon 0000:01:00.0: R_00D034_DMA_STATUS_REG = > 0x44C83D57 > [89642.728147] radeon 0000:01:00.0: GRBM_SOFT_RESET=0x00004001 > [89642.728199] radeon 0000:01:00.0: SRBM_SOFT_RESET=0x00000100 > [89642.729345] radeon 0000:01:00.0: GRBM_STATUS = > 0x00003828 > [89642.729346] radeon 0000:01:00.0: GRBM_STATUS_SE0 = > 0x00000007 > [89642.729347] radeon 0000:01:00.0: GRBM_STATUS_SE1 = > 0x00000007 > [89642.729348] radeon 0000:01:00.0: SRBM_STATUS = > 0x200000C0 > [89642.729348] radeon 0000:01:00.0: SRBM_STATUS2 = > 0x00000000 > [89642.729349] radeon 0000:01:00.0: R_008674_CP_STALLED_STAT1 = > 0x00000000 > [89642.729350] radeon 0000:01:00.0: R_008678_CP_STALLED_STAT2 = > 0x00000000 > [89642.729351] radeon 0000:01:00.0: R_00867C_CP_BUSY_STAT = > 0x00000000 > [89642.729352] radeon 0000:01:00.0: R_008680_CP_STAT = > 0x00000000 > [89642.729353] radeon 0000:01:00.0: R_00D034_DMA_STATUS_REG = > 0x44C83D57 > [89642.729379] radeon 0000:01:00.0: GPU reset succeeded, trying to resume > [89642.805590] [drm] PCIE gen 2 link speeds already enabled > [89642.809936] [drm] PCIE GART of 1024M enabled (table at > 0x000000000014C000). > [89642.810017] radeon 0000:01:00.0: WB enabled > [89642.810018] radeon 0000:01:00.0: fence driver on ring 0 use gpu addr > 0x00000000e0000c00 and cpu addr 0x00000000ff8aea81 > [89642.810019] radeon 0000:01:00.0: fence driver on ring 3 use gpu addr > 0x00000000e0000c0c and cpu addr 0x00000000c4d65d70 > [89642.810203] radeon 0000:01:00.0: fence driver on ring 5 use gpu addr > 0x000000000005c418 and cpu addr 0x0000000075222c76 > [89642.826333] [drm] ring test on 0 succeeded in 1 usecs > [89642.826337] [drm] ring test on 3 succeeded in 2 usecs > [89643.002009] [drm] ring test on 5 succeeded in 1 usecs > [89643.002013] [drm] UVD initialized successfully. > [89644.038977] [drm:r600_ib_test [radeon]] *ERROR* radeon: fence wait > timed out. > [89644.039025] [drm:radeon_ib_ring_tests [radeon]] *ERROR* radeon: failed > testing IB on GFX ring (-110). > > > > -------- Forwarded Message -------- > Subject: Fwd: Random display freeze AMD TURKS (DRM 2.50.0 / > 5.4.0-77-generic, LLVM 11.0.0) > Date: Sat, 11 Sep 2021 00:41:30 +0800 > From: Guus Ellenkamp <g...@activediscovery.net> <g...@activediscovery.net> > To: xorg-driver-ati@lists.x.org > > I upgraded Mesa, now: AMD TURKS (DRM 2.50.0-77-generic, LLVM 12.0.1) and > it seemed more stable, but just now my screen and keybord froze again. I > was able to access the system through ssh and rebooted normally. Screen > stayed locked until hard reboot. > > Last part of syslog (I have the whole): > > ... > > [104783.530925] CIFS VFS: SMB signature verification returned error = -13 > [104783.533033] audit: type=1400 audit(1631273391.656:679): > apparmor="ALLOWED" operation="unlink" profile="libreoffice-soffice" > name=2F6E6574776F726B2F636F6D70616E792D6F6E2D6D6F6F6E2F437573746F6D657220646F63756D656E74732F456967656E20486F72656361204D616B656C6161722F2E7E6C6F636B2E54696D6573686565742045484D2076312E30312E786C737823 > pid=131532 comm="soffice.bin" requested_mask="d" denied_mask="d" fsuid=1000 > ouid=1000 > [104783.539575] audit: type=1400 audit(1631273391.660:680): > apparmor="ALLOWED" operation="mknod" profile="libreoffice-soffice" > name=2F6E6574776F726B2F636F6D70616E792D6F6E2D6D6F6F6E2F437573746F6D657220646F63756D656E74732F456967656E20486F72656361204D616B656C6161722F6C75313331353332653872716E772E746D70 > pid=131532 comm="soffice.bin" requested_mask="c" denied_mask="c" fsuid=1000 > ouid=1000 > [104783.543464] audit: type=1400 audit(1631273391.664:681): > apparmor="ALLOWED" operation="open" profile="libreoffice-soffice" > name=2F6E6574776F726B2F636F6D70616E792D6F6E2D6D6F6F6E2F437573746F6D657220646F63756D656E74732F456967656E20486F72656361204D616B656C6161722F6C75313331353332653872716E772E746D70 > pid=131532 comm="soffice.bin" requested_mask="wrc" denied_mask="wrc" > fsuid=1000 ouid=1000 > [104783.545227] audit: type=1400 audit(1631273391.668:682): > apparmor="ALLOWED" operation="open" profile="libreoffice-soffice" > name=2F6E6574776F726B2F636F6D70616E792D6F6E2D6D6F6F6E2F437573746F6D657220646F63756D656E74732F456967656E20486F72656361204D616B656C6161722F6C75313331353332653872716E772E746D70 > pid=131532 comm="soffice.bin" requested_mask="wrc" denied_mask="wrc" > fsuid=1000 ouid=1000 > [104783.547124] audit: type=1400 audit(1631273391.668:683): > apparmor="ALLOWED" operation="open" profile="libreoffice-soffice" > name=2F6E6574776F726B2F636F6D70616E792D6F6E2D6D6F6F6E2F437573746F6D657220646F63756D656E74732F456967656E20486F72656361204D616B656C6161722F54696D6573686565742045484D2076312E30312E786C7378 > pid=131532 comm="soffice.bin" requested_mask="wr" denied_mask="wr" > fsuid=1000 ouid=1000 > [104783.548692] audit: type=1400 audit(1631273391.668:684): > apparmor="ALLOWED" operation="file_lock" profile="libreoffice-soffice" > name=2F6E6574776F726B2F636F6D70616E792D6F6E2D6D6F6F6E2F437573746F6D657220646F63756D656E74732F456967656E20486F72656361204D616B656C6161722F54696D6573686565742045484D2076312E30312E786C7378 > pid=131532 comm="soffice.bin" requested_mask="wk" denied_mask="wk" > fsuid=1000 ouid=1000 > [104783.552528] audit: type=1400 audit(1631273391.672:685): > apparmor="ALLOWED" operation="mknod" profile="libreoffice-soffice" > name=2F6E6574776F726B2F636F6D70616E792D6F6E2D6D6F6F6E2F437573746F6D657220646F63756D656E74732F456967656E20486F72656361204D616B656C6161722F2E7E6C6F636B2E54696D6573686565742045484D2076312E30312E786C737823 > pid=131532 comm="soffice.bin" requested_mask="c" denied_mask="c" fsuid=1000 > ouid=1000 > [104783.556089] audit: type=1400 audit(1631273391.676:686): > apparmor="ALLOWED" operation="open" profile="libreoffice-soffice" > name=2F6E6574776F726B2F636F6D70616E792D6F6E2D6D6F6F6E2F437573746F6D657220646F63756D656E74732F456967656E20486F72656361204D616B656C6161722F2E7E6C6F636B2E54696D6573686565742045484D2076312E30312E786C737823 > pid=131532 comm="soffice.bin" requested_mask="wrc" denied_mask="wrc" > fsuid=1000 ouid=1000 > [104783.742044] CIFS VFS: SMB signature verification returned error = -13 > [104783.979172] CIFS VFS: SMB signature verification returned error = -13 > [104784.124761] CIFS VFS: SMB signature verification returned error = -13 > [104784.142969] audit: type=1400 audit(1631273392.264:687): > apparmor="ALLOWED" operation="rename_src" profile="libreoffice-soffice" > name=2F6E6574776F726B2F636F6D70616E792D6F6E2D6D6F6F6E2F437573746F6D657220646F63756D656E74732F456967656E20486F72656361204D616B656C6161722F6C75313331353332653872716E772E746D70 > pid=131532 comm="soffice.bin" requested_mask="wrd" denied_mask="wrd" > fsuid=1000 ouid=1000 > [120182.069061] CIFS VFS: \\moon.megaheights.net has not responded in 180 > seconds. Reconnecting... > [120182.073067] CIFS VFS: \\moon.megaheights.net has not responded in 180 > seconds. Reconnecting... > [120182.073099] CIFS VFS: \\moon.megaheights.net has not responded in 180 > seconds. Reconnecting... > [120182.073106] CIFS VFS: \\laptopguus.megaheights.net has not responded > in 180 seconds. Reconnecting... > [121101.456672] kauditd_printk_skb: 7 callbacks suppressed > [121101.456674] audit: type=1400 audit(1631289709.793:695): > apparmor="DENIED" operation="capable" profile="/usr/sbin/cups-browsed" > pid=143181 comm="cups-browsed" capability=23 capname="sys_nice" > [122269.208516] radeon 0000:01:00.0: ring 0 stalled for more than 10024msec > [122269.208525] radeon 0000:01:00.0: GPU lockup (current fence id > 0x00000000001b27d5 last fence id 0x00000000001b281e on ring 0) > [122269.720539] radeon 0000:01:00.0: ring 0 stalled for more than 10536msec > [122269.720543] radeon 0000:01:00.0: GPU lockup (current fence id > 0x00000000001b27d5 last fence id 0x00000000001b281e on ring 0) > [122270.232541] radeon 0000:01:00.0: ring 0 stalled for more than 11048msec > [122270.232550] radeon 0000:01:00.0: GPU lockup (current fence id > 0x00000000001b27d5 last fence id 0x00000000001b281f on ring 0) > [122270.748481] radeon 0000:01:00.0: ring 0 stalled for more than 11564msec > [122270.748485] radeon 0000:01:00.0: GPU lockup (current fence id > 0x00000000001b27d5 last fence id 0x00000000001b2822 on ring 0) > [122271.256502] radeon 0000:01:00.0: ring 0 stalled for more than 12072msec > [122271.256507] radeon 0000:01:00.0: GPU lockup (current fence id > 0x00000000001b27d5 last fence id 0x00000000001b2824 on ring 0) > [122271.772479] radeon 0000:01:00.0: ring 0 stalled for more than 12588msec > [122271.772487] radeon 0000:01:00.0: GPU lockup (current fence id > 0x00000000001b27d5 last fence id 0x00000000001b2826 on ring 0) > [122272.280507] radeon 0000:01:00.0: ring 0 stalled for more than 13096msec > [122272.280517] radeon 0000:01:00.0: GPU lockup (current fence id > 0x00000000001b27d5 last fence id 0x00000000001b2827 on ring 0) > [122272.796466] radeon 0000:01:00.0: ring 0 stalled for more than 13612msec > [122272.796474] radeon 0000:01:00.0: GPU lockup (current fence id > 0x00000000001b27d5 last fence id 0x00000000001b2827 on ring 0) > [122273.180373] radeon 0000:01:00.0: Saved 2679 dwords of commands on ring > 0. > [122273.180403] radeon 0000:01:00.0: GPU softreset: 0x00000008 > [122273.180418] radeon 0000:01:00.0: GRBM_STATUS = > 0xA0003828 > [122273.180419] radeon 0000:01:00.0: GRBM_STATUS_SE0 = > 0x00000007 > [122273.180420] radeon 0000:01:00.0: GRBM_STATUS_SE1 = > 0x00000007 > [122273.180421] radeon 0000:01:00.0: SRBM_STATUS = > 0x200000C0 > [122273.180422] radeon 0000:01:00.0: SRBM_STATUS2 = > 0x00000000 > [122273.180423] radeon 0000:01:00.0: R_008674_CP_STALLED_STAT1 = > 0x00000000 > [122273.180425] radeon 0000:01:00.0: R_008678_CP_STALLED_STAT2 = > 0x00010002 > [122273.180426] radeon 0000:01:00.0: R_00867C_CP_BUSY_STAT = > 0x00020182 > [122273.180427] radeon 0000:01:00.0: R_008680_CP_STAT = > 0x80038243 > [122273.180437] radeon 0000:01:00.0: R_00D034_DMA_STATUS_REG = > 0x44C83D57 > [122273.195236] radeon 0000:01:00.0: GRBM_SOFT_RESET=0x00004001 > [122273.195288] radeon 0000:01:00.0: SRBM_SOFT_RESET=0x00000100 > [122273.196434] radeon 0000:01:00.0: GRBM_STATUS = > 0x00003828 > [122273.196435] radeon 0000:01:00.0: GRBM_STATUS_SE0 = > 0x00000007 > [122273.196437] radeon 0000:01:00.0: GRBM_STATUS_SE1 = > 0x00000007 > [122273.196448] radeon 0000:01:00.0: SRBM_STATUS = > 0x200000C0 > [122273.196449] radeon 0000:01:00.0: SRBM_STATUS2 = > 0x00000000 > [122273.196450] radeon 0000:01:00.0: R_008674_CP_STALLED_STAT1 = > 0x00000000 > [122273.196452] radeon 0000:01:00.0: R_008678_CP_STALLED_STAT2 = > 0x00000000 > [122273.196453] radeon 0000:01:00.0: R_00867C_CP_BUSY_STAT = > 0x00000000 > [122273.196454] radeon 0000:01:00.0: R_008680_CP_STAT = > 0x00000000 > [122273.196455] radeon 0000:01:00.0: R_00D034_DMA_STATUS_REG = > 0x44C83D57 > [122273.196478] radeon 0000:01:00.0: GPU reset succeeded, trying to resume > [122273.218795] [drm] enabling PCIE gen 2 link speeds, disable with > radeon.pcie_gen2=0 > [122273.223112] [drm] PCIE GART of 1024M enabled (table at > 0x0000000000162000). > [122273.223206] radeon 0000:01:00.0: WB enabled > [122273.223208] radeon 0000:01:00.0: fence driver on ring 0 use gpu addr > 0x00000000e0000c00 and cpu addr 0x00000000eda988f2 > [122273.223209] radeon 0000:01:00.0: fence driver on ring 3 use gpu addr > 0x00000000e0000c0c and cpu addr 0x0000000085c11b99 > [122273.223965] radeon 0000:01:00.0: fence driver on ring 5 use gpu addr > 0x0000000000072118 and cpu addr 0x00000000a7da438f > [122273.240356] [drm] ring test on 0 succeeded in 2 usecs > [122273.240366] [drm] ring test on 3 succeeded in 6 usecs > [122273.416107] [drm] ring test on 5 succeeded in 2 usecs > [122273.416115] [drm] UVD initialized successfully. > [122274.588470] [drm:r600_ib_test [radeon]] *ERROR* radeon: fence wait > timed out. > [122274.588525] [drm:radeon_ib_ring_tests [radeon]] *ERROR* radeon: failed > testing IB on GFX ring (-110). > [122286.240611] CIFS VFS: cifs_setlk failed rc=-22 > [122287.016841] traps: lxqt-globalkeys[75604] general protection fault > ip:7ff93e77d9f6 sp:7ffe7a1e77e0 error:0 in libc-2.31.so > [7ff93e759000+178000] > [122293.850252] pcmanfm-qt[75603]: segfault at 30 ip 00007f2d348c360a sp > 00007ffe44297b90 error 4 in libQt5Core.so.5.12.8[7f2d346c3000+2e0000] > [122293.850267] Code: 53 48 83 ec 28 8b 05 8d 23 2c 00 83 f8 ff 7c 16 0f > b6 05 29 23 2c 00 84 c0 0f 84 c9 00 00 00 4c 8d 35 3a 23 2c 00 4d 8d 7e 30 > <41> 8b 2f 41 89 ec 41 89 ed 41 81 e4 ff ff ff 00 41 81 e5 c0 ff ff > [122415.283180] device enp3s0 left promiscuous mode > [122439.221503] vboxnetflt: 9 out of 12284920 packets were not sent > (directed to host) > > > > > -------- Forwarded Message -------- > Subject: Re: Random display freeze AMD TURKS (DRM 2.50.0 / > 5.4.0-77-generic, LLVM 11.0.0) > Date: Sat, 4 Sep 2021 14:17:11 +0800 > From: Guus Ellenkamp <g...@activediscovery.net> <g...@activediscovery.net> > To: xorg-driver-ati@lists.x.org > > I have some more info on it: > > Sometimes I can go to another terminal screen, sometimes I can still login > through ssh, sometimes the system fully locks up, but I can reboot with > sysreq-alt-b. > > The whole thing seems to happen random, but mostly after the system has > been running one or two days. > > OpenGL Renderer: AMD TURKS (DRM 2.50.0 / 5.4.0-77-generic, LLVM 12.0.0) > > Last part of dmeg --syslog latest time it happened: > > ... > > [21385.583783] audit: type=1400 audit(1630676128.271:208): > apparmor="ALLOWED" operation="rename_src" profile="libreoffice-soffice" > name=2F6E6574776F726B2F636F6D70616E792D6F6E2D6D6F6F6E2F437573746F6D657220646F63756D656E74732F456967656E20486F72656361204D616B656C6161722F6C7531393238303462336C616B2E746D70 > pid=19280 comm="soffice.bin" requested_mask="wrd" denied_mask="wrd" > fsuid=1000 ouid=1000 > [21536.024463] kauditd_printk_skb: 7 callbacks suppressed > [21536.024465] audit: type=1400 audit(1630676278.701:216): > apparmor="ALLOWED" operation="open" profile="libreoffice-soffice" > name=2F6E6574776F726B2F636F6D70616E792D6F6E2D6D6F6F6E2F437573746F6D657220646F63756D656E74732F456967656E20486F72656361204D616B656C6161722F2E7E6C6F636B2E54696D6573686565742045484D2076312E302E786C737823 > pid=19280 comm="soffice.bin" requested_mask="r" denied_mask="r" fsuid=1000 > ouid=1000 > [21536.029431] CIFS VFS: SMB signature verification returned error = -13 > [21536.031847] audit: type=1400 audit(1630676278.709:217): > apparmor="ALLOWED" operation="unlink" profile="libreoffice-soffice" > name=2F6E6574776F726B2F636F6D70616E792D6F6E2D6D6F6F6E2F437573746F6D657220646F63756D656E74732F456967656E20486F72656361204D616B656C6161722F2E7E6C6F636B2E54696D6573686565742045484D2076312E302E786C737823 > pid=19280 comm="soffice.bin" requested_mask="d" denied_mask="d" fsuid=1000 > ouid=1000 > [25333.210567] hrtimer: interrupt took 25127 ns > [30072.138275] audit: type=1400 audit(1630684814.956:218): > apparmor="DENIED" operation="capable" profile="/usr/sbin/cups-browsed" > pid=33554 comm="cups-browsed" capability=23 capname="sys_nice" > [30859.256212] radeon 0000:01:00.0: ring 3 stalled for more than 10224msec > [30859.256220] radeon 0000:01:00.0: GPU lockup (current fence id > 0x0000000000002058 last fence id 0x000000000000205a on ring 3) > [30859.366220] radeon 0000:01:00.0: Saved 1746 dwords of commands on ring > 0. > [30859.366234] radeon 0000:01:00.0: GPU softreset: 0x0000008C > [30859.366235] radeon 0000:01:00.0: GRBM_STATUS = > 0xA0003828 > [30859.366236] radeon 0000:01:00.0: GRBM_STATUS_SE0 = > 0x00000007 > [30859.366237] radeon 0000:01:00.0: GRBM_STATUS_SE1 = > 0x00000007 > [30859.366238] radeon 0000:01:00.0: SRBM_STATUS = > 0x200440C0 > [30859.366239] radeon 0000:01:00.0: SRBM_STATUS2 = > 0x00000000 > [30859.366240] radeon 0000:01:00.0: R_008674_CP_STALLED_STAT1 = > 0x00000000 > [30859.366241] radeon 0000:01:00.0: R_008678_CP_STALLED_STAT2 = > 0x00010000 > [30859.366242] radeon 0000:01:00.0: R_00867C_CP_BUSY_STAT = > 0x00000002 > [30859.366243] radeon 0000:01:00.0: R_008680_CP_STAT = > 0x80010243 > [30859.366244] radeon 0000:01:00.0: R_00D034_DMA_STATUS_REG = > 0x44483146 > [30859.379995] radeon 0000:01:00.0: GRBM_SOFT_RESET=0x00004001 > [30859.380047] radeon 0000:01:00.0: SRBM_SOFT_RESET=0x00108100 > [30859.381206] radeon 0000:01:00.0: GRBM_STATUS = > 0x00003828 > [30859.381207] radeon 0000:01:00.0: GRBM_STATUS_SE0 = > 0x00000007 > [30859.381208] radeon 0000:01:00.0: GRBM_STATUS_SE1 = > 0x00000007 > [30859.381209] radeon 0000:01:00.0: SRBM_STATUS = > 0x200400C0 > [30859.381210] radeon 0000:01:00.0: SRBM_STATUS2 = > 0x00000000 > [30859.381211] radeon 0000:01:00.0: R_008674_CP_STALLED_STAT1 = > 0x00000000 > [30859.381212] radeon 0000:01:00.0: R_008678_CP_STALLED_STAT2 = > 0x00000000 > [30859.381213] radeon 0000:01:00.0: R_00867C_CP_BUSY_STAT = > 0x00000000 > [30859.381214] radeon 0000:01:00.0: R_008680_CP_STAT = > 0x00000000 > [30859.381215] radeon 0000:01:00.0: R_00D034_DMA_STATUS_REG = > 0x44C83D57 > [30859.381227] radeon 0000:01:00.0: GPU reset succeeded, trying to resume > [30859.403380] [drm] enabling PCIE gen 2 link speeds, disable with > radeon.pcie_gen2=0 > [30859.407610] [drm] PCIE GART of 1024M enabled (table at > 0x0000000000162000). > [30859.407704] radeon 0000:01:00.0: WB enabled > [30859.407706] radeon 0000:01:00.0: fence driver on ring 0 use gpu addr > 0x00000000e0000c00 and cpu addr 0x0000000068b59d95 > [30859.407706] radeon 0000:01:00.0: fence driver on ring 3 use gpu addr > 0x00000000e0000c0c and cpu addr 0x000000005a048e55 > [30859.408486] radeon 0000:01:00.0: fence driver on ring 5 use gpu addr > 0x0000000000072118 and cpu addr 0x000000003f7c7eb9 > [30859.424829] [drm] ring test on 0 succeeded in 3 usecs > [30859.424840] [drm] ring test on 3 succeeded in 7 usecs > [30859.600585] [drm] ring test on 5 succeeded in 2 usecs > [30859.600594] [drm] UVD initialized successfully. > [30860.760281] [drm:r600_ib_test [radeon]] *ERROR* radeon: fence wait > timed out. > [30860.760342] [drm:radeon_ib_ring_tests [radeon]] *ERROR* radeon: failed > testing IB on GFX ring (-110). > > Here the graphical (desktop) display was fully locked, but I was able to > access a terminal screen if I remember correctly and reboot the system > 'normally'. After reboot the system always runs okay. > > Earlier in the log: > > ... > > [ 3053.103898] audit: type=1400 audit(1630657795.570:146): > apparmor="ALLOWED" operation="file_lock" profile="libreoffice-soffice" > name=2F6E6574776F726B2F636F6D70616E792D6F6E2D6D6F6F6E2F437573746F6D657220646F63756D656E74732F456967656E20486F72656361204D616B656C6161722F54696D6573686565742045484D2076312E302E786C7378 > pid=6030 comm="soffice.bin" requested_mask="wk" denied_mask="wk" fsuid=1000 > ouid=1000 > [ 3053.108517] audit: type=1400 audit(1630657795.578:147): > apparmor="ALLOWED" operation="mknod" profile="libreoffice-soffice" > name=2F6E6574776F726B2F636F6D70616E792D6F6E2D6D6F6F6E2F437573746F6D657220646F63756D656E74732F456967656E20486F72656361204D616B656C6161722F2E7E6C6F636B2E54696D6573686565742045484D2076312E302E786C737823 > pid=6030 comm="soffice.bin" requested_mask="c" denied_mask="c" fsuid=1000 > ouid=1000 > [ 3053.112153] audit: type=1400 audit(1630657795.582:148): > apparmor="ALLOWED" operation="open" profile="libreoffice-soffice" > name=2F6E6574776F726B2F636F6D70616E792D6F6E2D6D6F6F6E2F437573746F6D657220646F63756D656E74732F456967656E20486F72656361204D616B656C6161722F2E7E6C6F636B2E54696D6573686565742045484D2076312E302E786C737823 > pid=6030 comm="soffice.bin" requested_mask="wrc" denied_mask="wrc" > fsuid=1000 ouid=1000 > [ 3053.151664] CIFS VFS: SMB signature verification returned error = -13 > [ 3053.303590] CIFS VFS: SMB signature verification returned error = -13 > [ 3053.319941] CIFS VFS: SMB signature verification returned error = -13 > [ 3053.334053] audit: type=1400 audit(1630657795.802:149): > apparmor="ALLOWED" operation="rename_src" profile="libreoffice-soffice" > name=2F6E6574776F726B2F636F6D70616E792D6F6E2D6D6F6F6E2F437573746F6D657220646F63756D656E74732F456967656E20486F72656361204D616B656C6161722F6C75363033303432646D7A362E746D70 > pid=6030 comm="soffice.bin" requested_mask="wrd" denied_mask="wrd" > fsuid=1000 ouid=1000 > [ 7476.284522] perf: interrupt took too long (2508 > 2500), lowering > kernel.perf_event_max_sample_rate to 79500 > [ 7930.584695] radeon 0000:01:00.0: ring 0 stalled for more than 10120msec > [ 7930.584703] radeon 0000:01:00.0: GPU lockup (current fence id > 0x000000000004351e last fence id 0x000000000004353d on ring 0) > [ 7931.096709] radeon 0000:01:00.0: ring 0 stalled for more than 10632msec > [ 7931.096717] radeon 0000:01:00.0: GPU lockup (current fence id > 0x000000000004351e last fence id 0x0000000000043542 on ring 0) > [ 7931.128660] radeon 0000:01:00.0: ring 3 stalled for more than 10240msec > [ 7931.128664] radeon 0000:01:00.0: GPU lockup (current fence id > 0x0000000000000b16 last fence id 0x0000000000000b18 on ring 3) > [ 7931.608731] radeon 0000:01:00.0: ring 0 stalled for more than 11144msec > [ 7931.608740] radeon 0000:01:00.0: GPU lockup (current fence id > 0x000000000004351e last fence id 0x0000000000043546 on ring 0) > [ 7931.644708] radeon 0000:01:00.0: ring 3 stalled for more than 10756msec > [ 7931.644712] radeon 0000:01:00.0: GPU lockup (current fence id > 0x0000000000000b16 last fence id 0x0000000000000b18 on ring 3) > [ 7932.120676] radeon 0000:01:00.0: ring 0 stalled for more than 11656msec > [ 7932.120680] radeon 0000:01:00.0: GPU lockup (current fence id > 0x000000000004351e last fence id 0x000000000004354c on ring 0) > [ 7932.128800] radeon 0000:01:00.0: failed to get a new IB (-35) > [ 7932.128825] [drm:radeon_cs_ioctl [radeon]] *ERROR* Failed to get ib ! > [ 7932.249318] radeon 0000:01:00.0: Saved 1522 dwords of commands on ring > 0. > [ 7932.249329] radeon 0000:01:00.0: GPU softreset: 0x0000000C > [ 7932.249330] radeon 0000:01:00.0: GRBM_STATUS = > 0xA0003828 > [ 7932.249331] radeon 0000:01:00.0: GRBM_STATUS_SE0 = > 0x00000007 > [ 7932.249332] radeon 0000:01:00.0: GRBM_STATUS_SE1 = > 0x00000007 > [ 7932.249333] radeon 0000:01:00.0: SRBM_STATUS = > 0x200000C0 > [ 7932.249334] radeon 0000:01:00.0: SRBM_STATUS2 = > 0x00000000 > [ 7932.249335] radeon 0000:01:00.0: R_008674_CP_STALLED_STAT1 = > 0x00000000 > [ 7932.249336] radeon 0000:01:00.0: R_008678_CP_STALLED_STAT2 = > 0x00010002 > [ 7932.249337] radeon 0000:01:00.0: R_00867C_CP_BUSY_STAT = > 0x00020186 > [ 7932.249338] radeon 0000:01:00.0: R_008680_CP_STAT = > 0x80038647 > [ 7932.249339] radeon 0000:01:00.0: R_00D034_DMA_STATUS_REG = > 0x44483146 > [ 7932.256904] radeon 0000:01:00.0: GRBM_SOFT_RESET=0x00004001 > [ 7932.256955] radeon 0000:01:00.0: SRBM_SOFT_RESET=0x00100100 > [ 7932.258100] radeon 0000:01:00.0: GRBM_STATUS = > 0x00003828 > [ 7932.258101] radeon 0000:01:00.0: GRBM_STATUS_SE0 = > 0x00000007 > [ 7932.258102] radeon 0000:01:00.0: GRBM_STATUS_SE1 = > 0x00000007 > [ 7932.258103] radeon 0000:01:00.0: SRBM_STATUS = > 0x200000C0 > [ 7932.258104] radeon 0000:01:00.0: SRBM_STATUS2 = > 0x00000000 > [ 7932.258105] radeon 0000:01:00.0: R_008674_CP_STALLED_STAT1 = > 0x00000000 > [ 7932.258106] radeon 0000:01:00.0: R_008678_CP_STALLED_STAT2 = > 0x00000000 > [ 7932.258107] radeon 0000:01:00.0: R_00867C_CP_BUSY_STAT = > 0x00000000 > [ 7932.258108] radeon 0000:01:00.0: R_008680_CP_STAT = > 0x00000000 > [ 7932.258109] radeon 0000:01:00.0: R_00D034_DMA_STATUS_REG = > 0x44C83D57 > [ 7932.258122] radeon 0000:01:00.0: GPU reset succeeded, trying to resume > [ 7932.280351] [drm] enabling PCIE gen 2 link speeds, disable with > radeon.pcie_gen2=0 > [ 7932.284667] [drm] PCIE GART of 1024M enabled (table at > 0x0000000000162000). > [ 7932.284775] radeon 0000:01:00.0: WB enabled > [ 7932.284777] radeon 0000:01:00.0: fence driver on ring 0 use gpu addr > 0x00000000e0000c00 and cpu addr 0x0000000068b59d95 > [ 7932.284778] radeon 0000:01:00.0: fence driver on ring 3 use gpu addr > 0x00000000e0000c0c and cpu addr 0x000000005a048e55 > [ 7932.285539] radeon 0000:01:00.0: fence driver on ring 5 use gpu addr > 0x0000000000072118 and cpu addr 0x000000003f7c7eb9 > [ 7932.301773] [drm] ring test on 0 succeeded in 2 usecs > [ 7932.301783] [drm] ring test on 3 succeeded in 7 usecs > [ 7932.477565] [drm] ring test on 5 succeeded in 2 usecs > [ 7932.477573] [drm] UVD initialized successfully. > [ 7933.624806] [drm:r600_ib_test [radeon]] *ERROR* radeon: fence wait > timed out. > [ 7933.624864] [drm:radeon_ib_ring_tests [radeon]] *ERROR* radeon: failed > testing IB on GFX ring (-110). > [ 7933.737625] radeon 0000:01:00.0: GPU softreset: 0x00000008 > [ 7933.737627] radeon 0000:01:00.0: GRBM_STATUS = > 0xA0003828 > [ 7933.737628] radeon 0000:01:00.0: GRBM_STATUS_SE0 = > 0x00000007 > [ 7933.737629] radeon 0000:01:00.0: GRBM_STATUS_SE1 = > 0x00000007 > [ 7933.737630] radeon 0000:01:00.0: SRBM_STATUS = > 0x200000C0 > [ 7933.737631] radeon 0000:01:00.0: SRBM_STATUS2 = > 0x00000000 > [ 7933.737631] radeon 0000:01:00.0: R_008674_CP_STALLED_STAT1 = > 0x00000000 > [ 7933.737633] radeon 0000:01:00.0: R_008678_CP_STALLED_STAT2 = > 0x00010100 > [ 7933.737633] radeon 0000:01:00.0: R_00867C_CP_BUSY_STAT = > 0x00020182 > [ 7933.737634] radeon 0000:01:00.0: R_008680_CP_STAT = > 0x80038243 > [ 7933.737636] radeon 0000:01:00.0: R_00D034_DMA_STATUS_REG = > 0x44C83D57 > [ 7933.748431] radeon 0000:01:00.0: GRBM_SOFT_RESET=0x00004001 > [ 7933.748483] radeon 0000:01:00.0: SRBM_SOFT_RESET=0x00000100 > [ 7933.749628] radeon 0000:01:00.0: GRBM_STATUS = > 0x00003828 > [ 7933.749629] radeon 0000:01:00.0: GRBM_STATUS_SE0 = > 0x00000007 > [ 7933.749630] radeon 0000:01:00.0: GRBM_STATUS_SE1 = > 0x00000007 > [ 7933.749631] radeon 0000:01:00.0: SRBM_STATUS = > 0x200000C0 > [ 7933.749632] radeon 0000:01:00.0: SRBM_STATUS2 = > 0x00000000 > [ 7933.749633] radeon 0000:01:00.0: R_008674_CP_STALLED_STAT1 = > 0x00000000 > [ 7933.749644] radeon 0000:01:00.0: R_008678_CP_STALLED_STAT2 = > 0x00000000 > [ 7933.749645] radeon 0000:01:00.0: R_00867C_CP_BUSY_STAT = > 0x00000000 > [ 7933.749646] radeon 0000:01:00.0: R_008680_CP_STAT = > 0x00000000 > [ 7933.749647] radeon 0000:01:00.0: R_00D034_DMA_STATUS_REG = > 0x44C83D57 > [ 7933.749659] radeon 0000:01:00.0: GPU reset succeeded, trying to resume > [ 7933.771913] [drm] enabling PCIE gen 2 link speeds, disable with > radeon.pcie_gen2=0 > [ 7933.776106] [drm] PCIE GART of 1024M enabled (table at > 0x0000000000162000). > [ 7933.776199] radeon 0000:01:00.0: WB enabled > [ 7933.776201] radeon 0000:01:00.0: fence driver on ring 0 use gpu addr > 0x00000000e0000c00 and cpu addr 0x0000000068b59d95 > [ 7933.776201] radeon 0000:01:00.0: fence driver on ring 3 use gpu addr > 0x00000000e0000c0c and cpu addr 0x000000005a048e55 > [ 7933.776982] radeon 0000:01:00.0: fence driver on ring 5 use gpu addr > 0x0000000000072118 and cpu addr 0x000000003f7c7eb9 > [ 7933.793238] [drm] ring test on 0 succeeded in 3 usecs > [ 7933.793249] [drm] ring test on 3 succeeded in 7 usecs > [ 7933.969032] [drm] ring test on 5 succeeded in 2 usecs > [ 7933.969041] [drm] UVD initialized successfully. > [ 7934.106544] [drm] ib test on ring 0 succeeded in 0 usecs > [ 7934.106582] [drm] ib test on ring 3 succeeded in 0 usecs > [ 7935.292759] [drm] ib test on ring 5 succeeded > [ 7970.036738] kauditd_printk_skb: 7 callbacks suppressed > [ 7970.036740] audit: type=1400 audit(1630662712.521:157): > apparmor="ALLOWED" operation="open" profile="libreoffice-soffice" > name=2F6E6574776F726B2F636F6D70616E792D6F6E2D6D6F6F6E2F437573746F6D657220646F63756D656E74732F456967656E20486F72656361204D616B656C6161722F2E7E6C6F636B2E54696D6573686565742045484D2076312E302E786C737823 > pid=6030 comm="soffice.bin" requested_mask="r" denied_mask="r" fsuid=1000 > ouid=1000 > [ 7970.040368] CIFS VFS: SMB signature verification returned error = -13 > [ 7970.042470] audit: type=1400 audit(1630662712.525:158): > apparmor="ALLOWED" operation="unlink" profile="libreoffice-soffice" > name=2F6E6574776F726B2F636F6D70616E792D6F6E2D6D6F6F6E2F437573746F6D657220646F63756D656E74732F456967656E20486F72656361204D616B656C6161722F2E7E6C6F636B2E54696D6573686565742045484D2076312E302E786C737823 > pid=6030 comm="soffice.bin" requested_mask="d" denied_mask="d" fsuid=1000 > ouid=1000 > [ 7970.047342] audit: type=1400 audit(1630662712.529:159): > apparmor="ALLOWED" operation="mknod" profile="libreoffice-soffice" > name=2F6E6574776F726B2F636F6D70616E792D6F6E2D6D6F6F6E2F437573746F6D657220646F63756D656E74732F456967656E20486F72656361204D616B656C6161722F6C75363033303432646D7A672E746D70 > pid=6030 comm="soffice.bin" requested_mask="c" denied_mask="c" fsuid=1000 > ouid=1000 > [ 7970.051025] audit: type=1400 audit(1630662712.533:160): > apparmor="ALLOWED" operation="open" profile="libreoffice-soffice" > name=2F6E6574776F726B2F636F6D70616E792D6F6E2D6D6F6F6E2F437573746F6D657220646F63756D656E74732F456967656E20486F72656361204D616B656C6161722F6C75363033303432646D7A672E746D70 > pid=6030 comm="soffice.bin" requested_mask="wrc" denied_mask="wrc" > fsuid=1000 ouid=1000 > > ... > > > > On 8/3/21 12:54 AM, Guus Ellenkamp wrote: > > My display freezes randomly on an Ubuntu 20.04 system with a Radeon AMD > Turks graphics card. > > Before the final freeze I often get warnings by the display suddenly > turning black and then turning on again. > > Not sure if it's the driver or the (cheap) graphics card. How can I find > out and is there any solution? > > Restarting the display manager does not have any effect. > >