Re: [Mesa-dev] Screen freeze AMD TURKS/JUNIPER

2021-10-21 Thread Gert Wollny
Hello Guus, 

On Wed, 2021-10-20 at 18:20 +0800, Guus Ellenkamp wrote:
> My screen freezes every now and then (once a day before, now more
> often)  in an Ubuntu 20.04 system. It started using a TURKS AMD
> graphics adapter  and after trying many things I thought the adapter
> might be defective, but replacing it wit h a JUNIPER type the problem
> still remains.
Did the problems start right after updating to Ubuntu 20.04 or did you
sucessfully use Ubuntu 20.04 already before the problems started? 



> It seems the freeze occurs after 'something' goes wrong, like I mostly
> see network errors before the card problem output.
This could also point to a hardware problem with the main board that
happens to affect the slot the graphics card is in. 


Apart from that you might try to test a graphics card that uses the
newer radeonsi driver (anything AMD R5+), they are available e.g. on
ebay for reasonable prices. 

In any case the right channel tp report bugs is here: 
https://gitlab.freedesktop.org/mesa/mesa/-/issues

Best regards, 
Gert





[Mesa-dev] Screen freeze AMD TURKS/JUNIPER

2021-10-20 Thread Guus Ellenkamp
My screen freezes every now and then (once a day before, now more often) 
in an Ubuntu 20.04 system. It started using a TURKS AMD graphics adapter 
and after trying many things I thought the adapter might be defective, 
but replacing it with a JUNIPER type the problem still remains.


Mostly I was able to switch to a terminal screen and often I can use ssh 
to reboot properly, but after upgrading to newer drivers (kisak 
repository), the problem seems to get worse.


I just rebooted in 18.04 mode, which seems more stable, but I didn't use 
it that long.


It seems the freeze occurs after 'something' goes wrong, like I mostly 
see network errors before the card problem output.


The problem is pretty urgent, as I am using this computer for work and 
random reboots are very annoying. I thought Linux was supposed to be 
more stable than Windows LOL.


Hope someone can help sort this out.



Some log file, but I see different errors (have more log output):

[ 4605.729370] CIFS VFS: SMB signature verification returned error = -13
[ 4605.841441] CIFS VFS: SMB signature verification returned error = -13
[ 4606.016338] CIFS VFS: SMB signature verification returned error = -13
[ 4606.033629] audit: type=1400 audit(1630738964.068:89): 
apparmor="ALLOWED" operation="rename_src" profile="libreoffice-soffice" 
name=2F6E6574776F726B2F636F6D70616E792D6F6E2D6D6F6F6E2F437573746F6D657220646F63756D656E74732F456967656E20486F72656361204D616B656C6161722F6C7531363634373566646475692E746D70 
pid=16647 comm="soffice.bin" requested_mask="wrd" denied_mask="wrd" 
fsuid=1000 ouid=1000
[ 7384.500862] perf: interrupt took too long (2540 > 2500), lowering 
kernel.perf_event_max_sample_rate to 78500

[ 8291.818014] radeon :01:00.0: ring 0 stalled for more than 10228msec
[ 8291.818023] radeon :01:00.0: GPU lockup (current fence id 
0x000500f9 last fence id 0x0005010a on ring 0)

[ 8291.936996] radeon :01:00.0: Saved 535 dwords of commands on ring 0.
[ 8291.937007] radeon :01:00.0: GPU softreset: 0x0008
[ 8291.937008] radeon :01:00.0:   GRBM_STATUS   = 0xA0003828
[ 8291.937009] radeon :01:00.0:   GRBM_STATUS_SE0   = 0x0007
[ 8291.937010] radeon :01:00.0:   GRBM_STATUS_SE1   = 0x0007
[ 8291.937011] radeon :01:00.0:   SRBM_STATUS   = 0x2AC0
[ 8291.937012] radeon :01:00.0:   SRBM_STATUS2  = 0x
[ 8291.937013] radeon :01:00.0:   R_008674_CP_STALLED_STAT1 = 0x
[ 8291.937014] radeon :01:00.0:   R_008678_CP_STALLED_STAT2 = 0x00010002
[ 8291.937015] radeon :01:00.0:   R_00867C_CP_BUSY_STAT = 0x00020186
[ 8291.937016] radeon :01:00.0:   R_008680_CP_STAT  = 0x80038647
[ 8291.937017] radeon :01:00.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
[ 8291.951250] radeon :01:00.0: GRBM_SOFT_RESET=0x4001
[ 8291.951302] radeon :01:00.0: SRBM_SOFT_RESET=0x0100
[ 8291.952448] radeon :01:00.0:   GRBM_STATUS   = 0x3828
[ 8291.952449] radeon :01:00.0:   GRBM_STATUS_SE0   = 0x0007
[ 8291.952450] radeon :01:00.0:   GRBM_STATUS_SE1   = 0x0007
[ 8291.952451] radeon :01:00.0:   SRBM_STATUS   = 0x20C0
[ 8291.952452] radeon :01:00.0:   SRBM_STATUS2  = 0x
[ 8291.952453] radeon :01:00.0:   R_008674_CP_STALLED_STAT1 = 0x
[ 8291.952454] radeon :01:00.0:   R_008678_CP_STALLED_STAT2 = 0x
[ 8291.952455] radeon :01:00.0:   R_00867C_CP_BUSY_STAT = 0x
[ 8291.952456] radeon :01:00.0:   R_008680_CP_STAT  = 0x
[ 8291.952457] radeon :01:00.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
[ 8291.952470] radeon :01:00.0: GPU reset succeeded, trying to resume
[ 8291.974662] [drm] enabling PCIE gen 2 link speeds, disable with 
radeon.pcie_gen2=0
[ 8291.978998] [drm] PCIE GART of 1024M enabled (table at 
0x00162000).

[ 8291.979111] radeon :01:00.0: WB enabled
[ 8291.979113] radeon :01:00.0: fence driver on ring 0 use gpu addr 
0xec00 and cpu addr 0x1ebd2b6b
[ 8291.979114] radeon :01:00.0: fence driver on ring 3 use gpu addr 
0xec0c and cpu addr 0xfc505e35
[ 8291.979874] radeon :01:00.0: fence driver on ring 5 use gpu addr 
0x00072118 and cpu addr 0x3a66b0fa

[ 8291.996180] [drm] ring test on 0 succeeded in 3 usecs
[ 8291.996191] [drm] ring test on 3 succeeded in 7 usecs
[ 8292.171926] [drm] ring test on 5 succeeded in 2 usecs
[ 8292.171934] [drm] UVD initialized successfully.
[ 8293.322021] [drm:r600_ib_test [radeon]] *ERROR* radeon: fence wait 
timed out.
[ 8293.322080] [drm:radeon_ib_ring_tests [radeon]] *ERROR* radeon: 
failed testing IB on GFX ring (-110).

[ 8299.068114] show_signal: 7 callbacks suppressed
[ 8299.068117] traps: Core[4211] general protection fault 
ip:7f45295949f6 sp:7f451e5d8930 error:0 in libc-2.31.so[7f452957+178000]
[ 8300.172199] pcmanfm-qt[4191]: