[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 Martin Peres changed: What|Removed |Added Resolution|--- |MOVED Status|NEW |RESOLVED --- Comment #240 from Martin Peres --- -- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/892. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #239 from Shmerl --- *llvm10 I mean -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #238 from Shmerl --- (In reply to Tobias Frisch from comment #237) > - linux 5.3.11-arch1-1 > - mesa 19.2.4-1 > That's really not a good idea. You'd need 5.4 with that flip patch applied and Mesa 20 (i.e. master) with llvm 20 if you want to avoid as many hangs as possible. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #237 from Tobias Frisch --- Hardware: - Asus ROG Crosshair VI Extreme - AMD Ryzen 7 2700X - Sapphire Radeon RX 5700 Software: - linux 5.3.11-arch1-1 - mesa 19.2.4-1 I just tried to encounter some hangs again which occur relative randomly using Arch. So I started Steam and tried some benchmarks in Shadow of the Tombraider. It fully completed it on highest settings with a high FPS score but it lagged quite hard (even stuttered one time for 1~3 seconds) during displaying. I just hope/guess the wrong/lying fps-counter in SoTR is not related to the amdgpu drivers, isn't it? Anyhow starting Rise of the Tombraider after it then froze my system again. [14494.683266] [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out or interrupted! [14494.683354] [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out or interrupted! [14499.803441] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=2989148, emitted seq=2989150 [14499.803522] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process RiseOfTheTombRa pid 414233 thread RiseOfTheT:cs0 pid 414239 [14499.803525] [drm] GPU recovery disabled. I still have one question.. how is the communication with AMD in these issues? Because somehow (I would like to know) their drivers work on my Ubuntu 18.04 LTS without any freezes so far (except from starting Blender). I use it at the moment to get something done without worrying about random freezes (I had one this day using Arch with linux 5.4.0-rc7-mainline). I hope these issues are fixed soon. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #236 from Alex Deucher --- (In reply to Marko Popovic from comment #235) > (In reply to Alex Deucher from comment #233) > > Does attachment 145971 [details] [review] [review] help? > > No, this is for flip hangs that only happen in some games, random SDMA hangs > are still present, but SDMA is disabled in MESA20 so for the timebeing it > should be more stable. They may be related. If the SDMA is waiting on a fence from the display engine it would time out if that display fence never triggers. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #235 from Marko Popovic --- (In reply to Alex Deucher from comment #233) > Does attachment 145971 [details] [review] help? No, this is for flip hangs that only happen in some games, random SDMA hangs are still present, but SDMA is disabled in MESA20 so for the timebeing it should be more stable. (In reply to Timur Kristóf from comment #234) > (In reply to John H from comment #227) > > However, I have hard freezes when playing games. A > > specific one I can reproduce EVERY. SINGLE. TIME. was when playing Unreal > > Tournament 3 via Steam proton. > > Sounds like the same, or similar issue as this one: > https://gitlab.freedesktop.org/mesa/mesa/issues/868 > > In that case it was caused by an LLVM bug that has been fixed in LLVM 10 for > a while but haven't made it into LLVM 9 yet. > If you use mesa 19.3 can you try if the same issue occours with ACO? > Radv hangs are not related to SDMA hangs, but luckily at least those are fixed in LLVM10, so we can at least have decently stable experience with AMD_DEBUG=nodma, which is basically enabled by default in MESA 20. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #234 from Timur Kristóf --- (In reply to John H from comment #227) > However, I have hard freezes when playing games. A > specific one I can reproduce EVERY. SINGLE. TIME. was when playing Unreal > Tournament 3 via Steam proton. Sounds like the same, or similar issue as this one: https://gitlab.freedesktop.org/mesa/mesa/issues/868 In that case it was caused by an LLVM bug that has been fixed in LLVM 10 for a while but haven't made it into LLVM 9 yet. If you use mesa 19.3 can you try if the same issue occours with ACO? (In reply to John Smith from comment #225) > Is this seriously what AMD calls "support"? No offense but this is > ridiculous, this card has been out for four months and it still can't even > browse firefox reliably, even after these "workarounds" and "patches". I can symphatize with your frustration, but I don't think this attitude is helpful. Pierre-Eric and Alex are doing their best to solve this problem. Insulting each other in the bugzilla is not constructive and won't bring us closer to the solution. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #233 from Alex Deucher --- Does attachment 145971 help? -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #232 from viste.sylv...@gmail.com --- (In reply to Sander Lienaerts from comment #231) > Been following this thread for a while now. Can't believe this has been > known for 3 months, without a fix released. > > Just a moment ago a random freeze occurred running Firefox and other > applications, no games. Spotify kept playing in the background. Cursor not > moving and unable to open another shell. > > This happened with AMD_DEBUG="nongg,nodma" enabled. Running kernel 5.4rc7 > and Mesa 19.2.4. I'm currently using kernel 5.4 and mesa-git (using lcarlier repo, it's written mesa 20 but there is no mesa 20 on the git repository so ...) on Arch and I'm not having any hang or freeze so it seems to be fixed but maybe I'm lucky. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #231 from Sander Lienaerts --- Been following this thread for a while now. Can't believe this has been known for 3 months, without a fix released. Just a moment ago a random freeze occurred running Firefox and other applications, no games. Spotify kept playing in the background. Cursor not moving and unable to open another shell. This happened with AMD_DEBUG="nongg,nodma" enabled. Running kernel 5.4rc7 and Mesa 19.2.4. Here is an output of the log before reboot: nov 15 20:47:58 sander-pc kernel: amdgpu :0a:00.0: [gfxhub] page fault (src_id:0 ring:40 vmid:5 pasid:32769, for process Xorg pid 811 thread Xorg:cs0 pid 974) nov 15 20:47:58 sander-pc kernel: amdgpu :0a:00.0: in page starting at address 0x0318c00e7000 from client 27 nov 15 20:47:58 sander-pc kernel: amdgpu :0a:00.0: GCVM_L2_PROTECTION_FAULT_STATUS:0x00541C51 nov 15 20:47:58 sander-pc kernel: amdgpu :0a:00.0: MORE_FAULTS: 0x1 nov 15 20:47:58 sander-pc kernel: amdgpu :0a:00.0: WALKER_ERROR: 0x0 nov 15 20:47:58 sander-pc kernel: amdgpu :0a:00.0: PERMISSION_FAULTS: 0x5 nov 15 20:47:58 sander-pc kernel: amdgpu :0a:00.0: MAPPING_ERROR: 0x0 nov 15 20:47:58 sander-pc kernel: amdgpu :0a:00.0: RW: 0x1 nov 15 20:47:58 sander-pc kernel: amdgpu :0a:00.0: [gfxhub] page fault (src_id:0 ring:40 vmid:5 pasid:32769, for process Xorg pid 811 thread Xorg:cs0 pid 974) nov 15 20:47:58 sander-pc kernel: amdgpu :0a:00.0: in page starting at address 0x0318c00e6000 from client 27 nov 15 20:47:58 sander-pc kernel: amdgpu :0a:00.0: GCVM_L2_PROTECTION_FAULT_STATUS:0x nov 15 20:47:58 sander-pc kernel: amdgpu :0a:00.0: MORE_FAULTS: 0x0 nov 15 20:47:58 sander-pc kernel: amdgpu :0a:00.0: WALKER_ERROR: 0x0 nov 15 20:47:58 sander-pc kernel: amdgpu :0a:00.0: PERMISSION_FAULTS: 0x0 nov 15 20:47:58 sander-pc kernel: amdgpu :0a:00.0: MAPPING_ERROR: 0x0 nov 15 20:47:58 sander-pc kernel: amdgpu :0a:00.0: RW: 0x0 nov 15 20:47:58 sander-pc kernel: amdgpu :0a:00.0: [gfxhub] page fault (src_id:0 ring:40 vmid:5 pasid:32769, for process Xorg pid 811 thread Xorg:cs0 pid 974) nov 15 20:47:58 sander-pc kernel: amdgpu :0a:00.0: in page starting at address 0x0318c00e9000 from client 27 nov 15 20:47:58 sander-pc kernel: amdgpu :0a:00.0: GCVM_L2_PROTECTION_FAULT_STATUS:0x nov 15 20:47:58 sander-pc kernel: amdgpu :0a:00.0: MORE_FAULTS: 0x0 nov 15 20:47:58 sander-pc kernel: amdgpu :0a:00.0: WALKER_ERROR: 0x0 nov 15 20:47:58 sander-pc kernel: amdgpu :0a:00.0: PERMISSION_FAULTS: 0x0 nov 15 20:47:58 sander-pc kernel: amdgpu :0a:00.0: MAPPING_ERROR: 0x0 nov 15 20:47:58 sander-pc kernel: amdgpu :0a:00.0: RW: 0x0 nov 15 20:47:58 sander-pc kernel: amdgpu :0a:00.0: [gfxhub] page fault (src_id:0 ring:40 vmid:5 pasid:32769, for process Xorg pid 811 thread Xorg:cs0 pid 974) nov 15 20:47:58 sander-pc kernel: amdgpu :0a:00.0: in page starting at address 0x0318c00e8000 from client 27 nov 15 20:47:58 sander-pc kernel: amdgpu :0a:00.0: GCVM_L2_PROTECTION_FAULT_STATUS:0x nov 15 20:47:58 sander-pc kernel: amdgpu :0a:00.0: MORE_FAULTS: 0x0 nov 15 20:47:58 sander-pc kernel: amdgpu :0a:00.0: WALKER_ERROR: 0x0 nov 15 20:47:58 sander-pc kernel: amdgpu :0a:00.0: PERMISSION_FAULTS: 0x0 nov 15 20:47:58 sander-pc kernel: amdgpu :0a:00.0: MAPPING_ERROR: 0x0 nov 15 20:47:58 sander-pc kernel: amdgpu :0a:00.0: RW: 0x0 nov 15 20:47:58 sander-pc kernel: amdgpu :0a:00.0: [gfxhub] page fault (src_id:0 ring:40 vmid:5 pasid:32769, for process Xorg pid 811 thread Xorg:cs0 pid 974) nov 15 20:47:58 sander-pc kernel: amdgpu :0a:00.0: in page starting at address 0x0318c00ea000 from client 27 nov 15 20:47:58 sander-pc kernel: amdgpu :0a:00.0: GCVM_L2_PROTECTION_FAULT_STATUS:0x nov 15 20:47:58 sander-pc kernel: amdgpu :0a:00.0: MORE_FAULTS: 0x0 nov 15 20:47:58 sander-pc kernel: amdgpu :0a:00.0: WALKER_ERROR: 0x0 nov 15 20:47:58 sander-pc kernel: amdgpu :0a:00.0: PERMISSION_FAULTS: 0x0 nov 15 20:47:58 sander-pc kernel: amdgpu :0a:00.0: MAPPING_ERROR: 0x0 nov 15 20:47:58 sander-pc kernel: amdgpu :0a:00.0: RW: 0x0 nov 15 20:48:09 sander-pc kernel: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out! nov 15 20:48:09 sander-pc kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma1 timeout, signaled seq=6760, emitted seq=6763 nov 15 20:48:09 sander-pc kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process Xorg pid 811 thread Xorg:cs0 pid 974 nov 15 20:48:09 sander-pc kernel: [drm] GPU recovery disabled. -- You are receiving this mail because: You are the assignee for the
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #230 from Daniel Suarez --- (In reply to John Smith from comment #225) > (In reply to Pierre-Eric Pelloux-Prayer from comment #141) > > > For radeonsi the AMD_DEBUG=nodma environment variable is a workaround until > > we figure out a proper fix. > > Is this seriously what AMD calls "support"? No offense but this is > ridiculous, this card has been out for four months and it still can't even > browse firefox reliably, even after these "workarounds" and "patches". > > Then we waited two months for the drivers to even get properly released, and > all this wait was for nothing because the drivers are useless, you can't > even browse firefox or let alone play any actual games. What is the point of > having open source drivers if they don't even work? Nvidia's GPUs have had > day one support, and unlike AMD, "support" actually means the GPU works for > something that is meaningful. I wouldn't really call what is happening here "support". Really feels like us Linux users were thrown to the side with little consideration. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #229 from Marko Popovic --- (In reply to Shmerl from comment #228) > (In reply to John H from comment #227) > > > > specific one I can reproduce EVERY. SINGLE. TIME. was when playing Unreal > > Tournament 3 via Steam proton. The "Shangri La" map i encountered lockups > > anywhere from a few seconds to a few minutes into the game. Forcing me to > > hit the reset button. > > This could be a llvm / Mesa bug, not the kernel one. If you can reproduce > it, please report it for that game individually to the Mesa bug tracker, > with an apitrace. And make sure to NOT report it for the MESA version as old as 19.2.3... only report the bug if you're running current 19.3 RC series or 20 git series... because a lot of those might have already been fixed. best regards -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #228 from Shmerl --- (In reply to John H from comment #227) > > specific one I can reproduce EVERY. SINGLE. TIME. was when playing Unreal > Tournament 3 via Steam proton. The "Shangri La" map i encountered lockups > anywhere from a few seconds to a few minutes into the game. Forcing me to > hit the reset button. This could be a llvm / Mesa bug, not the kernel one. If you can reproduce it, please report it for that game individually to the Mesa bug tracker, with an apitrace. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #227 from John H --- Hi all. For the last couple weeks I have been following this thread and just wanted to reprot my experiences findings. First off, my machine's specs: AMD Ryzen 3700X Aorus X570 Pro Wifi motherboard 32 GB (16x2) DDR4 3200 RAM PowerColor Red Devil 5700XT Graphics Various SSD / HDD all on SATA. Windows 10 / Debian Sid Debian Sid: Kernel 5.3.10, Mesa 19.2.3, LLVM 9 as of writing this. In the whole time I have had this graphics card (October 21 onwards) I dont think I have had any crashes / freezes on the desktop or during browsing through Chromium. However, I have hard freezes when playing games. A specific one I can reproduce EVERY. SINGLE. TIME. was when playing Unreal Tournament 3 via Steam proton. The "Shangri La" map i encountered lockups anywhere from a few seconds to a few minutes into the game. Forcing me to hit the reset button. I was able to SSH in via my phone before resetting and looking at dmesg said something about amdgpu GPU recovery failed. My 5700XT, has a dual BIOS's. One overclocked, the other for "silent". By default the switch was in the OC position, earlier today I flipped it to silent. and since then, NO freezes in UT whatsoever! I figured the factory overclock PowerColor implemented on this card was just a touch too high and is therefore unstable. Forza 6 Apex in Windows 10 also hard freezes my PC, forcing me to reset. That problem also has been eliminated since flipping the switch. A slight performance loss but I'll take the stability anyday. TL;DR - If your Navi card has dual BIOS, try switching to the lower clocked BIOS if you haven't already. it may just help. Certainly, I'll report back if I find any other issues in Debian that is linked to this gfx card -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 William Casarin changed: What|Removed |Added CC|j...@jb55.com | --- Comment #226 from William Casarin --- (In reply to Marko Popovic from comment #222) > (In reply to William Casarin from comment #221) > > mesa 19.3.0-rc2 + RADV_PERFTEST=aco fixed this for me > > ACO should have no impact on SDMA. Firstly OpenGL still uses LLVM, and > OpenGL is the only one using SDMA in the first place, radv doesn't. So you > must be talking about some different kinds of hangs, probably the ring_gfx > types. you're right, I wasn't aware that this thread was only for sdma related hangs. The -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #225 from John Smith --- (In reply to Pierre-Eric Pelloux-Prayer from comment #141) > For radeonsi the AMD_DEBUG=nodma environment variable is a workaround until > we figure out a proper fix. Is this seriously what AMD calls "support"? No offense but this is ridiculous, this card has been out for four months and it still can't even browse firefox reliably, even after these "workarounds" and "patches". Then we waited two months for the drivers to even get properly released, and all this wait was for nothing because the drivers are useless, you can't even browse firefox or let alone play any actual games. What is the point of having open source drivers if they don't even work? Nvidia's GPUs have had day one support, and unlike AMD, "support" actually means the GPU works for something that is meaningful. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #224 from Marko Popovic --- (In reply to lptech1024 from comment #223) > Followup to #216: > > Fedora 31: Kernel 5.3.9, GNOME 3.34, Mesa 19.2.2, linux-firmware 20190923, > LLVM 9.0.0 > > The hang is 100% reproducible. > > It occurs running the Linux-native (Vulkan) version of Shadow of the Tomb > Raider (SotTR). I have never run SotTR under Proton/Wine, so that isn't a > confounding variable. > > The (unskippable) cutscene is for the Amazon River in Peru and occurs > anywhere between 15 seconds before the pilot is struck and the pilot is > struck. Even when the video hangs, you can usually hear fragments (sound > effects) of the game for a few seconds afterwords. > > I ran SotTR with vktrace and activated the Gnome (Wayland) overview to see > if there I could catch any relevant terminal output (none that I saw). The > game still had focus, so it continued playing. After the hang (when I > rebooted), there wasn't a vktrace file. I would assume this would be either > it didn't write it out due to the hang or it didn't have content to write. > > However, with it running visible in the overview (and a manual kernel > update), I got both ring gfx and sdma errors: > > Nov 07 [SNIP]:24 [SNIP] kernel: [drm] GPU recovery disabled. > Nov 07 [SNIP]:24 [SNIP] kernel: [drm] GPU recovery disabled. > Nov 07 [SNIP]:24 [SNIP] kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* > Process information: process pid 0 thread pid 0 > Nov 07 [SNIP]:24 [SNIP] kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* > Process information: process gnome-shell pid 1722 thread gnome-shel:cs0 pid > 1768 > Nov 07 [SNIP]:24 [SNIP] kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* > ring sdma1 timeout, signaled seq=1049, emitted seq=1053 > Nov 07 [SNIP]:24 [SNIP] kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* > ring sdma0 timeout, signaled seq=30017, emitted seq=30020 > Nov 07 [SNIP]:19 [SNIP] kernel: [drm] GPU recovery disabled. > Nov 07 [SNIP]:19 [SNIP] kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* > Process information: process ShadowOfTheTomb pid 3890 thread WebViewRenderer > pid 4981 > Nov 07 [SNIP]:19 [SNIP] kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* > ring gfx_0.0.0 timeout, signaled seq=75610, emitted seq=75612 > Nov 07 [SNIP]:19 [SNIP] kernel: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] > *ERROR* Waiting for fences timed out or interrupted! > > As a workaround to proceed in the game, I downloaded the AMDVLD 2019.Q4.2 > .deb, extracted the contents, modified the JSON file (to point to the local > amdvlk64.so), and ran SotTR with the VK_ICD_FILENAMES variable set to the > AMDVLK JSON file. > > The AMDVLK graphics were terrible (significant percentage of random pixels > turning random colors, bad rendering of elements, etc), but I did not > experience any hangs during the cutscene. After reaching a known save point, > I switched back to mesa/RADV-llvm and haven't experienced a hang since > (haven't progressed that much further yet, but that's the only hang so far - > about 13% of the game has been completed). > > This would seem to point to a bug at least partially due to mesa/RADV-llvm. radv related hangs got fixed in Mesa 20 git series, this thread is more concerned with SDMA kernel-driver hangs. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #223 from lptech1...@gmail.com --- Followup to #216: Fedora 31: Kernel 5.3.9, GNOME 3.34, Mesa 19.2.2, linux-firmware 20190923, LLVM 9.0.0 The hang is 100% reproducible. It occurs running the Linux-native (Vulkan) version of Shadow of the Tomb Raider (SotTR). I have never run SotTR under Proton/Wine, so that isn't a confounding variable. The (unskippable) cutscene is for the Amazon River in Peru and occurs anywhere between 15 seconds before the pilot is struck and the pilot is struck. Even when the video hangs, you can usually hear fragments (sound effects) of the game for a few seconds afterwords. I ran SotTR with vktrace and activated the Gnome (Wayland) overview to see if there I could catch any relevant terminal output (none that I saw). The game still had focus, so it continued playing. After the hang (when I rebooted), there wasn't a vktrace file. I would assume this would be either it didn't write it out due to the hang or it didn't have content to write. However, with it running visible in the overview (and a manual kernel update), I got both ring gfx and sdma errors: Nov 07 [SNIP]:24 [SNIP] kernel: [drm] GPU recovery disabled. Nov 07 [SNIP]:24 [SNIP] kernel: [drm] GPU recovery disabled. Nov 07 [SNIP]:24 [SNIP] kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process pid 0 thread pid 0 Nov 07 [SNIP]:24 [SNIP] kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process gnome-shell pid 1722 thread gnome-shel:cs0 pid 1768 Nov 07 [SNIP]:24 [SNIP] kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma1 timeout, signaled seq=1049, emitted seq=1053 Nov 07 [SNIP]:24 [SNIP] kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout, signaled seq=30017, emitted seq=30020 Nov 07 [SNIP]:19 [SNIP] kernel: [drm] GPU recovery disabled. Nov 07 [SNIP]:19 [SNIP] kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process ShadowOfTheTomb pid 3890 thread WebViewRenderer pid 4981 Nov 07 [SNIP]:19 [SNIP] kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=75610, emitted seq=75612 Nov 07 [SNIP]:19 [SNIP] kernel: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out or interrupted! As a workaround to proceed in the game, I downloaded the AMDVLD 2019.Q4.2 .deb, extracted the contents, modified the JSON file (to point to the local amdvlk64.so), and ran SotTR with the VK_ICD_FILENAMES variable set to the AMDVLK JSON file. The AMDVLK graphics were terrible (significant percentage of random pixels turning random colors, bad rendering of elements, etc), but I did not experience any hangs during the cutscene. After reaching a known save point, I switched back to mesa/RADV-llvm and haven't experienced a hang since (haven't progressed that much further yet, but that's the only hang so far - about 13% of the game has been completed). This would seem to point to a bug at least partially due to mesa/RADV-llvm. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #222 from Marko Popovic --- (In reply to William Casarin from comment #221) > mesa 19.3.0-rc2 + RADV_PERFTEST=aco fixed this for me ACO should have no impact on SDMA. Firstly OpenGL still uses LLVM, and OpenGL is the only one using SDMA in the first place, radv doesn't. So you must be talking about some different kinds of hangs, probably the ring_gfx types. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #221 from William Casarin --- mesa 19.3.0-rc2 + RADV_PERFTEST=aco fixed this for me -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 Marco Liedtke changed: What|Removed |Added Attachment #145904|0 |1 is obsolete|| --- Comment #220 from Marco Liedtke --- Created attachment 145917 --> https://bugs.freedesktop.org/attachment.cgi?id=145917=edit dmesg of new sdma0 error while watching youtube with firefox, mainline kernel 5.3.9, padoka ppa mesa 19.3 Hi, after installing and testing some configurations, amdgpu pro with amdvlk and kernel 4.15 (working..) and getting back to radv, cause only radv has no graphical issues with World of Tanks (wine + dxvk). I have another dmesg outputbtw /etc/environment has "export AMD_DEBUG=nodma" included and it works for me that i can use the pc for 1 or 2 hours...much better then before... so the attachment has many infos from the hang including sdma0 failure...maybe this helps... -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #219 from Shmerl --- (In reply to L.S.S. from comment #218) > > I'm not an expert of apitrace, but the reporter provided a trace that would > 100% reproduce the lockup, and he was able to bisect the call that caused > the lockup which is the last call of that trace file. There could be multiple reasons for such hangs, so just please report one separately if you can reproduce others. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #218 from L.S.S. --- It seems the page fault issue has been already reported here. I also found similar page faults in the log sometimes when the lockup occurred (I think it'll definitely show up if I leave the system as is for a prolonged amount of time). https://gitlab.freedesktop.org/mesa/mesa/issues/2053 I'm not an expert of apitrace, but the reporter provided a trace that would 100% reproduce the lockup, and he was able to bisect the call that caused the lockup which is the last call of that trace file. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #217 from Shmerl --- (In reply to lptech1024 from comment #216) > >Hang occurred during a gaming cutscene. > >... > Nov 06 [SNIP] kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring > gfx_0.0.0 timeout, signaled seq=2827901, emitted seq=2827903 > Nov 06 [SNIP] kernel: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* > Waiting for fences timed out or interrupted! If you can reproduce it, please report this to radeonsi bug tracker (and attach an apitrace please). https://gitlab.freedesktop.org/mesa/mesa/issues Also, please add details on what game it is (and etc.) here: https://www.gamingonlinux.com/wiki/Mesa_Broken -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #216 from lptech1...@gmail.com --- GPU: PowerColor Red Devil Radeon RX 5700 XT using OC BIOS (default) Stock Fedora 31: Kernel 5.3.8, GNOME 3.34, Mesa 19.2.2, linux-firmware 20190923, LLVM 9.0.0 I experienced frequent hangs using X.org Gnome (Kernel 5.3.7, > Mesa 19.2.0), especially interacting with graphical file manager-related operations . Wayland Gnome is much more stable, although I experienced a hang today after being powered on for almost two hours (45 minutes idle, 75 minutes with high GPU load). Hang occurred during a gaming cutscene. All messages contained an identical timestamp: Nov 06 [SNIP] kernel: [drm] GPU recovery disabled. Nov 06 [SNIP] kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process ShadowOfTheTomb pid 16893 thread WebViewRenderer pid 16939 Nov 06 [SNIP] kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=2827901, emitted seq=2827903 Nov 06 [SNIP] kernel: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out or interrupted! -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #215 from Shmerl --- (In reply to Pierre-Eric Pelloux-Prayer from comment #141) > If anyone has a reliable way to trigger the issue, the most helpful thing to > do for now is an apitrace capture. Does the trace in comment #199 help to narrow it down? https://bugs.freedesktop.org/show_bug.cgi?id=111481#c199 -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #214 from Marco Liedtke --- Created attachment 145904 --> https://bugs.freedesktop.org/attachment.cgi?id=145904=edit dmesg with gpu recovery enabled -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #213 from Marco Liedtke --- (In reply to wychuchol from comment #212) > (In reply to Marco Liedtke from comment #211) > > > > I have already set AMD_DEBUG=nodam in /etc/environment and in ~/.profile. > > Last time i played World of Tanks via Wine and DXVK the same freeze occured, > > again the same error that xorg pid timed out... > > Don't know if you made a typo here but do you have AMD_DEBUG="nongg,nodma" > line in /etc/environment ? Bugs still occur for me but they're far less > frequent. > Also since you're running ryzen 3000 try to get kernel 5.4. It won't solve > your problems but there's a massive performance buff for zen2 in 5.4. Hi, i have noch kernel 5.4 rc6 installed and the problem didnt change. I have written AMD_DEBUG=nodma and NOT AMD_DEBUG="nodma" in /etc/environment. Now i have added the amdgpu.gpu_recovery=1 attribute in grub. So now there is a long output from dmesg while nothing done then clicking "login" in bugzilla with firefox. see attachment dmesg_with_gpu_recovery enabled I hope this helps a bit... -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #212 from wychuchol --- (In reply to Marco Liedtke from comment #211) > > I have already set AMD_DEBUG=nodam in /etc/environment and in ~/.profile. > Last time i played World of Tanks via Wine and DXVK the same freeze occured, > again the same error that xorg pid timed out... Don't know if you made a typo here but do you have AMD_DEBUG="nongg,nodma" line in /etc/environment ? Bugs still occur for me but they're far less frequent. Also since you're running ryzen 3000 try to get kernel 5.4. It won't solve your problems but there's a massive performance buff for zen2 in 5.4. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #211 from Marco Liedtke --- Hi folks, i am new to bugreporting, but due to having a new system and this bug, i want to contribute something to this situation. I have almost the same behavior as stated in comment 1. My Xorg freezes every session no matter what i do. I could only get the last dmesg befor i had to hard reboot over ssh, cause this was the only thing working. 2 Examples: [ 1184.577790] [drm:amdgpu_dm_commit_planes.constprop.0 [amdgpu]] *ERROR* Waiting for fences timed out or interrupted! [ 1189.697729] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout, signaled seq=53043, emitted seq=53045 [ 1189.697797] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process Xorg pid 1398 thread Xorg:cs0 pid 1409 [ 1189.697799] [drm] GPU recovery disabled. [ 708.286318] [drm:amdgpu_dm_commit_planes.constprop.0 [amdgpu]] *ERROR* Waiting for fences timed out or interrupted! [ 713.406528] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout, signaled seq=104848, emitted seq=104850 [ 713.406594] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process Xorg pid 1402 thread Xorg:cs0 pid 1414 [ 713.406596] [drm] GPU recovery disabled. I have already set AMD_DEBUG=nodam in /etc/environment and in ~/.profile. Last time i played World of Tanks via Wine and DXVK the same freeze occured, again the same error that xorg pid timed out... It is happenening after 1 Minute logged in or 1 hour. My System specs are: R7 3700x Powercolor R5700XT Red Dragon Silent Bios enabled Gigabyte X570 I Aourus Pro WIFI UBUNTU 18.04.3 LTS with Kernel 5.3.8 and Padoka unstable PPA (Mesa 19.3) I have no NVME SSD and i have no Monitoring applications running. Tests done: -With Kernel 4.15 standrad Ubuntu Kernel and AMDGPU-PRO installed, everything runs fine without a freeze. - With Kernel 4.18 and Mesa 19.0.8 no freezes occured, kernel does not recognize rx5700, so no amdgpu modul is loaded. freezes occured with kernel 5.3.7 and 5.3.8 and in combination with padoka and oibaf ppa (Mesa 19.3). If i can help with further information pls guide me to dig in my system the infos u need. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #210 from Lazy --- To clarify, first: it's an Asus reference (blower-style) 5700XT I can't use the overclock utilities without a crash coming within the hour on Windows 10 or any of my Linux installs, no fan profiles, no manual control of fans, no setting it to "high performance" on the dynamic clock or it crashes within the hour. No exceptions, no setting then resetting the setting to default to get around it. Generally speaking, it maxes around 75C, but that's mostly due to the default fan profile only ramping up enough to negate further gains at that point (I'm guessing that's to do with trying to keep the card quiet). If I supply cool air, it'll slow the fan, and the heat still comes eventually. Some things that may or may not be relevant: This card crashes mostly around times that the clock rate adjusts more often; If the card goes from, say, max freq to a step below and back, there's a chance of a crash. (maybe coincidence, maybe not, I don't know to be bluntly honest) This is a constant I've noticed on both OSes. Windows 10 tends to keep things relatively stable in that regard, while Manjaro tends to see a lot of spiking and sudden drops. SteamVR definitely instigates that kind of behavior in my experience on my old Vega 56 as well (Which with nodma set on Navi, is actually not much different tbh). Probably explains why ever since the latest set of patches, the majority of the time it crashes is after an hour or two of gameplay in Manjaro. (also no idea why Manjaro switches more often..) To be blunt, though, in both OSes, seemingly random hangs are also a common occurrence for me. I had Win10 just yesterday, hang completely, no recovery, simply animating a minimizing window as SteamVR first opened. Granted, this also coincided with a rapid up-tick in clock speed most likely, as I've observed this massive spike on launching SteamVR via GPU adjusting utilities before I realized they instigate the issue as well. Setting nodma does get rid of some of the more random crashes, but these ones stick around in my experience so far. Maybe 75C is a bit high, but in neither OS can I manage to adjust the fans without the same issue, so.. No idea what to do, here. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #209 from L.S.S. --- Really?! Although I haven't really used the card under Windows, if similar behaviors happen on Windows as well then either something's really really wrong here. I haven't tested gaming on Manjaro yet, but at least with amdgpu-pro stuffs on Manjaro the sdma0 freezes with Nemo stopped happening. On the other hand, video card recovery is not yet matured on Linux yet, but on Windows it has already been available thanks to the WDDM, though you cannot completely rely on it, as some apps can still misbehave if the driver has been crashed for at least once in the system lifecycle, and it may eventually fail to recover at some point later on. Which brand of the Radeon RX 5700/XT are you using? For me I'm using a 50th Anniversary edition. How's the thermal condition when you play games on the card? It's possible the card might have weird behavior if it's under load with temperature near triple digits (something that I personally would never allow). I have a PCI slot fan set (consists of 3 slim fans which is around the same length as the card itself) placed beneath the card, blowing upwards, and it seems very effective. With the help of its own blower fans, the card maintains a steady 50 celsius under load. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #208 from ousleya...@gmail.com --- Just making this note at the recommendation of another, I'm reproducing similar behavior across both Linux distributions, and Windows 10. The behavior is as follows: Linux-Manjaro Linux kernel 5.4.0-1-MANJARO, Mesa 20.0.0-devel (git-dd77bdb34b), and LLVM 10.0.0 (compiled from Git master as I recall): Boot, launch Overwatch, or SteamVR. Usually after a period of 1-2 hours, displays will stutter a few times, before a full hang, leaving the last rendered frame on each display. Windows 10: latest insider build as of 11/5/2019: Similar behavior in the end, aside from the duration of stability being 3-4 hours it seems. Launching SteamVR, I can run for 3-4 hours, and then it stutters, hangs for a few seconds, then recovers. Then it'll do the same a few moments later with a longer duration before the recovery. After this repeats a few times, the display either hangs on the last frame, or all displays go black. After this, I have to hard-shutdown the same way as I do for Manjaro. This may not be the exact same behavior, but I don't know of a way to log this particular behavior in Windows. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #207 from Shmerl --- And I just got an sdma Firefox hang with 5.4-rc6. So while rate, it still happens. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #206 from Shmerl --- (In reply to wychuchol from comment #205) > How would I go about verifying if something uses PCIe 4? To avoid lengthy off-topic, I answered in the Matrix room (linked above). -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #205 from wychuchol --- (In reply to Shmerl from comment #198) > (In reply to wychuchol from comment #197) > > Despite the 'fix' I posted in comment 193 AER PCI bus errors still happen, > > and autonomous resets happen as well. I think it's less frequent though. > > Still it's difficult to say for sure or put in a precise value. > > Could be a motherboard issue with PCIe 4. Perhaps. I've built this system on Tomahawk B450 MAX but I thought PCIe 4 isn't even enabled by default since it caused problems. How would I go about verifying if something uses PCIe 4? Hmm there's a new BIOS available it seems, I'm running 7C02v33 and 7C02v34 has some NVMe compatibility updates. I'm gonna try it if I don't see people around internet wailing that it bricked their PCs. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #204 from Daniel Suarez --- (In reply to Marko Popovic from comment #202) > (In reply to Daniel Suarez from comment #201) > > AMD has been pretty quiet here lately, has anyone tested with the 6th > > release candidate for kernel 5.4? AMD was present in the changelogs and they > > did some SDMA improvements, some mentioning that it fixes some freezes > > Last trace that I posted is done on 5.4 RC6 and MESA git... My bad I missed that. Shame, AMD really needs to get it together -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #203 from Shmerl --- I'm running 5.4-rc6. No more hangs in Firefox at least, but that also could be due to me switching to Firefox nightly (stock, not the custom one I was testing before). -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #202 from Marko Popovic --- (In reply to Daniel Suarez from comment #201) > AMD has been pretty quiet here lately, has anyone tested with the 6th > release candidate for kernel 5.4? AMD was present in the changelogs and they > did some SDMA improvements, some mentioning that it fixes some freezes Last trace that I posted is done on 5.4 RC6 and MESA git... -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #201 from Daniel Suarez --- AMD has been pretty quiet here lately, has anyone tested with the 6th release candidate for kernel 5.4? AMD was present in the changelogs and they did some SDMA improvements, some mentioning that it fixes some freezes -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #200 from Marko Popovic --- (In reply to Marko Popovic from comment #199) > Created attachment 145882 [details] > Trace file from Blender SDMA hang > > Here is a trace file of the SDMA hang provoked by using blender, happens > pretty much all the time on the same place, so I guess those hangs look > random on the surface but are reproducible indeed. + Extra info: it doesn't happen with nodma on... so it's definitely SDMA related, not shaders... -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #199 from Marko Popovic --- Created attachment 145882 --> https://bugs.freedesktop.org/attachment.cgi?id=145882=edit Trace file from Blender SDMA hang Here is a trace file of the SDMA hang provoked by using blender, happens pretty much all the time on the same place, so I guess those hangs look random on the surface but are reproducible indeed. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #198 from Shmerl --- (In reply to wychuchol from comment #197) > Despite the 'fix' I posted in comment 193 AER PCI bus errors still happen, > and autonomous resets happen as well. I think it's less frequent though. > Still it's difficult to say for sure or put in a precise value. Could be a motherboard issue with PCIe 4. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #197 from wychuchol --- Despite the 'fix' I posted in comment 193 AER PCI bus errors still happen, and autonomous resets happen as well. I think it's less frequent though. Still it's difficult to say for sure or put in a precise value. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #196 from wychuchol --- (In reply to Shmerl from comment #194) > It sounds like NVMe problem, so not related to amdgpu? The thing is I played hours upon hours of Witcher 3 without any hangs or autonomous resets before until I added any lines to /etc/environment . Changing settings to make amdgpu work with more stability caused conflict so I'd propose it is related. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #195 from L.S.S. --- It's possible that the GPU issues might be able to affect other things on the PCIe bus. With Radeon RX 5700 XT I'm also encountering some NVMe-related errors, but I don't think my NVMe drives have issues as they worked just fine before I installed this video card. I recall if I don't power cycle the PC (not just pressing the reset button) when the freeze happens, one of my non-system NVMe drives would report "frozen state error detected, reset controller" errors (that the system would attempt to reset its controller, and it may still work), and some other NVMe drives might end up being unable to be detected by the system, unless I do a power cycle (a quick one is enough). -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #194 from Shmerl --- It sounds like NVMe problem, so not related to amdgpu? -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #193 from wychuchol --- Perhaps needs another entry started but it's related (since it didn't happen before I tried RADV_PERFTEST=aco and AMD_DEBUG="nongg,nodma") so I'll post it in case someone has had same issues as me. After some time in Witcher 3 GOTY run with Lutris PC restarts on it's own. I thought something is overheating (I've noticed graphic card memory in PSensor sometimes reaching 90 so I thought maybe that's what's happening) but I investigated kern.log and this always happened before that autonomous reset: Nov 2 22:01:53 pop-os kernel: [ 979.244964] pcieport :00:01.1: AER: Corrected error received: :01:00.0 Nov 2 22:01:53 pop-os kernel: [ 979.244967] nvme :01:00.0: AER: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID) Nov 2 22:01:53 pop-os kernel: [ 979.244968] nvme :01:00.0: AER: device [1987:5012] error status/mask=1000/6000 Nov 2 22:01:53 pop-os kernel: [ 979.244968] nvme :01:00.0: AER:[12] Timeout Nov 2 22:01:53 pop-os kernel: [ 979.262629] Emergency Sync complete A solution I found is to add pci=nommconf in /etc/default/grub to the line GRUB_CMDLINE_LINUX_DEFAULT="quiet splash" (so it looks like this: GRUB_CMDLINE_LINUX_DEFAULT="quiet splash pci=nommconf"). -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #192 from Seba Pe --- (In reply to L.S.S. from comment #188) > Not sure where the problem might be. > > After installing 5.4-rc5, in addition to amdgpu-pro-libgl (and other > amdgpu-pro related stuffs), I stopped encountering those dreaded "ring sdma0 > timeout" freezes when using Nemo. I think amdgpu-pro stuffs might be what > "fixed" it. > > I'll test this for the time being. I cannot be confident that it would be > completely fixed this way, but at least the situation has been improved to > the point that Nemo is now usable again. As I said in comment #149 (https://bugs.freedesktop.org/show_bug.cgi?id=111481#c149), amdgpu-pro does not exhibit freezes or timeouts. This appears to point to a problem in the generated instructions from libgl (or potentially a combination of that plus an underlying issue in the kernel driver). -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #191 from wychuchol --- Oh and music player kept working, played next track from playlist and I managed to reset with REISUB. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #190 from wychuchol --- Added AMD_DEBUG="nongg,nodma" to /etc/environment but it happened while opening a webm file in a new tab in Palemoon. Nov 1 20:10:30 pop-os kernel: [24044.197839] [drm:amdgpu_dm_commit_planes.constprop.0 [amdgpu]] *ERROR* Waiting for fences timed out! Nov 1 20:10:30 pop-os kernel: [24049.317800] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout, signaled seq=3673639, emitted seq=3673641 Nov 1 20:10:30 pop-os kernel: [24049.317836] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process Xorg pid 2350 thread Xorg:cs0 pid 2351 Nov 1 20:10:30 pop-os kernel: [24049.317838] [drm] GPU recovery disabled. I'd think it happens less though. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #189 from wychuchol --- (In reply to Konstantin Pereiaslov from comment #187) > As recommended here I added AMD_DEBUG="nongg,nodma" to /etc/environment and > additionally added export AMD_DEBUG="nongg,nodma" to ~/.profile just to be > sure and for 5 days since that I only had one system freeze and it had a > different journalctl message, so it did help me help with sdma0 timeout > issue! Thank you very much. I was afraid to try this since someone mentioned performance drops but I haven't noticed any in applications I use. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #188 from L.S.S. --- Not sure where the problem might be. After installing 5.4-rc5, in addition to amdgpu-pro-libgl (and other amdgpu-pro related stuffs), I stopped encountering those dreaded "ring sdma0 timeout" freezes when using Nemo. I think amdgpu-pro stuffs might be what "fixed" it. I'll test this for the time being. I cannot be confident that it would be completely fixed this way, but at least the situation has been improved to the point that Nemo is now usable again. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #187 from Konstantin Pereiaslov --- As recommended here I added AMD_DEBUG="nongg,nodma" to /etc/environment and additionally added export AMD_DEBUG="nongg,nodma" to ~/.profile just to be sure and for 5 days since that I only had one system freeze and it had a different journalctl message, so it did help me help with sdma0 timeout issue! -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #186 from wychuchol --- Forgot to add, Kernel v5.4-rc5. Sorry for doublepost, if someone feels the need to delete that second message please do, I can't find a way to delete my own posts. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #185 from wychuchol --- I wrote a nice long post but for some reason my browser decided to refresh so it got dunked... Anyways long story short: RX 5700 XT, Pop OS 19.10, latest Oibaf mesa, I don't know how to check llvm version cause search engine gave me no answer but it's probably whatever got installed using this guide and updated: https://ubuntuforums.org/showthread.php?t=2425799 DDLC with Monika's After Story mod running natively Oct 31 11:52:34 pop-os kernel: [ 129.130712] [drm:amdgpu_dm_commit_planes.constprop.0 [amdgpu]] *ERROR* Waiting for fences timed out! Oct 31 11:52:34 pop-os kernel: [ 133.994710] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout, signaled seq=17012, emitted seq=17014 Oct 31 11:52:34 pop-os kernel: [ 133.994747] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process DDLC pid 3150 thread DDLC:cs0 pid 3168 Oct 31 11:52:34 pop-os kernel: [ 133.994748] [drm] GPU recovery disabled. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #184 from wychuchol --- Pop OS 19.10, latest Oibaf mesa (as of date) not sure what llvm, I'm kinda new at this and search "how to check my llvm version" didn't yield any results... Please be patient with me. Anyway this happens frequently on rx 5700 xt, DDLC Monika's After Story mod (similar things occur with youtube videos, browsing internet - like trying to log in or opening a Oct 31 11:52:34 pop-os kernel: [ 129.130712] [drm:amdgpu_dm_commit_planes.constprop.0 [amdgpu]] *ERROR* Waiting for fences timed out! Oct 31 11:52:34 pop-os kernel: [ 133.994710] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout, signaled seq=17012, emitted seq=17014 Oct 31 11:52:34 pop-os kernel: [ 133.994747] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process DDLC pid 3150 thread DDLC:cs0 pid 3168 Oct 31 11:52:34 pop-os kernel: [ 133.994748] [drm] GPU recovery disabled. Sometimes it's right away, sometimes it can run for maybe an hour or so but it does hang - everything besides the mouse pointer stops (but can't click on anything), can't change to system terminal via ctr+alt+F3, power button does not give a signal to shut down (I tried waiting for about 2 minutes maybe I needed to wait more but nothing really helps and REISUB doesn't seem to be working at all here or I'm doing it wrong) only option left being hard reset. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #183 from Timur Kristóf --- (In reply to Jaap Buurman from comment #142) > How can I set both AMD_DEBUG=nongg and AMD_DEBUG=nodma in the > /etc/environment file? Do they need to be on two separate lines, or will the > second line simply overwrite the first one by setting the same environment > variable? Do they need to be comma separated maybe? Add the following line to your /etc/environment export AMD_DEBUG=nongg,nodma (In reply to Pierre-Eric Pelloux-Prayer from comment #141) > I don't think radv uses SDMA at all, so they cannot be affected by this > issue. Correct, radv doesn't use the SDMA so is not affected by this problem. If you see hangs in Vulkan games, it is currently most likely an LLVM problem. The LLVM devs have fixed most of the problems in their latest master, but haven't backported the fixes to LLVM 9 yet. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #182 from Shmerl --- Yep, even with 5.4-rc5 with those two extra patches applied, Firefox hangs randomly sometimes. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #181 from Daniel Suarez --- (In reply to L.S.S. from comment #180) > The other two patches do not fix the problem for me (sdma read delay and the > wip patch). After applying these two patches (along with the mask patch > which was already included upstream), I still get the same ring sdma0 > timeout (process Xorg) freezes when using Nemo. Don't feel left out. Those patches don't seem to work for almost anyone, at best it helps in some specific scenarios but they really don't do anything in terms of a proper solution/fix. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #180 from L.S.S. --- The other two patches do not fix the problem for me (sdma read delay and the wip patch). After applying these two patches (along with the mask patch which was already included upstream), I still get the same ring sdma0 timeout (process Xorg) freezes when using Nemo. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #179 from Shmerl --- (In reply to Pierre-Eric Pelloux-Prayer from comment #33) > Created attachment 145323 [details] [review] > wip patch > > You can give a try to the attached kernel patch which hopefully could > prevent some sdma timeouts. > > I'm still testing it but the more testers the better :) >From the three patches, the mask patch is already upstreamed. Do you plan to upstream the other two for 5.4 cycle? -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #178 from L.S.S. --- Created attachment 145827 --> https://bugs.freedesktop.org/attachment.cgi?id=145827=edit Errors captured with amdgpu.gpu_recovery=1 It seems GPU recovery is not yet ready for Navi. Just attempted to turn on that feature and when the freeze occurs, the screen turned black for a few seconds then returned and stayed frozen. >From the journalctl log it said the GPU recovery failed, and it followed with snd_hda_intel spamming errors then eventually crashed (which I think might be due to the HDMI/DP audio codec lost communication with the video card). -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #177 from L.S.S. --- I'm still getting freezes when using Nemo with the same sdma0 timeout, on latest Manjaro 5.4 rc4 kernel built from latest PKGBUILD (which included the sdma0 fix commits) and after applying the sdma_read_delay patch. Additionally, I discovered that changing system icon themes on Cinnamon can also trigger the freeze. Error codes are the same (ring sdma0 timeout). Additionally, before this, last night I was able to generate a sdma1 error when browsing with Chromium. This time it states chromium instead of Xorg as process caused the ring timeout: kernel: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out! kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma1 timeout, signaled seq=2140606, emitted seq=2140608 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process chromium pid 39450 thread chromium:cs0 pid 39509 It seems in all occurrences, the differences between emitted and signaled values are always 2. Is there any process regarding this issue? Or is there any more information needed (and how to enable verbose logs in the system regarding amdgpu and related parts)? -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #176 from L.S.S. --- Unfortunately this still happens with Nemo on 5.4-rc4 kernel (official), after switching to Manjaro Testing channel. The same ring sdma0 timeout error appears. An interesting phenomenon is that when the screen freezes (taskbar clock stopped changing), at first the mouse can still move, but after a few clicks the mouse stopped moving and the screen appears to have shifted to a previous frame before freezing completely: The contents of the previous folder would reappear in Nemo, and the taskbar clock may sometimes move a second backwards. I've removed AMD_DEBUG=nodma since it apparently doesn't work. If the patches are meant for 5.4-rc4, which patches are needed to address this problem? For now I'm using nnn (a terminal-based file manager) for browsing files since terminals don't freeze the system... I'm not sure what might be triggering the freeze as all the lockups I have so far all happened when using Nemo. Other programs (including Firefox and Chromium) haven't triggered the freeze yet. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #175 from Marko Popovic --- (In reply to Shmerl from comment #174) > Is it public? I can't join the room. https://matrix.to/#/!XvwReLqAqwRmEzgmVh:matrix.org?via=matrix.org Sorry, this should work -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #174 from Shmerl --- Is it public? I can't join the room. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #173 from Marko Popovic --- https://matrix.to/#/!UiDmeMlfsLndmzmPhp:matrix.org?via=matrix.org Here is a link to Matrix community, anyone interested should try to join. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #172 from Marko Popovic --- (In reply to Shmerl from comment #171) > (In reply to Marko Popovic from comment #170) > > By the way if anyone is up for it, we can make a dedicated Discord chat room > > for Navi linux users, so we don't bloat this bugtracker, since a lot of the > > comments are just random questions etc. Let me know what you think > > I'd prefer something on Matrix (FOSS and open protocol after all). Not > really using Discord. Sure, I'm up for that! -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #171 from Shmerl --- (In reply to Marko Popovic from comment #170) > By the way if anyone is up for it, we can make a dedicated Discord chat room > for Navi linux users, so we don't bloat this bugtracker, since a lot of the > comments are just random questions etc. Let me know what you think I'd prefer something on Matrix (FOSS and open protocol after all). Not really using Discord. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #170 from Marko Popovic --- By the way if anyone is up for it, we can make a dedicated Discord chat room for Navi linux users, so we don't bloat this bugtracker, since a lot of the comments are just random questions etc. Let me know what you think -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #169 from Marko Popovic --- (In reply to L.S.S. from comment #168) > For the 5.4 kernel, I'm running 5.4-rc2 (from official Manjaro repo). Not > sure when Manjaro Stable will receive its next update regarding kernels... You can always compile Kernel-git but Manjaro should be decently fast to provide 5.4+ RC series. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #168 from L.S.S. --- For the 5.4 kernel, I'm running 5.4-rc2 (from official Manjaro repo). Not sure when Manjaro Stable will receive its next update regarding kernels... -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #167 from Shmerl --- (In reply to Marko Popovic from comment #166) > when has it been accepted upstream? https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7557d2783850eec199cae78dac561e9b7de181be -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #166 from Marko Popovic --- (In reply to Shmerl from comment #165) > (In reply to L.S.S. from comment #163) > > This was captured on 5.4(rc) > > Just to clarify, do you have all the mentioned patches above applied? > 5.4-rc4 already includes the mask patch, but not the other two. Are you sure about that? I'm using 5.4 daily and I still get frequent freezes, which didn't happen even remotely as often with mask patch applied... when has it been accepted upstream? -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #165 from Shmerl --- (In reply to L.S.S. from comment #163) > This was captured on 5.4(rc) Just to clarify, do you have all the mentioned patches above applied? 5.4-rc4 already includes the mask patch, but not the other two. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #164 from L.S.S. --- EDIT: Did some analysis myself about the GCVM_L2_PROTECTION_FAULT errors... In the errors last time contained this: src_id:0 ring:40 vmid:7 pasid:32769 GCVM_L2_PROTECTION_FAULT_STATUS:0x00741A51 (only on first error) Whereas in the errors this time contained this: src_id:0 ring:40 vmid:1 pasid:32769 GCVM_L2_PROTECTION_FAULT_STATUS:0x00141A51 (only on first error) vmid became 1 and GCVM_L2_PROTECTION_FAULT_STATUS changed from 0x00741A51 to 0x00141A51. The rest of the first error remained the same. MORE_FAULTS: 0x1 WALKER_ERROR: 0x0 PERMISSION_FAULTS: 0x5 MAPPING_ERROR: 0x0 RW: 0x1 In subsequent errors those values were all 0. Both times the first error has a starting address of 0x0318c00e7000. Not sure if these could be of any help, though. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 L.S.S. changed: What|Removed |Added Attachment #145807|0 |1 is obsolete|| --- Comment #163 from L.S.S. --- Created attachment 145814 --> https://bugs.freedesktop.org/attachment.cgi?id=145814=edit Newly captured GCVM_L2_PROTECTION_FAULT errors. This was captured on 5.4(rc) kernel, and with AMD_DEBUG=nodma. I got a few more freezes when using Nemo. This time with AMD_DEBUG=nodma or AMD_DEBUG="nodma,nongg". I put AMD_DEBUG to /etc/environment, and I can indeed confirm it from terminal (echo $AMD_DEBUG). It seems this doesn't work, as the freezes I got this time are also sdma0 type, same as before. I also captured some new GCVM_L2_PROTECTION_FAULT errors. Not sure if they're different from last time. This is captured on 5.4(rc) kernel with AMD_DEBUG=nodma. In the end, the sdma0 error doesn't seem to go away and I'm not even sure whether the parameter was set correctly. Where am I supposed to put the AMD_DEBUG parameters on Manjaro? -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #162 from Shmerl --- I suppose it's a problem with radeonsi specifically. Hopefully AMD developers can clarify this. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #161 from b...@thschuetz.de --- > Regarding this issue, is this issue mostly caused by the amdgpu driver > itself, or caused by mesa? I tried to avoid freezes and uninstalled amdgpu, just using the modesetting driver for X11 - and got freezes. So I don't think it's a problem of amdgpu. Maybe somebody could confirm this? -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #160 from L.S.S. --- I'll try AMD_DEBUG="nodma,nongg" when I get back. Regarding this issue, is this issue mostly caused by the amdgpu driver itself, or caused by mesa? It seems more related to the driver as I have this same system freeze issue on both mesa from official Manjaro repo, as well as mesa-aco-git from AUR (which is a bit newer). Speaking of rendering, how do current web browser render images/videos nowadays? I haven't gotten a single system freeze that was caused directly by the web browser (Firefox/Chromium) yet, so I'm curious, given the issue might be OpenGL-related. So far the "ring sdma0 timeout" errors have been mostly caused by Nemo. Opening the file manager, browsing files, or simply leaving the file manager running can all cause the system to freeze at some point later. By the way (off-topic), how's the issue on Wayland? And, does Cinnamon have proper support for Wayland and does anyone who's on Manjaro have experience on how to switch to Wayland from Xorg? I'm still unfamiliar about Wayland as I have never really used it (all the DEs I've been actively using, such as XFCE and Cinnamon, are still on X11/Xorg). -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #159 from Michael de Lang --- Thank you for making me look twice at the contents of the variable. Although the env variable is incorrect, the quotes don't do anything to the contents of the variable. Rather the error is in that it is not space- but comma-separated. For posterity, this means that I will now be running with AMD_DEBUG="nodma,nongg". Commenters #150 and #151 should also look into this. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #158 from Konstantin Pereiaslov --- Also experiencing this with Radeon RX 5700 XT and amdgpu 19.1.0+git1910111930.b467d2~oibaf~b with kernel version 5.3.7-050307-generic running KDE Neon User edition with latest updates. Didn't have any heavy load for the GPU to do. First I had some artifacts appeared on Plasma Hard Disk Monitor widget and CPU Load Widget (here is a screenshot: https://i.perk11.info/20191024_193152_kernel.png) while PC was idle and screen was locked, but everything else continued to work fine. I checked the logs for the period when this could've happened, but the only logs from that period are from KScreen that start like this: Oct 24 16:34:58 perk11-home org.kde.KScreen[25804]: kscreen.xcb.helper: RRNotify_OutputProperty (ignored) Oct 24 16:34:58 perk11-home org.kde.KScreen[25804]: kscreen.xcb.helper: Output: 88 Oct 24 16:34:58 perk11-home org.kde.KScreen[25804]: kscreen.xcb.helper: Property: EDID Oct 24 16:34:58 perk11-home org.kde.KScreen[25804]: kscreen.xcb.helper: State (newValue, Deleted): 1 Oct 24 16:34:58 perk11-home org.kde.KScreen[25804]: kscreen.xcb.helper: RRNotify_OutputProperty (ignored) Oct 24 16:34:58 perk11-home org.kde.KScreen[25804]: kscreen.xcb.helper: Output: 88 Oct 24 16:34:58 perk11-home org.kde.KScreen[25804]: kscreen.xcb.helper: Property: EDID Oct 24 16:34:58 perk11-home org.kde.KScreen[25804]: kscreen.xcb.helper: State (newValue, Deleted): 1 Oct 24 16:34:58 perk11-home org.kde.KScreen[25804]: kscreen.xcb.helper: RRNotify_OutputChange Oct 24 16:34:58 perk11-home org.kde.KScreen[25804]: kscreen.xcb.helper: Output: 88 Oct 24 16:34:58 perk11-home org.kde.KScreen[25804]: kscreen.xcb.helper: CRTC: 81 Oct 24 16:34:58 perk11-home org.kde.KScreen[25804]: kscreen.xcb.helper: Mode: 97 Oct 24 16:34:58 perk11-home org.kde.KScreen[25804]: kscreen.xcb.helper: Rotation: "Rotate_0" Oct 24 16:34:58 perk11-home org.kde.KScreen[25804]: kscreen.xcb.helper: Connection: "Disconnected" Oct 24 16:34:58 perk11-home org.kde.KScreen[25804]: kscreen.xcb.helper: Subpixel Order: 0 Oct 24 16:34:58 perk11-home org.kde.KScreen[25804]: kscreen.xcb.helper: RRScreenChangeNotify Oct 24 16:34:58 perk11-home org.kde.KScreen[25804]: kscreen.xcb.helper: Window: 18874373 Oct 24 16:34:58 perk11-home org.kde.KScreen[25804]: kscreen.xcb.helper: Root: 1744 Oct 24 16:34:58 perk11-home org.kde.KScreen[25804]: kscreen.xcb.helper: Rotation: "Rotate_0" Oct 24 16:34:58 perk11-home org.kde.KScreen[25804]: kscreen.xcb.helper: Size ID: 65535 Oct 24 16:34:58 perk11-home org.kde.KScreen[25804]: kscreen.xcb.helper: Size: 7280 1440 Oct 24 16:34:58 perk11-home org.kde.KScreen[25804]: kscreen.xcb.helper: SizeMM: 1926 381 Oct 24 16:34:58 perk11-home org.kde.KScreen[25804]: kscreen.xcb.helper: RRNotify_OutputChange Oct 24 16:34:58 perk11-home org.kde.KScreen[25804]: kscreen.xcb.helper: Output: 88 Oct 24 16:34:58 perk11-home org.kde.KScreen[25804]: kscreen.xcb.helper: CRTC: 81 Oct 24 16:34:58 perk11-home org.kde.KScreen[25804]: kscreen.xcb.helper: Mode: 97 Oct 24 16:34:58 perk11-home org.kde.KScreen[25804]: kscreen.xcb.helper: Rotation: "Rotate_0" Oct 24 16:34:58 perk11-home org.kde.KScreen[25804]: kscreen.xcb.helper: Connection: "Disconnected" Oct 24 16:34:58 perk11-home org.kde.KScreen[25804]: kscreen.xcb.helper: Subpixel Order: 0 Oct 24 16:34:58 perk11-home org.kde.KScreen[25804]: kscreen.xrandr: XRandROutput 88 update Oct 24 16:34:58 perk11-home org.kde.KScreen[25804]: m_connected: 0 Oct 24 16:34:58 perk11-home org.kde.KScreen[25804]: m_crtc XRandRCrtc(0x5655577da9f0) Oct 24 16:34:58 perk11-home org.kde.KScreen[25804]: CRTC: 81 Oct 24 16:34:58 perk11-home org.kde.KScreen[25804]: MODE: 97 Oct 24 16:34:58 perk11-home org.kde.KScreen[25804]: Connection: 1 Oct 24 16:34:58 perk11-home org.kde.KScreen[25804]: Primary: false Oct 24 16:34:58 perk11-home org.kde.KScreen[25804]: kscreen.xrandr: Output 88 : connected = false , enabled = true Oct 24 16:34:58 perk11-home org.kde.KScreen[25804]: kscreen.xrandr: XRandROutput 88 update Oct 24 16:34:58 perk11-home org.kde.KScreen[25804]: m_connected: 1 Oct 24 16:34:58 perk11-home org.kde.KScreen[25804]: m_crtc XRandRCrtc(0x5655577da9f0) Oct 24 16:34:58 perk11-home org.kde.KScreen[25804]: CRTC: 81 Oct 24 16:34:58 perk11-home org.kde.KScreen[25804]: MODE: 97 Oct 24 16:34:58 perk11-home org.kde.KScreen[25804]: Connection: 1 Oct 24 16:34:58 perk11-home org.kde.KScreen[25804]: Primary: false 90 minutes later, the system became unresponsive while I was typing a message in Skype, but the audio I had playing in Audacity continued to play and the cron jobs continued running normally for a few minutes while I
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #157 from Marko Popovic --- (In reply to Michael de Lang from comment #156) > Just had a hang using 5.4.0-rc3, mesa 19.3~git1910171930.4b458b~oibaf~d, > AMD_DEBUG="nodma nongg" while using firefox: > > Oct 24 16:31:26 oipo-X570-AORUS-ELITE kernel: [27386.467009] broken atomic > modeset userspace detected, disabling atomic > Oct 24 21:04:58 oipo-X570-AORUS-ELITE kernel: [43796.470041] > [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed > out! > Oct 24 21:04:58 oipo-X570-AORUS-ELITE kernel: [43798.773602] > [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout, signaled > seq=1756792, emitted seq=1756794 > Oct 24 21:04:58 oipo-X570-AORUS-ELITE kernel: [43798.773683] > [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process GPU > Process pid 17048 thread firefox:cs0 pid 17134 > Oct 24 21:04:58 oipo-X570-AORUS-ELITE kernel: [43798.773685] [drm] GPU > recovery disabled. Ok this doesn't sound right, how can you get an SDMA hang if you disable DMA completely. command should be: AMD_DEBUG=nodma not AMD_DEBUG="nodma" -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #156 from Michael de Lang --- Just had a hang using 5.4.0-rc3, mesa 19.3~git1910171930.4b458b~oibaf~d, AMD_DEBUG="nodma nongg" while using firefox: Oct 24 16:31:26 oipo-X570-AORUS-ELITE kernel: [27386.467009] broken atomic modeset userspace detected, disabling atomic Oct 24 21:04:58 oipo-X570-AORUS-ELITE kernel: [43796.470041] [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out! Oct 24 21:04:58 oipo-X570-AORUS-ELITE kernel: [43798.773602] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout, signaled seq=1756792, emitted seq=1756794 Oct 24 21:04:58 oipo-X570-AORUS-ELITE kernel: [43798.773683] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process GPU Process pid 17048 thread firefox:cs0 pid 17134 Oct 24 21:04:58 oipo-X570-AORUS-ELITE kernel: [43798.773685] [drm] GPU recovery disabled. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #155 from jmsharvey...@gmail.com --- Some observations from me that may point to this being an OpenGL issue: * Vulkan applications seem to work (mostly). I've not had a crash with the Dolphin Emulator with the Vulkan backend and Heat Signature, a game that runs through Proton. This doesn't explain Rise of the Tomb Raider though. I've also had freezing issues with Overwatch via Lutris/DXVK. * Running freezing games in windowed mode stops hangs. In CS:GO, Minecraft, and Team Fortress 2, my system freezes in the menus. When I run them in windowed mode, they seem to run fine * OpenGL games freeze after mouse input (for example, selecting a menu item). This is when CS:GO, TF2, and Minecraft freeze up. I am using Manjaro on kernel 5.4-rc4, mesa 19.2.1-2, vulkan-radeon (radv) 19.2.1-2 and xf86-video-amdgpu 19.1.0-1. I use KDE Plasma 5.17.1 -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #154 from L.S.S. --- I'm not sure about how to locally pipe dmesg log to file so the moment when the system freezes could be captured. And interestingly, the GCVM_L2_PROTECTION_FAULT errors that I saw from journalctl when it froze last time went missing somehow... maybe I mistook it, but whenever the system froze the following lines are guaranteed to show up (ring sdma0 timeout), so it's most likely sdma0 type. [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout, signaled seq=151787, emitted seq=151789 [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process Xorg pid 1838 thread Xorg:cs0 pid 1862 Currently the system is running okay as I haven't opened Nemo yet (which can almost 100% cause the freeze). Web browsers such as Firefox and Chrome currently don't cause the freeze. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #153 from Marko Popovic --- (In reply to L.S.S. from comment #152) > UPDATE: I just got another freeze on 5.3.6 kernel. The same > GCVM_L2_PROTECTION_FAULT error followed by a ring sdma0 timeout. > > So it seems AMD_DEBUG="nodma nongg" doesn't really work for me. Can you at least provide the dmesg log so we can determine what type of hang you're having and directing you to the right bugtracker, since there are multiple types. This also varies greatly from one desktop environment to other, wayland or not etc. This topic is mostly concerning the SDMA type hangs that happen at random, and AMD_DEBUG=nodma seems to take care of it for almost anyone, I don't think using nongg is neccessary since until now it's only been proven to take care of 1 specific hang happening in Citra emulator, which is also ring-gfx type so it's a driver bug, probably not kernel driver related. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #152 from L.S.S. --- UPDATE: I just got another freeze on 5.3.6 kernel. The same GCVM_L2_PROTECTION_FAULT error followed by a ring sdma0 timeout. So it seems AMD_DEBUG="nodma nongg" doesn't really work for me. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 L.S.S. changed: What|Removed |Added CC||ragnaros39...@yandex.com --- Comment #151 from L.S.S. --- Created attachment 145807 --> https://bugs.freedesktop.org/attachment.cgi?id=145807=edit captured GCVM_L2_PROTECTION_FAULT errors in the log. This was captured on 5.4(rc) kernel. I'm having similar issues with Navi on Manjaro (both 5.3 and 5.4 kernels). Both kernels were from official Manjaro repos. It's almost 100% reproducible using Cinnamon's file manager, Nemo. It can happen right after I start it, or after I click something (such as opening a folder). Interestingly, I haven't gotten a freeze from use web browsers (Firefox, Chromium) just yet. When the system froze, the rest of the stuffs are still running. The froze happened in the morning and since I was about to leave for work I left the system as is (until I get back home in the evening). The xmrig (CPU) mining session in the background continued to work as normal as observed from the pool's dashboard. It seems the protection fault errors would appear after the system has frozen long enough (I only saw it appear at the time I left it on frozen for a while, and the rest of the times I reset my system right after it froze). If resetting the system only a short a while after the freeze happened, the log will end only at "ring sdma0 timeout". It seems the "nodma nongg" trick partially worked on 5.3 (5.3.6 to be precise) as the system hasn't frozen for the time being (even when using Nemo). It however, doesn't work with the 5.4 (rc) kernel as I still got a freeze caused by the same "ring sdma0 timeout" error. Off-topic: On 5.3 kernel, the mouse cursor feels sluggish as if my monitor is running at 30Hz (while xrandr reports it's indeed 60Hz), while the mouse cursor works fine on 5.4(rc) kernel. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #150 from Stijn Tintel --- (In reply to Jaap Buurman from comment #142) > How can I set both AMD_DEBUG=nongg and AMD_DEBUG=nodma in the > /etc/environment file? Do they need to be on two separate lines, or will the > second line simply overwrite the first one by setting the same environment > variable? Do they need to be comma separated maybe? AMD_DEBUG="nodma nongg" I've been running like this since I found this bug report. Current uptime: 11:08:41 up 4 days, 4:12, 11 users, load average: 8,56, 8,33, 8,15 Haven't experienced a single hang, not even a kernel oops. Before that, the system was frustratingly unstable. If you need stability, put this in /etc/environment (or /etc/env.d/99amdgpu or so if your distro supports /etc/env.d). Running on Gentoo, kernel 5.3.4, mesa 19.2.1, llvm 9.0.0, libdrm 2.4.99, xf86-video-amdgpu git e6fce59a071220967fcd4e2c9e4a262c72870761. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #149 from Seba Pe --- (In reply to Jaap Buurman from comment #136) > Has anyone tried AMD's closed source OpenGL driver to see if that one is > stable? I've been running AMDGPU-PRO without issues while waiting for a fix for this (5700XT). OpenGL apps appear to work fine. With Vulkan I've had a crash to desktop but I haven't tested it that much. No freezes at least. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #148 from Shmerl --- (In reply to Daniel Suarez from comment #147) > > You shouldn't be using a 5700 XT in a system that demands 100% uptime, I > have had mine randomly hang in the night without Firefox even being open, > only qbittorrent and discord For the reference, common UI toolkits (GTK and Qt) use OpenGL rendering too. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #147 from Daniel Suarez --- (In reply to Jaap Buurman from comment #144) > I need as close as 100% uptime on this machine as possible, so I don't > really have the time to add applications over time until the problem is > fixed. I need stability now. So a systemwide setting is fine for me, even if > it might result in big performance losses. I'll wait until a proper fix is > found. > > Do you happen to know whether it will require two lines to set both debug > options, or does the environment variable expect the values to be > comma-separated? You shouldn't be using a 5700 XT in a system that demands 100% uptime, I have had mine randomly hang in the night without Firefox even being open, only qbittorrent and discord -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #146 from Shmerl --- You can also use Your $HOME/.profile for setting session wide variables. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #145 from Shmerl --- According to man environment The /etc/environment file specifies the environment variables to be set. The file must consist of simple NAME=VALUE pairs on separate lines. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #144 from Jaap Buurman --- I need as close as 100% uptime on this machine as possible, so I don't really have the time to add applications over time until the problem is fixed. I need stability now. So a systemwide setting is fine for me, even if it might result in big performance losses. I'll wait until a proper fix is found. Do you happen to know whether it will require two lines to set both debug options, or does the environment variable expect the values to be comma-separated? -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #143 from Shmerl --- (In reply to Jaap Buurman from comment #142) > How can I set both AMD_DEBUG=nongg and AMD_DEBUG=nodma in the > /etc/environment file? Do they need to be on two separate lines, or will the > second line simply overwrite the first one by setting the same environment > variable? Do they need to be comma separated maybe? It's probably better to avoid a wide setting like that. If you know some applications that hangs (like Firefox or specific game), just set that when launching it (you can for example add it to .desktop file or some start script). -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #142 from Jaap Buurman --- How can I set both AMD_DEBUG=nongg and AMD_DEBUG=nodma in the /etc/environment file? Do they need to be on two separate lines, or will the second line simply overwrite the first one by setting the same environment variable? Do they need to be comma separated maybe? -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9
https://bugs.freedesktop.org/show_bug.cgi?id=111481 --- Comment #141 from Pierre-Eric Pelloux-Prayer --- Thanks for the quake 2 trace, I could reproduce the same hang here. If anyone has a reliable way to trigger the issue, the most helpful thing to do for now is an apitrace capture. The umr log were helpful (thanks!) but I don't need more of them at the moment. I don't think radv uses SDMA at all, so they cannot be affected by this issue. For radeonsi the AMD_DEBUG=nodma environment variable is a workaround until we figure out a proper fix. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel