Re: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out or interrupted!
Sure. (I just hope it doesn't eat my data and/or unborn children. Sometimes I really wish one didn't need to switch out the whole kernel to get a different version of a bunch of components ...) Anyway, I have it (the new kernel) running since yesterday evening, the first Ori session, ~1.5 h, was uneventful. If & when it crashes, I'll get back to you. Cheers, C. Am Fr., 6. Dez. 2019 um 16:45 Uhr schrieb Andrey Grodzovsky : > > The WARN stack trace after GPU reset kicks in points to not the latest code - > can you please try running the same with kernel at the tip of > https://cgit.freedesktop.org/~agd5f/linux/log/?h=amd-staging-drm-next ? > > Andrey > > On 12/5/19 6:14 PM, Christian Pernegger wrote: > > Hello, > > one of my computers has been crashing while gaming rather a lot > lately, with kernel messages pointing to amdgpu. First line see > subject, rest in the attached log. > SSH still works, attempts to shutdown/reboot don't quite finish. > > Radeon VII in an Asus Pro WS X570-Ace. Ubuntu 18.04.3 HWE, mesa-aco. > This one was with kernel 5.3.0-24-generic [hwe-edge], mesa > 19.3.0+aco+git1575452833-3409c06e26d-1bionic1, vesa20_* from > linux-firmware-20191022, running Ori and the Blind Forest: Definitive > Edition via Proton/WINED3D11 under Steam Remote Play. I've had similar > crashes sporadically even with 5.0 [plain hwe] and linux-firmware > completely stock, and with native games (e.g. Crusader Kings II) > running locally. > It used to be maybe once every other week, though, that was tolerable, > now Ori usually triggers it in under an hour. Turning off ACO via > RADV_PERFTEST=llvm makes it worse (not bad enough to make it trigger > quickly and reliably. though), going back to kernel 5.0 helps (as in > an hour or two might go by without a crash, but the performance impact > is severe). > > All very vague. Which is why this isn't pretending to be a bug report, > just a "has anyone seen this?" kind of shout-out. If it's worthy of > following up, I'd be happy to provide further info, just tell me what. > > Cheers, > C. > > > ___ > amd-gfx mailing list > amd-gfx@lists.freedesktop.org > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfxdata=02%7C01%7Candrey.grodzovsky%40amd.com%7Cbb730551c8ef4057491908d779d90a9b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637111845515070500sdata=RisL4HBqy35p25FOcp97EU%2F4Ldq6W1GJtkVANyzz8BY%3Dreserved=0 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out or interrupted!
The WARN stack trace after GPU reset kicks in points to not the latest code - can you please try running the same with kernel at the tip of https://cgit.freedesktop.org/~agd5f/linux/log/?h=amd-staging-drm-next ? Andrey On 12/5/19 6:14 PM, Christian Pernegger wrote: Hello, one of my computers has been crashing while gaming rather a lot lately, with kernel messages pointing to amdgpu. First line see subject, rest in the attached log. SSH still works, attempts to shutdown/reboot don't quite finish. Radeon VII in an Asus Pro WS X570-Ace. Ubuntu 18.04.3 HWE, mesa-aco. This one was with kernel 5.3.0-24-generic [hwe-edge], mesa 19.3.0+aco+git1575452833-3409c06e26d-1bionic1, vesa20_* from linux-firmware-20191022, running Ori and the Blind Forest: Definitive Edition via Proton/WINED3D11 under Steam Remote Play. I've had similar crashes sporadically even with 5.0 [plain hwe] and linux-firmware completely stock, and with native games (e.g. Crusader Kings II) running locally. It used to be maybe once every other week, though, that was tolerable, now Ori usually triggers it in under an hour. Turning off ACO via RADV_PERFTEST=llvm makes it worse (not bad enough to make it trigger quickly and reliably. though), going back to kernel 5.0 helps (as in an hour or two might go by without a crash, but the performance impact is severe). All very vague. Which is why this isn't pretending to be a bug report, just a "has anyone seen this?" kind of shout-out. If it's worthy of following up, I'd be happy to provide further info, just tell me what. Cheers, C. ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfxdata=02%7C01%7Candrey.grodzovsky%40amd.com%7Cbb730551c8ef4057491908d779d90a9b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637111845515070500sdata=RisL4HBqy35p25FOcp97EU%2F4Ldq6W1GJtkVANyzz8BY%3Dreserved=0 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out or interrupted!
Am Fr., 6. Dez. 2019 um 02:43 Uhr schrieb Liu, Zhan : > I've seen a few people reported this issue on Freedesktop/Bugzilla. For > example: > https://bugs.freedesktop.org/show_bug.cgi?id=24. Yes, there's also https://bugs.freedesktop.org/show_bug.cgi?id=109955. To be honest, I couldn't say if my issue is the same as either of these -- I wouldn't know where to begin to interpret these logs, or even say whether they're essentially showing the same or something completely different. From experience, blindly lobbing stuff together because of the first error message that pops out is dangerous. That's why I'm asking here. > They all experienced this issue while playing games. The higher GPU clock is, > the more frequent issue can be reproduced. Shouldn't that mean that more demanding games crash more often? Neither Ori nor CK2 are likely to tax a Radon VII, at least temps stay down, and so do the fans. I remember a single crash from Overcooked [Proton/DXVK], same; but none from Prey (2017) [Proton/DXVK] or Life is Strange - Before the Storm [native Vulkan]. That said, I mostly play games one after the other, just because something didn't crash then, that doesn't mean it wouldn't now. > Also, some Reddit users pointed out all these games are Vulkan based. It > could be a Vulkan specific issue. I'm running Ori using WINED3D (D3D-OpenGL), CK2 is native OpenGL AFAIK, that hypothesis doesn't hold. The various search results that look similar to the layman also offer a plethora of workarounds, a lot of which look to be of the sacrifice a chicken at full moon and dance around it counter-clockwise intoning sea shanties at the top of your voice variety to that same layman. Understandable, when you don't know if it helped or the bug just takes particularly long to trigger. I'd still rather help identify & fix the root cause. Thank you for caring. Cheers, C. > > -Original Message- > > From: amd-gfx On Behalf Of > > Christian Pernegger > > Sent: 2019/December/05, Thursday 6:15 PM > > To: amd-gfx@lists.freedesktop.org > > Subject: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* > > Waiting for fences timed out or interrupted! > > > > Hello, > > > > one of my computers has been crashing while gaming rather a lot lately, with > > kernel messages pointing to amdgpu. First line see subject, rest in the > > attached log. > > SSH still works, attempts to shutdown/reboot don't quite finish. > > > > Radeon VII in an Asus Pro WS X570-Ace. Ubuntu 18.04.3 HWE, mesa-aco. > > This one was with kernel 5.3.0-24-generic [hwe-edge], mesa > > 19.3.0+aco+git1575452833-3409c06e26d-1bionic1, vesa20_* from linux- > > firmware-20191022, running Ori and the Blind Forest: Definitive Edition via > > Proton/WINED3D11 under Steam Remote Play. I've had similar crashes > > sporadically even with 5.0 [plain hwe] and linux-firmware completely stock, > > and with native games (e.g. Crusader Kings II) running locally. > > It used to be maybe once every other week, though, that was tolerable, now > > Ori usually triggers it in under an hour. Turning off ACO via > > RADV_PERFTEST=llvm makes it worse (not bad enough to make it trigger > > quickly and reliably. though), going back to kernel 5.0 helps (as in an > > hour or > > two might go by without a crash, but the performance impact is severe). > > > > All very vague. Which is why this isn't pretending to be a bug report, just > > a > > "has anyone seen this?" kind of shout-out. If it's worthy of following up, > > I'd > > be happy to provide further info, just tell me what. > > > > Cheers, > > C. ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out or interrupted!
I've seen a few people reported this issue on Freedesktop/Bugzilla. For example: https://bugs.freedesktop.org/show_bug.cgi?id=24. They all experienced this issue while playing games. The higher GPU clock is, the more frequent issue can be reproduced. Also, some Reddit users pointed out all these games are Vulkan based. It could be a Vulkan specific issue. Thanks, Zhan > -Original Message- > From: amd-gfx On Behalf Of > Christian Pernegger > Sent: 2019/December/05, Thursday 6:15 PM > To: amd-gfx@lists.freedesktop.org > Subject: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* > Waiting for fences timed out or interrupted! > > Hello, > > one of my computers has been crashing while gaming rather a lot lately, with > kernel messages pointing to amdgpu. First line see subject, rest in the > attached log. > SSH still works, attempts to shutdown/reboot don't quite finish. > > Radeon VII in an Asus Pro WS X570-Ace. Ubuntu 18.04.3 HWE, mesa-aco. > This one was with kernel 5.3.0-24-generic [hwe-edge], mesa > 19.3.0+aco+git1575452833-3409c06e26d-1bionic1, vesa20_* from linux- > firmware-20191022, running Ori and the Blind Forest: Definitive Edition via > Proton/WINED3D11 under Steam Remote Play. I've had similar crashes > sporadically even with 5.0 [plain hwe] and linux-firmware completely stock, > and with native games (e.g. Crusader Kings II) running locally. > It used to be maybe once every other week, though, that was tolerable, now > Ori usually triggers it in under an hour. Turning off ACO via > RADV_PERFTEST=llvm makes it worse (not bad enough to make it trigger > quickly and reliably. though), going back to kernel 5.0 helps (as in an hour > or > two might go by without a crash, but the performance impact is severe). > > All very vague. Which is why this isn't pretending to be a bug report, just a > "has anyone seen this?" kind of shout-out. If it's worthy of following up, I'd > be happy to provide further info, just tell me what. > > Cheers, > C. ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out or interrupted!
Hello, one of my computers has been crashing while gaming rather a lot lately, with kernel messages pointing to amdgpu. First line see subject, rest in the attached log. SSH still works, attempts to shutdown/reboot don't quite finish. Radeon VII in an Asus Pro WS X570-Ace. Ubuntu 18.04.3 HWE, mesa-aco. This one was with kernel 5.3.0-24-generic [hwe-edge], mesa 19.3.0+aco+git1575452833-3409c06e26d-1bionic1, vesa20_* from linux-firmware-20191022, running Ori and the Blind Forest: Definitive Edition via Proton/WINED3D11 under Steam Remote Play. I've had similar crashes sporadically even with 5.0 [plain hwe] and linux-firmware completely stock, and with native games (e.g. Crusader Kings II) running locally. It used to be maybe once every other week, though, that was tolerable, now Ori usually triggers it in under an hour. Turning off ACO via RADV_PERFTEST=llvm makes it worse (not bad enough to make it trigger quickly and reliably. though), going back to kernel 5.0 helps (as in an hour or two might go by without a crash, but the performance impact is severe). All very vague. Which is why this isn't pretending to be a bug report, just a "has anyone seen this?" kind of shout-out. If it's worthy of following up, I'd be happy to provide further info, just tell me what. Cheers, C. amdgpu.journal Description: Binary data ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx