[Bug 111763] ring_gfx hangs/freezes on Navi gpus

2019-11-19 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=111763

Martin Peres  changed:

   What|Removed |Added

 Resolution|--- |MOVED
 Status|NEW |RESOLVED

--- Comment #38 from Martin Peres  ---
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been
closed from further activity.

You can subscribe and participate further through the new bug through this link
to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/914.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 111763] ring_gfx hangs/freezes on Navi gpus

2019-11-12 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #37 from Andrew Sheldon  ---
(In reply to Daniel Suarez from comment #33)

> That workaround delays the hangs af best, and I have gotten hangs from
> OpenGl Games and also by using amdvlk. 
> 

Those hangs shouldn't be SDMA related, however. If you are getting hangs from
specific games, report them on the corresponding bug tracker
(https://gitlab.freedesktop.org/mesa/mesa for OGL and RADV,
https://github.com/GPUOpen-Drivers/AMDVLK/issues for AMDVLK).

I suggest using RADV_PERFTEST=aco with mesa-git for the most stable Vulkan
experience (or try the AMDGPU-PRO Vulkan driver). 

There's also the "divide error" random hang issue, but it shouldn't be related
to SDMA either.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 111763] ring_gfx hangs/freezes on Navi gpus

2019-11-12 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #36 from John H  ---
Also, for people who have a 5700XT card, check if yours has dual BIOS's

Typically one is for running at normal clock speeds, and the other is for
running overclocked values.

My card, the Powercolor Red Devil 5700XT, is an example of such card, in OC
mode I have had all sorts of random freezes and crashes in both Windows AND
Linux. 

Since switching to the default clocks, sometimes called Silent mode. I haven't
had a single problem since. This is just a heads up for users who have Navi10
based cards with a selectable BIOS

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 111763] ring_gfx hangs/freezes on Navi gpus

2019-11-10 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #35 from Daniel Suarez  ---
(In reply to Marko Popovic from comment #34)
> (In reply to Daniel Suarez from comment #33)
> > (In reply to Marko Popovic from comment #32)
> > > (In reply to Daniel Suarez from comment #31)
> > > > (In reply to Marko Popovic from comment #30)
> > > > > (In reply to Daniel Suarez from comment #29)
> > > > > > (In reply to Marko Popovic from comment #28)
> > > > > > > I think this bug report can be closed now, Mesa 20 git basically 
> > > > > > > fixes radv
> > > > > > > related ring_gfx hangs, there is still hang that happens in Citra 
> > > > > > > emulator
> > > > > > > (ngg related) but AMD developers are aware of it so will probably 
> > > > > > > get fixed
> > > > > > > too.
> > > > > > 
> > > > > > Yeah.. "soon". Still waiting for them to fix bug 111481
> > > > > 
> > > > > SDMA hangs have nothing to do with ring_gfx hangs which were mostly 
> > > > > radv
> > > > > related and are fixed now
> > > > 
> > > > Still, I can't even play Vulkan titles reliably because the system
> > > > constantly hangs even with the workarounds in the bug report. AMD really
> > > > needs to fix them.
> > > 
> > > Mesa 20.0 should fix Vulkan hangs for you, and with nodma SDMA is disabled
> > > fully so you can't get any hangs that are SDMA related.
> > 
> > That workaround delays the hangs af best, and I have gotten hangs from
> > OpenGl Games and also by using amdvlk. 
> > 
> > Don't get me wrong I'm not saying this bug report shouldn't be closed, I'm
> > just saying that you saying "soon" is very misleading. AMD hasn't still
> > properly fixed bugs that lead to hangs by just watching Firefox, and it's
> > been MONTHS. "soon" for them is months apperantly
> 
> And where exactly did I say soon?

My bad, I read "soon" instead of "too", apologies

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 111763] ring_gfx hangs/freezes on Navi gpus

2019-11-10 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #34 from Marko Popovic  ---
(In reply to Daniel Suarez from comment #33)
> (In reply to Marko Popovic from comment #32)
> > (In reply to Daniel Suarez from comment #31)
> > > (In reply to Marko Popovic from comment #30)
> > > > (In reply to Daniel Suarez from comment #29)
> > > > > (In reply to Marko Popovic from comment #28)
> > > > > > I think this bug report can be closed now, Mesa 20 git basically 
> > > > > > fixes radv
> > > > > > related ring_gfx hangs, there is still hang that happens in Citra 
> > > > > > emulator
> > > > > > (ngg related) but AMD developers are aware of it so will probably 
> > > > > > get fixed
> > > > > > too.
> > > > > 
> > > > > Yeah.. "soon". Still waiting for them to fix bug 111481
> > > > 
> > > > SDMA hangs have nothing to do with ring_gfx hangs which were mostly radv
> > > > related and are fixed now
> > > 
> > > Still, I can't even play Vulkan titles reliably because the system
> > > constantly hangs even with the workarounds in the bug report. AMD really
> > > needs to fix them.
> > 
> > Mesa 20.0 should fix Vulkan hangs for you, and with nodma SDMA is disabled
> > fully so you can't get any hangs that are SDMA related.
> 
> That workaround delays the hangs af best, and I have gotten hangs from
> OpenGl Games and also by using amdvlk. 
> 
> Don't get me wrong I'm not saying this bug report shouldn't be closed, I'm
> just saying that you saying "soon" is very misleading. AMD hasn't still
> properly fixed bugs that lead to hangs by just watching Firefox, and it's
> been MONTHS. "soon" for them is months apperantly

And where exactly did I say soon?

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 111763] ring_gfx hangs/freezes on Navi gpus

2019-11-10 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #33 from Daniel Suarez  ---
(In reply to Marko Popovic from comment #32)
> (In reply to Daniel Suarez from comment #31)
> > (In reply to Marko Popovic from comment #30)
> > > (In reply to Daniel Suarez from comment #29)
> > > > (In reply to Marko Popovic from comment #28)
> > > > > I think this bug report can be closed now, Mesa 20 git basically 
> > > > > fixes radv
> > > > > related ring_gfx hangs, there is still hang that happens in Citra 
> > > > > emulator
> > > > > (ngg related) but AMD developers are aware of it so will probably get 
> > > > > fixed
> > > > > too.
> > > > 
> > > > Yeah.. "soon". Still waiting for them to fix bug 111481
> > > 
> > > SDMA hangs have nothing to do with ring_gfx hangs which were mostly radv
> > > related and are fixed now
> > 
> > Still, I can't even play Vulkan titles reliably because the system
> > constantly hangs even with the workarounds in the bug report. AMD really
> > needs to fix them.
> 
> Mesa 20.0 should fix Vulkan hangs for you, and with nodma SDMA is disabled
> fully so you can't get any hangs that are SDMA related.

That workaround delays the hangs af best, and I have gotten hangs from OpenGl
Games and also by using amdvlk. 

Don't get me wrong I'm not saying this bug report shouldn't be closed, I'm just
saying that you saying "soon" is very misleading. AMD hasn't still properly
fixed bugs that lead to hangs by just watching Firefox, and it's been MONTHS.
"soon" for them is months apperantly

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 111763] ring_gfx hangs/freezes on Navi gpus

2019-11-10 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #32 from Marko Popovic  ---
(In reply to Daniel Suarez from comment #31)
> (In reply to Marko Popovic from comment #30)
> > (In reply to Daniel Suarez from comment #29)
> > > (In reply to Marko Popovic from comment #28)
> > > > I think this bug report can be closed now, Mesa 20 git basically fixes 
> > > > radv
> > > > related ring_gfx hangs, there is still hang that happens in Citra 
> > > > emulator
> > > > (ngg related) but AMD developers are aware of it so will probably get 
> > > > fixed
> > > > too.
> > > 
> > > Yeah.. "soon". Still waiting for them to fix bug 111481
> > 
> > SDMA hangs have nothing to do with ring_gfx hangs which were mostly radv
> > related and are fixed now
> 
> Still, I can't even play Vulkan titles reliably because the system
> constantly hangs even with the workarounds in the bug report. AMD really
> needs to fix them.

Mesa 20.0 should fix Vulkan hangs for you, and with nodma SDMA is disabled
fully so you can't get any hangs that are SDMA related.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 111763] ring_gfx hangs/freezes on Navi gpus

2019-11-10 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #31 from Daniel Suarez  ---
(In reply to Marko Popovic from comment #30)
> (In reply to Daniel Suarez from comment #29)
> > (In reply to Marko Popovic from comment #28)
> > > I think this bug report can be closed now, Mesa 20 git basically fixes 
> > > radv
> > > related ring_gfx hangs, there is still hang that happens in Citra emulator
> > > (ngg related) but AMD developers are aware of it so will probably get 
> > > fixed
> > > too.
> > 
> > Yeah.. "soon". Still waiting for them to fix bug 111481
> 
> SDMA hangs have nothing to do with ring_gfx hangs which were mostly radv
> related and are fixed now

Still, I can't even play Vulkan titles reliably because the system constantly
hangs even with the workarounds in the bug report. AMD really needs to fix
them.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 111763] ring_gfx hangs/freezes on Navi gpus

2019-11-10 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #30 from Marko Popovic  ---
(In reply to Daniel Suarez from comment #29)
> (In reply to Marko Popovic from comment #28)
> > I think this bug report can be closed now, Mesa 20 git basically fixes radv
> > related ring_gfx hangs, there is still hang that happens in Citra emulator
> > (ngg related) but AMD developers are aware of it so will probably get fixed
> > too.
> 
> Yeah.. "soon". Still waiting for them to fix bug 111481

SDMA hangs have nothing to do with ring_gfx hangs which were mostly radv
related and are fixed now

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 111763] ring_gfx hangs/freezes on Navi gpus

2019-11-10 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #29 from Daniel Suarez  ---
(In reply to Marko Popovic from comment #28)
> I think this bug report can be closed now, Mesa 20 git basically fixes radv
> related ring_gfx hangs, there is still hang that happens in Citra emulator
> (ngg related) but AMD developers are aware of it so will probably get fixed
> too.

Yeah.. "soon". Still waiting for them to fix bug 111481

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 111763] ring_gfx hangs/freezes on Navi gpus

2019-11-10 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #28 from Marko Popovic  ---
I think this bug report can be closed now, Mesa 20 git basically fixes radv
related ring_gfx hangs, there is still hang that happens in Citra emulator (ngg
related) but AMD developers are aware of it so will probably get fixed too.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 111763] ring_gfx hangs/freezes on Navi gpus

2019-11-09 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=111763

James Wood  changed:

   What|Removed |Added

 CC||chryseus8...@gmail.com

--- Comment #27 from James Wood  ---
This doesn't seem to be exclusive to Navi GPUs, I've been having instances of
ring gfx timeouts freezing up the system in numerous games such as Project
Zomboid (was recently fixed by the developer) and ArmA 3 with the all too
familiar dmesg:
[drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed
out or interrupted!
drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, but soft recovered

I'm using:
Radeon RX 590 Series (POLARIS10, DRM 3.33.0, 5.3.8-arch1-1, LLVM 9.0.0)

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 111763] ring_gfx hangs/freezes on Navi gpus

2019-11-09 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #26 from Marko Popovic  ---
(In reply to Ben Klein from comment #25)
> Created attachment 145918 [details]
> Journal excerpt vega56 ring gfx timeout, then gpu reset
> 
> I think I'm having this problem on a Vega 56, I didn't see anyone else
> mention that card here.
> 
> I attached the relevant log, I think it's this same issue, but someone
> correct me if I'm wrong.
> 
> OpenGL renderer string: Radeon RX Vega (VEGA10, DRM 3.33.0,
> 5.3.0-20-generic, LLVM 9.0.0)
> OpenGL core profile version string: 4.5 (Core Profile) Mesa 19.2.1
> 
> Running Pop!_OS:
> Linux robo-triangulum 5.3.0-20-generic
> #21+system76~1572304854~19.10~8caa3e6-Ubuntu SMP Tue Oct 29 00:4 x86_64
> x86_64 x86_64 GNU/Linux

Could be, there are a few patches in latest RADV, so try out MESA 20.0 git to
see if it fixes anything for you... apparently radv hangs for navi gpus stopped
with that fix.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 111763] ring_gfx hangs/freezes on Navi gpus

2019-11-08 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #25 from Ben Klein  ---
Created attachment 145918
  --> https://bugs.freedesktop.org/attachment.cgi?id=145918&action=edit
Journal excerpt vega56 ring gfx timeout, then gpu reset

I think I'm having this problem on a Vega 56, I didn't see anyone else mention
that card here.

I attached the relevant log, I think it's this same issue, but someone correct
me if I'm wrong.

OpenGL renderer string: Radeon RX Vega (VEGA10, DRM 3.33.0, 5.3.0-20-generic,
LLVM 9.0.0)
OpenGL core profile version string: 4.5 (Core Profile) Mesa 19.2.1

Running Pop!_OS:
Linux robo-triangulum 5.3.0-20-generic
#21+system76~1572304854~19.10~8caa3e6-Ubuntu SMP Tue Oct 29 00:4 x86_64 x86_64
x86_64 GNU/Linux

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 111763] ring_gfx hangs/freezes on Navi gpus

2019-11-05 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #24 from wychuchol  ---
(In reply to wychuchol from comment #23)
> (In reply to wychuchol from comment #19)
> > After some time in Witcher 3 GOTY run with Lutris PC restarts on it's own. I
> > thought something is overheating (I've noticed graphic card memory in
> > PSensor sometimes reaching 90 so I thought maybe that's what's happening)
> > but I investigated kern.log and this always happened before that autonomous
> > reset:
> > 
> > Nov  2 22:01:53 pop-os kernel: [  979.244964] pcieport :00:01.1: AER:
> > Corrected error received: :01:00.0
> > Nov  2 22:01:53 pop-os kernel: [  979.244967] nvme :01:00.0: AER: PCIe
> > Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID)
> > Nov  2 22:01:53 pop-os kernel: [  979.244968] nvme :01:00.0: AER:  
> > device [1987:5012] error status/mask=1000/6000
> > Nov  2 22:01:53 pop-os kernel: [  979.244968] nvme :01:00.0: AER:   
> > [12] Timeout   
> > Nov  2 22:01:53 pop-os kernel: [  979.262629] Emergency Sync complete
> 
> Thing with those AER errors is that they can go on and on and reset happens
> few minutes after the last logged error. 
> This might be overheating, I managed to find how to output sensors readings
> into txt log and found that memory went up to 96 C (or rather it stayed
> there for about 1m 10s)
> Last reading before reset:
> amdgpu-pci-2800
> Adapter: PCI adapter
> vddgfx:   +1.16 V  
> fan1:1551 RPM  (min =0 RPM, max = 3200 RPM)
> edge: +74.0°C  (crit = +118.0°C, hyst = -273.1°C)
>(emerg = +99.0°C)
> junction: +88.0°C  (crit = +99.0°C, hyst = -273.1°C)
>(emerg = +99.0°C)
> mem:  +96.0°C  (crit = +99.0°C, hyst = -273.1°C)
>(emerg = +99.0°C)
> power1:  162.00 W  (cap = 195.00 W)
> 
> k10temp-pci-00c3
> Adapter: PCI adapter
> Tdie: +70.5°C  (high = +70.0°C)
> Tctl: +70.5°C  
> 
> Now the weird thing is - if this is in fact overheating why fan didn't go
> beyond 1600 rpm even once Highest was like 1581 rpm and I don't have
> silent bios switched on (sapphire pulse rx 5700 xt, lever facing away from
> video ports).

Okay I don't think it's overheating anymore. I found a moment in Anomaly 1.5.0
I can't get past without system resetting, just before a psi storm in Army
Warehouses (I can provide a savefile).

Last sensors reading before crash (5 second increments):
amdgpu-pci-2800
Adapter: PCI adapter
vddgfx:   +1.01 V  
fan1:1560 RPM  (min =0 RPM, max = 3200 RPM)
edge: +69.0°C  (crit = +118.0°C, hyst = -273.1°C)
   (emerg = +99.0°C)
junction: +84.0°C  (crit = +99.0°C, hyst = -273.1°C)
   (emerg = +99.0°C)
mem:  +80.0°C  (crit = +99.0°C, hyst = -273.1°C)
   (emerg = +99.0°C)
power1:  227.00 W  (cap = 195.00 W)

k10temp-pci-00c3
Adapter: PCI adapter
Tdie: +71.8°C  (high = +70.0°C)
Tctl: +71.8°C

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 111763] ring_gfx hangs/freezes on Navi gpus

2019-11-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #23 from wychuchol  ---
(In reply to wychuchol from comment #19)
> After some time in Witcher 3 GOTY run with Lutris PC restarts on it's own. I
> thought something is overheating (I've noticed graphic card memory in
> PSensor sometimes reaching 90 so I thought maybe that's what's happening)
> but I investigated kern.log and this always happened before that autonomous
> reset:
> 
> Nov  2 22:01:53 pop-os kernel: [  979.244964] pcieport :00:01.1: AER:
> Corrected error received: :01:00.0
> Nov  2 22:01:53 pop-os kernel: [  979.244967] nvme :01:00.0: AER: PCIe
> Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID)
> Nov  2 22:01:53 pop-os kernel: [  979.244968] nvme :01:00.0: AER:  
> device [1987:5012] error status/mask=1000/6000
> Nov  2 22:01:53 pop-os kernel: [  979.244968] nvme :01:00.0: AER:   
> [12] Timeout   
> Nov  2 22:01:53 pop-os kernel: [  979.262629] Emergency Sync complete

Thing with those AER errors is that they can go on and on and reset happens few
minutes after the last logged error. 
This might be overheating, I managed to find how to output sensors readings
into txt log and found that memory went up to 96 C (or rather it stayed there
for about 1m 10s)
Last reading before reset:
amdgpu-pci-2800
Adapter: PCI adapter
vddgfx:   +1.16 V  
fan1:1551 RPM  (min =0 RPM, max = 3200 RPM)
edge: +74.0°C  (crit = +118.0°C, hyst = -273.1°C)
   (emerg = +99.0°C)
junction: +88.0°C  (crit = +99.0°C, hyst = -273.1°C)
   (emerg = +99.0°C)
mem:  +96.0°C  (crit = +99.0°C, hyst = -273.1°C)
   (emerg = +99.0°C)
power1:  162.00 W  (cap = 195.00 W)

k10temp-pci-00c3
Adapter: PCI adapter
Tdie: +70.5°C  (high = +70.0°C)
Tctl: +70.5°C  

Now the weird thing is - if this is in fact overheating why fan didn't go
beyond 1600 rpm even once Highest was like 1581 rpm and I don't have silent
bios switched on (sapphire pulse rx 5700 xt, lever facing away from video
ports).

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 111763] ring_gfx hangs/freezes on Navi gpus

2019-11-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #22 from wychuchol  ---
(In reply to Marko Popovic from comment #21)
> What kernel/MESA combo are you using?

DRM 3.35.0, 5.4.0-050400rc5-generic, LLVM 9.0.0
Mesa 19.3.0-devel (git-ff6e148 2019-10-29 eoan-oibaf-ppa

Or at least that's what I got from glxinfo | grep OpenGL

Stalker hanged again just after few minutes of playtime so I don't know if any
of the fixes actually fixed anything or has it held stuff together a bit more
securely.

Nov  4 23:04:16 pop-os kernel: [100672.998576]
[drm:amdgpu_dm_commit_planes.constprop.0 [amdgpu]] *ERROR* Waiting for fences
timed out!
Nov  4 23:04:16 pop-os kernel: [100677.862509] [drm:amdgpu_job_timedout
[amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=23742723, emitted
seq=23742725
Nov  4 23:04:16 pop-os kernel: [100677.862545] [drm:amdgpu_job_timedout
[amdgpu]] *ERROR* Process information: process AnomalyDX11.exe pid 3904 thread
AnomalyDX11.exe pid 3904
Nov  4 23:04:16 pop-os kernel: [100677.862547] [drm] GPU recovery disabled.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 111763] ring_gfx hangs/freezes on Navi gpus

2019-11-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #21 from Marko Popovic  ---
(In reply to wychuchol from comment #20)
> Barely started PC, opened palemoon, curse move only hang and then dozens of
> graphical artifacts on screen like square patches of glitches. 
> 
> Nov  3 13:15:10 pop-os kernel: [  133.998883]
> [drm:amdgpu_dm_commit_planes.constprop.0 [amdgpu]] *ERROR* Waiting for
> fences timed out!
> Nov  3 13:15:10 pop-os kernel: [  139.118912] [drm:amdgpu_job_timedout
> [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=11145, emitted
> seq=11148
> Nov  3 13:15:10 pop-os kernel: [  139.118956] [drm:amdgpu_job_timedout
> [amdgpu]] *ERROR* Process information: process gnome-shell pid 2588 thread
> gnome-shel:cs0 pid 2606
> Nov  3 13:15:10 pop-os kernel: [  139.118958] [drm] GPU recovery disabled.
> 
> Then sometime later I got ring gfx related crash with Witcher 3 which didn't
> happen before:
> Nov  3 14:08:47 pop-os kernel: [ 3185.175837]
> [drm:amdgpu_dm_commit_planes.constprop.0 [amdgpu]] *ERROR* Waiting for
> fences timed out!
> Nov  3 14:08:47 pop-os kernel: [ 3190.039750] [drm:amdgpu_job_timedout
> [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=1448573, emitted
> seq=1448575
> Nov  3 14:08:47 pop-os kernel: [ 3190.039786] [drm:amdgpu_job_timedout
> [amdgpu]] *ERROR* Process information: process witcher3.exe pid 8100 thread
> witcher3.exe pid 10168
> Nov  3 14:08:47 pop-os kernel: [ 3190.039788] [drm] GPU recovery disabled.

What kernel/MESA combo are you using?

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 111763] ring_gfx hangs/freezes on Navi gpus

2019-11-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #20 from wychuchol  ---
Barely started PC, opened palemoon, curse move only hang and then dozens of
graphical artifacts on screen like square patches of glitches. 

Nov  3 13:15:10 pop-os kernel: [  133.998883]
[drm:amdgpu_dm_commit_planes.constprop.0 [amdgpu]] *ERROR* Waiting for fences
timed out!
Nov  3 13:15:10 pop-os kernel: [  139.118912] [drm:amdgpu_job_timedout
[amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=11145, emitted seq=11148
Nov  3 13:15:10 pop-os kernel: [  139.118956] [drm:amdgpu_job_timedout
[amdgpu]] *ERROR* Process information: process gnome-shell pid 2588 thread
gnome-shel:cs0 pid 2606
Nov  3 13:15:10 pop-os kernel: [  139.118958] [drm] GPU recovery disabled.

Then sometime later I got ring gfx related crash with Witcher 3 which didn't
happen before:
Nov  3 14:08:47 pop-os kernel: [ 3185.175837]
[drm:amdgpu_dm_commit_planes.constprop.0 [amdgpu]] *ERROR* Waiting for fences
timed out!
Nov  3 14:08:47 pop-os kernel: [ 3190.039750] [drm:amdgpu_job_timedout
[amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=1448573, emitted
seq=1448575
Nov  3 14:08:47 pop-os kernel: [ 3190.039786] [drm:amdgpu_job_timedout
[amdgpu]] *ERROR* Process information: process witcher3.exe pid 8100 thread
witcher3.exe pid 10168
Nov  3 14:08:47 pop-os kernel: [ 3190.039788] [drm] GPU recovery disabled.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 111763] ring_gfx hangs/freezes on Navi gpus

2019-11-02 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #19 from wychuchol  ---
Perhaps needs another entry started but it's related (since it didn't happen
before I tried RADV_PERFTEST=aco and AMD_DEBUG="nongg,nodma") so I'll post it
in case someone has had same issues as me.

After some time in Witcher 3 GOTY run with Lutris PC restarts on it's own. I
thought something is overheating (I've noticed graphic card memory in PSensor
sometimes reaching 90 so I thought maybe that's what's happening) but I
investigated kern.log and this always happened before that autonomous reset:

Nov  2 22:01:53 pop-os kernel: [  979.244964] pcieport :00:01.1: AER:
Corrected error received: :01:00.0
Nov  2 22:01:53 pop-os kernel: [  979.244967] nvme :01:00.0: AER: PCIe Bus
Error: severity=Corrected, type=Data Link Layer, (Transmitter ID)
Nov  2 22:01:53 pop-os kernel: [  979.244968] nvme :01:00.0: AER:   device
[1987:5012] error status/mask=1000/6000
Nov  2 22:01:53 pop-os kernel: [  979.244968] nvme :01:00.0: AER:[12]
Timeout   
Nov  2 22:01:53 pop-os kernel: [  979.262629] Emergency Sync complete

A solution I found is to add pci=nommconf in /etc/default/grub to the line 
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash" (so it looks like this:
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash pci=nommconf").

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 111763] ring_gfx hangs/freezes on Navi gpus

2019-11-02 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #18 from wychuchol  ---
It happened again. This time without a game or anything running, barely logged
in and opened a program and boom.

Nov  2 12:42:07 pop-os kernel: [ 1675.883513]
[drm:amdgpu_dm_commit_planes.constprop.0 [amdgpu]] *ERROR* Waiting for fences
timed out!
Nov  2 12:42:07 pop-os kernel: [ 1680.747513] [drm:amdgpu_job_timedout
[amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=2714, emitted seq=2716
Nov  2 12:42:07 pop-os kernel: [ 1680.747549] [drm:amdgpu_job_timedout
[amdgpu]] *ERROR* Process information: process Xorg pid 2293 thread Xorg:cs0
pid 2294
Nov  2 12:42:07 pop-os kernel: [ 1680.747551] [drm] GPU recovery disabled.

Only cursor moved, no clicks registered, restart achieved with REISUB.
I tried registering at https://gitlab.freedesktop.org/mesa/mesa/issues but I'm
getting no account confirmation mail so can't post it there.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 111763] ring_gfx hangs/freezes on Navi gpus

2019-11-01 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #17 from wychuchol  ---
(In reply to Andrew Sheldon from comment #16)
> (In reply to wychuchol from comment #14)
> > RX 5700 XT Pop OS 19.10 latest Oibaf mesa not sure what llvm
> > Anomaly 1.5.0 update 3 standalone 64 bit mod for S.T.A.L.K.E.R. Call of
> > Pripyat running under wine d3dx11_43->dxvk (winetricks dxvk d3dcompiler_43
> > d3dx11_43)
> > 
> > Oct 30 02:49:30 pop-os kernel: [ 4864.627343]
> > [drm:amdgpu_dm_commit_planes.constprop.0 [amdgpu]] *ERROR* Waiting for
> > fences timed out!
> > Oct 30 02:49:30 pop-os kernel: [ 4869.231450] [drm:amdgpu_job_timedout
> > [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=2626284, emitted
> > seq=2626286
> > Oct 30 02:49:30 pop-os kernel: [ 4869.231486] [drm:amdgpu_job_timedout
> > [amdgpu]] *ERROR* Process information: process AnomalyDX11.exe pid 5791
> > thread AnomalyDX11.exe pid 5791
> > Oct 30 02:49:30 pop-os kernel: [ 4869.231487] [drm] GPU recovery disabled.
> > 
> > Happens at random. Sometimes hangs straight away, sometimes can go over an
> > hour without crash. Complete crash, no option available besides hard reset.
> > Not even mouse pointer would move (as with sdma0 hang).
> > 
> > I'm sorry if it's not the right place to report this, I'm somewhat new to
> > all of this.
> 
> Ring gfx type hangs tend to be in Mesa. Report here:
> https://gitlab.freedesktop.org/mesa/mesa/issues
> 
> Also I'm not sure how up to date the Oibaf repo is, but Mesa git landed ACO
> recently for Navi cards. You can try with RADV_PERFTEST=aco environment
> variable set if your Mesa is new enough, and you might have better luck with
> hangs.

Thank you so very much, no way to be sure since they seemed to happen at random
but I think I'd experience at least 2 or 3 hangs in the time I've tested it but
smooth ride so far. No performance impact either but running this game as I do
I'm supposedly laying most of the calculations on CPU not GPU.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 111763] ring_gfx hangs/freezes on Navi gpus

2019-10-31 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #16 from Andrew Sheldon  ---
(In reply to wychuchol from comment #14)
> RX 5700 XT Pop OS 19.10 latest Oibaf mesa not sure what llvm
> Anomaly 1.5.0 update 3 standalone 64 bit mod for S.T.A.L.K.E.R. Call of
> Pripyat running under wine d3dx11_43->dxvk (winetricks dxvk d3dcompiler_43
> d3dx11_43)
> 
> Oct 30 02:49:30 pop-os kernel: [ 4864.627343]
> [drm:amdgpu_dm_commit_planes.constprop.0 [amdgpu]] *ERROR* Waiting for
> fences timed out!
> Oct 30 02:49:30 pop-os kernel: [ 4869.231450] [drm:amdgpu_job_timedout
> [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=2626284, emitted
> seq=2626286
> Oct 30 02:49:30 pop-os kernel: [ 4869.231486] [drm:amdgpu_job_timedout
> [amdgpu]] *ERROR* Process information: process AnomalyDX11.exe pid 5791
> thread AnomalyDX11.exe pid 5791
> Oct 30 02:49:30 pop-os kernel: [ 4869.231487] [drm] GPU recovery disabled.
> 
> Happens at random. Sometimes hangs straight away, sometimes can go over an
> hour without crash. Complete crash, no option available besides hard reset.
> Not even mouse pointer would move (as with sdma0 hang).
> 
> I'm sorry if it's not the right place to report this, I'm somewhat new to
> all of this.

Ring gfx type hangs tend to be in Mesa. Report here:
https://gitlab.freedesktop.org/mesa/mesa/issues

Also I'm not sure how up to date the Oibaf repo is, but Mesa git landed ACO
recently for Navi cards. You can try with RADV_PERFTEST=aco environment
variable set if your Mesa is new enough, and you might have better luck with
hangs.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 111763] ring_gfx hangs/freezes on Navi gpus

2019-10-31 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #15 from wychuchol  ---
Forgot to add, Kernel v5.4-rc5.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 111763] ring_gfx hangs/freezes on Navi gpus

2019-10-31 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #14 from wychuchol  ---
RX 5700 XT Pop OS 19.10 latest Oibaf mesa not sure what llvm
Anomaly 1.5.0 update 3 standalone 64 bit mod for S.T.A.L.K.E.R. Call of Pripyat
running under wine d3dx11_43->dxvk (winetricks dxvk d3dcompiler_43 d3dx11_43)

Oct 30 02:49:30 pop-os kernel: [ 4864.627343]
[drm:amdgpu_dm_commit_planes.constprop.0 [amdgpu]] *ERROR* Waiting for fences
timed out!
Oct 30 02:49:30 pop-os kernel: [ 4869.231450] [drm:amdgpu_job_timedout
[amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=2626284, emitted
seq=2626286
Oct 30 02:49:30 pop-os kernel: [ 4869.231486] [drm:amdgpu_job_timedout
[amdgpu]] *ERROR* Process information: process AnomalyDX11.exe pid 5791 thread
AnomalyDX11.exe pid 5791
Oct 30 02:49:30 pop-os kernel: [ 4869.231487] [drm] GPU recovery disabled.

Happens at random. Sometimes hangs straight away, sometimes can go over an hour
without crash. Complete crash, no option available besides hard reset. Not even
mouse pointer would move (as with sdma0 hang).

I'm sorry if it's not the right place to report this, I'm somewhat new to all
of this.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 111763] ring_gfx hangs/freezes on Navi gpus

2019-10-23 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=111763

sambolinux  changed:

   What|Removed |Added

   Priority|not set |medium

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 111763] ring_gfx hangs/freezes on Navi gpus

2019-10-15 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #13 from Pierre-Eric Pelloux-Prayer 
 ---
For hangs involving radv the AMD_DEBUG options aren't relevant.
You should use RADV_DEBUG instead (probably doesn't support the same values).

Also opening a bug in https://gitlab.freedesktop.org/mesa/mesa/issues is a good
idea since gfx hangs are most likely a driver issue (radv or radeonsi,
depending on the API used).

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 111763] ring_gfx hangs/freezes on Navi gpus

2019-10-15 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #12 from shahul  ---

I am working on Navi10 RX5700
I am facing below issue when i run unigine-heaven benchmark

 [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed
out!
 [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled
seq=5075872, emitted seq=5075874
[drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process
heaven_x64 pid 13723 thread heaven_x64:cs0 pid 13741
 [drm] GPU recovery disabled.

Is any fix for it ? 

Thanks on advance.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 111763] ring_gfx hangs/freezes on Navi gpus

2019-10-11 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #11 from Marko Popovic  ---
(In reply to takios+fdbugs from comment #10)
> (In reply to Marko Popovic from comment #9)
> > https://cgit.freedesktop.org/mesa/mesa/commit/
> > ?id=a2a68d551c1c2a4f13761ffa8f3f6f13fee7a384
> > 
> > This might actually fix the ring_gfx type hangs or even sdma ones at least
> > for Vulkan API? Not exactly sure but will also be testing the latest MESA
> > builds from Oibaf's PPA in following days and report back on the issue :)
> 
> Sadly, I'm still getting the ring_gfx hangs after a few minutes of playing
> Trackmania 2.

Oh yes I forgot to add a reply here. It didn't solve any of the hangs for me
either.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 111763] ring_gfx hangs/freezes on Navi gpus

2019-10-11 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #10 from takios+fdb...@takios.de ---
(In reply to Marko Popovic from comment #9)
> https://cgit.freedesktop.org/mesa/mesa/commit/
> ?id=a2a68d551c1c2a4f13761ffa8f3f6f13fee7a384
> 
> This might actually fix the ring_gfx type hangs or even sdma ones at least
> for Vulkan API? Not exactly sure but will also be testing the latest MESA
> builds from Oibaf's PPA in following days and report back on the issue :)

Sadly, I'm still getting the ring_gfx hangs after a few minutes of playing
Trackmania 2.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 111763] ring_gfx hangs/freezes on Navi gpus

2019-10-03 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #9 from Marko Popovic  ---
https://cgit.freedesktop.org/mesa/mesa/commit/?id=a2a68d551c1c2a4f13761ffa8f3f6f13fee7a384

This might actually fix the ring_gfx type hangs or even sdma ones at least for
Vulkan API? Not exactly sure but will also be testing the latest MESA builds
from Oibaf's PPA in following days and report back on the issue :)

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 111763] ring_gfx hangs/freezes on Navi gpus

2019-09-30 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #8 from Marko Popovic  ---
(In reply to Doug Ty from comment #7)
> (In reply to Marko Popovic from comment #6)
> > (In reply to Doug Ty from comment #5)
> > > I've been getting this too with Minecraft:  
> > > https://bugs.freedesktop.org/show_bug.cgi?id=111669
> > > 
> > > For my particular case at least, AMD_DEBUG=nodma seems to fix it
> > 
> > You are refering to sdma0 / sdma1 type hang which is tracked
> > here:https://bugs.freedesktop.org/show_bug.cgi?id=111481
> > 
> > For ring_gfx hangs they're quite more reproducible and are not affected by
> > AMD_DEBUG=nodma or AMD_DEBUG=nongg which I already mentioned above in the
> > bug description.
> 
> Sorry, but this is incorrect. My Minecraft hang is most definitely a ring
> gfx hang, *not* sdma. I've posted logs and apitraces in the linked thread if
> you'd like to check for yourself.
> 
> I can't explain why nodma isn't working for you, perhaps it doesn't work for
> game? Have you tried putting it in /etc/environment so it's system-wide? I
> don't know what to tell you regarding nodma, but my hang is definitely ring
> gfx as well.

I guess we just have many different types of hangs then... ring_gfx hangs are
more mysterious than sdma0/1 hangs it seems, since there is no "universal"
workaround for them. nodma works for stopping global sdma-type hangs for me,
nongg works for stopping the citra-related hang of ring_gfx type, but none of
those 2 variables work for stopping Starcraft II and RoTR ring_gfx-type hangs
for me, so it's really really confusing.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 111763] ring_gfx hangs/freezes on Navi gpus

2019-09-30 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #7 from Doug Ty  ---
(In reply to Marko Popovic from comment #6)
> (In reply to Doug Ty from comment #5)
> > I've been getting this too with Minecraft:  
> > https://bugs.freedesktop.org/show_bug.cgi?id=111669
> > 
> > For my particular case at least, AMD_DEBUG=nodma seems to fix it
> 
> You are refering to sdma0 / sdma1 type hang which is tracked
> here:https://bugs.freedesktop.org/show_bug.cgi?id=111481
> 
> For ring_gfx hangs they're quite more reproducible and are not affected by
> AMD_DEBUG=nodma or AMD_DEBUG=nongg which I already mentioned above in the
> bug description.

Sorry, but this is incorrect. My Minecraft hang is most definitely a ring gfx
hang, *not* sdma. I've posted logs and apitraces in the linked thread if you'd
like to check for yourself.

I can't explain why nodma isn't working for you, perhaps it doesn't work for
game? Have you tried putting it in /etc/environment so it's system-wide? I
don't know what to tell you regarding nodma, but my hang is definitely ring gfx
as well.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 111763] ring_gfx hangs/freezes on Navi gpus

2019-09-30 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #6 from Marko Popovic  ---
(In reply to Doug Ty from comment #5)
> I've been getting this too with Minecraft:  
> https://bugs.freedesktop.org/show_bug.cgi?id=111669
> 
> For my particular case at least, AMD_DEBUG=nodma seems to fix it

(In reply to Marko Popovic from comment #0)
> There is another type of freeze/hang happening when playing Starcraft II via
> D9VK. This one doesn't seem to be related to either ngg or dma because I
> have them both disabled by AMD_DEBUG=nodma and AMD_DEBUG=nongg and the hangs
> occur anyway, on exactly the same place every time.

You are refering to sdma0 / sdma1 type hang which is tracked
here:https://bugs.freedesktop.org/show_bug.cgi?id=111481

For ring_gfx hangs they're quite more reproducible and are not affected by
AMD_DEBUG=nodma or AMD_DEBUG=nongg which I already mentioned above in the bug
description.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 111763] ring_gfx hangs/freezes on Navi gpus

2019-09-30 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #5 from Doug Ty  ---
I've been getting this too with Minecraft:  
https://bugs.freedesktop.org/show_bug.cgi?id=111669

For my particular case at least, AMD_DEBUG=nodma seems to fix it

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 111763] ring_gfx hangs/freezes on Navi gpus

2019-09-23 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #4 from Daniel Lu  ---
I am seeing a similar hang in Starcraft II. Unlike Marko, I am not using d9vk
--- instead, I'm using wine-nine. The hang doesn't happen in all games but
seems to be particularly frequent in the coop mission "dead of night".

Using mesa-git 19.3.0_devel.115092.3f5b541fc8b-1.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 111763] ring_gfx hangs/freezes on Navi gpus

2019-09-22 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #3 from Daniel Lu  ---
Created attachment 145465
  --> https://bugs.freedesktop.org/attachment.cgi?id=145465&action=edit
output of running sudo umr -R gfx_0.0.0

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 111763] ring_gfx hangs/freezes on Navi gpus

2019-09-22 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #2 from Daniel Lu  ---
Created attachment 145464
  --> https://bugs.freedesktop.org/attachment.cgi?id=145464&action=edit
dmesg output

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 111763] ring_gfx hangs/freezes on Navi gpus

2019-09-22 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=111763

--- Comment #1 from Jeremy Attali  ---
Not sure if that might help someone else, but I found a workaround in my case
with DOOM. I was having the same crashes as Marko described with Starcraft II,
I tried the following:

- In Steam, I disabled the In Game Steam Overlay
- I switched the Graphics API from OpenGL to Vulkan

I did not have any crash so far. But I haven't tried to isolate one or the
other.

Packages:
linux 5.3.arch1-1
linux-firmware-agd5f-radeon-navi10 2019.09.13.18.36-1
mesa-git 1:19.3.0_devel.115574.40087ffc5b9-1
vulkan-radeon-git 1:19.3.0_devel.115574.40087ffc5b9-1
libdrm 2.4.99-1
lib32-mesa-git 1:19.3.0_devel.115574.40087ffc5b9-1
lib32-vulkan-radeon-git 1:19.3.0_devel.115574.40087ffc5b9-1
lib32-libdrm 2.4.99-1

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 111763] ring_gfx hangs/freezes on Navi gpus

2019-09-22 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=111763

Bug ID: 111763
   Summary: ring_gfx hangs/freezes on Navi gpus
   Product: DRI
   Version: unspecified
  Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
  Severity: major
  Priority: not set
 Component: DRM/AMDgpu
  Assignee: dri-devel@lists.freedesktop.org
  Reporter: popovic.ma...@protonmail.com

I'm making this topic as a separate tracking of ring_gfx related bugs since we
should keep https://bugs.freedesktop.org/show_bug.cgi?id=111481 related to
sdma0/1 type freezes since those are ones that seem to cause random "Out of the
blue" hangs on the desktop.

There is another type of freeze/hang happening when playing Starcraft II via
D9VK. This one doesn't seem to be related to either ngg or dma because I have
them both disabled by AMD_DEBUG=nodma and AMD_DEBUG=nongg and the hangs occur
anyway, on exactly the same place every time.

Error logs:
sep 17 11:48:24 Marko-PC kernel: [drm:amdgpu_dm_commit_planes.constprop.0
[amdgpu]] *ERROR* Waiting for fences timed out or interrupted!
sep 17 11:48:24 Marko-PC kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR*
ring gfx_0.0.0 timeout, signaled seq=2361623, emitted seq=2361625
sep 17 11:48:24 Marko-PC kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR*
Process information: process SC2_x64.exe pid 20236 thread SC2_x64.exe pid 20236

I will try and provide trace files by using renderdoc for described issues.
They also happen in native games like Rise of the Tomb Raider and Vulkan etc.
Will provide as much info as possible.

Using Kernel 5.3, MESA 19.2 and llvm9.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel