https://bugs.freedesktop.org/show_bug.cgi?id=108493
--- Comment #9 from Timur Kristóf <ven...@msn.com> ---
I think I discovered a possible reason for this issue. If you look at the
DDEBUG dumps, it says in several places: "This slot was corrupted in GPU
memory". So I began to suspect something was wrong with the VRAM.
After looking around a bit, I found that the amdgpu driver does not honor the
voltage settings from the VBIOS, and sets the memory to use lower voltages
instead. So basically the driver undervolts the VRAM without me asking to do
so. I guess this might be considered a feature for some people.
However, when I manually edit pp_od_clk_voltage to increase the OD_MCLK
voltages, then the card begins to work in a stable manner and the GPU hang is
gone. (Or at the very least I haven't seen a hang yet, whereas previously it
used to hang in less than a minute.)
In my case, the VBIOS wants to set the MCLK voltages to 1000 mV at all
frequencies, while amdgpu sets them to 750 mv, 800 mV, and 900mV. And it turns
out that 900 mV is just too low for my card at 1750 MHz.
[root@timur-xps ~]# cat /sys/class/drm/card0/device/pp_od_clk_voltage
OD_SCLK:
0: 300MHz 750mV
1: 588MHz 765mV
2: 952MHz 900mV
3: 1041MHz 975mV
4: 1106MHz 1031mV
5: 1168MHz 1093mV
6: 1209MHz 1143mV
7: 1244MHz 1150mV
OD_MCLK:
0: 300MHz 750mV
1: 1000MHz 800mV
2: 1750MHz 900mV
OD_RANGE:
SCLK: 300MHz 2000MHz
MCLK: 300MHz 2250MHz
VDDC: 750mV 1150mV
[root@timur-xps ~]# cat /sys/kernel/debug/dri/0/amdgpu_vbios > mybios.rom
[root@timur-xps ~]# pbec -i mybios.rom -s -r MEMORY_CLOCK
----
[DEFAULT] ATOM_MCLK_ENTRY Array
----
Entry: 0
Frequency: 300 MHz.
Voltage:. 1000 MV
Entry: 1
Frequency: 1000 MHz.
Voltage:. 1000 MV
Entry: 2
Frequency: 1750 MHz.
Voltage:. 1000 MV
----
Here is some info about the VBIOS:
[root@timur-xps ~]# cat /sys/class/drm/card0/device/subsystem_device
0xe343
[root@timur-xps ~]# cat /sys/class/drm/card0/device/subsystem_vendor
0x1da2
[root@timur-xps ~]# cat /sys/class/drm/card0/device/vbios_version
113-D00034-S07
--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel