https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #172 from line...@xcpp.org ---
I had dpm=2 as a module option. GPU initialization failure does not occur
without dpm=2
--
You are receiving this mail because:
You are the assignee for the bug.
https://bugs.freedesktop.org/show_bug.cgi?id=110674
Alex Deucher changed:
What|Removed |Added
Attachment #146026|text/x-log |text/plain
mime type|
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #171 from line...@xcpp.org ---
Created attachment 146026
--> https://bugs.freedesktop.org/attachment.cgi?id=146026&action=edit
5.4.0-arch1-1 GPU initialization fails
With kernel version 5.4.0-arch1-1 the GPU can flat out no longer
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #170 from Peter Hercek ---
Maybe this helps since there is a stack trace. GUI stopped to respond so I shut
it down over ssh. A kernel crash during the shutdown on 5.3.6-arch1-1-ARCH even
when amdgpu.dpm=0. That is the option which is
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #169 from picar...@live.de ---
I am using a Radeon VII with Arch Linux, a 1440p144hz and a 4K60Hz monitor, and
I had similar crashes to the others here if I tried running the 1440p144hz
monitor at 144hz, at 60hz it was stable. This be
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #168 from line...@xcpp.org ---
Created attachment 145784
--> https://bugs.freedesktop.org/attachment.cgi?id=145784&action=edit
5.3.7: Fence fallback timer expired on ring
Here is a freeze which went a bit differently.
This time t
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #167 from Alex Deucher ---
(In reply to Peter Hercek from comment #166)
> I got the crash after 4 days of use. It looks the same as before:
> ring sdma0 timeout, gpu reset (allegedly successful), many skipped IBs, and
> failure to in
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #166 from Peter Hercek ---
I tried, 5.3.6-arch1-1 on archlinux with 3 DP monitors. It should contain the
patch based on the comment from line...@xcpp.org.
I got the crash after 4 days of use. It looks the same as before:
ring sdma0
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #165 from Tom B ---
I just tried 5.3.5 (which is the latest in the arch repo) and it's working fine
for me.
I do have an issue on Wayland. If the screen turns off, Wayland crashes and I
have to hard reset. The log shows
Oct 14 17:
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #164 from line...@xcpp.org ---
(In reply to Tom B from comment #163)
> Gargoyle, linedot, can you confirm whether this crash is with both patches
> applied?
>
> I'm still on 5.3.1 patched and haven't had a single crash.
For 5.3.1 I'
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #163 from Tom B ---
Gargoyle, linedot, can you confirm whether this crash is with both patches
applied?
I'm still on 5.3.1 patched and haven't had a single crash.
--
You are receiving this mail because:
You are the assignee for th
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #162 from line...@xcpp.org ---
Created attachment 145730
--> https://bugs.freedesktop.org/attachment.cgi?id=145730&action=edit
Freeze/Black screen/Crash on 5.3.6
Apologies, I have been on vacation and thus away from my main System.
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #161 from Gargoyle ---
Hi there. I've been trying to solve some lockups and pauses with my system and
have just read this entire thread.
The good news is that I am another Radeon VII owner having the same problems
and I am willing
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #160 from ReddestDream ---
Well, today I had a hard freeze using more than one display with Radeon VII.
Back to Radeon VII + iGPU . . . :(
--
You are receiving this mail because:
You are the assignee for the bug.___
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #159 from ReddestDream ---
Oh. Also,
cat /sys/kernel/debug/dri/0/amdgpu_pm_info
Now seems to work on 5.3.4 with more than one monitor in. It doesn't report
nonsense values like 0 watts like it did before. :)
--
You are receiving
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #158 from ReddestDream ---
More good news. It seems that 5.3.4 does work for me and doesn't (at least
immediately since I'm typing this from there right now) fall apart into a
glitchy mess.
I'm still not really sure of the complete
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #157 from ReddestDream ---
@Tom B. Well, some good news. Kernel 5.3.4 should have the patches for Radeon
VII included now. I'll do some more tests on that ...
--
You are receiving this mail because:
You are the assignee for the bug
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #156 from Tom B ---
This is strange because with a patched 5.3.1, I have perfect stability. An
uptime of over a week and no issues. Are you saying that the issue comes back
in 5.4? Hopefully not as Linux 5.4 + Mesa 19.3 looks to have
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #155 from ReddestDream ---
So, I've done some tests with 5.4-rc1 and it seems like I'm getting similar
results to line...@xcpp.org and sehell...@gmail.com. I'm using GNOME with
Wayland (which works fine with only 1 display). Sometime
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #154 from line...@xcpp.org ---
Created attachment 145623
--> https://bugs.freedesktop.org/attachment.cgi?id=145623&action=edit
5.4.0-rc1 hangup
dmesg with 5.4.0-rc1.
System freezes and becomes unresponsive to input like before
--
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #153 from ReddestDream ---
Just FYI, it appears that kernel 5.3.2 does not have the Vega 20 fix commits
that Alex Deucher mentioned.
--
You are receiving this mail because:
You are the assignee for the bug._
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #152 from ReddestDream ---
Kernel 5.4-rc1, the first kernel version that includes the Vega 20 patches
noted by Alex Deucher, is now out and in linux-mainline on Arch Linux AUR. :)
I plan to do some testing of this version over the n
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #151 from line...@xcpp.org ---
Created attachment 145583
--> https://bugs.freedesktop.org/attachment.cgi?id=145583&action=edit
5.3.1 patched, xorg crash
And here is a dmesg of just an X session crashing
--
You are receiving this
https://bugs.freedesktop.org/show_bug.cgi?id=110674
line...@xcpp.org changed:
What|Removed |Added
Attachment #145581|0 |1
is obsolete|
https://bugs.freedesktop.org/show_bug.cgi?id=110674
line...@xcpp.org changed:
What|Removed |Added
CC||line...@xcpp.org
--- Comment #149 fro
https://bugs.freedesktop.org/show_bug.cgi?id=110674
Anthony Rabbito changed:
What|Removed |Added
Resolution|--- |FIXED
Status|NEW
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #147 from ReddestDream ---
> Already merged to 5.4. I'll take a look at older kernels as well.
@Alex Deucher Thanks so much for all your help! :)
--
You are receiving this mail because:
You are the assignee for the bug.__
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #146 from Alex Deucher ---
(In reply to tom91136 from comment #145)
> @Alex any plans for the patches to be merged for 5.4 or even backported to
> 5.3 at some point?
Already merged to 5.4. I'll take a look at older kernels as well.
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #145 from tom91...@gmail.com ---
@Alex any plans for the patches to be merged for 5.4 or even backported to 5.3
at some point?
--
You are receiving this mail because:
You are the assignee for the bug.
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #144 from sehell...@gmail.com ---
I also think this is strange. Since yesterday, they turned off and on many
times successfully without any problems. Most likely, it's connected with
something else, but I don’t know where to find.
--
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #143 from Tom B ---
I'm not sure how KDE handles monitor power behind the scenes but I have an
uptime of 2 days now since applying the patches and with KDE I've let it turn
off the monitors at least 6 or 7 times and suspend/resume 3
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #142 from sehell...@gmail.com ---
(In reply to Alex Deucher from comment #141)
> (In reply to sehellion from comment #140)
> > Created attachment 145463 [details]
> > 5.3.1 with Alex's patches and dual monitors, crash
>
> That's not
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #141 from Alex Deucher ---
(In reply to sehellion from comment #140)
> Created attachment 145463 [details]
> 5.3.1 with Alex's patches and dual monitors, crash
That's not a crash, it's just a warning.
--
You are receiving this mai
https://bugs.freedesktop.org/show_bug.cgi?id=110674
Alex Deucher changed:
What|Removed |Added
Attachment #145463|text/x-log |text/plain
mime type|
https://bugs.freedesktop.org/show_bug.cgi?id=110674
sehell...@gmail.com changed:
What|Removed |Added
Attachment #145461|0 |1
is obsolete|
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #139 from sehell...@gmail.com ---
Today, when trying to wake up the monitors, the system crashed again.
WARNING: CPU: 4 PID: 32 at
drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link_dp.c:1720
decide_link_settings+0xe0/0x2a0 [amdg
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #138 from sehell...@gmail.com ---
Created attachment 145461
--> https://bugs.freedesktop.org/attachment.cgi?id=145461&action=edit
5.3.1 with Alex's patches and dual monitors
--
You are receiving this mail because:
You are the assi
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #137 from sehell...@gmail.com ---
(In reply to Alex Deucher from comment #128)
> Do these patches help?
> https://cgit.freedesktop.org/~agd5f/linux/commit/?h=drm-
> fixes&id=c46e5df4ac898108da66a880c4e18f69c74f6c1b
> https://cgit.free
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #133 from Anthony Rabbito ---
Created attachment 145459
--> https://bugs.freedesktop.org/attachment.cgi?id=145459&action=edit
dsmeg log with Alex's patches
Here's my dsmeg with Alex's patches. Going to mess around and see what I c
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #129 from Tom B ---
Thank you Alex! That has fixed it! The card is now correctly setting its
voltages and clocks. I applied the patch to 5.3.1
However, I've noticed a few very minor problems that are probably worth
reporting.
1. I
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #131 from Tom B ---
In addition to my previous comment, [drm] schedsdma0 is not ready, skipping
repeating indefinitely stops after a suspend/resume. After the machine is
resumed these stop appearing but it does suspend and resume cor
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #132 from Anthony Rabbito ---
Created attachment 145458
--> https://bugs.freedesktop.org/attachment.cgi?id=145458&action=edit
linux-mainline5.3 dmesg without patches
Here's my current dmesg with two out of three monitors running w
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #134 from Anthony Rabbito ---
Wow ! All three of my monitors are working again. 2560x1440 @ 144Hz
--
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel ma
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #135 from Adrian Brown ---
@reddestdream Thanks. I don't think the active adapter is the problem as it
works perfectly with my Vega 64. However I will try 18.04 and AMD's driver as
suggested.
--
You are receiving this mail because:
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #136 from tom91...@gmail.com ---
Been following this thread for a while now as I just got 3 4k 60Hz monitors
connected to the 3 DP ports on my Radeon VII.
I'm getting the exact same errors discussed in this report with matching dmesg
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #130 from Anthony Rabbito ---
(In reply to Alex Deucher from comment #128)
> Do these patches help?
> https://cgit.freedesktop.org/~agd5f/linux/commit/?h=drm-
> fixes&id=c46e5df4ac898108da66a880c4e18f69c74f6c1b
> https://cgit.freedes
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #128 from Alex Deucher ---
Do these patches help?
https://cgit.freedesktop.org/~agd5f/linux/commit/?h=drm-fixes&id=c46e5df4ac898108da66a880c4e18f69c74f6c1b
https://cgit.freedesktop.org/~agd5f/linux/commit/?h=drm-fixes&id=c02d6a161395
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #127 from Alex Deucher ---
(In reply to Tom B from comment #15)
> Have been running 5.0 since release without issue but upgraded this morning
> and got crashes as described here within a few seconds of boot.
>
Can you bisect betwee
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #126 from ReddestDream ---
@Adrian Brown Your Linux issue is potentially related to the active adapter.
Have you tried w/o it?
On Windows, the flickering on/around login, at least for me, has been mostly
resolved by using the latest
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #125 from Adrian Brown ---
I am also getting frequent crashes with a Radeon VII on Kubuntu 19.10 (kernel
5.0.0-29-generic). I see there is some discussion in this thread about it
possibly being related to multiple monitors. But I don
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #124 from ReddestDream ---
Created attachment 145254
--> https://bugs.freedesktop.org/attachment.cgi?id=145254&action=edit
Dmesg 5.3-rc7 w/ Two monitors
This issue is still not fixed on 5.3-rc7. I guess we will probably have to wa
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #123 from ReddestDream ---
A few interesting fixes that touch vega20_hwmgr.c have rolled in from
drm-fixes:
The first is likely the most interesting for our issues, as it touches
min/maxes (tho only the soft ones it seems). The othe
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #122 from ReddestDream ---
Tested 5.3-rc6. Still has the same issues. Only it's maybe actually worse
because I lose display completely when I use amdgpu.dpm=2 w/Radeon VII
multimonitor on 5.3-rc6, whereas on 5.2.9 I just got same/sim
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #121 from ReddestDream ---
Some observations:
1. Nothing at all seems to be up with cur_speed and cur_width. They get set
several times in a row in both runs, but the values are all the same in both.
2. I can't really see anything
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #120 from ReddestDream ---
Created attachment 145159
--> https://bugs.freedesktop.org/attachment.cgi?id=145159&action=edit
DebugAMDiGPU
Also here is the AMD + iGPU one.
--
You are receiving this mail because:
You are the assigne
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #119 from ReddestDream ---
Created attachment 145158
--> https://bugs.freedesktop.org/attachment.cgi?id=145158&action=edit
DebugAMD2Monitors
>I don't think I have time to try it today but if anyone is recompiling the
>code adding
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #118 from ReddestDream ---
So, this is a crazy idea, but ironically I think it might be getting closer to
the truth.
Tom B. attempted reverting ad51c46eec739c18be24178a30b47801b10e0357, which was
known to cause some issue with an RX
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #117 from ReddestDream ---
Created attachment 145154
--> https://bugs.freedesktop.org/attachment.cgi?id=145154&action=edit
AMDInteliGPUBoot
Also find my stable Intel iGPU + AMD Graphics config dmesg here.
--
You are receiving th
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #116 from ReddestDream ---
Created attachment 145153
--> https://bugs.freedesktop.org/attachment.cgi?id=145153&action=edit
dmesgAMD2Monitors
I've been doing a few tests. I looked into and compiled 5.3-rc5 along with
these patches,
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #115 from Tom B ---
I should have noted it earlier, but I had already tried reverting both "golden
values" commits. I've no idea what it does but it didn't fix this crash.
One thing that would be insightful would be logging every ca
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #114 from ReddestDream ---
5. Tom B., it is probably worth getting a full dmesg with your two monitors in
on a relatively new 5.2.x kernel using at least: amdgpu.dc_log=1 drm.debug=0x1e
log_buf_len=2M
And anything else you might thi
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #113 from ReddestDream ---
4.
> Given that two different versions of the code produce the same result, my
> hunch is that the problem is B. The card is not in a state where it's able to
> receive power changes.
Something to cons
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #112 from ReddestDream ---
More ideas:
3. Looking through the crash in sehellion's comment 45:
gfx_v9_0_ring_test_ring+0x19e/0x230 [amdgpu]
amdgpu_ring_test_helper+0x1e/0x90 [amdgpu]
gfx_v9_0_hw_fini+0x299/0x690 [amdgpu]
amdgpu_dev
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #111 from ReddestDream ---
A few other ideas to ponder:
1. Looking into DPM, I found this commit for 5.1-rc1 that looks interesting:
https://github.com/torvalds/linux/commit/7ca881a8651bdeffd99ba8e0010160f9bf60673e
Looks like it e
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #110 from ReddestDream ---
> 1. The functions in vega20_ppt.c are used with this new patch so that answers
> my question from earlier, that's what this file is for and why it contains
> similar/identical functions.
I was hoping th
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #109 from Tom B ---
Created attachment 145080
--> https://bugs.freedesktop.org/attachment.cgi?id=145080&action=edit
dmesg with amdgpu.dpm=2
> Tom B., did you try booting with amdgpu.dpm=1 or amdgpu.dpm=2 (default is
> generally -
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #108 from ReddestDream ---
> Booting with amdgpu.dpm=0 on 5.2.7 works.
Tom B., did you try booting with amdgpu.dpm=1 or amdgpu.dpm=2 (default is
generally -1 for automatic)? Seems like one of those might enable the new
experimental
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #107 from ReddestDream ---
> Booting with amdgpu.dpm=0 on 5.2.7 works.
> It is a DPM issue of some kind so although my earlier tests showed that
> hard_min_level was set correctly, it still could be an issue elsewhere in the
> DPM
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #106 from Tom B ---
Booting with amdgpu.dpm=0 on 5.2.7 works.
Performance is poor and as expected I cannot get any information about power
states because /sys/kernel/debug/dri/0/amdgpu_pm_info doesn't exist. I'm
guessing it runs at
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #105 from Tom B ---
> Also, I considered that both of my monitors have audio out support. I wonder
> if audio initialization might be the missing piece to the puzzle, the thing
> that interrupts/changes the state of the card and pr
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #104 from Tom B ---
I did get very similar crashing when I was running HDMI + DP at different
refresh rates ( see https://bugs.freedesktop.org/show_bug.cgi?id=110510 ). I
switched to DP + DP because HDMI+DP wasn't stable, it could be
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #103 from Peter Hercek ---
I boot in BIOS mode and I'm still getting these errors. Though they are rare in
my case with the "better" kernels (around once a week).
Just a note: There were tearing errors in windows drivers of Radeon V
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #102 from Tom B ---
> Grasping at straws a bit here, but it occurred to me that maybe Linux kernel
> testing on Radeon VII was done on an early VBIOS that didn't have full UEFI
> support yet. We know that AMD had to issue a VBIOS u
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #101 from ReddestDream ---
Grasping at straws a bit here, but it occurred to me that maybe Linux kernel
testing on Radeon VII was done on an early VBIOS that didn't have full UEFI
support yet. We know that AMD had to issue a VBIOS up
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #100 from Tom B ---
I've bee trying to work backwards to find the place where screens get
initialised and eventually call vega20_pre_display_configuration_changed_task.
vega20_pre_display_configuration_changed_task is exported as
p
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #99 from Tom B ---
Created attachment 145062
--> https://bugs.freedesktop.org/attachment.cgi?id=145062&action=edit
a list of commits 5.0.13 - 5.1.0
Attached is a list of all amdgpu and powerplay commits from 5.0.13 - 5.1.0.
I ha
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #98 from Sylvain BERTRAND ---
> The code seems very similar to what we see in
> vega20_notify_smc_display_config_after_ps_adjustment near where we get the "
> [SetHardMinFreq] Set hard min uclk failed!" Maybe this
> smum_send_msg_to_
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #97 from Tom B ---
I've been investigating this:
https://github.com/torvalds/linux/commit/94ed6d0cfdb867be9bf05f03d682980bce5d0036
Because vega20 doesn't export display_configuration_change, it jumps to the
newly added else block a
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #96 from Tom B ---
Created attachment 145047
--> https://bugs.freedesktop.org/attachment.cgi?id=145047&action=edit
logging anywhere the number of screens is set
Again, no closer to a fix but another thing to rule out. In addition
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #95 from Tom B ---
So here's something interesting. In 5.0.13 there is no function
vega20_display_config_changed. This function issues
smu_send_smc_msg_with_param(smu, SMU_MSG_NumOfDisplays, 0);
In fact, in 5.0.13 there is no refer
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #94 from Tom B ---
Reverting d1a3e239a6016f2bb42a91696056e223982e8538 didn't fix it for me. But
that commit may give some insight because it is related to uclk which is the
first error we get.
I also tried globally increasing usec_t
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #93 from Chris Hodapp ---
Note: It might be good for someone else to double-check my conclusion before
too much stock is put into it. Scientific method and all that.
--
You are receiving this mail because:
You are the assignee for
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #92 from ReddestDream ---
>If you follow the callstack:
I've been thinking all this over. The only thing unfortunately that really
sticks out at me still is how Chris Hodapp says that reverting this commit:
https://github.com/torva
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #91 from ReddestDream ---
>It returns 0 on success and -EIO on failure, which is then in turn returned
>from vega20_set_fclk_to_highest_dpm_leve. Where did you see the check/retry on
>EINVAL? Perhaps -EIO should be -EINVAL?
I didn
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #90 from Tom B ---
I'm not sure this is helpful but I managed to somewhat test the race condition
theory.
If you follow the callstack:
vega20_set_fclk_to_highest_dpm_level -> smum_send_msg_to_smc_with_parameter ->
vega20_send_msg_t
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #89 from Tom B ---
> It should return -EINVAL instead. Maybe then it would reset and try again
> instead of just ignoring it and continuing with initialization anyway,
> leading to instability.
If you look at vega20_send_msg_to_sm
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #88 from ReddestDream ---
>The question then becomes: Why doesn't the race condition happen with only one
>screen? Perhaps it's a matter of speed. With a single display, the driver
>detect the displays, read/parse the EDID data, in
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #87 from Tom B ---
> Could be we've got a race condition between the powerplay setup and amdgpu
handing off the card to drm_dev_register to advertise it for normal use.
The question then becomes: Why doesn't the race condition happe
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #86 from ReddestDream ---
>In addition to that, vega20_set_fclk_to_highest_dpm_level is called several
>times before the card is initialized and even on 5.2.7 works. Something
>happens during or just before the initialization stage
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #85 from Tom B ---
> Yeah. I've had boots where I have my 2 4K DP monitors in and I don't get
> powerplay error on boot. In fact, it can go a bit and seem stable.
In addition to that, vega20_set_fclk_to_highest_dpm_level is called
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #84 from ReddestDream ---
>Need to figure out what exactly what is generating the line "[drm] Initialized
>amdgpu 3.27.0 20150101 for :44:00.0 on minor 0."
That "Initialized amdgpu" message seems to be coming from here:
https:
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #83 from ReddestDream ---
> Here's what I found: The value of hard_min_level is 1001 in both 5.0.13 and
> 5.2.7 so the issue is not the value from the dpm table. The dpm table is
> probably correct.
Fantastic! Glad you tested thi
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #82 from Tom B ---
In addition, I will note that the file vega20_baco.c has been added in 5.1
details: https://www.phoronix.com/scan.php?page=news_item&px=AMD-Vega-12-BACO
commit:
https://github.com/torvalds/linux/commit/0c5ccf14
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #81 from Tom B ---
Created attachment 145038
--> https://bugs.freedesktop.org/attachment.cgi?id=145038&action=edit
5.2.7 dmesg with hard_min_level logged
As mentioned in the previous post, I started logging the value of
hard_min_l
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #80 from Tom B ---
> I tried something like that before but a huge portion of the commits in that
> range won't build kernels that can boot (at least on my system). I ended up
> resorting to trying reverting individual vega20-affec
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #79 from ReddestDream ---
>I tried something like that before but a huge portion of the commits in that
>range won't build kernels that can boot (at least on my system).
It's interesting that you found d1a3e239a6016f2bb42a91696056e
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #78 from Chris Hodapp ---
> I don't see anywhere else to go but bisection from 5.0.13 to 5.1. That should
> at least find something . . .
I tried something like that before but a huge portion of the commits in that
range won't buil
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #77 from ReddestDream ---
>I guess, you are good for a bisection if you have a "working" kernel.
This is, based on everything here, I'm not convinced that 5.0.13 has 0 issues.
Only that it seems to have fewer issues. But yeah. I don
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #76 from Sylvain BERTRAND ---
> Unfortunately, it does look like going through and slowing disabling features
> and/or bisecting might be the only way to find how this issue got started. At
> least if we could narrow it down, we migh
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #75 from ReddestDream ---
>Here's some additional investigation.
>[SetUclkToHightestDpmLevel] Set hard min uclk failed! Appears as one of the
>first errors in dmesg. This is from vega20_hwmgr.c:3354 and triggered by:
I agree that
1 - 100 of 179 matches
Mail list logo