Hi Luis,

mhm, sounds like a timing issue. We have probably made something faster during bootup in 4.20 and because of this you now see this issue more often.

If the bisection doesn't show any result can you try adding some msleep(10) call at critical places in the driver code to narrow this down?

Officially we don't test/support ARM with the driver code, but in this particular case we should probably investigate since it sounds like it just doesn't happen on x86 because of different timing.

Thanks,
Christian.

Am 28.12.18 um 15:05 schrieb Luís Mendes:
Hi Alex,

Before all... Have a nice holidays! Happy new year!!

- Okay, so it looks like sometimes the driver is able to enter
graphical mode with the Polaris card, but most of the time it fails
before with:
[   49.762704] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx
timeout, signaled seq=2, emitted seq=3

- This is something that is happening sporadically but in a less
intensive way in 4.17, 4.18 and 4.19 kernels, so this is actually not
a regression, but rather an existent issue, which maybe the patch
"drm/amdgpu/gfx_v8_0: Reorder the gfx, kiq and kcq ring tests
sequence" solves. I tried to backport it to 4.20, but had no
improvement. Need to try with the git version, or rc1.

- This hang happens after the console is displayed in the screen, but
before switching to graphical mode with X.

- However if X is entered then the driver is stable and can be used
for long periods.

Regards,
Luís Mendes

On Tue, Dec 18, 2018 at 11:16 PM Luís Mendes <luis.p.men...@gmail.com> wrote:
Hi Alex,

I am already using drm_arch_can_wc_memory() set to false.
I will try to bisect...

Regards,
Luís

On Tue, Dec 18, 2018 at 7:03 PM Alex Deucher <alexdeuc...@gmail.com> wrote:
On Tue, Dec 18, 2018 at 8:58 AM Luís Mendes <luis.p.men...@gmail.com> wrote:
Hi Christian,

I've been using a Sapphire RX 550 and a Sapphire RX 460 on a custom
armhf board that runs well with Linux 4.19.9 at least, but now
starting with Linux kernel 4.20, I'm having a gpu hang, right after
the console being displayed, but before entering in graphical mode,
when starting X session.
I'm only reporting this now, because there was a PCI commit for mvebu
that also entered for linux-4.20 that caused a kernel oops during
pci_map_rom call in amdgpu initialization code. I've reverted that
patch, but now amdgpu is hanging.
It would be useful if you could bisect.  This is the first I've heard
of amdgpu working on an ARM board without write combining (WC)
disabled.  You might check to see if disabling WC helps.  Return false
in drm_arch_can_wc_memory().

Alex


[   24.801861] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx
timeout, signaled seq=2, emitted seq=3

02:00.0 VGA compatible controller: Advanced Micro Devices, Inc.
[AMD/ATI] Baffin [Polaris11] (rev ff) (prog-if 00 [VGA controller])
     Subsystem: Sapphire Technology Limited Baffin [Radeon RX 560]
     Flags: bus master, fast devsel, latency 0, IRQ 51
     Memory at d0000000 (64-bit, prefetchable) [size=256M]
     Memory at e0000000 (64-bit, prefetchable) [size=2M]
     I/O ports at 10000 [size=256]
     Memory at e0200000 (32-bit, non-prefetchable) [size=256K]
     Expansion ROM at e0240000 [disabled] [size=128K]
     Capabilities: <access denied>
     Kernel driver in use: amdgpu
     Kernel modules: amdgpu

dmesg follows in attachment.

Regards,
Luís
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Reply via email to