[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2019-11-02 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

--- Comment #56 from kumquad  ---
More tests:

*pci=noacpi
PC is booting but either it freezes right after booting to GNOME or it blocks
the Touchpad/USB mouse/keyboard/Tochscreen from working

*amdgpu.dpm=0 or acpi=off
Screen turns black and nothing happens anymore

Another thing I noticed:
Sometimes when booting, GRUB shows on the external monitor hoocked up to the
eGPU and not on the Laptops internal one. But as soon as Linux starts it
switches back to the internal monitor. If I select Windows the whole boot
continues on the eGPU-monitor.

I also made a thread in the manjaro-forum two days ago. There you can also find
some more logs from my machine/system/configuration:
https://forum.manjaro.org/t/help-setting-up-egpu-stuck-on-boot-when-connected/109583/10

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2019-11-02 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

--- Comment #55 from kumquad  ---
Created attachment 145876
  --> https://bugs.freedesktop.org/attachment.cgi?id=145876=edit
Spectrex360 / Kernel 5.3.7 / Vega56 / normal boot

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2019-11-02 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

kumquad  changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
 Resolution|NOTOURBUG   |---

--- Comment #54 from kumquad  ---
Hello, 

I think I have the same problem as all the others here.
My setup - working fine under Windows - is:
* HP Spectre x360 - chxx model with the i7-8705 CPU (so also with an vega M
dGPU
* Razer Core X with AMD Vega56
A normal boot with Kernel 5.3.7-2 gives me this in dmesg:

[   11.672442] [drm:atom_op_jump [amdgpu]] *ERROR* atombios stuck in loop for
more than 5secs aborting
[   11.672469] [drm:amdgpu_atom_execute_table_locked [amdgpu]] *ERROR* atombios
stuck executing CF4E (len 1030, WS 8, PS 0) @ 0xD2C5
[   11.672492] [drm:amdgpu_atom_execute_table_locked [amdgpu]] *ERROR* atombios
stuck executing C410 (len 114, WS 0, PS 8) @ 0xC41C
[   11.672495] amdgpu :09:00.0: gpu post error!
[   11.672496] amdgpu :09:00.0: Fatal error during GPU init
[   11.672497] [drm] amdgpu: finishing device.



but I also found some errors like this: Do they have to do something with our
GPU?
[2.732913] ACPI BIOS Error (bug): AE_AML_BUFFER_LIMIT, Field [D128] at bit
offset/length 128/1024 exceeds size of target Buffer (160 bits)
(20190703/dsopcode-198)
[2.732919] ACPI Error: Aborting method \HWMC due to previous error
(AE_AML_BUFFER_LIMIT) (20190703/psparse-529)
[2.732927] ACPI Error: Aborting method \_SB.WMID.WMAA due to previous error
(AE_AML_BUFFER_LIMIT) (20190703/psparse-529)



I also have some of those BAR-errors, but I never have them with the PCI Adress
of the eGPU only with other adresses:
[0.779001] pci :02:00.0: BAR 13: assigned [io  0x2000-0x4fff]
[0.779004] pci :03:02.0: BAR 15: no space for [mem size 0x0020
64bit pref]
[0.779005] pci :03:02.0: BAR 15: failed to assign [mem size 0x0020
64bit pref]
[0.779006] pci :03:01.0: BAR 13: assigned [io  0x2000-0x2fff]
[0.779006] pci :03:02.0: BAR 13: assigned [io  0x3000-0x3fff]
[0.779007] pci :03:04.0: BAR 13: assigned [io  0x4000-0x4fff]
[0.779009] pci :03:02.0: BAR 15: no space for [mem size 0x0020
64bit pref]
[0.779010] pci :03:02.0: BAR 15: failed to assign [mem size 0x0020
64bit pref]
[0.779010] pci :03:00.0: PCI bridge to [bus 04]
[0.779016] pci :03:00.0:   bridge window [mem 0xde00-0xde0f]
[0.779025] pci :03:01.0: PCI bridge to [bus 05-37]
[0.779027] pci :03:01.0:   bridge window [io  0x2000-0x2fff]
[0.779032] pci :03:01.0:   bridge window [mem 0xb000-0xc7ef]
[0.779036] pci :03:01.0:   bridge window [mem 0x2f9000-0x2fafff
64bit pref]

Device Adress 03:02.0 Corresponds to my Thunderbolt controller:

03:02.0 PCI bridge: Intel Corporation JHL6540 Thunderbolt 3 Bridge (C step)
[Alpine Ridge 4C 2016] (rev 02)


@Robert Strude: You wanted to buy a Vega and make some researches with it. As
you can see there is still something wrong here. Perhaps you are interested and
want to help me finding more information.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2019-04-24 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

--- Comment #53 from Robert Strube  ---
Created attachment 144091
  --> https://bugs.freedesktop.org/attachment.cgi?id=144091=edit
lspci kernel 5.0.x with nividia eGPU

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2019-04-24 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

--- Comment #52 from Robert Strube  ---
Created attachment 144090
  --> https://bugs.freedesktop.org/attachment.cgi?id=144090=edit
dmesg log with kernel 5.0.x with nvidia eGPU

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2019-04-24 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

--- Comment #51 from Robert Strube  ---
Hello Everyone,

I realize it's been a long time since I updated this bug report, apologies in
advance.  I decided to give up on eGPUs + Linux (over Thunderbolt 3) for a
while, and didn't get a chance to really tackle the problem again until more
recently.

Since my initial report, I have been able to get an eGPU working with my Dell
XPS 9575, but only with a Nvidia GPU (specifically an RTX 2070).  I did try
another AMD card, but ran into the same problems.

I'll attach my dmesg and lspci information in the hopes that this might shed
some light on why the nvidia GPU works correctly (albeit with the proprietary
driver) and certain AMD GPUs don't.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2019-04-19 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

--- Comment #50 from Dimitar Atanasov  ---
There is two windows on this system. Small one below 4GB which is 2.5GB and
bigger one over 4GB which is 64GB. Address space for thunderbolt is only 200
MB. As I know AMDGPU needs 250 MB in low 4GB and rest is in big space.
Interesting enough, is that I have XPS 9570 which is 8750h and there is BIOS
option how to assign MMIO space, there is no such option here.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2019-04-01 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

--- Comment #49 from Alex Deucher  ---
(In reply to Dimitar Atanasov from comment #48)
> May be problem is the CPU, because it has only 16 PCIe lains, so 8 for vega
> M,
> 4 for NVME, and 4 for others, I have seen that card reader is also connected
> on PCIe.

It's the MMIO space in the CPU's address space.  The CPU (by way of the sbios)
defines a window of address space that is used for device MMIO.  By default
most platforms put a relatively small MMIO windows below 4GB for 32 bit OS
compatibility.  Having a small MMIO windows limits the amount of space for
devices and if there is not enough space some device resources can't be mapped
which is what causes the problem.  There is often a feature in the sbios config
called ">4GB MMIO" or similar which enables a bigger MMIO windows.  Some
sbioses also enable it dynamically depending on what OS is booted or conditions
in the system at boot time (legacy vs UEFI boot).  IIRC, it's a requirement for
windows 10, so there is probably something about the windows 10 OEM install
which causes it to boot with a larger MMIO window set up.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2019-03-30 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

--- Comment #48 from Dimitar Atanasov  ---
May be problem is the CPU, because it has only 16 PCIe lains, so 8 for vega M,
4 for NVME, and 4 for others, I have seen that card reader is also connected on
PCIe.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2019-03-30 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

Rasmus Thomsen  changed:

   What|Removed |Added

 CC||cogi...@exherbo.org

--- Comment #47 from Rasmus Thomsen  ---
Created attachment 143818
  --> https://bugs.freedesktop.org/attachment.cgi?id=143818=edit
dmesg-HP-SpectreX360-8705G

Seems that I also hit the

[13974.780260] pcieport :03:04.0: BAR 13: failed to assign [io  size
0x1000]
[13974.780262] pcieport :03:02.0: BAR 15: no space for [mem size 0x0020
64bit pref]

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2019-03-30 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

--- Comment #46 from Rasmus Thomsen  ---
FWIW, I'm also hit by this on a HP Spectre X360 with a Vega M configuration
(thought it's notable as the other reports seem to only come from Dell
machines?). Disabling the Vega M by doing `echo 1 >
/sys/bus/pci//devices/:01:00.0/remove` hasn't changed a thing (Checked by
lspci first, that is the right device) and booting with `acpi=off` causes my
touchpad and the thunderbolt controller to not function at all (apparently?),
so there's that.

Am using a RX470 in a Razer Core X, which works flawlessly on Windows. But I
guess the Vega M does the trick on Linux for now, thanks for the great work on
amdgpu and Mesa :)

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2019-03-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

--- Comment #45 from Alex Deucher  ---
(In reply to Dimitar Atanasov from comment #44)
> Works with 4.19.32 witch acpi=off, but CPU is single core

Most likely some other device failed to get enumerated in that case and that
freed up resources for the bridges.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2019-03-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

--- Comment #44 from Dimitar Atanasov  ---
Works with 4.19.32 witch acpi=off, but CPU is single core

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2019-03-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

--- Comment #43 from Alex Deucher  ---
(In reply to Dimitar Atanasov from comment #41)
> Created attachment 143805 [details]
> default boot vega 56

Similar issues on your system:
[  168.653171] pci :3d:00.0: BAR 0: no space for [mem size 0x4000]
[  168.653176] pci :3d:00.0: BAR 0: failed to assign [mem size 0x4000]

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2019-03-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

--- Comment #42 from Dimitar Atanasov  ---
actualy with acpi=off, Vega 56 is not found at all during boot, only enclosure

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2019-03-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

--- Comment #41 from Dimitar Atanasov  ---
Created attachment 143805
  --> https://bugs.freedesktop.org/attachment.cgi?id=143805=edit
default boot vega 56

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2019-03-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

--- Comment #40 from Dimitar Atanasov  ---
Created attachment 143804
  --> https://bugs.freedesktop.org/attachment.cgi?id=143804=edit
acpi=off amd.dpm=0

Vega 56

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2019-03-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

--- Comment #39 from Dimitar Atanasov  ---
I am with same problem. Computer is Dell Precision 5530 2-in-1 with VegaM
inside
EGPU is Vega56. EGPU is not starting even with acpi=off. Kernel 5.0.4

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2019-01-09 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

--- Comment #38 from Alex Deucher  ---
(In reply to Robert Strube from comment #34)
> Created attachment 143044 [details]
> dmesg log amdgpu.dpm=0 with 580 as eGPU
> 
> Another user is reporting a similar problem with a different Dell laptop
> (the XPS 9370).  He provided two dmesg log files.  This one has amdgpu=0.

It would appear that this user is not experiencing the same issue as you.  In
your case the driver fails to even post the GPU.  That happens long before dpm
is initialized.  The other user can try adding amdgpu.ppfeaturemask=0xfffd3ffb
to disable pcie power management to see if his issue is related to comment 29.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2019-01-09 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

--- Comment #37 from Robert Strube  ---
One additional comment, the user with the XPS 9370 was able to get the RX 580
working as an eGPU flawlessly in Windows 10.  This lends some credibility to
the theory that it might not actually be a BIOS issue - unless the BIOS bug is
worked around in Windows 10 drivers.  Please see here for additional
information:

https://forum.manjaro.org/t/rx-580-in-a-thunderbolt-egpu-dock/58210/30

I'm planning on purchasing an Vega 56 or 64 in the near future so I can
continue to attempt to troubleshoot the issue.

Rob

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2019-01-09 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

--- Comment #36 from Robert Strube  ---
(In reply to Robert Strube from comment #34)
> Created attachment 143044 [details]
> dmesg log amdgpu.dpm=0 with 580 as eGPU
> 
> Another user is reporting a similar problem with a different Dell laptop
> (the XPS 9370).  He provided two dmesg log files.  This one has amdgpu=0.

Meant to say amdgpu.dpm=0 as a boot paramater.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2019-01-09 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

Robert Strube  changed:

   What|Removed |Added

 Attachment #143044|dmesg log amdgpu=0 with 580 |dmesg log amdgpu.dpm=0 with
description|as eGPU |580 as eGPU

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2019-01-09 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

--- Comment #35 from Robert Strube  ---
Created attachment 143045
  --> https://bugs.freedesktop.org/attachment.cgi?id=143045=edit
dmesg log pci=noacpi with 580 as eGPU

Another user has reported a similar problem with a different laptop (XPS 9370).
 He provided to dmesg log files.  This one has pci=noacpi set as a kernel boot
parameter.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2019-01-09 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

--- Comment #34 from Robert Strube  ---
Created attachment 143044
  --> https://bugs.freedesktop.org/attachment.cgi?id=143044=edit
dmesg log amdgpu=0 with 580 as eGPU

Another user is reporting a similar problem with a different Dell laptop (the
XPS 9370).  He provided two dmesg log files.  This one has amdgpu=0.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2018-11-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

--- Comment #33 from Alex Deucher  ---
With respect to pci(e) devices, acpi=off and pci=noacpi are equivalent I think.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2018-11-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

--- Comment #32 from Robert Strube  ---
Hi Alex,

I just tested acpi=off independently and not amdgpu.dpm=0 independently. 

In regards to the other user, yes he had the exact same PCI resource issues as
me, what I'm curious to find out though, is when he passed in PCI=noacpi and
was able to get the card initialized if those same PCI resource issues are
present.  My hunch is that they are still present AND the card was able to
initialize, but I'd be anxious to see.

I've also asked him to test out amdgpu.dpm=0 independently and report back. 
Hopefully you're onto something here!

Rob

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2018-11-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

--- Comment #31 from Alex Deucher  ---
(In reply to Robert Strube from comment #30)
> Hi Alex,
> 
> Thanks for the reply.  I wanted to clarify an important point: When I
> disabled PM completely and ACPI completely, I did not see any PCI resource
> issues AND the eGPU initialized successfully.
> 
> acpi=off apm=off amdgpu.dpm=0 amdgpu.aspm=0 amdgpu.runpm=0 amdgpu.bapm=0

the only relevant items here are acpi=off and amdgpu.dpm=0.  Did you test them
independently or just together?  setting dpm=0 is irrelevant if acpi=off since
there will be no resource restrictions.  You need to test them independently.

> 
> I've been working with another user that has the exact same system (XPS
> 9575) and an RX 580 and is having the same issues.  He was actually able to
> get the eGPU initialized by passing in PCI=noacpi rather than completely
> disabling ACPI as a whole.  I'll double check with him to see if he can post
> his dmesg log - because I'm not sure if the PCI resource issues are present
> under those circumstances.

looks like the same issue as yours.  PCI resources not getting assigned.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2018-11-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

--- Comment #30 from Robert Strube  ---
Hi Alex,

Thanks for the reply.  I wanted to clarify an important point: When I disabled
PM completely and ACPI completely, I did not see any PCI resource issues AND
the eGPU initialized successfully.

However, after testing a little more I was able to keep the PM enabled and only
disabled ACPI.  In this situation I encountered a scenario where the PCI
resource issues were present in the log, BUT the eGPU still initialized.  I
mentioned that briefly in one of my previous comments but didn't really
elaborate.

So under certain situations the eGPU did initialize despite seeing PCI BAR
resource issues.

I've been working with another user that has the exact same system (XPS 9575)
and an RX 580 and is having the same issues.  He was actually able to get the
eGPU initialized by passing in PCI=noacpi rather than completely disabling ACPI
as a whole.  I'll double check with him to see if he can post his dmesg log -
because I'm not sure if the PCI resource issues are present under those
circumstances.

Reference:
https://forum.manjaro.org/t/rx-580-in-a-thunderbolt-egpu-dock/58210/13

At this point I've had to return my RX 580 - great card but after about a month
of troubleshooting I was running out of time in the return window - so I'm
unable to do any more testing with that specific card at this time.  Kind of a
bummer... I'll probably pick up a Vega early next year and try again.

Rob

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2018-11-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

--- Comment #29 from Alex Deucher  ---
(In reply to Robert Strube from comment #28)
> Quick update:
> 
> I heard back from the ACPI BIOS kernel developers (see:
> https://bugzilla.kernel.org/show_bug.cgi?id=201527) and they seem to imply
> that the PCI resource issues showing up in the dmseg log are *not* a
> problem.  Linux is simply trying to allocate more resources, and that the
> failure is OK and it does get the requisite resources required.

It would appear to not be ok as when it gets the resources, the GPU works.

Does disabling dpm make the GPU work?  E.g., append amdgpu.dpm=0 to the kernel
command line in grub.  The driver needs to query the supported pcie speeds from
the pcie bridge it is connected to in order to setup the power management
controller.  Maybe when the resources are not available, the driver is not able
to get that information or it gets garbage.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2018-11-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

--- Comment #28 from Robert Strube  ---
Quick update:

I heard back from the ACPI BIOS kernel developers (see:
https://bugzilla.kernel.org/show_bug.cgi?id=201527) and they seem to imply that
the PCI resource issues showing up in the dmseg log are *not* a problem.  Linux
is simply trying to allocate more resources, and that the failure is OK and it
does get the requisite resources required.

See this comment https://bugzilla.kernel.org/show_bug.cgi?id=201527#c8 from
Mika.

I'm not sure where this leaves us, is it a BIOS / PCI resource issue, or is it
a bug within amdgpu?

I've also been in contact with Dell regarding the possibility that there is a
BIOS bug causing some of these issues.  I'm going to need to conduct some
testing on Windows with eGPUs to see if the problem also exists there.

Thanks!
Rob

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2018-10-26 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

--- Comment #27 from Robert Strube  ---
Thanks for the response.

I think it was just a coincidence that the eGPU started working with acpi=off. 
Taking a closer look at the issue it really does appear to be a BIOS problem
that prevents the proper PCI resource allocation to one of the TB PCI bridges. 
In fact when I took a closer look at the dmesg with acpi=off I still see the
resource issues present.

I've opened up an official bug report here with the kernel ACPI BIOS team here:
https://bugzilla.kernel.org/show_bug.cgi?id=201527

I realize the issue should really be solved by the manufacturer, but perhaps
the kernel devs can create a work around and/or have more direct lines of
communication with the Dell engineers.  Thank you both for your suggestions and
comments.

Rob

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2018-10-26 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

--- Comment #26 from Christian König  ---
I don't want to stop your cheering, but that isn't a perfect solution either.

(In reply to Robert Strube from comment #25)
> I'm wondering where I can report the Thunderbolt Controller / Bridge bug? 
> Perhaps you fine folks can point me in the right direction?

That is unfortunately most like a bug in the BIOS.

What happens here is when you specify acpi=off that the internal Vega M gets
disabled and the address space that one used freed up.

This address space is then used for the Thunderbolt Controller to handle the
Polaris.

What you could try is to blacklist amdgpu from automatically loading and then
issue the following commands as root manually:

#Disable the internal Vega M
echo 1 > ./bus/pci/devices/:01:00.0/remove
#Manually load amdgpu to initialize the Polaris
modprobe amdgpu
#Rescan the PCI bus to find the Vega M again
echo 1 > ./bus/pci/devices/:00:00.0/rescan

It's just a shot into the dark, but that might work as well.

Apart from that there isn't much else you could do except to upgrade the BIOS
or use different hardware.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2018-10-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

--- Comment #25 from Robert Strube  ---
acpi=off is the only parameter necessary to get the eGPU up and running.
Setting this parameter allows the Thunderbolt PCI bridge to correctly have it's
resources allocated. This incidentally also completely disables the Vega M
(even with Vanilla kernel that does not have the device IDs commented out).

I'm wondering where I can report the Thunderbolt Controller / Bridge bug? 
Perhaps you fine folks can point me in the right direction?

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2018-10-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

--- Comment #24 from Robert Strube  ---
Created attachment 142209
  --> https://bugs.freedesktop.org/attachment.cgi?id=142209=edit
dmesg log booting system with PM *DISABLED*  and *WITH* eGPU

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2018-10-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

--- Comment #23 from Robert Strube  ---
Edit: taking a closer look at the dmesg I see that disabling the PM did indeed
eliminate the PCI resource issues.  So for some reason having PM enabled
affects the PCI resource allocation for the Thunderbolt PCI bridges!

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2018-10-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

--- Comment #22 from Robert Strube  ---
OK! My hunch about the PM was right! The card is fully initialized now, so the
issue doesn't appear to be a PCI resource issue!

I took the brute force approach, compiled my own custom kernel that completely
disabled the Vega M (by commenting out it's device IDs). I then passed in the
following kernel boot parameters:

acpi=off apm=off amdgpu.dpm=0 amdgpu.aspm=0 amdgpu.runpm=0 amdgpu.bapm=0

Rebooted the machine and *BAM* the eGPU was initialized!

I'm attaching the new dmesg!

I'm just super excited that I was able to get the eGPU initialized!

xrandr even sees it!

xrandr --listproviders
Providers: number : 2
Provider 0: id: 0x74 cap: 0x9, Source Output, Sink Offload crtcs: 3 outputs: 7
associated providers: 1 name:modesetting
Provider 1: id: 0x4a cap: 0x6, Sink Output, Source Offload crtcs: 6 outputs: 5
associated providers: 1 name:Radeon RX 580 Series @ pci::09:00.0

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2018-10-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

--- Comment #21 from Robert Strube  ---
Hi guys,

Apologies for the deluge of posts here, I've been trying really hard to
investigate this issue!

So I took a closer look at the PCI resource issues that you mentioned, I've
also been looking and thunderbolt driver issues in general, and I've noticed
that this type of log message is quite common.  Here's what I'm wondering:

These four devices correspond to the TB to PCI bridges in the system

:04:00.0
:05:01.0
:05:02.0
:05:04.0

04:00.0 PCI bridge: Intel Corporation JHL6540 Thunderbolt 3 Bridge (C step)
[Alpine Ridge 4C 2016] (rev 02) (prog-if 00 [Normal decode])
Flags: bus master, fast devsel, latency 0, IRQ 16
Bus: primary=04, secondary=05, subordinate=6e, sec-latency=0
Memory behind bridge: bc00-ea0f
Prefetchable memory behind bridge: 002fb000-002ff9ff
Capabilities: [80] Power Management version 3
Capabilities: [88] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [ac] Subsystem: Intel Corporation JHL6540 Thunderbolt 3
Bridge (C step) [Alpine Ridge 4C 2016]
Capabilities: [c0] Express Upstream Port, MSI 00
Capabilities: [100] Device Serial Number b7-de-04-b0-a6-c9-a0-00
Capabilities: [200] Advanced Error Reporting
Capabilities: [300] Virtual Channel
Capabilities: [400] Power Budgeting 
Capabilities: [500] Vendor Specific Information: ID=1234 Rev=1 Len=0d8

Capabilities: [600] Latency Tolerance Reporting
Capabilities: [700] #19
Kernel driver in use: pcieport

05:00.0 PCI bridge: Intel Corporation JHL6540 Thunderbolt 3 Bridge (C step)
[Alpine Ridge 4C 2016] (rev 02) (prog-if 00 [Normal decode])
Flags: bus master, fast devsel, latency 0, IRQ 16
Bus: primary=05, secondary=06, subordinate=06, sec-latency=0
Memory behind bridge: ea00-ea0f
Capabilities: [80] Power Management version 3
Capabilities: [88] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [ac] Subsystem: Intel Corporation JHL6540 Thunderbolt 3
Bridge (C step) [Alpine Ridge 4C 2016]
Capabilities: [c0] Express Downstream Port (Slot+), MSI 00
Capabilities: [100] Device Serial Number b7-de-04-b0-a6-c9-a0-00
Capabilities: [200] Advanced Error Reporting
Capabilities: [300] Virtual Channel
Capabilities: [400] Power Budgeting 
Capabilities: [500] Vendor Specific Information: ID=1234 Rev=1 Len=0d8

Capabilities: [700] #19
Kernel driver in use: pcieport

05:01.0 PCI bridge: Intel Corporation JHL6540 Thunderbolt 3 Bridge (C step)
[Alpine Ridge 4C 2016] (rev 02) (prog-if 00 [Normal decode])
Flags: bus master, fast devsel, latency 0, IRQ 17
Bus: primary=05, secondary=07, subordinate=39, sec-latency=0
Memory behind bridge: bc00-d3ef
Prefetchable memory behind bridge: 002fb000-002fcfff
Capabilities: [80] Power Management version 3
Capabilities: [88] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [ac] Subsystem: Intel Corporation JHL6540 Thunderbolt 3
Bridge (C step) [Alpine Ridge 4C 2016]
Capabilities: [c0] Express Downstream Port (Slot+), MSI 00
Capabilities: [100] Device Serial Number b7-de-04-b0-a6-c9-a0-00
Capabilities: [200] Advanced Error Reporting
Capabilities: [300] Virtual Channel
Capabilities: [400] Power Budgeting 
Capabilities: [500] Vendor Specific Information: ID=1234 Rev=1 Len=0d8

Capabilities: [700] #19
Kernel driver in use: pcieport

05:02.0 PCI bridge: Intel Corporation JHL6540 Thunderbolt 3 Bridge (C step)
[Alpine Ridge 4C 2016] (rev 02) (prog-if 00 [Normal decode])
Flags: bus master, fast devsel, latency 0, IRQ 18
Bus: primary=05, secondary=3a, subordinate=3a, sec-latency=0
Memory behind bridge: d3f0-d3ff
Capabilities: [80] Power Management version 3
Capabilities: [88] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [ac] Subsystem: Intel Corporation JHL6540 Thunderbolt 3
Bridge (C step) [Alpine Ridge 4C 2016]
Capabilities: [c0] Express Downstream Port (Slot+), MSI 00
Capabilities: [100] Device Serial Number b7-de-04-b0-a6-c9-a0-00
Capabilities: [200] Advanced Error Reporting
Capabilities: [300] Virtual Channel
Capabilities: [400] Power Budgeting 
Capabilities: [500] Vendor Specific Information: ID=1234 Rev=1 Len=0d8

Capabilities: [700] #19
Kernel driver in use: pcieport

05:04.0 PCI bridge: Intel Corporation JHL6540 Thunderbolt 3 Bridge (C step)
[Alpine Ridge 4C 2016] (rev 02) (prog-if 00 [Normal decode])
Flags: bus master, fast devsel, latency 0, IRQ 16
Bus: primary=05, secondary=3b, subordinate=6e, sec-latency=0
Memory behind bridge: 

[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2018-10-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

--- Comment #20 from Robert Strube  ---
One more thing I thought of.  Would it help if I posted my dmesg log with the
GTX 1060 connected as an eGPU?

As I mentioned previously this card *is* working with nouveau.  I haven't
tested with the proprietary nvidia drivers.

I'd imagine that PCI resource issues you pointed out are still there, so I'm
surprised that the nvidia card is able to work.  Perhaps they have some hacks
in their drivers to work around issues like this?

I also have a friend that has an older RX 290, should I give that a shot as
well? It might take me a while to get a hold of that card.

I don't doubt that this is most likely a BIOS bug, but I've noticed people on
the windows side of the fence getting the XPS 9575 working with eGPUs, and
presumably they have the same BIOS as me.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2018-10-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

Robert Strube  changed:

   What|Removed |Added

 Attachment #142205|fresh dmesg log booting |fresh dmesg log booting
description|system *wite* eGPU  |system *with* eGPU
   |connected at boot   |connected at boot

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2018-10-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

--- Comment #19 from Robert Strube  ---
Created attachment 142205
  --> https://bugs.freedesktop.org/attachment.cgi?id=142205=edit
fresh dmesg log booting system *wite* eGPU connected at boot

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2018-10-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

--- Comment #18 from Robert Strube  ---
Created attachment 142204
  --> https://bugs.freedesktop.org/attachment.cgi?id=142204=edit
sudo cat /proc/iomem *with* eGPU connected at boot

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2018-10-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

--- Comment #17 from Robert Strube  ---
Created attachment 142203
  --> https://bugs.freedesktop.org/attachment.cgi?id=142203=edit
lspci *with* eGPU attached at boot

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2018-10-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

--- Comment #16 from Robert Strube  ---
Created attachment 142202
  --> https://bugs.freedesktop.org/attachment.cgi?id=142202=edit
fresh dmesg log booting system *without* eGPU

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2018-10-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

Robert Strube  changed:

   What|Removed |Added

 Attachment #142200|lspci (no eGPU) |lspci with eGPU *not*
description||connected.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2018-10-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

--- Comment #15 from Robert Strube  ---
Created attachment 142201
  --> https://bugs.freedesktop.org/attachment.cgi?id=142201=edit
sudo cat /proc/iomem when eGPU *not* connected

sudo cat /proc/iomem when eGPU *not* connected

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2018-10-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

--- Comment #14 from Robert Strube  ---
Created attachment 142200
  --> https://bugs.freedesktop.org/attachment.cgi?id=142200=edit
lspci (no eGPU)

lspci -t -nn -v output when the eGPU is *not* connected.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2018-10-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

Christian König  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |NOTOURBUG

--- Comment #13 from Christian König  ---
This problem isn't related to the GPU in any way, that the amdgpu driver fails
to load is just another symptom.

The Thunderbolt bridge doesn't get enough resources assigned for its devices
even when the GPU isn't present at all for some reason.

That could be a problem with the BIOS, with the Linux Thunderbolt or the
resource allocation in the PCI subsystem.

Please provide the output of "sudo cat /proc/iomem" and "lspci -t -nn -v"
together with an up to date dmesg.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2018-10-24 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

--- Comment #12 from Robert Strube  ---
Created attachment 142187
  --> https://bugs.freedesktop.org/attachment.cgi?id=142187=edit
dmesg log booting system *without* eGPU

So I decided to do a sanity check and completely remove the eGPU from the
equation.  I am still getting the BAR errors in dmesg, so perhaps this isn't
the problem after all?!

The one thing I noticed is that the BAR errors are for the pci module and not
the pcieport module.  One other thing worth mentioning is that I tried with a
different GPU (for the eGPU) yesterday.  I had a GTX 1060 available, and this
*did* work correctly.  I haven't double checked if the BAR errors are present
with the other GPU.

Perhaps it is a bug with amdgpu after all?

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2018-10-24 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

--- Comment #11 from Robert Strube  ---
I disabled a bunch of devices in the BIOS (sound, SD card reader, etc.) and I
confirmed that they are no longer showing up in lspci, but I'm still getting
the same error.

I also found one suggestion to pass in a kernel parameter of hpbussize=4 to
increase the bus size made available for hot-pluggable devices, this also
didn't help.

Thanks for all your assistance BTW!

Any other suggestions? I'm starting run out of options here.

Thanks!

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2018-10-24 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

--- Comment #10 from Alex Deucher  ---
(In reply to Robert Strube from comment #9)
> Any suggestion for how I can increase the MMIO space for the BARs on the
> Thunderbolt bridges? Should I try to disable additional devices in the BIOS,
> etc.?  I'm a little out of my element here.

Worth a shot if you can.

Alex

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2018-10-24 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

--- Comment #9 from Robert Strube  ---
Any suggestion for how I can increase the MMIO space for the BARs on the
Thunderbolt bridges? Should I try to disable additional devices in the BIOS,
etc.?  I'm a little out of my element here.

Thanks!
Rob

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2018-10-24 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

--- Comment #8 from Alex Deucher  ---
There does not seem to be enough MMIO space for the BARs on the thunderbolt
bridges:
[0.436946] pci :04:00.0: BAR 13: no space for [io  size 0x4000]
[0.436947] pci :04:00.0: BAR 13: failed to assign [io  size 0x4000]
[0.436949] pci :04:00.0: BAR 13: assigned [io  0xc000-0xcfff]
[0.436950] pci :04:00.0: BAR 13: [io  0xc000-0xcfff] (failed to expand
by 0x3000)
[0.436951] pci :04:00.0: failed to add 3000 res[13]=[io  0xc000-0xcfff]
[0.436955] pci :05:02.0: BAR 15: no space for [mem size 0x0020
64bit pref]
[0.436956] pci :05:02.0: BAR 15: failed to assign [mem size 0x0020
64bit pref]
[0.436957] pci :05:01.0: BAR 13: no space for [io  size 0x2000]
[0.436958] pci :05:01.0: BAR 13: failed to assign [io  size 0x2000]
[0.436959] pci :05:02.0: BAR 13: assigned [io  0xc000-0xcfff]
[0.436960] pci :05:04.0: BAR 13: no space for [io  size 0x1000]
[0.436961] pci :05:04.0: BAR 13: failed to assign [io  size 0x1000]
[0.436963] pci :05:01.0: BAR 13: assigned [io  0xc000-0xcfff]
[0.436964] pci :05:04.0: BAR 13: no space for [io  size 0x1000]
[0.436965] pci :05:04.0: BAR 13: failed to assign [io  size 0x1000]
[0.436967] pci :05:02.0: BAR 15: no space for [mem size 0x0020
64bit pref]
[0.436968] pci :05:02.0: BAR 15: failed to assign [mem size 0x0020
64bit pref]
[0.436969] pci :05:02.0: BAR 13: no space for [io  size 0x1000]
[0.436970] pci :05:02.0: BAR 13: failed to assign [io  size 0x1000]
[0.436971] pci :05:01.0: BAR 13: [io  0xc000-0xcfff] (failed to expand
by 0x1000)
[0.436972] pci :05:01.0: failed to add 1000 res[13]=[io  0xc000-0xcfff]
I don't think that should be an issue for the devices behind it, but perhaps it
is?

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2018-10-24 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

--- Comment #7 from Robert Strube  ---
Thanks for the suggestions! I took your advice and commented out the Vega M
device IDs located here: /drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c

These are the lines of code that I commented out.
/* VEGAM */
{0x1002, 0x694C, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VEGAM},
{0x1002, 0x694E, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VEGAM},

This did indeed cause my Vega M to not be initialized, *but* the problem I'm
having with the eGPU remains. So it appears you were correct and my hunch that
the Vega M is interfering with the eGPU initialization was incorrect.  I'm back
to square one...

I uploaded a new dmesg log for this kernel, perhaps with the Vega M out of the
equation you might see something new?

Thanks!
Rob

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2018-10-24 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

--- Comment #6 from Robert Strube  ---
Created attachment 142182
  --> https://bugs.freedesktop.org/attachment.cgi?id=142182=edit
dmesg log booting system with eGPU (Vega M device IDs removed in kernel)

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2018-10-24 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

--- Comment #5 from Alex Deucher  ---
If you can get any of the other methods to work you can remove the vegam device
id from the driver.  That said, I doubt it will make a difference.  Usually the
problem with thunderbolt is that pci BAR resources don't get assigned properly
to the devices and the ones the driver needs are not available.  That doesn't
seem to be the case here, but I might be missing something.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2018-10-24 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

--- Comment #4 from Robert Strube  ---
I decided to apply a hack to 4.19 and see if I could get the eGPU to
initialize.  I noticed that this code in /drivers/gpu/drm/amd/amdgpu/atom.c

if ((jiffies_to_msecs(cjiffies) > 5000)) {
DRM_ERROR("atombios stuck in loop for more than 5secs aborting\n");
ctx->abort = true;
}

is where the error is being thrown, so I thought I would try giving the eGPU
more time.  I increased the 5000 value to 15000, recompiled the kernel, and
tried to attach the eGPU.  Unfortunately I received the same error, but this
time after 15 seconds of trying to initialize the GPU.  Should I increase the
time even more?

I'm not sure if the issue is actually related to not having enough time, or if
it's something else entirely.

I'll bump it up to 30 seconds in a final last ditch attempt.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2018-10-23 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

--- Comment #3 from Robert Strube  ---
Quick question:

Is it possible to completely disable the Vega M using kernel boot parameters. 
I did try using pci-stub.ids=: with the PCI hex id for my Vega M
(1002:694e) but amdgpu was still applied to the device, not sure why.  I also
thought there was explicit PCI device blacklisting support in the kernel, but I
have been unable to find any documentation on this.

Ideally I'd like to see if having the Vega M disabled allows the eGPU to be
correctly initialized.  I took a look at the documentation for amdgpu, but I
didn't see any boot parameters that stood out at me.

Blacklisting the amdgpu module wouldn't work either, as I need that to
correctly support the RX 580 once it's attached.

Thanks!

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2018-10-23 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

Robert Strube  changed:

   What|Removed |Added

 Attachment #142151|dmesg log booting system|dmesg log booting system
description|with eGPU attched and   |with eGPU attached and
   |powered |powered

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2018-10-23 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

--- Comment #2 from Robert Strube  ---
Created attachment 142151
  --> https://bugs.freedesktop.org/attachment.cgi?id=142151=edit
dmesg log booting system with eGPU attched and powered

Starting at line:

[   11.192733] ATOM BIOS: 401815-171128-QS1

You can see the failure that occurs when trying to initialize the RX 580 as an
eGPU over Thunderbolt 3.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2018-10-23 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

--- Comment #1 from Alex Deucher  ---
Please attach your dmesg output.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 108521] RX 580 as eGPU amdgpu: gpu post error!

2018-10-22 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108521

Bug ID: 108521
   Summary: RX 580 as eGPU amdgpu: gpu post error!
   Product: DRI
   Version: unspecified
  Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
  Severity: normal
  Priority: medium
 Component: DRM/AMDgpu
  Assignee: dri-devel@lists.freedesktop.org
  Reporter: rstr...@gmail.com

Hello everyone,

I've been attempting to get my RX 580 working correctly as an eGPU using the
Akitio Node eGPU enclosure (over Thunderbolt 3).

I've confirmed that both the Akitio Node and my laptops Thunderbolt 3
controller are running the most up-to-date firmware.  I've also been able to
successfully authorize the Thunderbolt eGPU enclosure, and see the RX 580 in
lspci, see blow:

00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v6/7th Gen Core Processor
Host Bridge/DRAM Registers (rev 05)
00:01.0 PCI bridge: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core
Processor PCIe Controller (x16) (rev 05)
00:02.0 VGA compatible controller: Intel Corporation Device 591b (rev 04)
00:04.0 Signal processing controller: Intel Corporation Xeon E3-1200 v5/E3-1500
v5/6th Gen Core Processor Thermal Subsystem (rev 05)
00:13.0 Non-VGA unclassified device: Intel Corporation 100 Series/C230 Series
Chipset Family Integrated Sensor Hub (rev 31)
00:14.0 USB controller: Intel Corporation 100 Series/C230 Series Chipset Family
USB 3.0 xHCI Controller (rev 31)
00:14.2 Signal processing controller: Intel Corporation 100 Series/C230 Series
Chipset Family Thermal Subsystem (rev 31)
00:15.0 Signal processing controller: Intel Corporation 100 Series/C230 Series
Chipset Family Serial IO I2C Controller #0 (rev 31)
00:15.1 Signal processing controller: Intel Corporation 100 Series/C230 Series
Chipset Family Serial IO I2C Controller #1 (rev 31)
00:16.0 Communication controller: Intel Corporation 100 Series/C230 Series
Chipset Family MEI Controller #1 (rev 31)
00:17.0 SATA controller: Intel Corporation HM170/QM170 Chipset SATA Controller
[AHCI Mode] (rev 31)
00:1c.0 PCI bridge: Intel Corporation 100 Series/C230 Series Chipset Family PCI
Express Root Port #1 (rev f1)
00:1c.4 PCI bridge: Intel Corporation 100 Series/C230 Series Chipset Family PCI
Express Root Port #5 (rev f1)
00:1d.0 PCI bridge: Intel Corporation 100 Series/C230 Series Chipset Family PCI
Express Root Port #9 (rev f1)
00:1f.0 ISA bridge: Intel Corporation QM175 Chipset LPC/eSPI Controller (rev
31)
00:1f.2 Memory controller: Intel Corporation 100 Series/C230 Series Chipset
Family Power Management Controller (rev 31)
00:1f.3 Audio device: Intel Corporation CM238 HD Audio Controller (rev 31)
00:1f.4 SMBus: Intel Corporation 100 Series/C230 Series Chipset Family SMBus
(rev 31)
01:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Polaris 22
[Radeon RX Vega M GL] (rev c0)
02:00.0 Network controller: Qualcomm Atheros QCA6174 802.11ac Wireless Network
Adapter (rev 32)
03:00.0 Unassigned class [ff00]: Realtek Semiconductor Co., Ltd. RTS525A PCI
Express Card Reader (rev 01)
04:00.0 PCI bridge: Intel Corporation JHL6540 Thunderbolt 3 Bridge (C step)
[Alpine Ridge 4C 2016] (rev 02)
05:00.0 PCI bridge: Intel Corporation JHL6540 Thunderbolt 3 Bridge (C step)
[Alpine Ridge 4C 2016] (rev 02)
05:01.0 PCI bridge: Intel Corporation JHL6540 Thunderbolt 3 Bridge (C step)
[Alpine Ridge 4C 2016] (rev 02)
05:02.0 PCI bridge: Intel Corporation JHL6540 Thunderbolt 3 Bridge (C step)
[Alpine Ridge 4C 2016] (rev 02)
05:04.0 PCI bridge: Intel Corporation JHL6540 Thunderbolt 3 Bridge (C step)
[Alpine Ridge 4C 2016] (rev 02)
06:00.0 System peripheral: Intel Corporation JHL6540 Thunderbolt 3 NHI (C step)
[Alpine Ridge 4C 2016] (rev 02)
07:00.0 PCI bridge: Intel Corporation DSL6340 Thunderbolt 3 Bridge [Alpine
Ridge 2C 2015]
08:01.0 PCI bridge: Intel Corporation DSL6340 Thunderbolt 3 Bridge [Alpine
Ridge 2C 2015]
09:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI]
Ellesmere [Radeon RX 470/480/570/570X/580/580X] (rev e7)
09:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon
RX 580]

Looking at just the RX 580 in more detail using lspci -v we have:

09:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI]
Ellesmere [Radeon RX 470/480/570/570X/580/580X] (rev e7) (prog-if 00 [VGA
controller])
Subsystem: XFX Pine Group Inc. Ellesmere [Radeon RX
470/480/570/570X/580/580X]
Flags: fast devsel, IRQ 18
Memory at 2fb000 (64-bit, prefetchable) [size=256M]
Memory at 2fc000 (64-bit, prefetchable) [size=2M]
I/O ports at 2000 [size=256]
Memory at bc00 (32-bit, non-prefetchable) [size=256K]
Expansion ROM at bc04 [disabled] [size=128K]
Capabilities: [48] Vendor Specific Information: Len=08 
Capabilities: [50] Power Management version 3
Capabilities: [58] Express Legacy Endpoint, MSI 00