https://bugzilla.kernel.org/show_bug.cgi?id=201527

            Bug ID: 201527
           Summary: Thunderbolt 3 PCI Bridge Fails to Receive Proper PCI
                    Resources
           Product: ACPI
           Version: 2.5
    Kernel Version: 4.19
          Hardware: All
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: BIOS
          Assignee: acpi_b...@kernel-bugs.osdl.org
          Reporter: rstr...@gmail.com
        Regression: No

Created attachment 279157
  --> https://bugzilla.kernel.org/attachment.cgi?id=279157&action=edit
dmesg log with ACPI enabled

System: Dell XPS 9575 (2 in 1)
Processor: i7-8705G CPU
Internal iGPU: Intel 630
Discrete GPU: Vega M (Polaris 22)
Kernel: 4.19

Description:

I was working with amdgpu developers to try to get a Polaris 10 (RX 580) eGPU
working over Thunderbolt 3 when we discovered some serious problems with PCI
resource allocation to the Thunderbolt 3 PCI bridges.  These PCI resource
issues prevent the eGPU from becoming initialized.

This issue is not tied directly to using eGPUs, as I can demonstrate the
problem without the use of any eGPU, simply by booting the system without any
thunderbolt devices attached to it.

After lots of trial and error I determined that if I pass in the acpi=off
kernel boot parameter, the Vega M GPU becomes disabled - and although the PCI
resource allocation issue is *still* present, the eGPU can become initialized. 
This seems more like a coincidence rather than a proper fix to the problem. 
The amdgpu devs seem to think that having the Vega M disabled frees up certain
address ranges allowing the eGPU to become initialized.

One other thing worth mentioning is that I did try compiling a custom kernel
with the Vega M device IDs commented out to see if this would help and it did
*not* help the situation.  The Vega M was indeed not initialized at boot, but
the PCI resource issues remained.

I'm attaching dmesg logs for 4.19 with ACPI enabled and disabled.  I'm also
attaching lspci -vv -t -nn reports.

The information in the dmesg (with ACPI enabled) that appears relevant is this:

Problems with ACPI BIOS:

[  152.409452] ACPI BIOS Error (bug): Failure creating [\_GPE.XTBT.SPRT],
AE_ALREADY_EXISTS (20180810/dswload2-316)
[  152.409479] No Local Variables are initialized for Method [XTBT]
[  152.409484] Initialized Arguments for Method [XTBT]:  (2 arguments defined
for method invocation)
[  152.409486]   Arg0:   00000000280bc52d <Obj>           Integer
0000000000000009
[  152.409500]   Arg1:   00000000464bcc18 <Obj>           Integer
0000000001060002
[  152.409512] ACPI Error: AE_ALREADY_EXISTS, During name lookup/catalog
(20180810/psobject-221)
[  152.409522] ACPI Error: Method parse/execution failed \_GPE.XTBT,
AE_ALREADY_EXISTS (20180810/psparse-516)
[  152.409537] ACPI Error: Method parse/execution failed \_GPE.XTBT,
AE_ALREADY_EXISTS (20180810/psparse-516)
[  152.409555] ACPI Error: Method parse/execution failed \_GPE._E42,
AE_ALREADY_EXISTS (20180810/psparse-516)
[  152.409568] ACPI: Marking method _E42 as Serialized because of
AE_ALREADY_EXISTS error
[  152.409578] ACPI Error: AE_ALREADY_EXISTS, while evaluating GPE method
[_E42] (20180810/evgpe-509)

PCI resource allocation issues:

Note: devices 0000:04:00.0, 0000:05:00.0, 0000:05:01.0, 0000:05:02.0, and
0000:05:04.0 are all Thunderbolt PCI bridges, but device 0000:05:02.0 seems to
be the problematic one.

[  152.673753] pci_bus 0000:05: Allocating resources
[  152.673792] pci 0000:05:01.0: bridge window [io  0x1000-0x0fff] to [bus
07-39] add_size 1000
[  152.673802] pci 0000:05:02.0: bridge window [io  0x1000-0x0fff] to [bus 3a]
add_size 1000
[  152.673803] pci 0000:05:02.0: bridge window [mem 0x00100000-0x000fffff 64bit
pref] to [bus 3a] add_size 200000 add_align 100000
[  152.673813] pci 0000:05:04.0: bridge window [io  0x1000-0x0fff] to [bus
3b-6e] add_size 1000
[  152.673823] pci 0000:04:00.0: bridge window [io  0x1000-0x0fff] to [bus
05-6e] add_size 3000
[  152.673825] pci 0000:04:00.0: BAR 13: assigned [io  0x2000-0x4fff]
[  152.673829] pci 0000:05:02.0: BAR 15: no space for [mem size 0x00200000
64bit pref]
[  152.673830] pci 0000:05:02.0: BAR 15: failed to assign [mem size 0x00200000
64bit pref]
[  152.673831] pci 0000:05:01.0: BAR 13: assigned [io  0x2000-0x2fff]
[  152.673832] pci 0000:05:02.0: BAR 13: assigned [io  0x3000-0x3fff]
[  152.673832] pci 0000:05:04.0: BAR 13: assigned [io  0x4000-0x4fff]
[  152.673834] pci 0000:05:02.0: BAR 15: no space for [mem size 0x00200000
64bit pref]
[  152.673835] pci 0000:05:02.0: BAR 15: failed to assign [mem size 0x00200000
64bit pref]
[  152.673837] pci 0000:05:00.0: PCI bridge to [bus 06]
[  152.673842] pci 0000:05:00.0:   bridge window [mem 0xea000000-0xea0fffff]
[  152.673852] pci 0000:05:01.0: PCI bridge to [bus 07-39]
[  152.673854] pci 0000:05:01.0:   bridge window [io  0x2000-0x2fff]
[  152.673859] pci 0000:05:01.0:   bridge window [mem 0xbc000000-0xd3efffff]
[  152.673863] pci 0000:05:01.0:   bridge window [mem 0x2fb0000000-0x2fcfffffff
64bit pref]
[  152.673870] pci 0000:05:02.0: PCI bridge to [bus 3a]
[  152.673872] pci 0000:05:02.0:   bridge window [io  0x3000-0x3fff]
[  152.673877] pci 0000:05:02.0:   bridge window [mem 0xd3f00000-0xd3ffffff]
[  152.673887] pci 0000:05:04.0: PCI bridge to [bus 3b-6e]
[  152.673889] pci 0000:05:04.0:   bridge window [io  0x4000-0x4fff]
[  152.673894] pci 0000:05:04.0:   bridge window [mem 0xd4000000-0xe9ffffff]
[  152.673898] pci 0000:05:04.0:   bridge window [mem 0x2fd0000000-0x2ff9ffffff
64bit pref]
[  152.673904] pci 0000:04:00.0: PCI bridge to [bus 05-6e]
[  152.673906] pci 0000:04:00.0:   bridge window [io  0x2000-0x4fff]
[  152.673912] pci 0000:04:00.0:   bridge window [mem 0xbc000000-0xea0fffff]
[  152.673915] pci 0000:04:00.0:   bridge window [mem 0x2fb0000000-0x2ff9ffffff
64bit pref]

It also appears that pcieport has resource PCI allocation issues:

[  193.946376] thunderbolt 0000:06:00.0: stopping RX ring 0
[  193.946388] thunderbolt 0000:06:00.0: disabling interrupt at register
0x38200 bit 12 (0xffffffff -> 0xffffefff)
[  193.946404] thunderbolt 0000:06:00.0: stopping TX ring 0
[  193.946413] thunderbolt 0000:06:00.0: disabling interrupt at register
0x38200 bit 0 (0xffffffff -> 0xfffffffe)
[  193.946421] thunderbolt 0000:06:00.0: control channel stopped
[  193.946516] thunderbolt 0000:06:00.0: freeing RX ring 0
[  193.946527] thunderbolt 0000:06:00.0: freeing TX ring 0
[  193.946542] thunderbolt 0000:06:00.0: shutdown
[  193.985339] pci_bus 0000:05: Allocating resources
[  193.985415] pcieport 0000:05:02.0: bridge window [mem 0x00100000-0x000fffff
64bit pref] to [bus 3a] add_size 200000 add_align 100000
[  193.985458] pcieport 0000:05:02.0: BAR 15: no space for [mem size 0x00200000
64bit pref]
[  193.985462] pcieport 0000:05:02.0: BAR 15: failed to assign [mem size
0x00200000 64bit pref]
[  193.985470] pcieport 0000:05:02.0: BAR 15: no space for [mem size 0x00200000
64bit pref]
[  193.985473] pcieport 0000:05:02.0: BAR 15: failed to assign [mem size
0x00200000 64bit pref]
[  198.333956] pcieport 0000:05:00.0: Refused to change power state, currently
in D3

Based on all the feedback I received so far, this does appear to be a BIOS
issue, but I felt it was important to report the issue in case the kernel
developers can come up with a work around - or perhaps if there is a more
direct line of communication with the Dell engineers.

I'm very willing and able to test out any patches you throw at me!

Thanks!
Rob

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

_______________________________________________
acpi-bugzilla mailing list
acpi-bugzilla@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/acpi-bugzilla

Reply via email to