On Fri, 26 May 2017 01:52:35 +0000
"Cheng, Collins" <collins.ch...@amd.com> wrote:

> Hi Alex W,
> 
> I don't need the kernel patch anymore. However it looks the kernel could be 
> improved to handle this more gracefully when PCI resource allocation fail. Do 
> you have a plan to improve it in kernel PCI code?

I don't have a device capable of reproducing and I'm currently working
on issues elsewhere.  If you don't plan to continue working on it, I'd
suggest filing a bug at bugzilla.kernel.org so that we can at least
track the problem.  Thanks,

Alex

> -----Original Message-----
> From: Cheng, Collins 
> Sent: Wednesday, May 24, 2017 4:56 PM
> To: 'Alex Williamson' <alex.william...@redhat.com>
> Cc: Alexander Duyck <alexander.du...@gmail.com>; Bjorn Helgaas 
> <bhelg...@google.com>; linux-...@vger.kernel.org; 
> linux-kernel@vger.kernel.org; Deucher, Alexander <alexander.deuc...@amd.com>; 
> Zytaruk, Kelly <kelly.zyta...@amd.com>; Yinghai Lu <ying...@kernel.org>
> Subject: RE: [PATCH] PCI: Make SR-IOV capable GPU working on the SR-IOV 
> incapable platform
> 
> Hi Alex W, Alex D,
> 
> I just tried two options, one is enable "Above 4G Decoding" in BIOS setup 
> menu, the other is add "pci=realloc=off" in grub. Both can fix this issue. 
> Please see the attached log files.
> 
> Previously I thought "Above 4G Decoding" is enabled, but it is off when I 
> looked CMOS setup today.
> 
> For now I think we have a solution. For the system that supports "Above 4G 
> Decoding", user should enable it when use a SR-IOV supported device. For the 
> system that doesn't support "Above 4G Decoding", user needs to add 
> "pci=realloc=off" in grub.
> 
> Potentially I think kernel still needs to find a way to avoid this issue 
> happen, like keeps the resource as the BIOS assigned value if there is a 
> failure on device's resource reallocation.
> 
> 
> -Collins Cheng
> 
> 
> -----Original Message-----
> From: Alex Williamson [mailto:alex.william...@redhat.com] 
> Sent: Wednesday, May 24, 2017 2:20 AM
> To: Cheng, Collins <collins.ch...@amd.com>
> Cc: Alexander Duyck <alexander.du...@gmail.com>; Bjorn Helgaas 
> <bhelg...@google.com>; linux-...@vger.kernel.org; 
> linux-kernel@vger.kernel.org; Deucher, Alexander <alexander.deuc...@amd.com>; 
> Zytaruk, Kelly <kelly.zyta...@amd.com>; Yinghai Lu <ying...@kernel.org>
> Subject: Re: [PATCH] PCI: Make SR-IOV capable GPU working on the SR-IOV 
> incapable platform
> 
> On Tue, 23 May 2017 03:41:21 +0000
> "Cheng, Collins" <collins.ch...@amd.com> wrote:
> 
> > Hi Alex,
> > 
> > I owe you a dmesg log. Attachment are two log files. 1.txt is without 
> > "pci=earlydump", 2.txt is with "pci=earlydump". The platform is an ASUS 
> > Z170-A motherboard that doesn't support SR-IOV. The graphics card is AMD 
> > FirePro S7150 card which enabled SR-IOV. 
> > 
> > You could find the error info like below in both logs. From the log, kernel 
> > failed to reallocate resource for BAR0 which is PF's Frame Buffer BAR 
> > (256MB needed), but kernel reallocated resource for BAR9 which is for VF. 
> > You are right, the real bug that is something goes wrong with the 
> > reallocation leaving the PF without resources.
> > 
> > [    0.992976] pci 0000:01:00.0: BAR 0: no space for [mem size 0x10000000 
> > 64bit pref]
> > [    0.992976] pci 0000:01:00.0: BAR 0: failed to assign [mem size 
> > 0x10000000 64bit pref]
> > [    0.992977] pci 0000:01:00.0: BAR 7: no space for [mem size 0x40000000 
> > 64bit pref]
> > [    0.992978] pci 0000:01:00.0: BAR 7: failed to assign [mem size 
> > 0x40000000 64bit pref]
> > [    0.992979] pci 0000:01:00.0: BAR 9: assigned [mem 0x88c00000-0x8abfffff 
> > 64bit pref]
> > [    0.992986] pci 0000:01:00.0: BAR 12: no space for [mem size 0x02000000]
> > [    0.992986] pci 0000:01:00.0: BAR 12: failed to assign [mem size 
> > 0x02000000]
> > [    0.992988] pci 0000:01:00.0: BAR 2: assigned [mem 0x8ac00000-0x8adfffff 
> > 64bit pref]
> > [    0.992994] pci 0000:01:00.0: BAR 5: no space for [mem size 0x00040000]
> > [    0.992995] pci 0000:01:00.0: BAR 5: failed to assign [mem size 
> > 0x00040000]
> > [    0.992996] pci 0000:01:00.0: BAR 6: no space for [mem size 0x00020000 
> > pref]
> > [    0.992997] pci 0000:01:00.0: BAR 6: failed to assign [mem size 
> > 0x00020000 pref]  
> 
> I've tried to extract more of the relevant resizing efforts below, perhaps 
> Yinghai or others can make more out of it.  In particular this system offers 
> no 64-bit MMIO and we'll never manage to allocate the necessary SR-IOV 
> resources without it.  AIUI, the PCI core won't try to use anything outside 
> the ACPI _CRS data without the option pci=nocrs.
> This might present a second alternative in addition to the pci=realloc=off, 
> which is actually suggested by the kernel below.  So I think we have at least 
> two potential workarounds in the code as it exists today, one leaving SR-IOV 
> disabled, the other (hopefully) enabling it using 64bit MMIO not described by 
> the system BIOS.
> Certainly an improvement would still be detecting the impossible reallocation 
> problem without nocrs and abandoning the process and of course to revert the 
> process before leaving more BARs unprogrammed than we started with.  Thanks,
> 
> Alex
> 
> [    0.891319] pci_bus 0000:00: root bus resource [io  0x0000-0x0cf7 window]
> [    0.891321] pci_bus 0000:00: root bus resource [io  0x0d00-0xffff window]
> [    0.891322] pci_bus 0000:00: root bus resource [mem 0x000a0000-0x000bffff 
> window]
> [    0.891323] pci_bus 0000:00: root bus resource [mem 0x88800000-0xdfffffff 
> window]
> [    0.891324] pci_bus 0000:00: root bus resource [mem 0xfd000000-0xfe7fffff 
> window]
> [    0.891325] pci_bus 0000:00: root bus resource [bus 00-fe]
> ...
> [    0.896481] pci 0000:01:00.0: [1002:6929] type 00 class 0x030000
> [    0.896496] pci 0000:01:00.0: reg 0x10: [mem 0xc0000000-0xcfffffff 64bit 
> pref]
> [    0.896506] pci 0000:01:00.0: reg 0x18: [mem 0xd0000000-0xd01fffff 64bit 
> pref]
> [    0.896513] pci 0000:01:00.0: reg 0x20: [io  0xe000-0xe0ff]
> [    0.896519] pci 0000:01:00.0: reg 0x24: [mem 0xdfe00000-0xdfe3ffff]
> [    0.896526] pci 0000:01:00.0: reg 0x30: [mem 0xdfe40000-0xdfe5ffff pref]
> [    0.896590] pci 0000:01:00.0: supports D1 D2
> [    0.896590] pci 0000:01:00.0: PME# supported from D1 D2 D3hot D3cold
> [    0.896625] pci 0000:01:00.0: reg 0x354: [mem 0x00000000-0x07ffffff 64bit 
> pref]
> [    0.896626] pci 0000:01:00.0: VF(n) BAR0 space: [mem 0x00000000-0x3fffffff 
> 64bit pref] (contains BAR0 for 8 VFs)
> [    0.896634] pci 0000:01:00.0: reg 0x35c: [mem 0x00000000-0x003fffff 64bit 
> pref]
> [    0.896635] pci 0000:01:00.0: VF(n) BAR2 space: [mem 0x00000000-0x01ffffff 
> 64bit pref] (contains BAR2 for 8 VFs)
> [    0.896646] pci 0000:01:00.0: reg 0x368: [mem 0x00000000-0x003fffff]
> [    0.896647] pci 0000:01:00.0: VF(n) BAR5 space: [mem 
> 0x00000000-0x01ffffff] (contains BAR5 for 8 VFs)
> [    0.896700] pci 0000:01:00.0: System wakeup disabled by ACPI
> [    0.906527] pci 0000:00:1b.0: PCI bridge to [bus 01]
> [    0.906544] pci 0000:00:1b.0:   bridge window [io  0xe000-0xefff]
> [    0.906546] pci 0000:00:1b.0:   bridge window [mem 0xdfe00000-0xdfefffff]
> [    0.906549] pci 0000:00:1b.0:   bridge window [mem 0xc0000000-0xd01fffff 
> 64bit pref]
> [    0.906550] pci 0000:00:1b.0: bridge has subordinate 01 but max busn 02
> ...
> [    0.943584] vgaarb: setting as boot device: PCI:0000:01:00.0
> [    0.943585] vgaarb: device added: 
> PCI:0000:01:00.0,decodes=io+mem,owns=io+mem,locks=none
> [    0.943586] vgaarb: loaded
> [    0.943586] vgaarb: bridge control possible 0000:01:00.0
> ...
> [    0.997491] pci 0000:01:00.0: BAR 7: no space for [mem size 0x40000000 
> 64bit pref]
> [    0.997491] pci 0000:01:00.0: BAR 7: failed to assign [mem size 0x40000000 
> 64bit pref]
> [    0.997493] pci 0000:01:00.0: BAR 9: no space for [mem size 0x02000000 
> 64bit pref]
> [    0.997493] pci 0000:01:00.0: BAR 9: failed to assign [mem size 0x02000000 
> 64bit pref]
> [    0.997495] pci 0000:01:00.0: BAR 12: no space for [mem size 0x02000000]
> [    0.997495] pci 0000:01:00.0: BAR 12: failed to assign [mem size 
> 0x02000000]
> [    0.997497] pci 0000:00:1b.0: PCI bridge to [bus 01]
> [    0.997498] pci 0000:00:1b.0:   bridge window [io  0xe000-0xefff]
> [    0.997501] pci 0000:00:1b.0:   bridge window [mem 0xdfe00000-0xdfefffff]
> [    0.997502] pci 0000:00:1b.0:   bridge window [mem 0xc0000000-0xd01fffff 
> 64bit pref]
> ...
> [    0.997540] pci_bus 0000:00: No. 2 try to assign unassigned res
> [    0.997540] release child resource [mem 0xdfe00000-0xdfe3ffff]
> [    0.997540] release child resource [mem 0xdfe40000-0xdfe5ffff pref]
> [    0.997541] pci 0000:00:1b.0: resource 14 [mem 0xdfe00000-0xdfefffff] 
> released
> [    0.997542] pci 0000:00:1b.0: PCI bridge to [bus 01]
> [    0.997543] release child resource [mem 0xc0000000-0xcfffffff 64bit pref]
> [    0.997544] release child resource [mem 0xd0000000-0xd01fffff 64bit pref]
> [    0.997544] pci 0000:00:1b.0: resource 15 [mem 0xc0000000-0xd01fffff 64bit 
> pref] released
> [    0.997545] pci 0000:00:1b.0: PCI bridge to [bus 01]
> [    0.997576] pci 0000:00:1b.0: BAR 15: no space for [mem size 0x58000000 
> 64bit pref]
> [    0.997577] pci 0000:00:1b.0: BAR 15: failed to assign [mem size 
> 0x58000000 64bit pref]
> [    0.997578] pci 0000:00:1b.0: BAR 14: assigned [mem 0x88c00000-0x8adfffff]
> [    0.997583] pci 0000:01:00.0: BAR 0: no space for [mem size 0x10000000 
> 64bit pref]
> [    0.997583] pci 0000:01:00.0: BAR 0: failed to assign [mem size 0x10000000 
> 64bit pref]
> [    0.997585] pci 0000:01:00.0: BAR 7: no space for [mem size 0x40000000 
> 64bit pref]
> [    0.997585] pci 0000:01:00.0: BAR 7: failed to assign [mem size 0x40000000 
> 64bit pref]
> [    0.997587] pci 0000:01:00.0: BAR 9: assigned [mem 0x88c00000-0x8abfffff 
> 64bit pref]
> [    0.997593] pci 0000:01:00.0: BAR 12: no space for [mem size 0x02000000]
> [    0.997594] pci 0000:01:00.0: BAR 12: failed to assign [mem size 
> 0x02000000]
> [    0.997595] pci 0000:01:00.0: BAR 2: assigned [mem 0x8ac00000-0x8adfffff 
> 64bit pref]
> [    0.997602] pci 0000:01:00.0: BAR 5: no space for [mem size 0x00040000]
> [    0.997602] pci 0000:01:00.0: BAR 5: failed to assign [mem size 0x00040000]
> [    0.997603] pci 0000:01:00.0: BAR 6: no space for [mem size 0x00020000 
> pref]
> [    0.997604] pci 0000:01:00.0: BAR 6: failed to assign [mem size 0x00020000 
> pref]
> [    0.997606] pci 0000:00:1b.0: PCI bridge to [bus 01]
> [    0.997607] pci 0000:00:1b.0:   bridge window [io  0xe000-0xefff]
> [    0.997609] pci 0000:00:1b.0:   bridge window [mem 0x88c00000-0x8adfffff]
> ...
> [    0.997647] pci_bus 0000:00: No. 3 try to assign unassigned res
> [    0.997648] release child resource [mem 0x88c00000-0x8abfffff 64bit pref]
> [    0.997648] release child resource [mem 0x8ac00000-0x8adfffff 64bit pref]
> [    0.997649] pci 0000:00:1b.0: resource 14 [mem 0x88c00000-0x8adfffff] 
> released
> [    0.997649] pci 0000:00:1b.0: PCI bridge to [bus 01]
> [    0.997651] release child resource [mem 0xdfd00000-0xdfd07fff 64bit]
> [    0.997651] pci 0000:00:1c.0: resource 14 [mem 0xdfd00000-0xdfdfffff] 
> released
> [    0.997652] pci 0000:00:1c.0: PCI bridge to [bus 02]
> [    0.997654] pci 0000:00:1d.0: resource 15 [mem 0x88a00000-0x88bfffff 64bit 
> pref] released
> [    0.997654] pci 0000:00:1d.0: PCI bridge to [bus 05]
> [    0.997664] pci 0000:00:1b.0: bridge window [mem 0x08000000-0x5fffffff 
> 64bit pref] to [bus 01] add_size 48000000 add_align 8000000
> [    0.997666] pci 0000:00:1b.0: bridge window [mem 0x00100000-0x022fffff] to 
> [bus 01] add_size 2200000 add_align 400000
> [    0.997687] pci 0000:00:1d.0: bridge window [mem 0x00100000-0x002fffff 
> 64bit pref] to [bus 05] add_size 200000 add_align 100000
> [    0.997692] pci 0000:00:1b.0: res[15]=[mem 0x08000000-0x5fffffff 64bit 
> pref] res_to_dev_res add_size 48000000 min_align 8000000
> [    0.997693] pci 0000:00:1b.0: res[15]=[mem 0x08000000-0xa7ffffff 64bit 
> pref] res_to_dev_res add_size 48000000 min_align 8000000
> [    0.997693] pci 0000:00:1b.0: res[14]=[mem 0x00100000-0x022fffff] 
> res_to_dev_res add_size 2200000 min_align 400000
> [    0.997694] pci 0000:00:1b.0: res[14]=[mem 0x00100000-0x044fffff] 
> res_to_dev_res add_size 2200000 min_align 400000
> [    0.997695] pci 0000:00:1d.0: res[15]=[mem 0x00100000-0x002fffff 64bit 
> pref] res_to_dev_res add_size 200000 min_align 100000
> [    0.997696] pci 0000:00:1d.0: res[15]=[mem 0x00100000-0x004fffff 64bit 
> pref] res_to_dev_res add_size 200000 min_align 100000
> [    0.997698] pci 0000:00:1b.0: BAR 15: no space for [mem size 0xa0000000 
> 64bit pref]
> [    0.997699] pci 0000:00:1b.0: BAR 15: failed to assign [mem size 
> 0xa0000000 64bit pref]
> [    0.997700] pci 0000:00:1b.0: BAR 14: assigned [mem 0x88c00000-0x8cffffff]
> [    0.997701] pci 0000:00:1c.0: BAR 14: assigned [mem 0x88a00000-0x88afffff]
> [    0.997702] pci 0000:00:1d.0: BAR 15: assigned [mem 0x8d000000-0x8d3fffff 
> 64bit pref]
> [    0.997705] pci 0000:00:1b.0: BAR 15: no space for [mem size 0x58000000 
> 64bit pref]
> [    0.997706] pci 0000:00:1b.0: BAR 15: failed to assign [mem size 
> 0x58000000 64bit pref]
> [    0.997707] pci 0000:00:1b.0: BAR 14: assigned [mem 0x88a00000-0x8abfffff]
> [    0.997708] pci 0000:00:1c.0: BAR 14: assigned [mem 0x8ac00000-0x8acfffff]
> [    0.997709] pci 0000:00:1d.0: BAR 15: assigned [mem 0x8ad00000-0x8aefffff 
> 64bit pref]
> [    0.997711] pci 0000:00:1d.0: BAR 15: reassigned [mem 
> 0x8ad00000-0x8b0fffff 64bit pref] (expanded by 0x200000)
> [    0.997713] pci 0000:00:1b.0: BAR 14: reassigned [mem 
> 0x8b400000-0x8f7fffff] (expanded by 0x2200000)
> [    0.997719] pci 0000:01:00.0: res[7]=[mem size 0x00000000 64bit pref] 
> res_to_dev_res add_size 40000000 min_align 0
> [    0.997720] pci 0000:01:00.0: res[9]=[mem 0x00000000-0xffffffffffffffff 
> 64bit pref] res_to_dev_res add_size 2000000 min_align 0
> [    0.997721] pci 0000:01:00.0: res[12]=[mem size 0x00000000] res_to_dev_res 
> add_size 2000000 min_align 0
> [    0.997722] pci 0000:01:00.0: BAR 0: no space for [mem size 0x10000000 
> 64bit pref]
> [    0.997722] pci 0000:01:00.0: BAR 0: failed to assign [mem size 0x10000000 
> 64bit pref]
> [    0.997723] pci 0000:01:00.0: BAR 7: no space for [mem size 0x40000000 
> 64bit pref]
> [    0.997724] pci 0000:01:00.0: BAR 7: failed to assign [mem size 0x40000000 
> 64bit pref]
> [    0.997725] pci 0000:01:00.0: BAR 9: assigned [mem 0x8b400000-0x8d3fffff 
> 64bit pref]
> [    0.997731] pci 0000:01:00.0: BAR 12: assigned [mem 0x8d400000-0x8f3fffff]
> [    0.997734] pci 0000:01:00.0: BAR 2: assigned [mem 0x8f400000-0x8f5fffff 
> 64bit pref]
> [    0.997740] pci 0000:01:00.0: BAR 5: assigned [mem 0x8f600000-0x8f63ffff]
> [    0.997744] pci 0000:01:00.0: BAR 0: no space for [mem size 0x10000000 
> 64bit pref]
> [    0.997745] pci 0000:01:00.0: BAR 0: failed to assign [mem size 0x10000000 
> 64bit pref]
> [    0.997746] pci 0000:01:00.0: BAR 2: assigned [mem 0x8b400000-0x8b5fffff 
> 64bit pref]
> [    0.997753] pci 0000:01:00.0: BAR 5: assigned [mem 0x8b600000-0x8b63ffff]
> [    0.997756] pci 0000:01:00.0: BAR 12: assigned [mem 0x8b800000-0x8d7fffff]
> [    0.997758] pci 0000:01:00.0: BAR 9: assigned [mem 0x8d800000-0x8f7fffff 
> 64bit pref]
> [    0.997765] pci 0000:01:00.0: BAR 7: no space for [mem size 0x40000000 
> 64bit pref]
> [    0.997765] pci 0000:01:00.0: BAR 7: failed to assign [mem size 0x40000000 
> 64bit pref]
> [    0.997767] pci 0000:00:1b.0: PCI bridge to [bus 01]
> [    0.997768] pci 0000:00:1b.0:   bridge window [io  0xe000-0xefff]
> [    0.997770] pci 0000:00:1b.0:   bridge window [mem 0x8b400000-0x8f7fffff]
> ...
> [    0.997818] pci_bus 0000:00: Automatically enabled pci realloc, if
> you have problem, try booting with pci=realloc=off

Reply via email to