Re: [Xen-devel] PCIe devices that are hotplugged after MMIO has been setup fail due to _CRS not covering 64-bit area
>>> On 01.11.16 at 15:39,wrote: > Also I realized that "Range Size and Alignment Requirement" aren't meet > with the code I wrote - as the size (2^n) must be aligned on the > 2^n boundary, and that is certainly not meet. Yes, this would better be obeyed to. > Anyhow the point here is that with modifications here I will > still run in the variable MTRR limit if I am to cover most of the > space. I can do up to a certain value. And that 'value' could > become the pci_high_mem_end? Yes - moving the boundary to require fewer MTRRs is certainly an option. Also remember that we are required to leave a few MTRRs for OS use. > Or perhaps revisit a6a822324: > Author: Keir Fraser > Date: Wed Apr 16 13:36:44 2008 +0100 > > x86, hvm: Lots of MTRR/PAT emulation cleanup. > > - Move MTRR MSR initialisation into hvmloader. > - Simplify initialisation logic by overlaying UC on default WB rather >than vice versa. > - Clean up hypervisor HVM MTRR/PAE code's interface with rest of >hypervisor. > > > As the default MTRR is WB. If that was UC we could just set MTRRs > for RAM regions and have the type be WB for those regions? > > I am not sure thought if that is a good direction either? Actually I think we should pick the variant requiring fewer MTRRs. I've seen BIOSes of both kinds. Otoh I've never been really convinced using WB as the default is really that good an idea. > And that actually worked out nicely. Linux sees the new _CRS regions > and I got [this includes two extra regions - so that the HT region > is not touched]: > > ... > pci_bus :00: root bus resource [io 0x-0x0cf7 window] > pci_bus :00: root bus resource [io 0x0d00-0x window] > pci_bus :00: root bus resource [mem 0x000a-0x000b window] > pci_bus :00: root bus resource [mem 0xf000-0xfbff window] > pci_bus :00: root bus resource [mem 0x10fc0-0xfcfffe window] > pci_bus :00: root bus resource [mem 0x100-0x > window] > pci_bus :00: root bus resource [bus 00-ff] > > from: > pci_bus :00: root bus resource [io 0x-0x0cf7 window] > pci_bus :00: root bus resource [io 0x0d00-0x window] > pci_bus :00: root bus resource [mem 0x000a-0x000b window] > pci_bus :00: root bus resource [mem 0xe000-0xfbff window] > pci_bus :00: root bus resource [bus 00-ff] > > Except that when I tried this with Windows 2000 I found out that > its AML interpreter blows up if any of the values are bigger than > 8GB. With a bit of extra AML duct-tape that got solved, albeit I need > to verify other Windows platforms. Which reminds me - you had dabbled > in this - are there any other surprises I should be aware of ? The only thing I remember is the WinXP issue with qword fields (as mentioned in a comment in dsdt.asl). Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] PCIe devices that are hotplugged after MMIO has been setup fail due to _CRS not covering 64-bit area
. snip.. > I modified it be subtractive, and got it to start with > large areas and then smaller and smaller: > > (d2) - CPU0 ... 36-bit phys ... fixed MTRRs ... Cover @04344(MB) to > 65536(M > (d2) B) with 7 MTRRs. > (d2) MTRR 1 @04344(MB) 37112(MB) > (d2) MTRR 2 @37112(MB) 53496(MB) > (d2) MTRR 3 @53496(MB) 61688(MB) > (d2) MTRR 4 @61688(MB) 63736(MB) > (d2) MTRR 5 @63736(MB) 64760(MB) > (d2) MTRR 6 @64760(MB) 65272(MB) > (d2) MTRR 7 @65272(MB) 65528(MB) > (d2) var MTRRs [8/8] ... done. > > But of course on 48-bit hardware, even with this we ran out of MTRRs: > (d1) - CPU0 ... 48-bit phys ... fixed MTRRs ... Cover @04344(MB) to > 0268435456( > (d1) MB) with 7 MTRRs. > (d1) MTRR 1 @04344(MB) 0134222072(MB) > (d1) MTRR 2 @0134222072(MB) 0201330936(MB) > (d1) MTRR 3 @0201330936(MB) 0234885368(MB) > (d1) MTRR 4 @0234885368(MB) 0251662584(MB) > (d1) MTRR 5 @0251662584(MB) 0260051192(MB) > (d1) MTRR 6 @0260051192(MB) 0264245496(MB) > (d1) MTRR 7 @0264245496(MB) 0266342648(MB) > (d1) var MTRRs [8/8] ... done. For comparison here is what the existing code does (pls ignore the 'MTRR 1'): (d35) MB) with 7 MTRRs. (d35) MTRR 1@04344(MB) 04352(MB)[8(MB)] (d35) MTRR 1@04352(MB) 04608(MB)[00256(MB)] (d35) MTRR 1@04608(MB) 05120(MB)[00512(MB)] (d35) MTRR 1@05120(MB) 06144(MB)[01024(MB)] (d35) MTRR 1@06144(MB) 08192(MB)[02048(MB)] (d35) MTRR 1@08192(MB) 16384(MB)[08192(MB)] (d35) MTRR 1@16384(MB) 32768(MB)[16384(MB)] ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] PCIe devices that are hotplugged after MMIO has been setup fail due to _CRS not covering 64-bit area
On Thu, Oct 13, 2016 at 03:20:24AM -0600, Jan Beulich wrote: > >>> On 12.10.16 at 23:15,wrote: > > On Wed, Sep 28, 2016 at 03:21:08AM -0600, Jan Beulich wrote: > >> >>> On 27.09.16 at 16:43, wrote: > >> > If the guest is booted with 'pci' we nicely expand the MMIO region below > >> > 4GB and try to fit in the BARs in there. If that fails (not enough > >> > space) we move it above the memory (64-bit). And throughout all of this > >> > we also update the _CRS field to cover these ranges. > >> > > >> > (Note, I need to check if the 64-bit area is also set, I think it is). > >> > > >> > But the situation is different if we hot-plug a device that has too big > >> > BAR to fit in the MMIO region. We move it in the 64-bit area but we > >> > don't update the _CRS. Which means that Linux will complain (unless > >> > booted with pci=nocrs)). Not sure about Windows but I would assume so > >> > to. > >> > > >> > I was wondering what would be a good way to solve this? I looked at some > >> > Dell machines to see how they deal with hotplug PCIe devices and they > >> > just declared all the memory in the _CRS (including RAM). > >> > > >> > We could do a hybrid - during bootup make the _CRS region have entry from > >> > end of RAM to .. end of memory? > >> > >> End of physical address space you mean? Generally yes, but we > >> need to be a little careful there: For one, on AMD we'd better not > >> overlap with the HT area. And then there's this MTRR related > >> comment next to the setting of pci_hi_mem_end (albeit both HT > >> area start and end of PA space should be aligned well enough). This got interesting. The existing code that sets the variable MTRR ran out of MTRRs to cover say 1<<36 of space. The reason is that it starts at low granularity sizes (4KB) and then builds up from there. To cover say from 4GB to 64GB we ran out of MTRRs. I modified it be subtractive, and got it to start with large areas and then smaller and smaller: (d2) - CPU0 ... 36-bit phys ... fixed MTRRs ... Cover @04344(MB) to 65536(M (d2) B) with 7 MTRRs. (d2) MTRR 1 @04344(MB) 37112(MB) (d2) MTRR 2 @37112(MB) 53496(MB) (d2) MTRR 3 @53496(MB) 61688(MB) (d2) MTRR 4 @61688(MB) 63736(MB) (d2) MTRR 5 @63736(MB) 64760(MB) (d2) MTRR 6 @64760(MB) 65272(MB) (d2) MTRR 7 @65272(MB) 65528(MB) (d2) var MTRRs [8/8] ... done. But of course on 48-bit hardware, even with this we ran out of MTRRs: (d1) - CPU0 ... 48-bit phys ... fixed MTRRs ... Cover @04344(MB) to 0268435456( (d1) MB) with 7 MTRRs. (d1) MTRR 1 @04344(MB) 0134222072(MB) (d1) MTRR 2 @0134222072(MB) 0201330936(MB) (d1) MTRR 3 @0201330936(MB) 0234885368(MB) (d1) MTRR 4 @0234885368(MB) 0251662584(MB) (d1) MTRR 5 @0251662584(MB) 0260051192(MB) (d1) MTRR 6 @0260051192(MB) 0264245496(MB) (d1) MTRR 7 @0264245496(MB) 0266342648(MB) (d1) var MTRRs [8/8] ... done. [I figured that it would be OK to set the UC MTRR even for the HT region: FC -> FF as you surely don't want WB there?] Then it ocurred to me that maybe I am overthinking it and should just pick the biggest one: (d32) Multiprocessor initialisation: (d32) - CPU0 ... 48-bit phys ... fixed MTRRs ... Cover @04344(MB) to 0268435456( (d32) MB) with 7 MTRRs. (d32) MTRR 1@04344(MB) 0268439800(MB) (d32) var MTRRs [1/8] ... done. Which would cover _past_ the CPU end, but that surely won't be healthy to the CPU? The Intel SDM doesn't mention what happens then. Also I realized that "Range Size and Alignment Requirement" aren't meet with the code I wrote - as the size (2^n) must be aligned on the 2^n boundary, and that is certainly not meet. Anyhow the point here is that with modifications here I will still run in the variable MTRR limit if I am to cover most of the space. I can do up to a certain value. And that 'value' could become the pci_high_mem_end? Or perhaps revisit a6a822324: Author: Keir Fraser Date: Wed Apr 16 13:36:44 2008 +0100 x86, hvm: Lots of MTRR/PAT emulation cleanup. - Move MTRR MSR initialisation into hvmloader. - Simplify initialisation logic by overlaying UC on default WB rather than vice versa. - Clean up hypervisor HVM MTRR/PAE code's interface with rest of hypervisor. As the default MTRR is WB. If that was UC we could just set MTRRs for RAM regions and have the type be WB for those regions? I am not sure thought if that is a good direction either? > >> > >> > Or perhaps add some extra logic between QEMU and ACPI AML to expand (or > >> > perhaps modify the last _CRS entry) when PCIe devices are hotplugged? > >> > >> While that would be the most flexible variant, I'd be afraid of this > >> getting rather complicated. Or have you already got some > >> reasonable layout of how this would look like? > > > > I did this and while
Re: [Xen-devel] PCIe devices that are hotplugged after MMIO has been setup fail due to _CRS not covering 64-bit area
>>> On 12.10.16 at 23:15,wrote: > On Wed, Sep 28, 2016 at 03:21:08AM -0600, Jan Beulich wrote: >> >>> On 27.09.16 at 16:43, wrote: >> > If the guest is booted with 'pci' we nicely expand the MMIO region below >> > 4GB and try to fit in the BARs in there. If that fails (not enough >> > space) we move it above the memory (64-bit). And throughout all of this >> > we also update the _CRS field to cover these ranges. >> > >> > (Note, I need to check if the 64-bit area is also set, I think it is). >> > >> > But the situation is different if we hot-plug a device that has too big >> > BAR to fit in the MMIO region. We move it in the 64-bit area but we >> > don't update the _CRS. Which means that Linux will complain (unless >> > booted with pci=nocrs)). Not sure about Windows but I would assume so >> > to. >> > >> > I was wondering what would be a good way to solve this? I looked at some >> > Dell machines to see how they deal with hotplug PCIe devices and they >> > just declared all the memory in the _CRS (including RAM). >> > >> > We could do a hybrid - during bootup make the _CRS region have entry from >> > end of RAM to .. end of memory? >> >> End of physical address space you mean? Generally yes, but we >> need to be a little careful there: For one, on AMD we'd better not >> overlap with the HT area. And then there's this MTRR related >> comment next to the setting of pci_hi_mem_end (albeit both HT >> area start and end of PA space should be aligned well enough). >> >> > Or perhaps add some extra logic between QEMU and ACPI AML to expand (or >> > perhaps modify the last _CRS entry) when PCIe devices are hotplugged? >> >> While that would be the most flexible variant, I'd be afraid of this >> getting rather complicated. Or have you already got some >> reasonable layout of how this would look like? > > I did this and while all the plumbing works great and I can see that > the pci_hi_len gets incremented by the size of the 64-bit BARS of the > new device (and also decremented if hot-unplugged) I hit a snag: > > Linux evaluates this only once (actually twice, but only during bootup). Ah - quite reasonable to expect this won't change. > For right now let me jump with the "simpler" solution of just > hardcoding the end of physical address space and see how that works out. Right. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] PCIe devices that are hotplugged after MMIO has been setup fail due to _CRS not covering 64-bit area
On Wed, Sep 28, 2016 at 03:21:08AM -0600, Jan Beulich wrote: > >>> On 27.09.16 at 16:43,wrote: > > If the guest is booted with 'pci' we nicely expand the MMIO region below > > 4GB and try to fit in the BARs in there. If that fails (not enough > > space) we move it above the memory (64-bit). And throughout all of this > > we also update the _CRS field to cover these ranges. > > > > (Note, I need to check if the 64-bit area is also set, I think it is). > > > > But the situation is different if we hot-plug a device that has too big > > BAR to fit in the MMIO region. We move it in the 64-bit area but we > > don't update the _CRS. Which means that Linux will complain (unless > > booted with pci=nocrs)). Not sure about Windows but I would assume so > > to. > > > > I was wondering what would be a good way to solve this? I looked at some > > Dell machines to see how they deal with hotplug PCIe devices and they > > just declared all the memory in the _CRS (including RAM). > > > > We could do a hybrid - during bootup make the _CRS region have entry from > > end of RAM to .. end of memory? > > End of physical address space you mean? Generally yes, but we > need to be a little careful there: For one, on AMD we'd better not > overlap with the HT area. And then there's this MTRR related > comment next to the setting of pci_hi_mem_end (albeit both HT > area start and end of PA space should be aligned well enough). > > > Or perhaps add some extra logic between QEMU and ACPI AML to expand (or > > perhaps modify the last _CRS entry) when PCIe devices are hotplugged? > > While that would be the most flexible variant, I'd be afraid of this > getting rather complicated. Or have you already got some > reasonable layout of how this would look like? I did this and while all the plumbing works great and I can see that the pci_hi_len gets incremented by the size of the 64-bit BARS of the new device (and also decremented if hot-unplugged) I hit a snag: Linux evaluates this only once (actually twice, but only during bootup). That is if I did the hotplug when the guest is in GRUB and boot Linux is quite happy. But if I did it after Linux has booted the PNP0A03 _CRS is not evaluated again. The only way I can see it evaulating this is if a new bridge is added and DMAR hotplug support ("Remapping Hardware Unit Hot Plug") is exposed to the guest. See in Linux code acpi_pci_root_add and if (hotadd && dmar_device_add(handle)) This means: - adding in QEMU bridge support for each new hotplugged device, - and Intel VT-d in the guest support. That I think will take a bit of time to get right. For right now let me jump with the "simpler" solution of just hardcoding the end of physical address space and see how that works out. > > Jan > ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] PCIe devices that are hotplugged after MMIO has been setup fail due to _CRS not covering 64-bit area
On Wed, Sep 28, 2016 at 03:21:08AM -0600, Jan Beulich wrote: > >>> On 27.09.16 at 16:43,wrote: > > If the guest is booted with 'pci' we nicely expand the MMIO region below > > 4GB and try to fit in the BARs in there. If that fails (not enough > > space) we move it above the memory (64-bit). And throughout all of this > > we also update the _CRS field to cover these ranges. > > > > (Note, I need to check if the 64-bit area is also set, I think it is). > > > > But the situation is different if we hot-plug a device that has too big > > BAR to fit in the MMIO region. We move it in the 64-bit area but we > > don't update the _CRS. Which means that Linux will complain (unless > > booted with pci=nocrs)). Not sure about Windows but I would assume so > > to. > > > > I was wondering what would be a good way to solve this? I looked at some > > Dell machines to see how they deal with hotplug PCIe devices and they > > just declared all the memory in the _CRS (including RAM). > > > > We could do a hybrid - during bootup make the _CRS region have entry from > > end of RAM to .. end of memory? > > End of physical address space you mean? Generally yes, but we Yes. > need to be a little careful there: For one, on AMD we'd better not > overlap with the HT area. And then there's this MTRR related > comment next to the setting of pci_hi_mem_end (albeit both HT > area start and end of PA space should be aligned well enough). > > > Or perhaps add some extra logic between QEMU and ACPI AML to expand (or > > perhaps modify the last _CRS entry) when PCIe devices are hotplugged? > > While that would be the most flexible variant, I'd be afraid of this > getting rather complicated. Or have you already got some > reasonable layout of how this would look like? Nothing yet sadly, just soliciting input at this point. Thanks again for the tidbit about HT. > > Jan > ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] PCIe devices that are hotplugged after MMIO has been setup fail due to _CRS not covering 64-bit area
>>> On 27.09.16 at 16:43,wrote: > If the guest is booted with 'pci' we nicely expand the MMIO region below > 4GB and try to fit in the BARs in there. If that fails (not enough > space) we move it above the memory (64-bit). And throughout all of this > we also update the _CRS field to cover these ranges. > > (Note, I need to check if the 64-bit area is also set, I think it is). > > But the situation is different if we hot-plug a device that has too big > BAR to fit in the MMIO region. We move it in the 64-bit area but we > don't update the _CRS. Which means that Linux will complain (unless > booted with pci=nocrs)). Not sure about Windows but I would assume so > to. > > I was wondering what would be a good way to solve this? I looked at some > Dell machines to see how they deal with hotplug PCIe devices and they > just declared all the memory in the _CRS (including RAM). > > We could do a hybrid - during bootup make the _CRS region have entry from > end of RAM to .. end of memory? End of physical address space you mean? Generally yes, but we need to be a little careful there: For one, on AMD we'd better not overlap with the HT area. And then there's this MTRR related comment next to the setting of pci_hi_mem_end (albeit both HT area start and end of PA space should be aligned well enough). > Or perhaps add some extra logic between QEMU and ACPI AML to expand (or > perhaps modify the last _CRS entry) when PCIe devices are hotplugged? While that would be the most flexible variant, I'd be afraid of this getting rather complicated. Or have you already got some reasonable layout of how this would look like? Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] PCIe devices that are hotplugged after MMIO has been setup fail due to _CRS not covering 64-bit area
Hey! If the guest is booted with 'pci' we nicely expand the MMIO region below 4GB and try to fit in the BARs in there. If that fails (not enough space) we move it above the memory (64-bit). And throughout all of this we also update the _CRS field to cover these ranges. (Note, I need to check if the 64-bit area is also set, I think it is). But the situation is different if we hot-plug a device that has too big BAR to fit in the MMIO region. We move it in the 64-bit area but we don't update the _CRS. Which means that Linux will complain (unless booted with pci=nocrs)). Not sure about Windows but I would assume so to. I was wondering what would be a good way to solve this? I looked at some Dell machines to see how they deal with hotplug PCIe devices and they just declared all the memory in the _CRS (including RAM). We could do a hybrid - during bootup make the _CRS region have entry from end of RAM to .. end of memory? Or perhaps add some extra logic between QEMU and ACPI AML to expand (or perhaps modify the last _CRS entry) when PCIe devices are hotplugged? I am wondering what folks think is the best way going forward? ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel