On 06/17/20 18:14, Laszlo Ersek wrote: > On 06/17/20 15:46, Dr. David Alan Gilbert wrote: >> * Laszlo Ersek (ler...@redhat.com) wrote: >>> On 06/16/20 19:14, Guilherme Piccoli wrote: >>>> Thanks Gerd, Dave and Eduardo for the prompt responses! >>>> >>>> So, I understand that when we use "-host-physical-bits", we are >>>> passing the *real* number for the guest, correct? So, in this case we >>>> can trust that the guest physbits matches the true host physbits. >>>> >>>> What if then we have OVMF relying in the physbits *iff* >>>> "-host-phys-bits" is used (which is the default in RH and a possible >>>> machine configuration on libvirt XML in Ubuntu), and we have OVMF >>>> fallbacks to 36-bit otherwise? >>> >>> I've now read the commit message on QEMU commit 258fe08bd341d, and the >>> complexity is simply stunning. >>> >>> Right now, OVMF calculates the guest physical address space size from >>> various range sizes (such as hotplug memory area end, default or >>> user-configured PCI64 MMIO aperture), and derives the minimum suitable >>> guest-phys address width from that address space size. This width is >>> then exposed to the rest of the firmware with the CPU HOB (hand-off >>> block), which in turn controls how the GCD (global coherency domain) >>> memory space map is sized. Etc. >>> >>> If QEMU can provide a *reliable* GPA width, in some info channel (CPUID >>> or even fw_cfg), then the above calculation could be reversed in OVMF. >>> We could take the width as a given (-> produce the CPU HOB directly), >>> plus calculate the *remaining* address space between the GPA space size >>> given by the width, and the end of the memory hotplug area end. If the >>> "remaining size" were negative, then obviously QEMU would have been >>> misconfigured, so we'd halt the boot. Otherwise, the remaining area >>> could be used as PCI64 MMIO aperture (PEI memory footprint of DXE page >>> tables be darned). >>> >>>> Now, regarding the problem "to trust or not" in the guests' physbits, >>>> I think it's an orthogonal discussion to some extent. It'd be nice to >>>> have that check, and as Eduardo said, prevent migration in such cases. >>>> But it's not really preventing OVMF big PCI64 aperture if we only >>>> increase the aperture _when "-host-physical-bits" is used_. >>> >>> I don't know what exactly those flags do, but I doubt they are clearly >>> visible to OVMF in any particular way. >> >> The firmware should trust whatever it reads from the cpuid and thus gets >> told from qemu; if qemu is doing the wrong thing there then that's our >> problem and we need to fix it in qemu. > > This sounds good in practice, but -- as Gerd too has stated, to my > understanding -- it has potential to break existing usage. > > Consider assigning a single device with a 32G BAR -- right now that's > supposed to work, without the X-PciMmio64Mb OVMF knob, on even the "most > basic" hardware (36-bit host phys address width, and EPT supported). If > OVMF suddenly starts trusting the CPUID from QEMU, and that results in a > GPA width of 40 bits (i.e. new OVMF is run on old QEMU), then the big > BAR (and other stuff too) could be allocated from GPA space that EPT is > actually able to deal with. --> regression for the user.
s/able/unable/, sigh. :/ > > Sometimes I can tell users "hey given that you're building OVMF from > source, or taking it from a 3rd party origin anyway, can you just run > upstream QEMU too", but most of the time they just want everything to > continue working on a 3 year old Ubuntu LTS release or whatever. :/ > > And again, this is *without* "X-PciMmio64Mb". > > Laszlo >