* Daniel P. Berrangé (berra...@redhat.com) wrote: > On Tue, Jun 16, 2020 at 01:10:21PM -0400, Eduardo Habkost wrote: > > On Tue, Jun 16, 2020 at 05:57:46PM +0100, Dr. David Alan Gilbert wrote: > > > * Gerd Hoffmann (kra...@redhat.com) wrote: > > > > Hi, > > > > > > > > > (a) We could rely in the guest physbits to calculate the PCI64 > > > > > aperture. > > > > > > > > I'd love to do that. Move the 64-bit I/O window as high as possible and > > > > use -- say -- 25% of the physical address space for it. > > > > > > > > Problem is we can't. > > > > > > > > > failure. Also, if the users are not setting the physbits in the guest, > > > > > there must be a default (seems to be 40bit according to my > > > > > experiments), > > > > > seems to be a good idea to rely on that. > > > > > > > > Yes, 40 is the default, and it is used *even if the host supports less > > > > than that*. Typical values I've seen for intel hardware are 36 and 39. > > > > 39 is used even by recent hardware (not the xeons, but check out a > > > > laptop or a nuc). > > > > > > > > > If guest physbits is 40, why to have OVMF limiting it to 36, right? > > > > > > > > Things will explode in case OVMF uses more physbits than the host > > > > supports (host physbits limit applies to ept too). In other words: OVMF > > > > can't trust the guest physbits, so it is conservative to be on the safe > > > > side. > > > > > > > > If we can somehow make a *trustable* physbits value available to the > > > > guest, then yes, we can go that route. But the guest physbits we have > > > > today unfortunately don't cut it. > > > > > > In downstream RH qemu, we run with host-physbits as default; so it's > > > reasonably > > > trustworthy; of course that doesn't help you across a migration between > > > hosts with different sizes (e.g. an E5 Xeon to an E3). > > > Changing upstream to do the same would seem sensible to me, but it's not > > > a foolproof config. > > > > Yeah, to make it really trustworthy we would need to prevent > > migration to hosts with mismatching phys sizes. We would need to > > communicate that to the guest somehow (with new hypervisor CPUID > > flags, maybe). > > QEMU should be able to validate the hostphysbits >= guestphysbits when > accepting incoming migration, and abort it.
Yeh, there's an outstanding request to validate other CPU flags as well. > Meanwhile libvirt should be enhanced to report hostphysbits, so that > management apps can determine that they shouldn't even pick bad hosts > in the first place. Sounds reasonable. Note there are a couple of other considerations when choosing the physbits as reported to the guest: a) TCG's view - I think it had a fixed size of 40 bits, but I haven't dug into it. b) We recently gained 'host-phys-bits-limit' which when used with host-phys-bits lets you take the host value but then limit it. Eduardo seems to have done that to limit the guest from flipping into 5-level page tables. Hmm I've not tried with chips that do 5-level - but maybe we also need this if you expect to migrate to hosts that don't have it. (I've also got a vague memory that there's a limit in some IOMMUs address sizes, but I can't remember what the details were). Dave > > Regards, > Daniel > -- > |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| > |: https://libvirt.org -o- https://fstop138.berrange.com :| > |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK