On 17/06/2016 10:15, Dr. David Alan Gilbert wrote: > Larger is a problem if the guest tries to map something to a high > address that's not addressable.
Right. It's not a problem for most emulated PCI devices (it would be a problem for those that have large RAM BARs, but even our emulated video cards do not have 64-bit RAM BARs, I think; emulated MMIO BARs are not an issue because they're marked as not present or reserved in the processor page tables). However, it's bad for devices like ivshmem (that do have large RAM BARs) and assigned devices, because EPT limits guest physical addresses to MAXPHYADDR bits. Assigned devices use EPT page tables to map guest physical BAR addresses to host physical BAR addresses. > Smaller is potentially a problem if the guest plays tricks with > what it thinks are spare bits in page tables but which are actually > interpreted. I believe KVM plays a trick like this. It does, though it uses bit 51 so it's not a problem in practice with current hardware. It would be a problem if a processor existed with 52-bit physical address space. >> The existing 40-bit default looks like a problem for 36-bit >> systems. Do you know what kind of systems have 36 bits only? Only >> old ones, or recent ones too? If only old ones, how old? > > My Sandy Bridge (~2.5 year old) laptop is 36 bits; I've seen > some other single-socket 39bit machines (Ivy bridge and I think newer). I have an Ivy Bridge Core i7 laptop and it still has 36 bits. >> Can't we have a new default that is as small as possible for the >> VM RAM+devices configuration? > > That would be good to have, however: > 1) I didn't want to change the default upstream behaviour at first, > this time through I just wanted a way to set it. > 2) While we have maxmem settings to tell us the top of VM RAM, do > we have anything that tells us the top of IO space? What happens > when we hotplug a PCI card? > 3) Is it better to stick to sizes that correspond to real hardware > if you can? For example I don't know of any machines with 37 bits > - in practice I think it's best to stick with sizes that correspond > to some real hardware. I'm not worried about (3), guests really use this bit to do little more than Linux's iomem_resource.end = (1ULL << boot_cpu_data.x86_phys_bits) - 1; (arch/x86/kernel/setup.c) but I agree that (2) is a blocker. You don't know how the guest will assign PCI BAR addresses, and as you said there's hotplug too. Paolo