* Michael S. Tsirkin (m...@redhat.com) wrote: > On Tue, Jul 05, 2016 at 10:33:25AM +0100, Dr. David Alan Gilbert wrote: > > * Michael S. Tsirkin (m...@redhat.com) wrote: > > > On Mon, Jul 04, 2016 at 08:16:03PM +0100, Dr. David Alan Gilbert (git) > > > wrote: > > > > From: "Dr. David Alan Gilbert" <dgilb...@redhat.com> > > > > > > > > QEMU sets the guests physical address bits to 40; this is wrong > > > > on most hardware, and can be detected by the guest. > > > > It also stops you using really huge multi-TB VMs. > > > > > > > > Red Hat has had a patch, that Andrea wrote, downstream for a couple > > > > of years that reads the hosts value and uses that in the guest. That's > > > > correct as far as the guest sees it, and lets you create huge VMs. > > > > > > > > The downside, is that if you've got a mix of hosts, say an i7 and a > > > > Xeon, > > > > life gets complicated in migration; prior to 2.6 it all apparently > > > > worked (although a guest that looked might spot the change). > > > > In 2.6 Paolo started checking MSR writes and they failed when the > > > > incoming MTRR mask didn't fit. > > > > > > > > This series: > > > > a) Fixes up mtrr masks so that if you're migrating between hosts > > > > of different physical address size it tries to do something > > > > sensible. > > > > > > > > b) Lets you specify the guest physical address size via a CPU > > > > property, i.e. > > > > -cpu SandyBridge,phys-bits=36 > > > > > > > > The default on old machine types is to use the existing 40 bits > > > > value. > > > > > > > > c) Lets you tell qemu to use the same setting as the host, i.e. > > > > -cpu SandyBridge,phys-bits=0 > > > > > > > > This is the default on new machine types. > > > > > > > > Note that mixed size hosts are still not necessarily safe; a guest > > > > started on a host with a large physical address size might start using > > > > those bits and get upset when it's moved to a small host. > > > > However that was already potentially broken in existing qemu that > > > > used a magic value of 40. > > > > > > > > There's potential to add some extra guards against people > > > > doing silly stuff; e.g. stop people running VMs using 1TB of > > > > address space on a tiny host. > > > > > > > > Dave > > > > > > This is all in target-i386 so if the maintainers want it this way, they > > > can merge this, and I do not have strong objections, but I wanted to > > > document an alternative that is IMHO somewhat nicer. Feel free to > > > ignore. See below. > > > > > > How can guest use more memory than what host supports? > > > I think there are two ways: > > > > > > 1. more memory than host supports is supplied > > > This is a configuration error. We can simply detect this > > > and fail init, or print a warning, no need for new flags. > > > > Yes we should do that; however there's a case that's potentially > > currently working for people but actually kind of illegal. > > That case is specifying a small amount of actual memory > > but a large maxmem - i.e.: > > > > -m 2G,slots=16,maxmem=2T > > > > On a host with a 39bit physaddress limit do you error > > on that or not? I think oVirt is currently doing something > > similar to that, but I'm trying to get confirmation. > > That would only be a problem since pci is allocated above > maxmem so 64 bit pci addresses aren't accessible. > With my proposal we can actually force firmware to avoid > using 64 bit memory for that config. > Will work better than today. > > > > > 2. pci addresses out of host range assigned by guest > > > Again normally at least seabios will not do this, > > > maybe OVMF will? > > > we certainly can add an interface telling firmware > > > what the limit is. > > > > > > Thus an alternative is: > > > - add interface to tell QEMU how much 64 bit memory can pci use. > > > - teach firmware to limit itself to that > > > - set guest bits to 48 unconditionally > > > > > > > > > the disadvantage of this approach is that firmware needs to be changed > > > > I guess it also needs the CRS to tell the guest OS not > > to remap PCI stuff into that space? > > CRS is a list of legal addresses, not list of illegal ones. > So just don't include what's illegal there. > > > I thought also from the previous > > discussions that the guest would get a different exception if it > > actually tried to use any of the bits below 48 it didn't have. > > Basically if you try to map pci at an address outside CRS > you can get any kind of crash since there could be on-board > hardware handling these addresses. > So I do not think we care about that.
The issue about guest bits is not purely about PCI addresses though; I thought it was also to do with visible behaviour/exceptions in page tables. > > > the advantage is that we get seemless migration between different > > > hosts as long as they both can support the configuration, > > > without any management effort. > > > > The reality (Linux guest) is that this already works as long as you don't > > map anything into the high address space, and the firmware wont do > > that unless it's pushed to by an excessive maxmem or huge > > 64bit PCI bars. > > > > Dave > > Right. So the disadvantage isn't big at all, and I think advantages > outweight it. Except that no one will ever get around to writing the firmware changes for both sets of firmware; so we never move forward? Dave > > > > > > > > > > > > v2 > > > > Default on new machine types is to read from the host > > > > Use the MAKE_64BIT_MASK macro > > > > Validate phys_bits in the realise method > > > > Move reading the host physical bits to the realise method > > > > Set phys-bits even for 32bit guests > > > > Add warning when your phys-bits doesn't match your host in the none > > > > default case > > > > > > > > Dr. David Alan Gilbert (6): > > > > x86: Allow physical address bits to be set > > > > x86: Mask mtrr mask based on CPU physical address limits > > > > x86: fill high bits of mtrr mask > > > > x86: Set physical address bits based on host > > > > x86: fix up 32 bit phys_bits case > > > > x86: Add sanity checks on phys_bits > > > > > > > > include/hw/i386/pc.h | 10 ++++++++ > > > > target-i386/cpu.c | 71 > > > > ++++++++++++++++++++++++++++++++++++++++++++++------ > > > > target-i386/cpu.h | 6 +++++ > > > > target-i386/kvm.c | 36 +++++++++++++++++++++++--- > > > > 4 files changed, 112 insertions(+), 11 deletions(-) > > > > > > > > -- > > > > 2.7.4 > > -- > > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK