Hi : Sorry for the late reply. I tested the patch series using 64G and 80G. Both of them are OK in XEN.
Here is what it looks like inside the VM (the memory is 80G): total used free shared buffers cached Mem: 81956412 654708 81301704 0 10528 42256 -/+ buffers/cache: 601924 81354488 Swap: 4186108 0 4186108 Thanks a lot for your nice work! Maoming -----邮件原件----- 发件人: Laszlo Ersek [mailto:ler...@redhat.com] 发送时间: 2015年6月10日 21:03 收件人: Maoming 抄送: edk2-devel@lists.sourceforge.net; Huangpeng (Peter); Wei Liu; Paolo Bonzini 主题: Re: [edk2] [RFC 4/4] OvmfPkg: PlatformPei: invert MTRR setup in QemuInitializeRam() On 06/09/15 04:15, Laszlo Ersek wrote: > On 06/08/15 23:46, Laszlo Ersek wrote: >> At the moment we work with a UC default MTRR type, and set three >> memory ranges to WB: >> - [0, 640 KB), >> - [1 MB, LowerMemorySize), >> - [4 GB, 4 GB + UpperMemorySize). >> >> Unfortunately, coverage for the third range can fail with a high >> likelihood. If the alignment of the base (ie. 4 GB) and the alignment >> of the size (UpperMemorySize) differ, then MtrrLib creates a series >> of variable MTRR entries, with power-of-two sized MTRR masks. And, >> it's really easy to run out of variable MTRR entries, dependent on >> the alignment difference. >> >> This is a problem because a Linux guest will loudly reject any high >> memory that is not covered my MTRR. >> >> So, let's follow the inverse pattern (loosely inspired by SeaBIOS): >> - flip the MTRR default type to WB, >> - set [0, 640 KB) to WB -- fixed MTRRs have precedence over the default >> type and variable MTRRs, so we can't avoid this, >> - set [640 KB, 1 MB) to UC -- implemented with fixed MTRRs, >> - set [LowerMemorySize, 4 GB) to UC -- should succeed with variable MTRRs >> more likely than the other scheme (due to less chaotic alignment >> differences). >> >> Effects of this patch can be observed by setting DEBUG_CACHE >> (0x00200000) in PcdDebugPrintErrorLevel. >> >> BUG: Although the MTRRs look good to me in the OVMF debug log, I >> still can't boot >= 64 GB guests with this. Instead of the complaints >> mentioned above, the Linux guest apparently spirals into an infinite >> loop (on KVM), or hangs with no CPU load (on TCG). > > No, actually there is no bug in this patch (so s/RFC/PATCH/). I did > more testing and these are the findings: > - I can reproduce the same issue on KVM with SeaBIOS guests. > - The exact symptoms are that as soon as the highest guest-phys address > is >= 64 GB, then the guest kernel doesn't boot. It gets stuck > somewhere after hitting Enter in grub. > - Normally 3 GB of the guest RAM is mapped under 4 GB in guest-phys > address space, then there's a 1 GB PCI hole, and the rest is above > 4 GB. This means that a 63 GB guest can be started (because 63 - 3 + 4 > == 64), but if you add just 1 MB more, it won't boot. > - (This was the big discovery:) I flipped the "ept" parameter of the > kvm_intel module on my host to N, and then things started to work. I > just booted a 128 GB Linux guest with this patchset. (I have 4 GB > RAM in my host, plus approx 250 GB swap.) The guest could see it all. > - The TCG boot didn't hang either; I just couldn't wait earlier for > network initialization to complete. > > I'm CC'ing Paolo for help with the EPT question. Other than that, this > series is functional. (For QEMU/KVM at least; Xen will likely need > more fixes from others.) We have a root cause, it seems. The issue is that the processor in my laptop, on which I tested, has only 36 bits for physical addresses: $ grep 'address sizes' /proc/cpuinfo address sizes : 36 bits physical, 48 bits virtual ... Which matches where the problem surfaces (64 GB guest-phys address space) with hw-supported nested paging (EPT) enabled on the host. In order to confirm this, a colleague of mine gave me access to a server with 96 GB of RAM, and: address sizes : 46 bits physical, 48 bits virtual On this host I booted a 72 GB OVMF guest on QEMU/KVM, with EPT enabled, and according to the guest dmesg, the guest saw it all. Memory: 74160924K/75493820K available (7735K kernel code, 1149K rwdata, 3340K rodata, 1500K init, 1524K bss, 1332896K reserved, 0K cma-reserved) Maoming: since you reported this issue, please confirm that the patch series resolves it for you as well. In that case, I'll repost the series with "PATCH" as subject-prefix instead of "RFC", and I'll drop the BUG note from the last commit message. Thanks Laszlo >> Cc: Maoming <maoming.maom...@huawei.com> >> Cc: Huangpeng (Peter) <peter.huangp...@huawei.com> >> Cc: Wei Liu <wei.l...@citrix.com> >> Contributed-under: TianoCore Contribution Agreement 1.0 >> Signed-off-by: Laszlo Ersek <ler...@redhat.com> >> --- >> OvmfPkg/PlatformPei/MemDetect.c | 43 >> +++++++++++++++++++++++++++++++++++++---- >> 1 file changed, 39 insertions(+), 4 deletions(-) >> >> diff --git a/OvmfPkg/PlatformPei/MemDetect.c >> b/OvmfPkg/PlatformPei/MemDetect.c index 3ceb142..cceab22 100644 >> --- a/OvmfPkg/PlatformPei/MemDetect.c >> +++ b/OvmfPkg/PlatformPei/MemDetect.c >> @@ -194,6 +194,8 @@ QemuInitializeRam ( { >> UINT64 LowerMemorySize; >> UINT64 UpperMemorySize; >> + MTRR_SETTINGS MtrrSettings; >> + EFI_STATUS Status; >> >> DEBUG ((EFI_D_INFO, "%a called\n", __FUNCTION__)); >> >> @@ -214,12 +216,45 @@ QemuInitializeRam ( >> } >> } >> >> - MtrrSetMemoryAttribute (BASE_1MB, LowerMemorySize - BASE_1MB, >> CacheWriteBack); >> + // >> + // We'd like to keep the following ranges uncached: >> + // - [640 KB, 1 MB) >> + // - [LowerMemorySize, 4 GB) >> + // >> + // Everything else should be WB. Unfortunately, programming the inverse >> (ie. >> + // keeping the default UC, and configuring the complement set of >> + the above as // WB) is not reliable in general, because the end of >> + the upper RAM can have // practically any alignment, and we may >> + not have enough variable MTRRs to // cover it exactly. >> + // >> + if (IsMtrrSupported ()) { >> + MtrrGetAllMtrrs (&MtrrSettings); >> >> - MtrrSetMemoryAttribute (0, BASE_512KB + BASE_128KB, >> CacheWriteBack); >> + // >> + // MTRRs disabled, fixed MTRRs disabled, default type is uncached >> + // >> + ASSERT ((MtrrSettings.MtrrDefType & BIT11) == 0); >> + ASSERT ((MtrrSettings.MtrrDefType & BIT10) == 0); >> + ASSERT ((MtrrSettings.MtrrDefType & 0xFF) == 0); >> >> - if (UpperMemorySize != 0) { >> - MtrrSetMemoryAttribute (BASE_4GB, UpperMemorySize, CacheWriteBack); >> + // >> + // flip default type to writeback >> + // >> + SetMem (&MtrrSettings.Fixed, sizeof MtrrSettings.Fixed, 0x06); >> + ZeroMem (&MtrrSettings.Variables, sizeof MtrrSettings.Variables); >> + MtrrSettings.MtrrDefType |= BIT11 | BIT10 | 6; >> + MtrrSetAllMtrrs (&MtrrSettings); >> + >> + // >> + // punch holes >> + // >> + Status = MtrrSetMemoryAttribute (BASE_512KB + BASE_128KB, >> + SIZE_256KB + SIZE_128KB, CacheUncacheable); >> + ASSERT_EFI_ERROR (Status); >> + >> + Status = MtrrSetMemoryAttribute (LowerMemorySize, >> + SIZE_4GB - LowerMemorySize, CacheUncacheable); >> + ASSERT_EFI_ERROR (Status); >> } >> } >> >> > ------------------------------------------------------------------------------ _______________________________________________ edk2-devel mailing list edk2-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/edk2-devel