On 06/16/15 14:54, Maoming wrote: > > > -----邮件原件----- > 发件人: Laszlo Ersek [mailto:ler...@redhat.com] > 发送时间: 2015年6月15日 22:08 > 收件人: Maoming > 抄送: edk2-devel@lists.sourceforge.net; Huangpeng (Peter); Wei Liu; Paolo > Bonzini > 主题: Re: 答复: [edk2] [RFC 4/4] OvmfPkg: PlatformPei: invert MTRR setup in > QemuInitializeRam() > > On 06/15/15 15:25, Maoming wrote: >> Hi : >> Sorry for the late reply. >> I tested the patch series using 64G and 80G. >> Both of them are OK in XEN. >> >> Here is what it looks like inside the VM (the memory is 80G): >> total used free shared buffers >> cached >> Mem: 81956412 654708 81301704 0 10528 42256 >> -/+ buffers/cache: 601924 81354488 >> Swap: 4186108 0 4186108 >> >> Thanks a lot for your nice work! >> Maoming > > Thanks for reporting back! > > Since you mentioned earlier that you encountered the problem on qemu/KVM > too -- can you please give that a whirl as well, with this patch series > in place? > > Thank you > Laszlo > > > The patch series works well in KVM too. > My environment is : > version: kvm-kmod-3.6 > QEMU emulator version 2.1.0 > > Here is what it looks like inside the VM (the memory is 90G): > total used free shared buffers > cached > Mem: 92862616 1155156 91707460 0 13552 77952 > -/+ buffers/cache: 1063652 91798964 > Swap: 4063224 0 4063224 > > Thanks! > Maoming
Great, thank you. I'll add Wei Liu's Tested-by to patches #1 and #2 (because the other two patches don't affect Xen), and I will add your Tested-by to all four patches. I'll update the commit message of patch #4 and I'll resend the series as PATCH, not RFC. Cheers! Laszlo > > >> -----邮件原件----- >> 发件人: Laszlo Ersek [mailto:ler...@redhat.com] >> 发送时间: 2015年6月10日 21:03 >> 收件人: Maoming >> 抄送: edk2-devel@lists.sourceforge.net; Huangpeng (Peter); Wei Liu; Paolo >> Bonzini >> 主题: Re: [edk2] [RFC 4/4] OvmfPkg: PlatformPei: invert MTRR setup in >> QemuInitializeRam() >> >> On 06/09/15 04:15, Laszlo Ersek wrote: >>> On 06/08/15 23:46, Laszlo Ersek wrote: >>>> At the moment we work with a UC default MTRR type, and set three >>>> memory ranges to WB: >>>> - [0, 640 KB), >>>> - [1 MB, LowerMemorySize), >>>> - [4 GB, 4 GB + UpperMemorySize). >>>> >>>> Unfortunately, coverage for the third range can fail with a high >>>> likelihood. If the alignment of the base (ie. 4 GB) and the alignment >>>> of the size (UpperMemorySize) differ, then MtrrLib creates a series >>>> of variable MTRR entries, with power-of-two sized MTRR masks. And, >>>> it's really easy to run out of variable MTRR entries, dependent on >>>> the alignment difference. >>>> >>>> This is a problem because a Linux guest will loudly reject any high >>>> memory that is not covered my MTRR. >>>> >>>> So, let's follow the inverse pattern (loosely inspired by SeaBIOS): >>>> - flip the MTRR default type to WB, >>>> - set [0, 640 KB) to WB -- fixed MTRRs have precedence over the default >>>> type and variable MTRRs, so we can't avoid this, >>>> - set [640 KB, 1 MB) to UC -- implemented with fixed MTRRs, >>>> - set [LowerMemorySize, 4 GB) to UC -- should succeed with variable MTRRs >>>> more likely than the other scheme (due to less chaotic alignment >>>> differences). >>>> >>>> Effects of this patch can be observed by setting DEBUG_CACHE >>>> (0x00200000) in PcdDebugPrintErrorLevel. >>>> >>>> BUG: Although the MTRRs look good to me in the OVMF debug log, I >>>> still can't boot >= 64 GB guests with this. Instead of the complaints >>>> mentioned above, the Linux guest apparently spirals into an infinite >>>> loop (on KVM), or hangs with no CPU load (on TCG). >>> >>> No, actually there is no bug in this patch (so s/RFC/PATCH/). I did >>> more testing and these are the findings: >>> - I can reproduce the same issue on KVM with SeaBIOS guests. >>> - The exact symptoms are that as soon as the highest guest-phys address >>> is >= 64 GB, then the guest kernel doesn't boot. It gets stuck >>> somewhere after hitting Enter in grub. >>> - Normally 3 GB of the guest RAM is mapped under 4 GB in guest-phys >>> address space, then there's a 1 GB PCI hole, and the rest is above >>> 4 GB. This means that a 63 GB guest can be started (because 63 - 3 + 4 >>> == 64), but if you add just 1 MB more, it won't boot. >>> - (This was the big discovery:) I flipped the "ept" parameter of the >>> kvm_intel module on my host to N, and then things started to work. I >>> just booted a 128 GB Linux guest with this patchset. (I have 4 GB >>> RAM in my host, plus approx 250 GB swap.) The guest could see it all. >>> - The TCG boot didn't hang either; I just couldn't wait earlier for >>> network initialization to complete. >>> >>> I'm CC'ing Paolo for help with the EPT question. Other than that, this >>> series is functional. (For QEMU/KVM at least; Xen will likely need >>> more fixes from others.) >> >> We have a root cause, it seems. The issue is that the processor in my >> laptop, on which I tested, has only 36 bits for physical addresses: >> >> $ grep 'address sizes' /proc/cpuinfo >> address sizes : 36 bits physical, 48 bits virtual >> ... >> >> Which matches where the problem surfaces (64 GB guest-phys address >> space) with hw-supported nested paging (EPT) enabled on the host. >> >> In order to confirm this, a colleague of mine gave me access to a server >> with 96 GB of RAM, and: >> >> address sizes : 46 bits physical, 48 bits virtual >> >> On this host I booted a 72 GB OVMF guest on QEMU/KVM, with EPT enabled, and >> according to the guest dmesg, the guest saw it all. >> >> Memory: 74160924K/75493820K available (7735K kernel code, 1149K >> rwdata, 3340K rodata, 1500K init, 1524K bss, 1332896K reserved, 0K >> cma-reserved) >> >> Maoming: since you reported this issue, please confirm that the patch series >> resolves it for you as well. In that case, I'll repost the series with >> "PATCH" as subject-prefix instead of "RFC", and I'll drop the BUG note from >> the last commit message. >> >> Thanks >> Laszlo >> >>>> Cc: Maoming <maoming.maom...@huawei.com> >>>> Cc: Huangpeng (Peter) <peter.huangp...@huawei.com> >>>> Cc: Wei Liu <wei.l...@citrix.com> >>>> Contributed-under: TianoCore Contribution Agreement 1.0 >>>> Signed-off-by: Laszlo Ersek <ler...@redhat.com> >>>> --- >>>> OvmfPkg/PlatformPei/MemDetect.c | 43 >>>> +++++++++++++++++++++++++++++++++++++---- >>>> 1 file changed, 39 insertions(+), 4 deletions(-) >>>> >>>> diff --git a/OvmfPkg/PlatformPei/MemDetect.c >>>> b/OvmfPkg/PlatformPei/MemDetect.c index 3ceb142..cceab22 100644 >>>> --- a/OvmfPkg/PlatformPei/MemDetect.c >>>> +++ b/OvmfPkg/PlatformPei/MemDetect.c >>>> @@ -194,6 +194,8 @@ QemuInitializeRam ( { >>>> UINT64 LowerMemorySize; >>>> UINT64 UpperMemorySize; >>>> + MTRR_SETTINGS MtrrSettings; >>>> + EFI_STATUS Status; >>>> >>>> DEBUG ((EFI_D_INFO, "%a called\n", __FUNCTION__)); >>>> >>>> @@ -214,12 +216,45 @@ QemuInitializeRam ( >>>> } >>>> } >>>> >>>> - MtrrSetMemoryAttribute (BASE_1MB, LowerMemorySize - BASE_1MB, >>>> CacheWriteBack); >>>> + // >>>> + // We'd like to keep the following ranges uncached: >>>> + // - [640 KB, 1 MB) >>>> + // - [LowerMemorySize, 4 GB) >>>> + // >>>> + // Everything else should be WB. Unfortunately, programming the inverse >>>> (ie. >>>> + // keeping the default UC, and configuring the complement set of >>>> + the above as // WB) is not reliable in general, because the end of >>>> + the upper RAM can have // practically any alignment, and we may >>>> + not have enough variable MTRRs to // cover it exactly. >>>> + // >>>> + if (IsMtrrSupported ()) { >>>> + MtrrGetAllMtrrs (&MtrrSettings); >>>> >>>> - MtrrSetMemoryAttribute (0, BASE_512KB + BASE_128KB, >>>> CacheWriteBack); >>>> + // >>>> + // MTRRs disabled, fixed MTRRs disabled, default type is uncached >>>> + // >>>> + ASSERT ((MtrrSettings.MtrrDefType & BIT11) == 0); >>>> + ASSERT ((MtrrSettings.MtrrDefType & BIT10) == 0); >>>> + ASSERT ((MtrrSettings.MtrrDefType & 0xFF) == 0); >>>> >>>> - if (UpperMemorySize != 0) { >>>> - MtrrSetMemoryAttribute (BASE_4GB, UpperMemorySize, CacheWriteBack); >>>> + // >>>> + // flip default type to writeback >>>> + // >>>> + SetMem (&MtrrSettings.Fixed, sizeof MtrrSettings.Fixed, 0x06); >>>> + ZeroMem (&MtrrSettings.Variables, sizeof MtrrSettings.Variables); >>>> + MtrrSettings.MtrrDefType |= BIT11 | BIT10 | 6; >>>> + MtrrSetAllMtrrs (&MtrrSettings); >>>> + >>>> + // >>>> + // punch holes >>>> + // >>>> + Status = MtrrSetMemoryAttribute (BASE_512KB + BASE_128KB, >>>> + SIZE_256KB + SIZE_128KB, CacheUncacheable); >>>> + ASSERT_EFI_ERROR (Status); >>>> + >>>> + Status = MtrrSetMemoryAttribute (LowerMemorySize, >>>> + SIZE_4GB - LowerMemorySize, CacheUncacheable); >>>> + ASSERT_EFI_ERROR (Status); >>>> } >>>> } >>>> >>>> >>> >> > ------------------------------------------------------------------------------ _______________________________________________ edk2-devel mailing list edk2-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/edk2-devel