PlatformPei: Set PcdCpuMaxLogicalProcessorNumber using QEMU fw_cfg

Laszlo Ersek Fri, 30 Oct 2015 06:40:36 -0700

On 10/30/15 14:04, Janusz Mocek wrote:
> W dniu 30.10.2015 o 13:26, Laszlo Ersek pisze:
>> CC'ing Xiao and Alex again.
>>
>> On 10/29/15 19:39, Jordan Justen wrote:
>>> On 2015-10-29 04:45:37, Laszlo Ersek wrote:
>>>> On 10/29/15 02:32, Jordan Justen wrote:
>>>>> +    ASSERT (MaxProcessors > 0);
>>>>> +    PcdSet32 (PcdCpuMaxLogicalProcessorNumber, MaxProcessors);
>>>> I think that when this branch is active, then
>>>> PcdCpuApInitTimeOutInMicroSeconds should *also* be set, namely to
>>>> MAX_UINT32 (~71 minutes, the closest we can get to "infinity"). When
>>>> this hint is available from QEMU, then we should practically disable
>>>> the timeout option in CpuDxe's AP counting.
>>> I think this is a good idea, but I don't think 71 minutes is useful.
>>> Perhaps 30 seconds? This seems more than adequate for hundreds of
>>> processors to startup. Or perhaps some timeout based on the number of
>>> processors?
>>>
>>> Janusz and I were discussing
>>> https://github.com/tianocore/edk2/issues/21 on irc. We increased the
>>> timeout to 10 seconds, and with only 8 processors it was still timing
>>> out.
>>>
>>> Obviously we are somehow failing to start the processors correctly, or
>>> QEMU/KVM is doing something wrong.
>>>
>>> Have you been able to reproduce this issue? It seems like we need to
>>> set the timeout to 71 minutes, and then debug QEMU/KVM to see what
>>> state the APs are in...
>>>
>>> Unfortunately I haven't yet been able to reproduce the bug on my
>>> system. :(
>> I've been staring at the following things for a few tens of minutes now:
>>
>> (1) Kernel commit b18d5431acc7. Note that the commit changes the return
>>     value of the vmx_get_mt_mask() function *exactly* in the following
>>     case:
>>
>>       kvm_arch_has_noncoherent_dma(vcpu->kvm) &&
>>       (kvm_read_cr0(vcpu) & X86_CR0_CD)
>>
>>     The first sub-condition is satisfied by GPU passthrough / device
>>     assignment, I think; the second part depends on the VCPU having
>>     turned on (or having *left* on) CR0.CD.
>>
>> (2) Consult the vmx_vcpu_reset() function in "arch/x86/kvm/vmx.c"
>>     (current upstream). You will find:
>>
>>      cr0 = X86_CR0_NW | X86_CR0_CD | X86_CR0_ET;
>>      vmx_set_cr0(vcpu, cr0); /* enter rmode */
>>
>>     Meaning a VCPU will start with CD and NW set, in real mode, after
>>     re-set.
>>
>>     This setting dates back to the birth of KVM:
>>
>>       commit 6aa8b732ca01c3d7a54e93f4d701b8aabbe60fb7
>>       Author: Avi Kivity <[email protected]>
>>       Date: Sun Dec 10 02:21:36 2006 -0800
>>
>>           [PATCH] kvm: userspace interface
>>
>>     Search that commit for "0x60000010" (the second hit, although the
>>     comment that contains the first hit is quite telling as well).
>>
>> (3) Consult the Intel SDM, Table 11-5. "Cache Operating Modes".
>>
>>     The (CD, NW) == (1, 1) setting in CR0 is documented as:
>>     - "Memory coherency is not maintained."
>>     - "(P6 family and Pentium processors.) State of the processor after
>>       a power up or reset. "
>>     - [in footnote 2] "The Pentium 4 and more recent processor families
>>       do not support this mode; setting the CD and NW bits to 1 selects
>>       the no-fill cache mode."
>>
>>     In other words, the settings implemented by vmx_vcpu_reset()
>>     actually invoke the behavior of the "no-fill cache mode" (which is
>>     (CD, NW) == (1, 0)) for all practical purposes.
>>
>> (4) Same reference.
>>
>>     The (CD, NW) == (1, 0) setting in CR0 is documented as:
>>     - "No-fill Cache Mode. Memory coherency is maintained."
>>     - "(Pentium 4 and later processor families.) State of processor
>>       after a power up or reset. "
>>
>> (5) The AsmEnableCache() function in
>>     "MdePkg/Library/BaseLib/Ia32/EnableCache.c". It clears both CD and
>>     NW in CR0.
>>
>> (6) This setting ((CD, NW) == (0, 0))is documented in the Intel SDM as:
>>     - "Normal Cache Mode. Highest performance cache operation."
>>
>> (7) The AsmEnableCache() function is invoked by MtrrLib
>>     [UefiCpuPkg/Library/MtrrLib/MtrrLib.c] after any and all MTRR
>>     changes. Consider:
>>
>>     PostMtrrChange() | MtrrSetAllMtrrs()
>>       PostMtrrChangeEnableCache()
>>         AsmEnableCache()
>>
>>     Where MtrrSetAllMtrrs() is a public function of the library; plus
>>     PostMtrrChange() is invoked by all of the following public
>>     functions:
>>
>>     - MtrrSetMemoryAttribute()
>>     - MtrrSetVariableMtrr()
>>     - MtrrSetFixedMtrr()
>>
>> (8) Because we call MtrrLib in PlatformPei first, there are two
>>     consequences:
>>
>>     (a) The boot VCPU has CR0.CD *set* in all parts of OVMF that run
>>         earlier than that.
>>
>>         This caused a widely reported boot perf regression in SEC (the
>>         LZMA decompression). Ultimately another MTRR change in KVM was
>>         reverted, so (as far as I know) this symptom has not been seen
>>         recently. (In any case, we should probably fix this sometime...)
>>
>>     (b) The other consequence is that the boot VCPU's CR0.CD is clear in
>>         the rest of OVMF. Which is what makes its speed acceptable, I
>>         guess (as long as no APs are started up).
>>
>> (9) Our AP startup code massages CR0, but only for mode switches. CR0.CD
>>     and CR0.NW are never touched.
>>
>>     Now, I guess this could be easily added to the assembly encoded as a
>>     C array ("mStartupCodeTemplate" in "UefiCpuPkg/CpuDxe/ApStartup.c")
>>     -- when cr0 is massaged anyway, just clear bits 29 and 30 too; same
>>     as in AsmEnableCache().
>>
>>     However, for testing the idea, perhaps the following one-liner
>>     suffices too -- this is the earliest an AP executes C code:
>>
>>> diff --git a/UefiCpuPkg/CpuDxe/CpuMp.c b/UefiCpuPkg/CpuDxe/CpuMp.c
>>> index 3f56faa..e7f5b41 100644
>>> --- a/UefiCpuPkg/CpuDxe/CpuMp.c
>>> +++ b/UefiCpuPkg/CpuDxe/CpuMp.c
>>> @@ -1451,6 +1451,8 @@ ApEntryPointInC (
>>>    VOID*           TopOfApStack;
>>>    UINTN           ProcessorNumber;
>>>
>>> +  AsmEnableCache ();
>>> +
>>>    if (!mAPsAlreadyInitFinished) {
>>>      FillInProcessorInformation (FALSE, mMpSystemData.NumberOfProcessors);
>>>      TopOfApStack  = (UINT8*)mApStackStart + gApStackSize;
>>     This should clear CR0.CD, and "undo" kernel commit b18d5431acc7 for
>>     the AP (by falsifying the second subcondition seen in (1)).
>>
>> Janusz, can you please test this one-liner (with no other out-of-tree
>> patch applied)?
>>
> tested, didn't solved problem with detected cpu's


Thanks for testing it. I'll try to reproduce the problem on my
workstation next week.

Thanks
Laszlo

_______________________________________________
edk2-devel mailing list
[email protected]
https://lists.01.org/mailman/listinfo/edk2-devel

Re: [edk2] [PATCH 6/6] OvmfPkg/PlatformPei: Set PcdCpuMaxLogicalProcessorNumber using QEMU fw_cfg

Reply via email to