W dniu 30.10.2015 o 13:26, Laszlo Ersek pisze:
> CC'ing Xiao and Alex again.
>
> On 10/29/15 19:39, Jordan Justen wrote:
>> On 2015-10-29 04:45:37, Laszlo Ersek wrote:
>>> On 10/29/15 02:32, Jordan Justen wrote:
>>>> + ASSERT (MaxProcessors > 0);
>>>> + PcdSet32 (PcdCpuMaxLogicalProcessorNumber, MaxProcessors);
>>> I think that when this branch is active, then
>>> PcdCpuApInitTimeOutInMicroSeconds should *also* be set, namely to
>>> MAX_UINT32 (~71 minutes, the closest we can get to "infinity"). When
>>> this hint is available from QEMU, then we should practically disable
>>> the timeout option in CpuDxe's AP counting.
>> I think this is a good idea, but I don't think 71 minutes is useful.
>> Perhaps 30 seconds? This seems more than adequate for hundreds of
>> processors to startup. Or perhaps some timeout based on the number of
>> processors?
>>
>> Janusz and I were discussing
>> https://github.com/tianocore/edk2/issues/21 on irc. We increased the
>> timeout to 10 seconds, and with only 8 processors it was still timing
>> out.
>>
>> Obviously we are somehow failing to start the processors correctly, or
>> QEMU/KVM is doing something wrong.
>>
>> Have you been able to reproduce this issue? It seems like we need to
>> set the timeout to 71 minutes, and then debug QEMU/KVM to see what
>> state the APs are in...
>>
>> Unfortunately I haven't yet been able to reproduce the bug on my
>> system. :(
> I've been staring at the following things for a few tens of minutes now:
>
> (1) Kernel commit b18d5431acc7. Note that the commit changes the return
> value of the vmx_get_mt_mask() function *exactly* in the following
> case:
>
> kvm_arch_has_noncoherent_dma(vcpu->kvm) &&
> (kvm_read_cr0(vcpu) & X86_CR0_CD)
>
> The first sub-condition is satisfied by GPU passthrough / device
> assignment, I think; the second part depends on the VCPU having
> turned on (or having *left* on) CR0.CD.
>
> (2) Consult the vmx_vcpu_reset() function in "arch/x86/kvm/vmx.c"
> (current upstream). You will find:
>
> cr0 = X86_CR0_NW | X86_CR0_CD | X86_CR0_ET;
> vmx_set_cr0(vcpu, cr0); /* enter rmode */
>
> Meaning a VCPU will start with CD and NW set, in real mode, after
> re-set.
>
> This setting dates back to the birth of KVM:
>
> commit 6aa8b732ca01c3d7a54e93f4d701b8aabbe60fb7
> Author: Avi Kivity <[email protected]>
> Date: Sun Dec 10 02:21:36 2006 -0800
>
> [PATCH] kvm: userspace interface
>
> Search that commit for "0x60000010" (the second hit, although the
> comment that contains the first hit is quite telling as well).
>
> (3) Consult the Intel SDM, Table 11-5. "Cache Operating Modes".
>
> The (CD, NW) == (1, 1) setting in CR0 is documented as:
> - "Memory coherency is not maintained."
> - "(P6 family and Pentium processors.) State of the processor after
> a power up or reset. "
> - [in footnote 2] "The Pentium 4 and more recent processor families
> do not support this mode; setting the CD and NW bits to 1 selects
> the no-fill cache mode."
>
> In other words, the settings implemented by vmx_vcpu_reset()
> actually invoke the behavior of the "no-fill cache mode" (which is
> (CD, NW) == (1, 0)) for all practical purposes.
>
> (4) Same reference.
>
> The (CD, NW) == (1, 0) setting in CR0 is documented as:
> - "No-fill Cache Mode. Memory coherency is maintained."
> - "(Pentium 4 and later processor families.) State of processor
> after a power up or reset. "
>
> (5) The AsmEnableCache() function in
> "MdePkg/Library/BaseLib/Ia32/EnableCache.c". It clears both CD and
> NW in CR0.
>
> (6) This setting ((CD, NW) == (0, 0))is documented in the Intel SDM as:
> - "Normal Cache Mode. Highest performance cache operation."
>
> (7) The AsmEnableCache() function is invoked by MtrrLib
> [UefiCpuPkg/Library/MtrrLib/MtrrLib.c] after any and all MTRR
> changes. Consider:
>
> PostMtrrChange() | MtrrSetAllMtrrs()
> PostMtrrChangeEnableCache()
> AsmEnableCache()
>
> Where MtrrSetAllMtrrs() is a public function of the library; plus
> PostMtrrChange() is invoked by all of the following public
> functions:
>
> - MtrrSetMemoryAttribute()
> - MtrrSetVariableMtrr()
> - MtrrSetFixedMtrr()
>
> (8) Because we call MtrrLib in PlatformPei first, there are two
> consequences:
>
> (a) The boot VCPU has CR0.CD *set* in all parts of OVMF that run
> earlier than that.
>
> This caused a widely reported boot perf regression in SEC (the
> LZMA decompression). Ultimately another MTRR change in KVM was
> reverted, so (as far as I know) this symptom has not been seen
> recently. (In any case, we should probably fix this sometime...)
>
> (b) The other consequence is that the boot VCPU's CR0.CD is clear in
> the rest of OVMF. Which is what makes its speed acceptable, I
> guess (as long as no APs are started up).
>
> (9) Our AP startup code massages CR0, but only for mode switches. CR0.CD
> and CR0.NW are never touched.
>
> Now, I guess this could be easily added to the assembly encoded as a
> C array ("mStartupCodeTemplate" in "UefiCpuPkg/CpuDxe/ApStartup.c")
> -- when cr0 is massaged anyway, just clear bits 29 and 30 too; same
> as in AsmEnableCache().
>
> However, for testing the idea, perhaps the following one-liner
> suffices too -- this is the earliest an AP executes C code:
>
>> diff --git a/UefiCpuPkg/CpuDxe/CpuMp.c b/UefiCpuPkg/CpuDxe/CpuMp.c
>> index 3f56faa..e7f5b41 100644
>> --- a/UefiCpuPkg/CpuDxe/CpuMp.c
>> +++ b/UefiCpuPkg/CpuDxe/CpuMp.c
>> @@ -1451,6 +1451,8 @@ ApEntryPointInC (
>> VOID* TopOfApStack;
>> UINTN ProcessorNumber;
>>
>> + AsmEnableCache ();
>> +
>> if (!mAPsAlreadyInitFinished) {
>> FillInProcessorInformation (FALSE, mMpSystemData.NumberOfProcessors);
>> TopOfApStack = (UINT8*)mApStackStart + gApStackSize;
> This should clear CR0.CD, and "undo" kernel commit b18d5431acc7 for
> the AP (by falsifying the second subcondition seen in (1)).
>
> Janusz, can you please test this one-liner (with no other out-of-tree
> patch applied)?
>
tested, didn't solved problem with detected cpu's
_______________________________________________
edk2-devel mailing list
[email protected]
https://lists.01.org/mailman/listinfo/edk2-devel