Laszlo,

I agree BZ#251 can not be closed if commit 0594ec417c89 does not work for OVMF.

I will reopen it.

Jeff

发件人: Laszlo Ersek<mailto:ler...@redhat.com>
发送时间: 2017年10月25日 1:40
收件人: Dong, Eric<mailto:eric.d...@intel.com>; 
edk2-devel@lists.01.org<mailto:edk2-devel@lists.01.org>
抄送: Ni, Ruiyu<mailto:ruiyu...@intel.com>; Paolo 
Bonzini<mailto:pbonz...@redhat.com>; Jeff Fan<mailto:vanjeff_...@hotmail.com>
主题: Re: [edk2] [Patch 2/2] UefiCpuPkg/MpInitLib: Enhance waiting for AP 
initialization logic.

Hi Eric,

On 10/24/17 17:23, Dong, Eric wrote:
> Laszlo,
>
>> -----Original Message-----
>> From: Laszlo Ersek [mailto:ler...@redhat.com]
>> Sent: Tuesday, October 24, 2017 6:16 PM
>> To: Dong, Eric <eric.d...@intel.com>; edk2-devel@lists.01.org
>> Cc: Ni, Ruiyu <ruiyu...@intel.com>; Paolo Bonzini <pbonz...@redhat.com>
>> Subject: Re: [edk2] [Patch 2/2] UefiCpuPkg/MpInitLib: Enhance waiting for
>> AP initialization logic.
>>
>> CC Paolo
>>
>> On 10/23/17 09:22, Eric Dong wrote:

>>> diff --git a/UefiCpuPkg/Library/MpInitLib/Ia32/MpEqu.inc
>>> b/UefiCpuPkg/Library/MpInitLib/Ia32/MpEqu.inc
>>> index 976af1f..bdfe0d3 100644
>>> --- a/UefiCpuPkg/Library/MpInitLib/Ia32/MpEqu.inc
>>> +++ b/UefiCpuPkg/Library/MpInitLib/Ia32/MpEqu.inc
>>> @@ -40,4 +40,5 @@ EnableExecuteDisableLocation  equ        LockLocation
>> + 30h
>>>  Cr3Location                   equ        LockLocation + 34h
>>>  InitFlagLocation              equ        LockLocation + 38h
>>>  CpuInfoLocation               equ        LockLocation + 3Ch
>>> +NumApsExecutingLocation       equ        LockLocation + 40h
>>>
>>> diff --git a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
>>> b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
>>> index 1b9c6a6..2b6c27d 100644
>>> --- a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
>>> +++ b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
>>> @@ -86,6 +86,12 @@ Flat32Start:                                   ; 
>>> protected mode
>> entry point
>>>
>>>      mov        esi, ebx
>>>
>>> +    ; Increment the number of APs executing here as early as possible
>>> +    ; This is decremented in C code when AP is finished executing
>>> +    mov        edi, esi
>>> +    add        edi, NumApsExecutingLocation
>>> +    lock inc   dword [edi]
>>> +
>>>      mov         edi, esi
>>>      add         edi, EnableExecuteDisableLocation
>>>      cmp         byte [edi], 0
>>> diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.c
>>> b/UefiCpuPkg/Library/MpInitLib/MpLib.c
>>> index db923c9..48f930b 100644
>>> --- a/UefiCpuPkg/Library/MpInitLib/MpLib.c
>>> +++ b/UefiCpuPkg/Library/MpInitLib/MpLib.c
>>> @@ -662,6 +662,7 @@ ApWakeupFunction (
>>>      // AP finished executing C code
>>>      //
>>>      InterlockedIncrement ((UINT32 *) &CpuMpData->FinishedCount);
>>> +    InterlockedDecrement ((UINT32 *)
>>> + &CpuMpData->MpCpuExchangeInfo->NumApsExecuting);
>>>
>>>      //
>>>      // Place AP is specified loop mode @@ -765,6 +766,7 @@
>>> FillExchangeInfoData (
>>>
>>>    ExchangeInfo->CFunction       = (UINTN) ApWakeupFunction;
>>>    ExchangeInfo->ApIndex         = 0;
>>> +  ExchangeInfo->NumApsExecuting = 0;
>>>    ExchangeInfo->InitFlag        = (UINTN) CpuMpData->InitFlag;
>>>    ExchangeInfo->CpuInfo         = (CPU_INFO_IN_HOB *) (UINTN)
>> CpuMpData->CpuInfoInHob;
>>>    ExchangeInfo->CpuMpData       = CpuMpData;
>>> @@ -934,13 +936,19 @@ WakeUpAP (
>>>      }
>>>      if (CpuMpData->InitFlag == ApInitConfig) {
>>>        //
>>> -      // Wait for all potential APs waken up in one specified period
>>> +      // Wait for one potential AP waken up in one specified period
>>>        //
>>> -      TimedWaitForApFinish (
>>> -        CpuMpData,
>>> -        PcdGet32 (PcdCpuMaxLogicalProcessorNumber) - 1,
>>> -        PcdGet32 (PcdCpuApInitTimeOutInMicroSeconds)
>>> -        );
>>> +      if (CpuMpData->CpuCount == 0) {
>>> +        TimedWaitForApFinish (
>>> +          CpuMpData,
>>> +          PcdGet32 (PcdCpuMaxLogicalProcessorNumber) - 1,
>>> +          PcdGet32 (PcdCpuApInitTimeOutInMicroSeconds)
>>> +          );
>>> +      }
>>
>> I don't understand this change. The new comment says,
>>
>>   Wait for *one* potential AP waken up in one specified period
>>
>> However, the second parameter of TimedWaitForApFinish(), namely
>> "FinishedApLimit", gets the same value as before:
>>
>>   PcdGet32 (PcdCpuMaxLogicalProcessorNumber) - 1
>>
>> It means that all of the (possible) APs are waited-for, just the same
>> as before.
>
> [[Eric]] This patch changes the collect AP count logic, original
> solution always waits for a specific time to let all APs start up. If
> the input CPU number (PcdGet32(PcdCpuMaxLogicalProcessorNumber) - 1)
> have been found or after a specific time(PcdGet32
> (PcdCpuApInitTimeOutInMicroSeconds)). BSP will not wait anymore and use
> CpuMpData->CpuCount as the found AP count.
>
> New logic also wait for a specific time, but this time is smaller than
> the original one. It just wait for the first AP(any AP) begin to do the
> initialization( do CpuMpData->MpCpuExchangeInfo->NumApsExecuting++ means
> it begin to do the initialization). When Ap finishes initialization, it
> will do CpuMpData->MpCpuExchangeInfo->NumApsExecuting--. So after BSP
> waits for a specific time at first, it just needs to check whether
> CpuMpData->MpCpuExchangeInfo->NumApsExecuting == 0 to know whether all
> Aps have finished initialization. Here we still use the original PCD
> (PcdCpuApInitTimeOutInMicroSeconds) for the new time value.
>
> When one AP do the initialization, it will also do
> CpuMpData->CpuCount++. So here I check whether CpuMpData->CpuCount != 0
> to know whether APs already begin to do the initialization. If yes, I
> not need to do the time out waiting anymore, just check the
> CpuMpData->MpCpuExchangeInfo->NumApsExecuting to know whether all Aps
> have finished initialization.

Thanks for the explanation.

The "NumApsExecuting" increment / decrement logic in this patch expects
that the APs work as follows:

(1) After the INIT-SIPI-SIPI, there may be a delay before the APs start
running. During this delay, the BSP may or may not reach the code in
question. The (CpuMpData->CpuCount != 0) check is supposed to take this
into account.

(2) After at least one AP has started running, the logic expects
"NumApsExecuting" to monotonically grow for a while, and then to
monotonically decrease, back to zero. For example, if we have 1 BSP and
7 APs, the BSP logic more or less expects the following values in
"NumApsExecuting":

1; 2; 3; 4; 5; 6; 7;
6; 5; 4; 3; 2; 1; 0


While this may be a valid expectation for physical processors (which
actually run in parallel, in the physical world), in a virtual machine,
it is not guaranteed. Dependent on hypervisor scheduling artifacts, it
is possible that, say, three APs start up *and finish* before the
remaining four APs start up *at all*. The INIT-SIPI-SIPI marks all APs
for execution / scheduling in the hypervisor, yes, but the actual code
execution may commence a lot later. For example, the BSP may witness the
following series of values in "NumApsExecuting":

1; 2; 3;
2; 1; 0;
1; 2; 3; 4;
3; 2; 1; 0

and the BSP could think that there are 3 APs only, when it sees the
first 0 value.


Now, let me get back to the use case that actually matters to OVMF and
QEMU. As I mentioned earlier, OVMF knows from QEMU the exact number of
APs. If there is a way for OVMF to tell MpInitLib to wait for this exact
number of APs -- no matter how long it takes --, then that's what I
would like to use.

Please see the original discussion around OVMF commit 45a70db3c3a59:

* In version 1, I introduced a new PCD called

  PcdCpuKnownLogicalProcessorNumber

and I modified MpInitLib to wait for this AP number, ignoring timeout
completely, if the PCD was set:

  https://lists.01.org/pipermail/edk2-devel/2016-November/005076.html

However, Jeff suggested to use the preexistent PCD
"PcdCpuMaxLogicalProcessorNumber" for this purpose:

  https://lists.01.org/pipermail/edk2-devel/2016-November/005092.html

* In version 2, I used the PCD suggested by Jeff, but I also introduced
a new special value for the timeout. Timeout=0 would mean "infinity":

  https://lists.01.org/pipermail/edk2-devel/2016-November/005209.html

Jeff didn't like the special value, and suggested that OVMF simply use
MAX_UINT32 microseconds (approx. 71 minutes) instead of "infinity":

  https://lists.01.org/pipermail/edk2-devel/2016-November/005266.html

* In v3, I implemented that, and that was pushed as:

  - UefiCpuPkg commit 6e1987f19af7 ("UefiCpuPkg/MpInitLib: wait no
    longer than necessary for initial AP startup", 2016-11-24)

  - and OvmfPkg commit 45a70db3c3a5 ("OvmfPkg/PlatformPei: take VCPU
    count from QEMU and configure MpInitLib", 2016-11-24).


So, again, the use case that I would like to cover is:

* the exact number of APs is known at boot, to OvmfPkg/PlatformPei,

* after the very first INIT-SIPI-SIPI, MpInitLib should wait for exactly
  this number of APs to "report in", regardless of:

  - how long it takes,

  - in what order / sequence the APs report in. (Again, please remember
    that some APs may complete the initialization before other APs
    execute their very first instruction.)

* Preferably, the case should be handled when the processor count grows
  from cold boot to S3 resume (VCPU hotplug). TianoCore BZ#251 used to
  track this use case separately.

(

Jeff closed BZ#251 today, with the argument that commit 0594ec417c89
(this patch) finds the CPU count dynamically anyway, so a platform can
simply set "PcdCpuMaxLogicalProcessorNumber" to the maximum possible value.

This argument does not work in a virtual machine, because commit
0594ec417c89 (this patch) may in fact not find the VCPU count correctly
-- in a VM, (NumApsExecuting==0) does not imply that all APs have finished.

)

Thanks!
Laszlo

_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel

Reply via email to