Laszlo, If an AP is in Wait-for-SIPI state, then the AP should ignore the SMI. One description of this is in Intel(R) 64 and IA-32 Architectures Software Developer's Manual Volume 3C, Section 34.2.
If QEMU could evaluate the AP state and not send an SMI to an AP in Wait-forSIPI, then updating SMIs to broadcast to all AP should work for SeaBios and OVMF. All APs in SeaBios are in Wait-for-SIPI, so the BSP will be the only CPU that receives the SMI. In OVMF, all APs can be in the HLT loop that Jeff Fan provided a patch for (and I see you have a good update for), so the SMI can be delivered to BSP and all APs. Best regards, Mike >-----Original Message----- >From: Laszlo Ersek [mailto:[email protected]] >Sent: Monday, October 26, 2015 10:37 AM >To: Fan, Jeff; [email protected] >Cc: Kinney, Michael D >Subject: Re: [Patch 3/3] UefiCpuPkg/CpuDxe: Place APs into protected mode >when ExitBootService > >On 10/26/15 14:47, Jeff Fan wrote: >> In original ExitBootService callback function, we will send INIT IPI to all >> APs and place APs into wait-for-sipi state. > >Yes. > >> However, it prevents APs to >> response any SMI. As a result, SMM BSP needs to do a specified timeout >delay >> on each SMI. > >These two sentences are not precise. The problem is not that SMI is not >delivered to those APs that are in wait-for-sipi. > >Instead, the problem is that the APs wake from the wait-for-sipi state, >execute the SMI handler, and then the RSM makes them return into >no-man's land. > >Perhaps this is not how physical hardware works, but this is what the >symptoms I'm seeing on TCG imply. > >So is *that* our problem perhaps? Namely, should the APs that wait for >SIPI simply not respond to the SMI? > >> This fix is to place APs into hlt-loop state to let APs response SMI >> correctly. > >... In our case: to allow the APs to return "somewhere" upon executing RSM. > >More comments below: > >> Moreover, we change AP mode into protected mode on x64 image. It could >avoid >> access page table when AP runs hlt-loop code, because page table is located >in >> boot service data range may re-used by OS. >> >> Contributed-under: TianoCore Contribution Agreement 1.0 >> Signed-off-by: Jeff Fan <[email protected]> >> CC: Michael Kinney <[email protected]> >> CC: Laszlo Ersek <[email protected]> >> --- >> UefiCpuPkg/CpuDxe/CpuDxe.inf | 2 ++ >> UefiCpuPkg/CpuDxe/CpuMp.c | 19 ++++++++++++++--- >> UefiCpuPkg/CpuDxe/CpuMp.h | 12 +++++++++++ >> UefiCpuPkg/CpuDxe/Ia32/ArchFuncs.c | 37 >+++++++++++++++++++++++++++++++++ >> UefiCpuPkg/CpuDxe/X64/ArchFuncs.c | 42 >++++++++++++++++++++++++++++++++++++++ >> 5 files changed, 109 insertions(+), 3 deletions(-) >> create mode 100644 UefiCpuPkg/CpuDxe/Ia32/ArchFuncs.c >> create mode 100644 UefiCpuPkg/CpuDxe/X64/ArchFuncs.c >> >> diff --git a/UefiCpuPkg/CpuDxe/CpuDxe.inf >b/UefiCpuPkg/CpuDxe/CpuDxe.inf >> index 9db5303..8ea4c38 100644 >> --- a/UefiCpuPkg/CpuDxe/CpuDxe.inf >> +++ b/UefiCpuPkg/CpuDxe/CpuDxe.inf >> @@ -62,6 +62,7 @@ >> Ia32/MpAsm.asm | MSFT >> Ia32/MpAsm.asm | INTEL >> Ia32/MpAsm.nasm | GCC >> + Ia32/ArchFuncs.c >> >> [Sources.X64] >> X64/CpuAsm.asm | MSFT >> @@ -70,6 +71,7 @@ >> X64/MpAsm.asm | MSFT >> X64/MpAsm.asm | INTEL >> X64/MpAsm.nasm | GCC >> + X64/ArchFuncs.c >> >> [Protocols] >> gEfiCpuArchProtocolGuid ## PRODUCES >> diff --git a/UefiCpuPkg/CpuDxe/CpuMp.c b/UefiCpuPkg/CpuDxe/CpuMp.c >> index 04c2f1f..883b5d7 100644 >> --- a/UefiCpuPkg/CpuDxe/CpuMp.c >> +++ b/UefiCpuPkg/CpuDxe/CpuMp.c >> @@ -30,6 +30,8 @@ VOID *mApStackStart = 0; >> volatile BOOLEAN mAPsAlreadyInitFinished = FALSE; >> volatile BOOLEAN mStopCheckAllAPsStatus = TRUE; >> >> +volatile STARTUP_CODE *StartupCode; >> + > >This should be "extern". > >Normally both GCC and MSVC should reject linking this object file >together with ApStartup.obj. Namely: > >- ApStartup.c has > > volatile STARTUP_CODE *StartupCode = NULL; > > which, according to the C standard, is an "external definition". > >- CpuMp.c has the above. According to the C standard, that qualifies as > a "tentative definition". Given this specific translation unit (i.e., > CpuMp.c), the definition is actually an external definition too, > according to the C standard. > >Therefore both MSVC and GNU binutils are incorrect to link these two >object files together; they should be rejected. > >As a side point, I checked CpuMp.obj with "nm", both without and with >"extern". In the former case, the type of the symbol is "C" (common), >otherwise it is "U" (undefined). The practical end result is the same >(because "C" -- common -- behaves identically to "U" given the external >definition in ApStartup.c). Nonetheless, this is a bug >(standards-non-conformance) in both toolchains. > >But I digress. > >> EFI_MP_SERVICES_PROTOCOL mMpServicesTemplate = { >> GetNumberOfProcessors, >> GetProcessorInfo, >> @@ -1659,11 +1661,22 @@ ExitBootServicesCallback ( >> IN VOID *Context >> ) >> { >> + EFI_STATUS Status; >> + >> // >> - // Avoid APs access invalid buff datas which allocated by BootServices, >> - // so we send INIT IPI to APs to let them wait for SIPI state. >> + // Place AP in procted mode to run hlt-loop in ACPI NVS range. >> + // It avoids OS crashed page table and this driver in boot service range. >> // >> - SendInitIpiAllExcludingSelf (); >> + Status = mMpServicesTemplate.StartupAllAPs ( >> + &mMpServicesTemplate, >> + ChangeApMode, >> + FALSE, >> + NULL, >> + 0, >> + (VOID *) &StartupCode->CpuApRunHltLoopCode, >> + NULL >> + ); >> + ASSERT_EFI_ERROR (Status); >> } > >* So, first of all I wanted to regression-test this change, so I built >OVMF without -D SMM_REQUIRE, and booted another X64 Fedora guest of >mine >with it, using QEMU machine type "pc" (i440fx) and KVM. > >Unfortunately, this change crashes the boot process with a triple fault, >according to the KVM trace. (I tested both with and without "extern".) > > kvm_exit: reason TRIPLE_FAULT rip 0x9f01f info 0 0 > ... > kvm_userspace_exit: reason KVM_EXIT_SHUTDOWN (8) > >* Second, I didn't think of using mMpServicesTemplate.StartupAllAPs() >for this purpose. > >I thought we'd use SendInitSipiSipiAllExcludingSelf(). But, >StartupAllAPs() seems reasonable I guess. > >Two points however: > >- I think we should wait in the ExitBootServicesCallback() function >until we are sure that the APs are running the HLT loop. Otherwise we >could return while some APs are still in our own idle loop (which is >"boot services code" type memory). This would probably require some >synchronization. > >- If there is only one processor (ie. no AP, just BSP), then Status will >be EFI_NOT_STARTED -- that should be accepted too: > > ASSERT (Status == EFI_SUCCESS || Status == EFI_NOT_STARTED); > >More comments below: > >> >> /** >> diff --git a/UefiCpuPkg/CpuDxe/CpuMp.h b/UefiCpuPkg/CpuDxe/CpuMp.h >> index 907e993..3040b2e 100644 >> --- a/UefiCpuPkg/CpuDxe/CpuMp.h >> +++ b/UefiCpuPkg/CpuDxe/CpuMp.h >> @@ -21,6 +21,7 @@ >> #include <Library/SynchronizationLib.h> >> #include <Library/HobLib.h> >> #include <Library/ReportStatusCodeLib.h> >> +#include <Library/DebugLib.h> >> >> #pragma pack(1) >> >> @@ -763,5 +764,16 @@ SetMtrrsFromBuffer ( >> IN VOID *Buffer >> ); >> >> +/** >> + This function will place AP to Hlt-Loop state. >> + >> + @param Buffer Pointer to Hlt-Loop entry point. >> +**/ >> +VOID >> +EFIAPI >> +ChangeApMode ( >> + IN VOID *Buffer >> + ); >> + >> #endif // _CPU_MP_H_ >> >> diff --git a/UefiCpuPkg/CpuDxe/Ia32/ArchFuncs.c >b/UefiCpuPkg/CpuDxe/Ia32/ArchFuncs.c >> new file mode 100644 >> index 0000000..32bfab5 >> --- /dev/null >> +++ b/UefiCpuPkg/CpuDxe/Ia32/ArchFuncs.c >> @@ -0,0 +1,37 @@ >> +/** @file >> + Arch specific functions. >> + >> + Copyright (c) 2015, Intel Corporation. All rights reserved.<BR> >> + This program and the accompanying materials >> + are licensed and made available under the terms and conditions of the >BSD License >> + which accompanies this distribution. The full text of the license may be >found at >> + http://opensource.org/licenses/bsd-license.php >> + >> + THE PROGRAM IS DISTRIBUTED UNDER THE BSD LICENSE ON AN "AS IS" >BASIS, >> + WITHOUT WARRANTIES OR REPRESENTATIONS OF ANY KIND, EITHER >EXPRESS OR IMPLIED. >> + >> +**/ >> + >> +#include <CpuMp.h> >> + >> +/** >> + This function will place AP to Hlt-Loop state. >> + >> + @param Buffer Pointer to Hlt-Loop entry point. >> +**/ >> +VOID >> +EFIAPI >> +ChangeApMode ( >> + IN VOID *Buffer >> + ) >> +{ >> + STACKLESS_AP_ENTRY_POINT ApHltLoopFunc; >> + >> + ApHltLoopFunc = (STACKLESS_AP_ENTRY_POINT) (UINTN) Buffer; >> + (*ApHltLoopFunc) (); > >Shouldn't we use SwitchStack() instead? > >I'm only asking because I recall that S3ResumePei uses SwitchStack() for >non-local control transfer, when it doesn't need to switch CPU modes. > >> + >> + // >> + // If should never reach here > >"It" > >> + // >> + ASSERT (FALSE); >> +} >> diff --git a/UefiCpuPkg/CpuDxe/X64/ArchFuncs.c >b/UefiCpuPkg/CpuDxe/X64/ArchFuncs.c >> new file mode 100644 >> index 0000000..4117f0a >> --- /dev/null >> +++ b/UefiCpuPkg/CpuDxe/X64/ArchFuncs.c >> @@ -0,0 +1,42 @@ >> +/** @file >> + Arch specific functions. >> + >> + Copyright (c) 2015, Intel Corporation. All rights reserved.<BR> >> + This program and the accompanying materials >> + are licensed and made available under the terms and conditions of the >BSD License >> + which accompanies this distribution. The full text of the license may be >found at >> + http://opensource.org/licenses/bsd-license.php >> + >> + THE PROGRAM IS DISTRIBUTED UNDER THE BSD LICENSE ON AN "AS IS" >BASIS, >> + WITHOUT WARRANTIES OR REPRESENTATIONS OF ANY KIND, EITHER >EXPRESS OR IMPLIED. >> + >> +**/ >> + >> +#include <CpuMp.h> >> +#include <CpuGdt.h> >> + >> +/** >> + This function will place AP to Hlt-Loop state. >> + >> + @param Buffer Pointer to Hlt-Loop entry point. >> +**/ >> +VOID >> +EFIAPI >> +ChangeApMode ( >> + IN VOID *Buffer >> + ) >> +{ >> + EFI_STATUS Status; >> + >> + AsmDisablePaging64 ( >> + SYS_CODE_SEL, >> + (UINT32) (UINTN) Buffer, >> + 0, >> + 0, >> + (UINT32) (UINTN) &Status >> + ); > >(I certainly thought of using AsmDisablePaging64() for this, but I >couldn't figure out what to pass as "Cs". I can see SYS_CODE_SEL in >"UefiCpuPkg/CpuDxe/CpuGdt.h"... I would have never found that.) > >I have one comment here (although it might not matter much): if this >module is compiled for X64 DXE, I don't think it is guaranteed that the >address of the current stack (i.e., &Status) can be expressed in 32-bits. > >Shouldn't we allocate a small array in STARTUP_CODE, to be used as stack >here? > >> + // >> + // If should never reach here > >"It" > >> + // >> + ASSERT (FALSE); >> +} >> > >In any case, the biggest problem for me is that this doesn't seem to >work :( Soon after I hit Enter in GRUB, the guest crashes with a triple >fault. > >Here's an idea: how about using SendInitSipiSipiAllExcludingSelf()? >- no stack is necessary to set up (the HLT loop needs no stack) > >- no mode switching needed, code runs immediately in real mode > >- SendInitSipiSipiAllExcludingSelf() claims to return only after the > IPI has been accepted by the target processors, so no need to wait / > synchronize in the exit boot services callback until the APs leave > "our" code (the idle loop). > >- It would take an independent, 4KB AcpiNVS allocation, independently > of STARTUP_CODE, but that's easy to do. > >- An INIT-SIPI-SIPI sequence should also safely abort whatever job the > APs were doing at the time of ExitBootServices(). > >What do you think? > >Thanks! >Laszlo _______________________________________________ edk2-devel mailing list [email protected] https://lists.01.org/mailman/listinfo/edk2-devel

