On 10/26/15 02:43, Fan, Jeff wrote: > Laszlo, > > I could provide the patch to place AP into protected mode in > ExitBootService Callback function for your test.
That would be awesome, thank you! Laszlo > > Thanks! > Jeff > > -----Original Message----- > From: Laszlo Ersek [mailto:[email protected]] > Sent: Monday, October 26, 2015 9:33 AM > To: Paolo Bonzini; Gerd Hoffmann; Justen, Jordan L; Kinney, Michael D; Fan, > Jeff; Chen Fan > Cc: edk2-devel-01 > Subject: Re: [edk2] OVMF SMM status 2015-Oct-24 > > Got some terrible news: > > On 10/24/15 02:18, Laszlo Ersek wrote: > > [snip] > >> * QEMU: >> - Current upstream "master", at bc79082e4cd1. >> >> - Plus the following patch applied: >> [PATCH] hw/isa/lpc_ich9: inject SMI on all VCPUs if APM_STS == 'Q' >> http://thread.gmane.org/gmane.comp.emulators.qemu/371195 >> >> >> * edk2 / OVMF: > > [snip] > >> - In addition, matching the QEMU patch referenced above, >> >> [PATCH v3 27/52] OvmfPkg: use relaxed AP SMM synchronization mode >> >> becomes unnecessary and is dropped, *and* the incremental OVMF patch >> that I'm attaching now purely for illustration is squashed into >> >> [PATCH v3 13/52] OvmfPkg: implement EFI_SMM_CONTROL2_PROTOCOL with a >> DXE_RUNTIME_DRIVER > > [snip] > >> * Results: >> >> accel bits guest OS OS boots efibootmgr works on S3 resume >> ----- ---- --------------- -------- ------------------- --------- >> TCG 32 Fedlet 20141209 pass[1] BSP and AP pass >> >> TCG 64 F21 XFCE LiveCD pass[1] BSP and AP fail[2] >> >> KVM 32 Fedlet 20141209 pass BSP and AP pass >> >> KVM 64 F21 XFCE LiveCD pass BSP and AP fail[2] >> >> KVM 64 Windows 8.1 pass n/a fail[2] >> >> [1] Although the boot is successful, I'm seeing one worrying sign: it >> looks like sometime after boot (when OVMF is "all done"), the AP >> starts executing the firmware from flash (I can see the SEC messages >> up to and including "DecompressMemFvs"). I don't understand why this >> happens, but it doesn't seem right. In any case, it didn't break >> these tests. > > I understand now why [1] happens. My QEMU and OVMF patches that implemented > broadcast SMI on writes to APM_CNT triggered a bug in UefiCpuPkg. > > First, please look at the ProcessorToIdleState() function in > "UefiCpuPkg/CpuDxe/CpuMp.c". The leading comment is: > > Application Processors do loop routine > after switch to its own stack. > > It has an infinite loop with CpuSleep() -- HLT -- and CpuPause() -- PAUSE -- > calls in it. > > I added the following line to this function: > > diff --git a/UefiCpuPkg/CpuDxe/CpuMp.c b/UefiCpuPkg/CpuDxe/CpuMp.c index > 04c2f1f..d1a94b1 100644 > --- a/UefiCpuPkg/CpuDxe/CpuMp.c > +++ b/UefiCpuPkg/CpuDxe/CpuMp.c > @@ -1295,6 +1295,7 @@ ProcessorToIdleState ( > } > > CpuPause (); > + asm volatile ("outb %%al,%%dx": :"d" (0x402), "a" ('Q')); > } > > CpuSleep (); > > This makes each AP print a Q character to the QEMU debug port every time they > are woken in the idle loop (spuriously or otherwise). > > If I build OVMF without -D SMM_REQUIRE, this line of code never runs, > practically. > > If I build OVMF with -D SMM_REQUIRE (and, remember, with my SMI broadcast > patches in place), then every time the APs are dragged into SMM and *return* > from SMM, they run one iteration of this loop -- they wake up, print the Q > character, then go back to sleep. Not so bad, right? It just proves that an > SMI wakes a sleeping processor, and that the RSM needs to return to somewhere. > > Now please turn your attention to: > - the ExitBootServicesCallback() function in the same file > - the commit message for that function (git 9840b129 / SVN r16397) > - and the discussion from last November that led to it: > http://thread.gmane.org/gmane.comp.bios.tianocore.devel/11244 > > The ExitBootServicesCallback() function that we have now -- which sends an > INIT IPI to the APs, so they remain dormant until a (double) SIPI -- is good > enough *assuming* that nothing wakes those APs between > ExitBootServices() and the time the runtime OS starts them up for good. > > Unfortunately, this assumption is no longer valid, with the broadcast SMI > idea: the variable services can be (and are) called in said interval, and > they wake the dormant APs. The in-SMM stuff works just fine, but the APs have > nowhere to return to, when they see the RSM. They weren't *running* when they > got the SMI! > > So that's why I see the APs uncontrollably rebooting under [1] above. > > There are two options: > > (1) I'm noping out of the broadcast SMI idea faster than you can say > "fandango on core" -- writing to APM_CNT will raise the SMI only on > the current processor. > > This means keeping Paolo's Relaxed Sync Mode patch. It also implies > a huge performance penalty for variable services that are executed > on APs -- on both TCG and KVM. > > Here's a comparison on KVM: > >> [root@ovmf-fedora-q35 ~]# time taskset -c 0 efibootmgr >> BootCurrent: 0001 >> Timeout: 0 seconds >> BootOrder: 0001,0000,0003 >> Boot0000* EFI SCSI Device >> Boot0001* Fedora >> Boot0003* EFI Internal Shell >> >> real 0m0.006s >> user 0m0.000s >> sys 0m0.005s <------- > > versus > >> [root@ovmf-fedora-q35 ~]# time taskset -c 1 efibootmgr >> BootCurrent: 0001 >> Timeout: 0 seconds >> BootOrder: 0001,0000,0003 >> Boot0000* EFI SCSI Device >> Boot0001* Fedora >> Boot0003* EFI Internal Shell >> >> real 0m36.013s >> user 0m1.637s >> sys 0m34.374s <------- 3-4 orders of magnitude slower > > > (2) The other option is keeping the broadcast SMI idea (the actual way > to configure that in QEMU remains a topic of discussion), but then > the ExitBootServices() callback in CpuDxe will have to implement > Jeff's idea from > > http://thread.gmane.org/gmane.comp.bios.tianocore.devel/11244/focus=11247 > > That is, make all APs turn off paging, and execute a halt loop. > > (So the RSM instruction from the SMI handler has a place to return > to, until the runtime OS takes control of those APs. > > Also, I think the ExitBootServices() callback shouldn't return > until it is sure that all APs are safely off of their original code > and stacks, so some synchronization will be necessary too...) > > I'm sorry but I don't know enough to work on mode switches, so I can't > volunteer for (2). Can someone else please? > > Thanks > Laszlo > _______________________________________________ edk2-devel mailing list [email protected] https://lists.01.org/mailman/listinfo/edk2-devel

