Got some terrible news:

On 10/24/15 02:18, Laszlo Ersek wrote:

[snip]

> * QEMU:
> - Current upstream "master", at bc79082e4cd1.
> 
> - Plus the following patch applied:
>   [PATCH] hw/isa/lpc_ich9: inject SMI on all VCPUs if APM_STS == 'Q'
>   http://thread.gmane.org/gmane.comp.emulators.qemu/371195
> 
> 
> * edk2 / OVMF:

[snip]

> - In addition, matching the QEMU patch referenced above,
> 
>   [PATCH v3 27/52] OvmfPkg: use relaxed AP SMM synchronization mode
> 
>   becomes unnecessary and is dropped, *and* the incremental OVMF patch
>   that I'm attaching now purely for illustration is squashed into
> 
>   [PATCH v3 13/52] OvmfPkg: implement EFI_SMM_CONTROL2_PROTOCOL with a
>                    DXE_RUNTIME_DRIVER

[snip]

> * Results:
> 
>   accel  bits  guest OS         OS boots  efibootmgr works on  S3 resume
>   -----  ----  ---------------  --------  -------------------  ---------
>   TCG    32    Fedlet 20141209  pass[1]   BSP and AP           pass
> 
>   TCG    64    F21 XFCE LiveCD  pass[1]   BSP and AP           fail[2]
> 
>   KVM    32    Fedlet 20141209  pass      BSP and AP           pass
> 
>   KVM    64    F21 XFCE LiveCD  pass      BSP and AP           fail[2]
> 
>   KVM    64    Windows 8.1      pass      n/a                  fail[2]
> 
> [1] Although the boot is successful, I'm seeing one worrying sign: it
>     looks like sometime after boot (when OVMF is "all done"), the AP
>     starts executing the firmware from flash (I can see the SEC messages
>     up to and including "DecompressMemFvs"). I don't understand why this
>     happens, but it doesn't seem right. In any case, it didn't break
>     these tests.

I understand now why [1] happens. My QEMU and OVMF patches that
implemented broadcast SMI on writes to APM_CNT triggered a bug in
UefiCpuPkg.

First, please look at the ProcessorToIdleState() function in
"UefiCpuPkg/CpuDxe/CpuMp.c". The leading comment is:

  Application Processors do loop routine
  after switch to its own stack.

It has an infinite loop with CpuSleep() -- HLT -- and CpuPause() --
PAUSE -- calls in it.

I added the following line to this function:

diff --git a/UefiCpuPkg/CpuDxe/CpuMp.c b/UefiCpuPkg/CpuDxe/CpuMp.c
index 04c2f1f..d1a94b1 100644
--- a/UefiCpuPkg/CpuDxe/CpuMp.c
+++ b/UefiCpuPkg/CpuDxe/CpuMp.c
@@ -1295,6 +1295,7 @@ ProcessorToIdleState (
     }

     CpuPause ();
+    asm volatile ("outb %%al,%%dx": :"d" (0x402), "a" ('Q'));
   }

   CpuSleep ();

This makes each AP print a Q character to the QEMU debug port every time
they are woken in the idle loop (spuriously or otherwise).

If I build OVMF without -D SMM_REQUIRE, this line of code never runs,
practically.

If I build OVMF with -D SMM_REQUIRE (and, remember, with my SMI
broadcast patches in place), then every time the APs are dragged into
SMM and *return* from SMM, they run one iteration of this loop -- they
wake up, print the Q character, then go back to sleep. Not so bad,
right? It just proves that an SMI wakes a sleeping processor, and that
the RSM needs to return to somewhere.

Now please turn your attention to:
- the ExitBootServicesCallback() function in the same file
- the commit message for that function (git 9840b129 / SVN r16397)
- and the discussion from last November that led to it:
  http://thread.gmane.org/gmane.comp.bios.tianocore.devel/11244

The ExitBootServicesCallback() function that we have now -- which sends
an INIT IPI to the APs, so they remain dormant until a (double) SIPI --
is good enough *assuming* that nothing wakes those APs between
ExitBootServices() and the time the runtime OS starts them up for good.

Unfortunately, this assumption is no longer valid, with the broadcast
SMI idea: the variable services can be (and are) called in said
interval, and they wake the dormant APs. The in-SMM stuff works just
fine, but the APs have nowhere to return to, when they see the RSM. They
weren't *running* when they got the SMI!

So that's why I see the APs uncontrollably rebooting under [1] above.

There are two options:

(1) I'm noping out of the broadcast SMI idea faster than you can say
    "fandango on core" -- writing to APM_CNT will raise the SMI only on
    the current processor.

    This means keeping Paolo's Relaxed Sync Mode patch. It also implies
    a huge performance penalty for variable services that are executed
    on APs -- on both TCG and KVM.

    Here's a comparison on KVM:

> [root@ovmf-fedora-q35 ~]# time taskset -c 0 efibootmgr
> BootCurrent: 0001
> Timeout: 0 seconds
> BootOrder: 0001,0000,0003
> Boot0000* EFI SCSI Device
> Boot0001* Fedora
> Boot0003* EFI Internal Shell
>
> real    0m0.006s
> user    0m0.000s
> sys     0m0.005s <-------

versus

> [root@ovmf-fedora-q35 ~]# time taskset -c 1 efibootmgr
> BootCurrent: 0001
> Timeout: 0 seconds
> BootOrder: 0001,0000,0003
> Boot0000* EFI SCSI Device
> Boot0001* Fedora
> Boot0003* EFI Internal Shell
>
> real    0m36.013s
> user    0m1.637s
> sys     0m34.374s <------- 3-4 orders of magnitude slower


(2) The other option is keeping the broadcast SMI idea (the actual way
    to configure that in QEMU remains a topic of discussion), but then
    the ExitBootServices() callback in CpuDxe will have to implement
    Jeff's idea from

http://thread.gmane.org/gmane.comp.bios.tianocore.devel/11244/focus=11247

    That is, make all APs turn off paging, and execute a halt loop.

    (So the RSM instruction from the SMI handler has a place to return
    to, until the runtime OS takes control of those APs.

    Also, I think the ExitBootServices() callback shouldn't return
    until it is sure that all APs are safely off of their original code
    and stacks, so some synchronization will be necessary too...)

I'm sorry but I don't know enough to work on mode switches, so I can't
volunteer for (2). Can someone else please?

Thanks
Laszlo
_______________________________________________
edk2-devel mailing list
[email protected]
https://lists.01.org/mailman/listinfo/edk2-devel

Reply via email to