Re: [edk2] [PATCH v3 00/52] OvmfPkg: support SMM for better security (steps towards MP and X64)

2015-10-21 Thread Laszlo Ersek
On 10/21/15 17:11, Paolo Bonzini wrote:
> 
> 
> On 21/10/2015 17:04, Laszlo Ersek wrote:
>> Now, on TCG, reading the APIC ID register (for device "apic") happens in:
>>
>> apic_mem_readl() [hw/intc/apic.c]
>>   val = s->id << 24
>>
>> Whereas on KVM, the same occurs in:
>>
>> kvm_apic_mem_read() [hw/i386/kvm/apic.c]
>>   return ~(uint64_t)0;
>>
>> However, such reads don't seem to reach QEMU (and the above read stub);
>> they are handled within KVM (I don't know where the distinction is made
>> in KVM).
> 
> It's here:
> 
> static u32 __apic_read(struct kvm_lapic *apic, unsigned int offset)
> {
> u32 val = 0;
> 
> if (offset >= LAPIC_MMIO_LENGTH)
> return 0;
> 
> switch (offset) {
> case APIC_ID:
> if (apic_x2apic_mode(apic))
> val = kvm_apic_id(apic);
> else
> val = kvm_apic_id(apic) << 24;
> break;

Sorry, I don't understand -- is this the distinction between normal and
x2apic mode, or the distinction whether the read access will be
forwarded to userspace vs. handled in KVM? My statement was about the
latter question.

> I'll try following the equally dizzying chains in OVMF

(not contesting that! :))

> and see why KVM
> is in x2apic mode.

I don't think KVM is in x2apic mode. I have no proof either way, but I
saw nothing implying x2apic. FWIW, LocalApicLib has two instances in edk2:

UefiCpuPkg/Library/BaseXApicLib/BaseXApicLib.inf
("supports xAPIC mode only")

UefiCpuPkg/Library/BaseXApicX2ApicLib/BaseXApicX2ApicLib.inf
("supports x2APIC capable processors which have xAPIC and x2APIC modes")

and all client modules in OVMF are linked against the former.

> In any case, "-cpu foo,-x2apic" is the next thing to
> try.

Tried it just now, no observable change.

... I gather it works on upstream or kvm/next host kernels; is that
correct? Maybe I should try a reverse bisection. *Shudder*.

Thanks!
Laszlo

___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH v3 00/52] OvmfPkg: support SMM for better security (steps towards MP and X64)

2015-10-21 Thread Laszlo Ersek
On 10/21/15 14:10, Laszlo Ersek wrote:
> On 10/15/15 00:25, Laszlo Ersek wrote:
> 
>> Test environment and results:
>>
>> Host kernel:
>> - latest RHEL-7 development kernel (3.10.0-323.el7), with Paolo's
>>   following patches backported by yours truly:
>>   - KVM: x86: clean up kvm_arch_vcpu_runnable
>>   - KVM: x86: fix SMI to halted VCPU
>>
>> QEMU:
>> - current upstream (c49d3411faae), with Paolo's patch applied:
>>   - target-i386: allow any alignment for SMBASE
>>
>> Below, the meaning of "bitness=32" is:
>> * qemu-system-i386
>> * -cpu coreduo,-nx
>>
>> Whereas "bitness=64" means:
>> * qemu-system-x86_64
>> * no special -cpu flag
>>
>> For variable access verification, "efibootmgr" is invoked (without
>> options) at the guest OS (Fedlet 20141209) root prompt.
>>
>>   bitness  accel  VCPUs  result
>>   ---  -  -  ---
>>   32   KVM1  Fedlet 20141209 boots, S3 works, variables work
>>
>>   32   KVM2  stuck in SMBASE relocation, APIC IDs look valid
> 
> Alright, so I've dug into this. It's very interesting.
> 
> First, here's the debug patch for edk2:
> 
> -
> diff --git a/UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.c 
> b/UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.c
> index 0e39173..bcfa075 100644
> --- a/UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.c
> +++ b/UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.c
> @@ -442,11 +442,15 @@ SmmRelocateBases (
>for (Index = 0; Index < mNumberOfCpus; Index++) {
>  mRebased[Index] = FALSE;
>  if (ApicId != (UINT32)gSmmCpuPrivate->ProcessorInfo[Index].ProcessorId) {
> +  DEBUG ((EFI_D_VERBOSE, "%a: sending SMI IPI to APIC ID 0x%Lx\n",
> +__FUNCTION__, gSmmCpuPrivate->ProcessorInfo[Index].ProcessorId));
>SendSmiIpi ((UINT32)gSmmCpuPrivate->ProcessorInfo[Index].ProcessorId);
>//
>// Wait for this AP to finish its 1st SMI
>//
>while (!mRebased[Index]);
> +  DEBUG ((EFI_D_VERBOSE, "%a: APIC ID 0x%Lx has processed its first 
> SMI\n",
> +__FUNCTION__, gSmmCpuPrivate->ProcessorInfo[Index].ProcessorId));
>  } else {
>//
>// BSP will be Relocated later
> -
> 
> As one can expect, the first message appears in the log:
> 
> 
> SMRAM TileSize = 0800
> CPU[000]  APIC ID=  SMBASE=7FFC1000  SaveState=7FFD0C00  Size=0400
> CPU[001]  APIC ID=0001  SMBASE=7FFC1800  SaveState=7FFD1400  Size=0400
> SmmRelocateBases: sending SMI IPI to APIC ID 0x1
> 
> 
> but the second message doesn't; the (!mRebased[Index]) condition
> never evaluates to false, so the loop is never exited.
> 
> Second, I sought to analyze the KVM trace very carefully, against the
> SendSmiIpi() source code in edk2, and against the KVM source code.
> Here comes the kicker: KVM interprets the APIC ICR (high, low) writes
> correctly, injects the SMI, VCPU#1 wakes and enters SMM (!), then
> leaves SMM with a relocated SMBASE field (!!!).
> 
> *However*, according to the KVM trace, the relocated SMBASE field is
> *wrong* -- the value being reported below, 0x7ffc1000, corresponds to
> CPU#0 above!
> 
> 
>  qemu-system-i38-22085 [000] 13634.057590: kvm_enter_smm:vcpu 1: 
> leaving SMM, smbase 0x7ffc1000
> 
> 
> Then VCPU#1 goes on to do various things (I'm too lazy to analyze all
> those trace entries), but ultimately it reaches a HLT. And the busy
> wait in SmmRelocateBases() never completes, because vcpu #1 seems to
> have looked at VCPU#0's area.
> 
> Given that this works with TCG, I *guess* it is either a KVM bug, or
> some visibility race. I'll have to look at more.

I added more debug messages to edk2, and compared the logs generated
when running on KVM against those generated when running on TCG. The
point where things go sideways in edk2 is this:

The SmmInitHandler() function
[UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.c] has the following leading
comment:

"C function for SMI handler. To change all processor's SMMBase Register."

It is called from the assembly language entry point code, in System
Management Mode, for the purposes of initial SMBASE relocation.

In this function, the processor that executes the function identifies
itself by calling GetApicId(). The retrieved APIC ID is located within
the ProcessorInfo array (in order to map the APIC ID to a CPU index),
and then the rest of the relocation code uses this CPU index, for poking
into the SMBASE and SaveState "tiles".

Now, for VCPU#1, GetApicId() returns APIC ID 0x01 on TCG, and 0x00 on
KVM. Things go downhill from there.

The GetApicId() function
[UefiCpuPkg/Library/BaseXApicLib/BaseXApicLib.c] does two (interesting)
things: it checks the initial APIC ID first (with a CPUID instruction),
then -- because the former is under 0x100 -- it does:

//
// If the initial local APIC ID is less 0x100, read APIC ID from
// XAPIC_ID_OFFSET, otherwise return the initial local APIC ID.
//
ApicId 

Re: [edk2] [PATCH v3 00/52] OvmfPkg: support SMM for better security (steps towards MP and X64)

2015-10-21 Thread Laszlo Ersek
On 10/21/15 12:27, Paolo Bonzini wrote:
> 
> 
> On 21/10/2015 12:22, Laszlo Ersek wrote:
>> On 10/21/15 11:51, Paolo Bonzini wrote:
>>>
>>>
>>> On 20/10/2015 12:08, Laszlo Ersek wrote:
>>
>>   64   KVM>=1"KVM: entry failed, hardware error 0x8021"
>>  while guest in SMBASE relocation
 Tracing KVM shows the following:

  qemu-system-x86-3236  [001] 10586.857752: kvm_enter_smm:vcpu 1: 
 entering SMM, smbase 0x3
 ...
  qemu-system-x86-3236  [001] 10586.857863: kvm_enter_smm:vcpu 1: 
 leaving SMM, smbase 0x0
  qemu-system-x86-3236  [001] 10586.857865: kvm_entry:vcpu 1
  qemu-system-x86-3236  [001] 10586.857866: kvm_exit: reason 
 UNKNOWN rip 0x0 info 0 8306
  qemu-system-x86-3236  [001] 10586.857909: kvm_userspace_exit:   reason 
 KVM_EXIT_FAIL_ENTRY (9)
>>>
>>> This seems like you do not have this patch: 
>>> https://lkml.org/lkml/2015/10/14/462
>>
>> That's right, I don't have it; but in the guest I do have your
>>
>>   UefiCpuPkg: PiSmmCpuDxeSmm: do not execute RSM from 64-bit mode
>>   http://thread.gmane.org/gmane.comp.bios.edk2.devel/3020/focus=3023
>>
>> My understanding was that either of those two patches was sufficient.
> 
> No, this one fixes return _from_ 64-bit mode.  The kernel patch fixes
> return _to_ 64-bit mode and will be in 4.3.
> 
>> IIRC you asked me to keep the guest patch around in my branch for ease
>> of testing, until the above KVM patch is released with Linux 4.4. Is
>> that correct?
> 
> This will be the kernel fix for return _from_ 64-bit mode.

I can confirm that the patch you referenced makes the above failure go away. 
Meaning, with >= 2 VCPUs, the failure is replaced with the infinite loop that 
is also seen in the 32-bit case. (In other words, with this host patch in 
place, the 32-bit and 64-bit cases, KVM, >= 2 VCPU, behave identically.)

With 1 VCPU, "Fedora-Live-Xfce-x86_64-21-5.iso" boots, but then panics with:

[0.003008] ACPI: Core revision 20140724
[0.004328] ACPI: All ACPI Tables successfully acquired
[0.005418] PANIC: double fault, error_code: 0x0
[0.005895] CPU: 0 PID: 0 Comm: (null) Not tainted 3.17.4-301.fc21.x86_64 #1
[0.006000] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 
02/06/2015
[0.006000] task:   (null) ti: c028 task.ti:   
(null)
[0.006000] RIP: 0008:[<7ffc0955>]  [<7ffc0955>] 0x7ffc0955
[0.006000] RSP: 0020:7ffaadf0  EFLAGS: 00010002
[0.006000] RAX: 0800 RBX: 0033 RCX: c080
[0.006000] RDX:  RSI: 81c03d60 RDI: 7ffa3000
[0.006000] RBP: 81c03c90 R08:  R09: 
[0.006000] R10: 7febb010 R11: 0007 R12: 81c03ed0
[0.006000] R13: 0007 R14:  R15: 0009c000
[0.006000] FS:  (0020) GS:(0020) 
knlGS:
[0.006000] CS:  0010 DS: 0020 ES: 0020 CR0: 80050033
[0.006000] CR2: 7ffaade8 CR3: 0009c000 CR4: 06b0
[0.006000] Stack:

The RIP looks suspiciously close (interpreted as a phys address) to the SMRAM 
range, but that's just a wild guess.

Anyway, we got proof that I had been missing this host kernel patch.

Thanks!
Laszlo
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH v3 00/52] OvmfPkg: support SMM for better security (steps towards MP and X64)

2015-10-21 Thread Laszlo Ersek
On 10/21/15 17:04, Laszlo Ersek wrote:
> On 10/21/15 14:10, Laszlo Ersek wrote:
>> On 10/15/15 00:25, Laszlo Ersek wrote:
>>
>>> Test environment and results:
>>>
>>> Host kernel:
>>> - latest RHEL-7 development kernel (3.10.0-323.el7), with Paolo's
>>>   following patches backported by yours truly:
>>>   - KVM: x86: clean up kvm_arch_vcpu_runnable
>>>   - KVM: x86: fix SMI to halted VCPU
>>>
>>> QEMU:
>>> - current upstream (c49d3411faae), with Paolo's patch applied:
>>>   - target-i386: allow any alignment for SMBASE
>>>
>>> Below, the meaning of "bitness=32" is:
>>> * qemu-system-i386
>>> * -cpu coreduo,-nx
>>>
>>> Whereas "bitness=64" means:
>>> * qemu-system-x86_64
>>> * no special -cpu flag
>>>
>>> For variable access verification, "efibootmgr" is invoked (without
>>> options) at the guest OS (Fedlet 20141209) root prompt.
>>>
>>>   bitness  accel  VCPUs  result
>>>   ---  -  -  ---
>>>   32   KVM1  Fedlet 20141209 boots, S3 works, variables work
>>>
>>>   32   KVM2  stuck in SMBASE relocation, APIC IDs look valid
>>
>> Alright, so I've dug into this. It's very interesting.
>>
>> First, here's the debug patch for edk2:
>>
>> -
>> diff --git a/UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.c 
>> b/UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.c
>> index 0e39173..bcfa075 100644
>> --- a/UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.c
>> +++ b/UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.c
>> @@ -442,11 +442,15 @@ SmmRelocateBases (
>>for (Index = 0; Index < mNumberOfCpus; Index++) {
>>  mRebased[Index] = FALSE;
>>  if (ApicId != (UINT32)gSmmCpuPrivate->ProcessorInfo[Index].ProcessorId) 
>> {
>> +  DEBUG ((EFI_D_VERBOSE, "%a: sending SMI IPI to APIC ID 0x%Lx\n",
>> +__FUNCTION__, gSmmCpuPrivate->ProcessorInfo[Index].ProcessorId));
>>SendSmiIpi ((UINT32)gSmmCpuPrivate->ProcessorInfo[Index].ProcessorId);
>>//
>>// Wait for this AP to finish its 1st SMI
>>//
>>while (!mRebased[Index]);
>> +  DEBUG ((EFI_D_VERBOSE, "%a: APIC ID 0x%Lx has processed its first 
>> SMI\n",
>> +__FUNCTION__, gSmmCpuPrivate->ProcessorInfo[Index].ProcessorId));
>>  } else {
>>//
>>// BSP will be Relocated later
>> -
>>
>> As one can expect, the first message appears in the log:
>>
>> 
>> SMRAM TileSize = 0800
>> CPU[000]  APIC ID=  SMBASE=7FFC1000  SaveState=7FFD0C00  Size=0400
>> CPU[001]  APIC ID=0001  SMBASE=7FFC1800  SaveState=7FFD1400  Size=0400
>> SmmRelocateBases: sending SMI IPI to APIC ID 0x1
>> 
>>
>> but the second message doesn't; the (!mRebased[Index]) condition
>> never evaluates to false, so the loop is never exited.
>>
>> Second, I sought to analyze the KVM trace very carefully, against the
>> SendSmiIpi() source code in edk2, and against the KVM source code.
>> Here comes the kicker: KVM interprets the APIC ICR (high, low) writes
>> correctly, injects the SMI, VCPU#1 wakes and enters SMM (!), then
>> leaves SMM with a relocated SMBASE field (!!!).
>>
>> *However*, according to the KVM trace, the relocated SMBASE field is
>> *wrong* -- the value being reported below, 0x7ffc1000, corresponds to
>> CPU#0 above!
>>
>> 
>>  qemu-system-i38-22085 [000] 13634.057590: kvm_enter_smm:vcpu 1: 
>> leaving SMM, smbase 0x7ffc1000
>> 
>>
>> Then VCPU#1 goes on to do various things (I'm too lazy to analyze all
>> those trace entries), but ultimately it reaches a HLT. And the busy
>> wait in SmmRelocateBases() never completes, because vcpu #1 seems to
>> have looked at VCPU#0's area.
>>
>> Given that this works with TCG, I *guess* it is either a KVM bug, or
>> some visibility race. I'll have to look at more.
> 
> I added more debug messages to edk2, and compared the logs generated
> when running on KVM against those generated when running on TCG. The
> point where things go sideways in edk2 is this:
> 
> The SmmInitHandler() function
> [UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.c] has the following leading
> comment:
> 
> "C function for SMI handler. To change all processor's SMMBase Register."
> 
> It is called from the assembly language entry point code, in System
> Management Mode, for the purposes of initial SMBASE relocation.
> 
> In this function, the processor that executes the function identifies
> itself by calling GetApicId(). The retrieved APIC ID is located within
> the ProcessorInfo array (in order to map the APIC ID to a CPU index),
> and then the rest of the relocation code uses this CPU index, for poking
> into the SMBASE and SaveState "tiles".
> 
> Now, for VCPU#1, GetApicId() returns APIC ID 0x01 on TCG, and 0x00 on
> KVM. Things go downhill from there.

More strangeness: after replacing all GetApicId() calls with
GetInitialApicId() in PiSmmCpuDxeSmm, the SMBASE relocation succeeds for
(32-bit, KVM, 2 VCPUs), but (64-bit, KVM, 2 VCPUs) remains stuck in the
loop. %-/

Re: [edk2] [PATCH v3 00/52] OvmfPkg: support SMM for better security (steps towards MP and X64)

2015-10-21 Thread Paolo Bonzini


On 21/10/2015 12:22, Laszlo Ersek wrote:
> On 10/21/15 11:51, Paolo Bonzini wrote:
>>
>>
>> On 20/10/2015 12:08, Laszlo Ersek wrote:
>
>   64   KVM>=1"KVM: entry failed, hardware error 0x8021"
>  while guest in SMBASE relocation
>>> Tracing KVM shows the following:
>>>
>>>  qemu-system-x86-3236  [001] 10586.857752: kvm_enter_smm:vcpu 1: 
>>> entering SMM, smbase 0x3
>>> ...
>>>  qemu-system-x86-3236  [001] 10586.857863: kvm_enter_smm:vcpu 1: 
>>> leaving SMM, smbase 0x0
>>>  qemu-system-x86-3236  [001] 10586.857865: kvm_entry:vcpu 1
>>>  qemu-system-x86-3236  [001] 10586.857866: kvm_exit: reason 
>>> UNKNOWN rip 0x0 info 0 8306
>>>  qemu-system-x86-3236  [001] 10586.857909: kvm_userspace_exit:   reason 
>>> KVM_EXIT_FAIL_ENTRY (9)
>>
>> This seems like you do not have this patch: 
>> https://lkml.org/lkml/2015/10/14/462
> 
> That's right, I don't have it; but in the guest I do have your
> 
>   UefiCpuPkg: PiSmmCpuDxeSmm: do not execute RSM from 64-bit mode
>   http://thread.gmane.org/gmane.comp.bios.edk2.devel/3020/focus=3023
> 
> My understanding was that either of those two patches was sufficient.

No, this one fixes return _from_ 64-bit mode.  The kernel patch fixes
return _to_ 64-bit mode and will be in 4.3.

> IIRC you asked me to keep the guest patch around in my branch for ease
> of testing, until the above KVM patch is released with Linux 4.4. Is
> that correct?

This will be the kernel fix for return _from_ 64-bit mode.

Paolo
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH v3 00/52] OvmfPkg: support SMM for better security (steps towards MP and X64)

2015-10-20 Thread Laszlo Ersek
On 10/15/15 00:25, Laszlo Ersek wrote:

> Test environment and results:
> 
> Host kernel:
> - latest RHEL-7 development kernel (3.10.0-323.el7), with Paolo's
>   following patches backported by yours truly:
>   - KVM: x86: clean up kvm_arch_vcpu_runnable
>   - KVM: x86: fix SMI to halted VCPU
> 
> QEMU:
> - current upstream (c49d3411faae), with Paolo's patch applied:
>   - target-i386: allow any alignment for SMBASE
> 
> Below, the meaning of "bitness=32" is:
> * qemu-system-i386
> * -cpu coreduo,-nx
> 
> Whereas "bitness=64" means:
> * qemu-system-x86_64
> * no special -cpu flag
> 
> For variable access verification, "efibootmgr" is invoked (without
> options) at the guest OS (Fedlet 20141209) root prompt.
> 
>   bitness  accel  VCPUs  result
>   ---  -  -  ---
>   32   KVM1  Fedlet 20141209 boots, S3 works, variables work
> 
>   32   KVM2  stuck in SMBASE relocation, APIC IDs look valid
> 
>   32   TCG1  Fedlet 20141209 boots, S3 works, variables work
> 
>   32   TCG2  Fedlet 20141209 boots, variables (efibootmgr)
>  are broken -- nothing is printed

I'm trying to track down this failure.

I wrote a separate reproducer, a small UEFI application, that does the
following:
- call GetMemoryMap() and ExitBootServices()
- enumerate variables with GetNextVariableName()
- shut down the VM with ResetSystem().

This app works perfectly with #vcpus = 1 and #vcpus = 2 as well, using
TCG. This proves that runtime variable access works with multiple CPUs too.

However, when the GetNextVariableName() calls are issued from within
Fedlet, they work only with #vcpus = 1; the same EFI service call fails
with #vcpus = 2.

In other words, I cannot reproduce the issue from a UEFI shell app, in
spite of calling GetNextVariableName() after ExitBootServices().

This is extremely annoying. Something gets corrupted when the SMM-backed
variable service is called from within the Fedlet OS (with #vcpus=2
only), but I have no clue what.

Maybe the problem is caused by the combination of #vcpus=2 and
SetVirtualAddressMap(). Because, one aspect of the (failing) Fedlet
environment that my UEFI app does *not* reflect is the non-1:1 mapping.

Paolo, can you please advise me on how to set up a KVM env / host kernel
that is supposed to work? What host kernel should I build? I have a
spare laptop (with Fedora 22 I believe) with unrestricted guest support
in the hardware. I think it should be possible to boot a Fedora 22
userland with a fresh upstream kernel. (I won't try that with my main,
new RHEL-7 laptop.)

I'd like to test this on KVM, because the problem doesn't seem to
reproduce unless I boot a real guest OS, and doing that on TCG is about
as exciting as watching paint dry.

Thanks!
Laszlo

> 
>   64   KVM>=1"KVM: entry failed, hardware error 0x8021"
>  while guest in SMBASE relocation
> 
>   64   TCG1  F21 XFCE LiveCD boots, variable access OK, S3
>  resume triggers InternalX86EnablePaging64()
>  ASSERT() in
>  "MdePkg/Library/BaseLib/X64/Non-existing.c".
>  Looks like a bug in S3Resume2Pei?
> 
>   64   TCG2  F21 XFCE LiveCD boots, variable access
>  (efibootmgr) is broken -- reports EINVAL
> 
> These results are not consistent with the ones seen by Paolo, which is
> why I didn't update the last (OvmfPkg/README) patch in the series, about
> the current level of functionality. (Added a note to it about this
> fact.)
> 
> What makes me glad is that the scenario that I've been testing and
> re-testing successfully since May, i.e., 32-bit/KVM/UP, *does* work.
> 
> Thanks
> Laszlo
> 
> Cc: Paolo Bonzini 
> Cc: Jordan Justen 
> Cc: Michael Kinney 
> 
> Laszlo Ersek (42):
>   UefiCpuPkg: CpuDxe: broadcast MTRR changes to APs
>   OvmfPkg: introduce -D SMM_REQUIRE and PcdSmmSmramRequire
>   OvmfPkg: Sec: force reinit of BaseExtractGuidedSectionLib handler
> table
>   OvmfPkg: Sec: assert the build-time calculated end of the scratch
> buffer
>   OvmfPkg: decompress FVs on S3 resume if SMM_REQUIRE is set
>   OvmfPkg: PlatformPei: allow caching in AddReservedMemoryBaseSizeHob()
>   OvmfPkg: PlatformPei: account for TSEG size with PcdSmmSmramRequire
> set
>   OvmfPkg: add PEIM for providing TSEG-as-SMRAM during PEI
>   OvmfPkg: add DXE_DRIVER for providing TSEG-as-SMRAM during boot-time
> DXE
>   OvmfPkg: implement EFI_SMM_CONTROL2_PROTOCOL with a DXE_RUNTIME_DRIVER
>   OvmfPkg: pull in the SMM IPL and SMM core
>   OvmfPkg: pull in CpuIo2Smm driver
>   OvmfPkg: AcpiS3SaveDxe: don't fake LockBox protocol if SMM_REQUIRE
>   OvmfPkg: LockBox: -D SMM_REQUIRE excludes our fake lockbox
>   OvmfPkg: LockBox: use SMM stack with -D SMM_REQUIRE
>   OvmfPkg: 

Re: [edk2] [PATCH v3 00/52] OvmfPkg: support SMM for better security (steps towards MP and X64)

2015-10-20 Thread Laszlo Ersek
On 10/15/15 00:25, Laszlo Ersek wrote:

> Test environment and results:
> 
> Host kernel:
> - latest RHEL-7 development kernel (3.10.0-323.el7), with Paolo's
>   following patches backported by yours truly:
>   - KVM: x86: clean up kvm_arch_vcpu_runnable
>   - KVM: x86: fix SMI to halted VCPU
> 
> QEMU:
> - current upstream (c49d3411faae), with Paolo's patch applied:
>   - target-i386: allow any alignment for SMBASE
> 
> Below, the meaning of "bitness=32" is:
> * qemu-system-i386
> * -cpu coreduo,-nx
> 
> Whereas "bitness=64" means:
> * qemu-system-x86_64
> * no special -cpu flag
> 
> For variable access verification, "efibootmgr" is invoked (without
> options) at the guest OS (Fedlet 20141209) root prompt.
> 
>   bitness  accel  VCPUs  result
>   ---  -  -  ---
>   32   KVM1  Fedlet 20141209 boots, S3 works, variables work
> 
>   32   KVM2  stuck in SMBASE relocation, APIC IDs look valid
> 
>   32   TCG1  Fedlet 20141209 boots, S3 works, variables work
> 
>   32   TCG2  Fedlet 20141209 boots, variables (efibootmgr)
>  are broken -- nothing is printed
> 
>   64   KVM>=1"KVM: entry failed, hardware error 0x8021"
>  while guest in SMBASE relocation

Tracing KVM shows the following:

 qemu-system-x86-3236  [001] 10586.857752: kvm_enter_smm:vcpu 1: 
entering SMM, smbase 0x3
...
 qemu-system-x86-3236  [001] 10586.857863: kvm_enter_smm:vcpu 1: 
leaving SMM, smbase 0x0
 qemu-system-x86-3236  [001] 10586.857865: kvm_entry:vcpu 1
 qemu-system-x86-3236  [001] 10586.857866: kvm_exit: reason UNKNOWN 
rip 0x0 info 0 8306
 qemu-system-x86-3236  [001] 10586.857909: kvm_userspace_exit:   reason 
KVM_EXIT_FAIL_ENTRY (9)


> 
>   64   TCG1  F21 XFCE LiveCD boots, variable access OK, S3
>  resume triggers InternalX86EnablePaging64()
>  ASSERT() in
>  "MdePkg/Library/BaseLib/X64/Non-existing.c".
>  Looks like a bug in S3Resume2Pei?
> 
>   64   TCG2  F21 XFCE LiveCD boots, variable access
>  (efibootmgr) is broken -- reports EINVAL


___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH v3 00/52] OvmfPkg: support SMM for better security (steps towards MP and X64)

2015-10-20 Thread Laszlo Ersek
On 10/20/15 10:52, Laszlo Ersek wrote:
> On 10/15/15 00:25, Laszlo Ersek wrote:
> 
>> Test environment and results:
>>
>> Host kernel:
>> - latest RHEL-7 development kernel (3.10.0-323.el7), with Paolo's
>>   following patches backported by yours truly:
>>   - KVM: x86: clean up kvm_arch_vcpu_runnable
>>   - KVM: x86: fix SMI to halted VCPU
>>
>> QEMU:
>> - current upstream (c49d3411faae), with Paolo's patch applied:
>>   - target-i386: allow any alignment for SMBASE
>>
>> Below, the meaning of "bitness=32" is:
>> * qemu-system-i386
>> * -cpu coreduo,-nx
>>
>> Whereas "bitness=64" means:
>> * qemu-system-x86_64
>> * no special -cpu flag
>>
>> For variable access verification, "efibootmgr" is invoked (without
>> options) at the guest OS (Fedlet 20141209) root prompt.
>>
>>   bitness  accel  VCPUs  result
>>   ---  -  -  ---
>>   32   KVM1  Fedlet 20141209 boots, S3 works, variables work
>>
>>   32   KVM2  stuck in SMBASE relocation, APIC IDs look valid
>>
>>   32   TCG1  Fedlet 20141209 boots, S3 works, variables work
>>
>>   32   TCG2  Fedlet 20141209 boots, variables (efibootmgr)
>>  are broken -- nothing is printed
> 
> I'm trying to track down this failure.
> 
> I wrote a separate reproducer, a small UEFI application, that does the
> following:
> - call GetMemoryMap() and ExitBootServices()
> - enumerate variables with GetNextVariableName()
> - shut down the VM with ResetSystem().
> 
> This app works perfectly with #vcpus = 1 and #vcpus = 2 as well, using
> TCG. This proves that runtime variable access works with multiple CPUs too.
> 
> However, when the GetNextVariableName() calls are issued from within
> Fedlet, they work only with #vcpus = 1; the same EFI service call fails
> with #vcpus = 2.
> 
> In other words, I cannot reproduce the issue from a UEFI shell app, in
> spite of calling GetNextVariableName() after ExitBootServices().
> 
> This is extremely annoying. Something gets corrupted when the SMM-backed
> variable service is called from within the Fedlet OS (with #vcpus=2
> only), but I have no clue what.
> 
> Maybe the problem is caused by the combination of #vcpus=2 and
> SetVirtualAddressMap(). Because, one aspect of the (failing) Fedlet
> environment that my UEFI app does *not* reflect is the non-1:1 mapping.

I managed to collect more details.

The variable access fails *iff* the access is made by a VCPU that is
*not* VCPU#0.

I added a bunch of debug messages to the following functions:

- InitCommunicateBuffer(), SendCommunicateBuffer()
  [MdeModulePkg/Universal/Variable/RuntimeDxe/VariableSmmRuntimeDxe.c]

- BSPHandler(), APHandler()
  [UefiCpuPkg/PiSmmCpuDxeSmm/MpService.c]

Now, when the "efibootmgr" utility is run on CPU#0, everything works:

> [root@ovmf-fedlet ~]# taskset -c 0 efibootmgr
> BootCurrent: 0004
> Timeout: 0 seconds
> BootOrder: 0004,0003
> Boot0003* EFI Internal Shell
> Boot0004* fedlet grub

and the OVMF debug log contains a sequence of the following snippets --
the snippet is repeated once per variable:

> InitCommunicateBuffer: Function=1
> InitCommunicateBuffer: Status=Success
> BSPHandler: 318: CpuIndex=0
> BSPHandler: 544: CpuIndex=0
> SendCommunicateBuffer: ReturnStatus=Buffer Too Small
>
> InitCommunicateBuffer: Function=1
> InitCommunicateBuffer: Status=Success
> BSPHandler: 318: CpuIndex=0
> BSPHandler: 544: CpuIndex=0
> SendCommunicateBuffer: ReturnStatus=Success

Here Function=1 is

  SMM_VARIABLE_FUNCTION_GET_VARIABLE
  [MdeModulePkg/Include/Guid/SmmVariableCommon.h]

It is one of the message types that the "unprivileged" runtime variable
driver composes for the "privileged" SMM variable driver. In the above
log, you can see
(1) the message buffer being initialized for this message type -- in the
unprivileged driver --,
(2) then the BSPHandler() function starting and completing (in SMM),
(3) finally the unprivileged driver looking at the status code coming
back, in the message buffer, from the privileged driver.

The first triplet ends with "Buffer Too Small", because efibootmgr --
apparently -- only queries the buffer size it needs to allocate for
fetching the variable's contents. The second triplet succeeds.

Side note: when testing variable access from a UEFI application or the
UEFI shell directly, the code *always* runs on VCPU#0! That's why such
testing always succeeds.

Now, here's what happens when the same is attempted from VCPU#1:

> [root@ovmf-fedlet ~]# taskset -c 1 efibootmgr
> efibootmgr: efibootmgr: Invalid argument

and the matching OVMF debug log is:

> InitCommunicateBuffer: Function=1
> InitCommunicateBuffer: Status=Success
> SendCommunicateBuffer: ReturnStatus=Buffer Too Small
>
> InitCommunicateBuffer: Function=1
> InitCommunicateBuffer: Status=Success
> SendCommunicateBBSPHandler: 318: CpuIndex=0
> BSPHandler: 544: CpuIndex=0
> uffer: ReturnStatus=Buffer Too Small

Observations:
- the first triplet seems to 

Re: [edk2] [PATCH v3 00/52] OvmfPkg: support SMM for better security (steps towards MP and X64)

2015-10-20 Thread Laszlo Ersek
On 10/20/15 18:27, Laszlo Ersek wrote:
> On 10/20/15 18:24, Laszlo Ersek wrote:
>> On 10/20/15 16:37, Laszlo Ersek wrote:
>>
>>> The variable access fails *iff* the access is made by a VCPU that is
>>> *not* VCPU#0.
>>>
>>> I added a bunch of debug messages to the following functions:
>>>
>>> - InitCommunicateBuffer(), SendCommunicateBuffer()
>>>   [MdeModulePkg/Universal/Variable/RuntimeDxe/VariableSmmRuntimeDxe.c]
>>>
>>> - BSPHandler(), APHandler()
>>>   [UefiCpuPkg/PiSmmCpuDxeSmm/MpService.c]
>>>
>>> Now, when the "efibootmgr" utility is run on CPU#0, everything works:
>>>
 [root@ovmf-fedlet ~]# taskset -c 0 efibootmgr
 BootCurrent: 0004
 Timeout: 0 seconds
 BootOrder: 0004,0003
 Boot0003* EFI Internal Shell
 Boot0004* fedlet grub
>>>
>>> and the OVMF debug log contains a sequence of the following snippets --
>>> the snippet is repeated once per variable:
>>>
 InitCommunicateBuffer: Function=1
 InitCommunicateBuffer: Status=Success
 BSPHandler: 318: CpuIndex=0
 BSPHandler: 544: CpuIndex=0
 SendCommunicateBuffer: ReturnStatus=Buffer Too Small

 InitCommunicateBuffer: Function=1
 InitCommunicateBuffer: Status=Success
 BSPHandler: 318: CpuIndex=0
 BSPHandler: 544: CpuIndex=0
 SendCommunicateBuffer: ReturnStatus=Success
>>>
>>> Here Function=1 is
>>>
>>>   SMM_VARIABLE_FUNCTION_GET_VARIABLE
>>>   [MdeModulePkg/Include/Guid/SmmVariableCommon.h]
>>>
>>> It is one of the message types that the "unprivileged" runtime variable
>>> driver composes for the "privileged" SMM variable driver. In the above
>>> log, you can see
>>> (1) the message buffer being initialized for this message type -- in the
>>> unprivileged driver --,
>>> (2) then the BSPHandler() function starting and completing (in SMM),
>>> (3) finally the unprivileged driver looking at the status code coming
>>> back, in the message buffer, from the privileged driver.
>>>
>>> The first triplet ends with "Buffer Too Small", because efibootmgr --
>>> apparently -- only queries the buffer size it needs to allocate for
>>> fetching the variable's contents. The second triplet succeeds.
>>>
>>> Side note: when testing variable access from a UEFI application or the
>>> UEFI shell directly, the code *always* runs on VCPU#0! That's why such
>>> testing always succeeds.
>>>
>>> Now, here's what happens when the same is attempted from VCPU#1:
>>>
 [root@ovmf-fedlet ~]# taskset -c 1 efibootmgr
 efibootmgr: efibootmgr: Invalid argument
>>>
>>> and the matching OVMF debug log is:
>>>
 InitCommunicateBuffer: Function=1
 InitCommunicateBuffer: Status=Success
 SendCommunicateBuffer: ReturnStatus=Buffer Too Small

 InitCommunicateBuffer: Function=1
 InitCommunicateBuffer: Status=Success
 SendCommunicateBBSPHandler: 318: CpuIndex=0
 BSPHandler: 544: CpuIndex=0
 uffer: ReturnStatus=Buffer Too Small
>>>
>>> Observations:
>>> - the first triplet seems to complete similarly, however we don't see
>>>   either BSPHandler or APHandler logs for it
>>>
>>> - the second triplet fails too (which is why efibootmgr reports EINVAL
>>>   -- the second call uses the right variable buffer size, so it should
>>>   succeed)
>>>
>>> - there are no APHandler() log entries at all, despite the efibootmgr
>>>   process being bound to VCPU#1. Also note that BSPHandler logs
>>>   CpuIndex=0.
>>>
>>>   This *cannot* be right. CpuIndex should be 1 in SmiRendezvous(),
>>>   APHandler() should be entered, and APHandler() should bring in the
>>>   BSP. The BSP should finally dispatch the SMI to the SMM core and the
>>>   variable driver.
>>
>> I think I have found the bug -- it's in QEMU.
>>
>> Namely, in ich9_apm_ctrl_changed(), the SMI is *always* injected on the
>> first CPU:
>>
>> /* SMI_EN = PMBASE + 30. SMI control and enable register */
>> if (lpc->pm.smi_en & ICH9_PMIO_SMI_EN_APMC_EN) {
>> cpu_interrupt(first_cpu, CPU_INTERRUPT_SMI);
>> }
>>
>> There's another similar function in this file, ich9_generate_smi(),
>> which similarly insists on the first CPU only, but that should be
>> alright. That function comes from:
>>
>> commit 920557971b60e53c2f3f22e5d6c620ab1ed411fd
>> Author: Paulo Alcantara 
>> Date:   Sun Jun 28 14:58:56 2015 -0300
>>
>> ich9: add TCO interface emulation
>>
>> and is only called from "hw/acpi/tco.c".
>>
>> A watchdog device may plausibly be permitted to inject the SMI on the
>> first CPU only, but for a synchronously triggered SMI, the SMI should be
>> injected on the exact CPU that triggers it.
>>
>> The bug in ich9_apm_ctrl_changed() dates back to:
>>
>> commit 4d00636e97b7f55810ff7faccff594159175e24e
>> Author: Jason Baron 
>> Date:   Wed Nov 14 15:54:05 2012 -0500
>>
>> ich9: Add the lpc chip
>>
>> Paolo, do you have an idea how the executing CPU's identity could be
>> preserved from the exec-stuff until the device emulation code? I think
>> it is trivial to know for someone who knows, but I 

Re: [edk2] [PATCH v3 00/52] OvmfPkg: support SMM for better security (steps towards MP and X64)

2015-10-20 Thread Laszlo Ersek
On 10/20/15 16:37, Laszlo Ersek wrote:

> The variable access fails *iff* the access is made by a VCPU that is
> *not* VCPU#0.
> 
> I added a bunch of debug messages to the following functions:
> 
> - InitCommunicateBuffer(), SendCommunicateBuffer()
>   [MdeModulePkg/Universal/Variable/RuntimeDxe/VariableSmmRuntimeDxe.c]
> 
> - BSPHandler(), APHandler()
>   [UefiCpuPkg/PiSmmCpuDxeSmm/MpService.c]
> 
> Now, when the "efibootmgr" utility is run on CPU#0, everything works:
> 
>> [root@ovmf-fedlet ~]# taskset -c 0 efibootmgr
>> BootCurrent: 0004
>> Timeout: 0 seconds
>> BootOrder: 0004,0003
>> Boot0003* EFI Internal Shell
>> Boot0004* fedlet grub
> 
> and the OVMF debug log contains a sequence of the following snippets --
> the snippet is repeated once per variable:
> 
>> InitCommunicateBuffer: Function=1
>> InitCommunicateBuffer: Status=Success
>> BSPHandler: 318: CpuIndex=0
>> BSPHandler: 544: CpuIndex=0
>> SendCommunicateBuffer: ReturnStatus=Buffer Too Small
>>
>> InitCommunicateBuffer: Function=1
>> InitCommunicateBuffer: Status=Success
>> BSPHandler: 318: CpuIndex=0
>> BSPHandler: 544: CpuIndex=0
>> SendCommunicateBuffer: ReturnStatus=Success
> 
> Here Function=1 is
> 
>   SMM_VARIABLE_FUNCTION_GET_VARIABLE
>   [MdeModulePkg/Include/Guid/SmmVariableCommon.h]
> 
> It is one of the message types that the "unprivileged" runtime variable
> driver composes for the "privileged" SMM variable driver. In the above
> log, you can see
> (1) the message buffer being initialized for this message type -- in the
> unprivileged driver --,
> (2) then the BSPHandler() function starting and completing (in SMM),
> (3) finally the unprivileged driver looking at the status code coming
> back, in the message buffer, from the privileged driver.
> 
> The first triplet ends with "Buffer Too Small", because efibootmgr --
> apparently -- only queries the buffer size it needs to allocate for
> fetching the variable's contents. The second triplet succeeds.
> 
> Side note: when testing variable access from a UEFI application or the
> UEFI shell directly, the code *always* runs on VCPU#0! That's why such
> testing always succeeds.
> 
> Now, here's what happens when the same is attempted from VCPU#1:
> 
>> [root@ovmf-fedlet ~]# taskset -c 1 efibootmgr
>> efibootmgr: efibootmgr: Invalid argument
> 
> and the matching OVMF debug log is:
> 
>> InitCommunicateBuffer: Function=1
>> InitCommunicateBuffer: Status=Success
>> SendCommunicateBuffer: ReturnStatus=Buffer Too Small
>>
>> InitCommunicateBuffer: Function=1
>> InitCommunicateBuffer: Status=Success
>> SendCommunicateBBSPHandler: 318: CpuIndex=0
>> BSPHandler: 544: CpuIndex=0
>> uffer: ReturnStatus=Buffer Too Small
> 
> Observations:
> - the first triplet seems to complete similarly, however we don't see
>   either BSPHandler or APHandler logs for it
> 
> - the second triplet fails too (which is why efibootmgr reports EINVAL
>   -- the second call uses the right variable buffer size, so it should
>   succeed)
> 
> - there are no APHandler() log entries at all, despite the efibootmgr
>   process being bound to VCPU#1. Also note that BSPHandler logs
>   CpuIndex=0.
> 
>   This *cannot* be right. CpuIndex should be 1 in SmiRendezvous(),
>   APHandler() should be entered, and APHandler() should bring in the
>   BSP. The BSP should finally dispatch the SMI to the SMM core and the
>   variable driver.

I think I have found the bug -- it's in QEMU.

Namely, in ich9_apm_ctrl_changed(), the SMI is *always* injected on the
first CPU:

/* SMI_EN = PMBASE + 30. SMI control and enable register */
if (lpc->pm.smi_en & ICH9_PMIO_SMI_EN_APMC_EN) {
cpu_interrupt(first_cpu, CPU_INTERRUPT_SMI);
}

There's another similar function in this file, ich9_generate_smi(),
which similarly insists on the first CPU only, but that should be
alright. That function comes from:

commit 920557971b60e53c2f3f22e5d6c620ab1ed411fd
Author: Paulo Alcantara 
Date:   Sun Jun 28 14:58:56 2015 -0300

ich9: add TCO interface emulation

and is only called from "hw/acpi/tco.c".

A watchdog device may plausibly be permitted to inject the SMI on the
first CPU only, but for a synchronously triggered SMI, the SMI should be
injected on the exact CPU that triggers it.

The bug in ich9_apm_ctrl_changed() dates back to:

commit 4d00636e97b7f55810ff7faccff594159175e24e
Author: Jason Baron 
Date:   Wed Nov 14 15:54:05 2012 -0500

ich9: Add the lpc chip

Paolo, do you have an idea how the executing CPU's identity could be
preserved from the exec-stuff until the device emulation code? I think
it is trivial to know for someone who knows, but I don't know (yet). :)
With that info surviving, ich9_apm_ctrl_changed() should be modified to
inject the SMI on the subject CPU.

Thanks!
Laszlo
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH v3 00/52] OvmfPkg: support SMM for better security (steps towards MP and X64)

2015-10-20 Thread Laszlo Ersek
On 10/20/15 18:24, Laszlo Ersek wrote:
> On 10/20/15 16:37, Laszlo Ersek wrote:
> 
>> The variable access fails *iff* the access is made by a VCPU that is
>> *not* VCPU#0.
>>
>> I added a bunch of debug messages to the following functions:
>>
>> - InitCommunicateBuffer(), SendCommunicateBuffer()
>>   [MdeModulePkg/Universal/Variable/RuntimeDxe/VariableSmmRuntimeDxe.c]
>>
>> - BSPHandler(), APHandler()
>>   [UefiCpuPkg/PiSmmCpuDxeSmm/MpService.c]
>>
>> Now, when the "efibootmgr" utility is run on CPU#0, everything works:
>>
>>> [root@ovmf-fedlet ~]# taskset -c 0 efibootmgr
>>> BootCurrent: 0004
>>> Timeout: 0 seconds
>>> BootOrder: 0004,0003
>>> Boot0003* EFI Internal Shell
>>> Boot0004* fedlet grub
>>
>> and the OVMF debug log contains a sequence of the following snippets --
>> the snippet is repeated once per variable:
>>
>>> InitCommunicateBuffer: Function=1
>>> InitCommunicateBuffer: Status=Success
>>> BSPHandler: 318: CpuIndex=0
>>> BSPHandler: 544: CpuIndex=0
>>> SendCommunicateBuffer: ReturnStatus=Buffer Too Small
>>>
>>> InitCommunicateBuffer: Function=1
>>> InitCommunicateBuffer: Status=Success
>>> BSPHandler: 318: CpuIndex=0
>>> BSPHandler: 544: CpuIndex=0
>>> SendCommunicateBuffer: ReturnStatus=Success
>>
>> Here Function=1 is
>>
>>   SMM_VARIABLE_FUNCTION_GET_VARIABLE
>>   [MdeModulePkg/Include/Guid/SmmVariableCommon.h]
>>
>> It is one of the message types that the "unprivileged" runtime variable
>> driver composes for the "privileged" SMM variable driver. In the above
>> log, you can see
>> (1) the message buffer being initialized for this message type -- in the
>> unprivileged driver --,
>> (2) then the BSPHandler() function starting and completing (in SMM),
>> (3) finally the unprivileged driver looking at the status code coming
>> back, in the message buffer, from the privileged driver.
>>
>> The first triplet ends with "Buffer Too Small", because efibootmgr --
>> apparently -- only queries the buffer size it needs to allocate for
>> fetching the variable's contents. The second triplet succeeds.
>>
>> Side note: when testing variable access from a UEFI application or the
>> UEFI shell directly, the code *always* runs on VCPU#0! That's why such
>> testing always succeeds.
>>
>> Now, here's what happens when the same is attempted from VCPU#1:
>>
>>> [root@ovmf-fedlet ~]# taskset -c 1 efibootmgr
>>> efibootmgr: efibootmgr: Invalid argument
>>
>> and the matching OVMF debug log is:
>>
>>> InitCommunicateBuffer: Function=1
>>> InitCommunicateBuffer: Status=Success
>>> SendCommunicateBuffer: ReturnStatus=Buffer Too Small
>>>
>>> InitCommunicateBuffer: Function=1
>>> InitCommunicateBuffer: Status=Success
>>> SendCommunicateBBSPHandler: 318: CpuIndex=0
>>> BSPHandler: 544: CpuIndex=0
>>> uffer: ReturnStatus=Buffer Too Small
>>
>> Observations:
>> - the first triplet seems to complete similarly, however we don't see
>>   either BSPHandler or APHandler logs for it
>>
>> - the second triplet fails too (which is why efibootmgr reports EINVAL
>>   -- the second call uses the right variable buffer size, so it should
>>   succeed)
>>
>> - there are no APHandler() log entries at all, despite the efibootmgr
>>   process being bound to VCPU#1. Also note that BSPHandler logs
>>   CpuIndex=0.
>>
>>   This *cannot* be right. CpuIndex should be 1 in SmiRendezvous(),
>>   APHandler() should be entered, and APHandler() should bring in the
>>   BSP. The BSP should finally dispatch the SMI to the SMM core and the
>>   variable driver.
> 
> I think I have found the bug -- it's in QEMU.
> 
> Namely, in ich9_apm_ctrl_changed(), the SMI is *always* injected on the
> first CPU:
> 
> /* SMI_EN = PMBASE + 30. SMI control and enable register */
> if (lpc->pm.smi_en & ICH9_PMIO_SMI_EN_APMC_EN) {
> cpu_interrupt(first_cpu, CPU_INTERRUPT_SMI);
> }
> 
> There's another similar function in this file, ich9_generate_smi(),
> which similarly insists on the first CPU only, but that should be
> alright. That function comes from:
> 
> commit 920557971b60e53c2f3f22e5d6c620ab1ed411fd
> Author: Paulo Alcantara 
> Date:   Sun Jun 28 14:58:56 2015 -0300
> 
> ich9: add TCO interface emulation
> 
> and is only called from "hw/acpi/tco.c".
> 
> A watchdog device may plausibly be permitted to inject the SMI on the
> first CPU only, but for a synchronously triggered SMI, the SMI should be
> injected on the exact CPU that triggers it.
> 
> The bug in ich9_apm_ctrl_changed() dates back to:
> 
> commit 4d00636e97b7f55810ff7faccff594159175e24e
> Author: Jason Baron 
> Date:   Wed Nov 14 15:54:05 2012 -0500
> 
> ich9: Add the lpc chip
> 
> Paolo, do you have an idea how the executing CPU's identity could be
> preserved from the exec-stuff until the device emulation code? I think
> it is trivial to know for someone who knows, but I don't know (yet). :)
> With that info surviving, ich9_apm_ctrl_changed() should be modified to
> inject the SMI on the subject CPU.

Can I simply rely on the 

Re: [edk2] [PATCH v3 00/52] OvmfPkg: support SMM for better security (steps towards MP and X64)

2015-10-15 Thread Laszlo Ersek
On 10/15/15 05:30, Kinney, Michael D wrote:
> Laszlo,
> 
> I have 32 VCPUs booting to UEFI Shell using Windows build of 32-bit QEMU.

Great!

> If more than 32, then we run out of SMRAM.

If you'd like to experiment with more, please locate the
"gUefiOvmfPkgTokenSpaceGuid.PcdQ35TsegMbytes" PCD in the DSC file(s).
The documentation for the PCD is in the DEC file. It is possible to
configure 1, 2, or 8 MB of TSEG (SMRAM).

The Ia32 DSC sets 1 MB currently, while the other two (Ia32X64, and X64)
DSCs set 2 MB.

> 
> I have made 2 changes from your branch to make this stable:
> 1) Force 32KB SMBASE alignment.  Required for me because I am using pre-built 
> QEMU binary.

Makes sense.

> 2) Your patch to sync MTRRs to all APs in CpuDxe CpuMp.c.  I changed
> SingleThread parameter from TRUE to FALSE and it resolved all the MP
> boot stability issues on my test config.

Wow! That's very interesting.

> This should work with
> either TRUE or FALSE, so there is something else wrong here.

I agree. I also thought that the invocation should work regardless of
the SingleThread setting, but I figured SingleThread=TRUE would be more
"conservative", hence safer / more portable. It is interesting to see
the opposite being true in practice!

I can flip the SingleThread arguments in the next version, but it would
be nice to understand what's happening.

Thanks
Laszlo

> 
>   //
>   // Synchronize MTRR settings to APs.
>   //
>   MtrrGetAllMtrrs ();
>   Status = mMpServicesTemplate.StartupAllAPs (
>  , // This
>  SetMtrrsFromBuffer,   // Procedure
>  FALSE,// SingleThread 
>  NULL, // WaitEvent
>  0,// 
> TimeoutInMicrosecsond
>  ,// ProcedureArgument
>  NULL  // FailedCpuList
>  );
>   ASSERT (Status == EFI_SUCCESS || Status == EFI_NOT_STARTED);
> 
> 
> Mike
> 
>> -Original Message-
>> From: Laszlo Ersek [mailto:ler...@redhat.com]
>> Sent: Wednesday, October 14, 2015 3:26 PM
>> To: edk2-de...@ml01.01.org
>> Cc: Paolo Bonzini; Justen, Jordan L; Kinney, Michael D
>> Subject: [PATCH v3 00/52] OvmfPkg: support SMM for better security (steps
>> towards MP and X64)
>>
>> Public branch: .
>>
>> (Mike's v1 patches are again contained in this branch, but I have
>> converted them to CRLF.)
>>
>> *
>>
>> Relative to v2, the following patches are new in v3 (also marked
>> individually in the Notes sections):
>>
>> * PiSmmCpuDxeSmm fixes from Paolo, to be reviewed and squashed in by
>>  Mike, included here for completeness:
>>
>>  02/52 UefiCpuPkg: PiSmmCpuDxeSmm: prepare PT in InitPaging before filling
>> in PDE
>>  03/52 UefiCpuPkg: PiSmmCpuDxeSmm: do not execute RSM from 64-bit
>> mode
>>
>> * Build failure fix for -D SMM_REQUIRE -D SOURCE_DEBUG_ENABLE, from
>>  Mike:
>>
>>  21/52 OvmfPkg: resolve DebugAgentLib for DXE_SMM_DRIVER modules
>>
>> * Features for MP and X64 support, from Paolo (kudos!!!):
>>
>>  23/52 OvmfPkg: import SmmCpuFeaturesLib from UefiCpuPkg
>>  24/52 OvmfPkg: SmmCpuFeaturesLib: remove unnecessary bits
>>  25/52 OvmfPkg: SmmCpuFeaturesLib: implement SMRAM state save map
>> access
>>  26/52 OvmfPkg: SmmCpuFeaturesLib: customize state save map format
>>  27/52 OvmfPkg: use relaxed AP SMM synchronization mode
>>  37/52 OvmfPkg: port CpuS3DataDxe to X64
>>
>> * Patch that adapts the build to the former, from Laszlo:
>>
>>  38/52 OvmfPkg: build QuarkPort/CpuS3DataDxe for -D SMM_REQUIRE
>>
>> * SMRAM / TSEG size update for the (Ia32)X64 builds, from Laszlo:
>>
>>  51/52 OvmfPkg: double the SMRAM (TSEG) size for the 64-bit DXE phase
>> builds
>>
>> *
>>
>> The following patches have been changed (and/or reworded) from v2 (also
>> marked individually in the Notes sections). Identifying them by their v3
>> numbers here:
>>
>> * Picked up Reviewed-by tag from Jeff, dropped Cc's:
>>
>>  04/52 UefiCpuPkg: CpuDxe: broadcast MTRR changes to APs
>>
>> * Addressed v2 feedback from Mike & Jordan:
>>
>>  06/52 OvmfPkg: Sec: force reinit of BaseExtractGuidedSectionLib handler
>> table
>>  11/52 OvmfPkg: add PEIM for providing TSEG-as-SMRAM during PEI
>>
>> * Adapted to Paolo's patches:
>>
>>  28/52 OvmfPkg: build PiSmmCpuDxeSmm for -D SMM_REQUIRE
>>  29/52 OvmfPkg: add skeleton QuarkPort/CpuS3DataDxe
>>  30/52 OvmfPkg: QuarkPort/CpuS3DataDxe: fill in
>> ACPI_CPU_DATA.StartupVector
>>
>> *
>>
>> Test environment and results:
>>
>> Host kernel:
>> - latest RHEL-7 development kernel (3.10.0-323.el7), with Paolo's
>>  following patches backported by yours truly:
>>  - KVM: x86: clean up kvm_arch_vcpu_runnable
>>  - KVM: x86: fix SMI to halted VCPU
>>
>> QEMU:
>> - current upstream (c49d3411faae), with Paolo's patch applied:
>>  -