Re: [edk2] [PATCH V2 0/6] Enable SMM page level protection.

2016-11-10 Thread Paolo Bonzini
> And, in my recent KVM / QEMU usage instructions for Jiewen:
> 
>   https://www.mail-archive.com/edk2-devel@lists.01.org/msg19446.html
> 
> I provided the following settings:
> 
> > # Settings for Ia32 only:
> > [...]
> > QEMU_COMMAND="qemu-system-i386 -cpu coreduo,-nx"
> >
> > # Settings for Ia32X64 only:
> > [...]
> > QEMU_COMMAND=qemu-system-x86_64
> 
> I guess the "-nx" bit can be left off with TCG, but AFAIR it is required
> for KVM.

Oh right now I remember.  The same problem exists: EFER is not saved in the
32-bit state save map.  AFAIK all processors with XD also have long mode.

That said, qemu-system-x86_64 and no -cpu option should work even with Ia32
PEI/DXE/SMM and no -cpu option.  In that case you could use XD.

Now if only Intel made the *full* format of the state save map public, we
could emulate everything more accurately...  I'm told it's in the BIOS
writers guide.

Paolo
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH V2 0/6] Enable SMM page level protection.

2016-11-10 Thread Laszlo Ersek
On 11/10/16 15:48, Yao, Jiewen wrote:

> Laszlo, your analysis will save me one day to install the Linux QEMU. J

Perfect; I can't wait till you guys adopt QEMU/KVM as a test platform! :)

Cheers
Laszlo

___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH V2 0/6] Enable SMM page level protection.

2016-11-10 Thread Laszlo Ersek
On 11/10/16 15:53, Paolo Bonzini wrote:
> 
> 
> On 10/11/2016 15:48, Yao, Jiewen wrote:
>> I cannot reproduce it before, because all my real hardware supports XD.
>> My Windows QEMU also supports XD (to my surprise.)
> 
> QEMU can be configured to support XD or not.  Possibly Laszlo was using
> some different default, or testing both cases.

When QEMU emulates an Ia32 (32-bit) target, the SMM state save area has
no room for capturing the fact whether NX is set or clear. This is an
issue that dates back to the inception of OVMF's SMM support. The
explanation was given by Paolo, actually :)

  https://www.mail-archive.com/edk2-devel@lists.01.org/msg00970.html

We adjusted the OvmfPkg/README file accordingly:

> * QEMU binary and options specific to 32-bit guests:
>
>   $ qemu-system-i386 -cpu coreduo,-nx \
>
>   or
>
>   $ qemu-system-x86_64 -cpu ,-lm,-nx \
>

Note the "-nx" bit.

And, in my recent KVM / QEMU usage instructions for Jiewen:

  https://www.mail-archive.com/edk2-devel@lists.01.org/msg19446.html

I provided the following settings:

> # Settings for Ia32 only:
> [...]
> QEMU_COMMAND="qemu-system-i386 -cpu coreduo,-nx"
>
> # Settings for Ia32X64 only:
> [...]
> QEMU_COMMAND=qemu-system-x86_64

I guess the "-nx" bit can be left off with TCG, but AFAIR it is required
for KVM.

Thanks!
Laszlo
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH V2 0/6] Enable SMM page level protection.

2016-11-10 Thread Paolo Bonzini


On 10/11/2016 15:48, Yao, Jiewen wrote:
> I cannot reproduce it before, because all my real hardware supports XD.
> My Windows QEMU also supports XD (to my surprise.)

QEMU can be configured to support XD or not.  Possibly Laszlo was using
some different default, or testing both cases.

Paolo
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH V2 0/6] Enable SMM page level protection.

2016-11-10 Thread Yao, Jiewen
Nice shot

After I reviewed SMI entry code again, I know the root-cause.

I made a huge mistake there.

We always save CpuIndex on the top of stack.
However, during SMI entry, below code *conditionally* push EDX.

; enable NXE if supported
DB  0b0h; mov al, imm8
ASM_PFX(mXdSupported): DB  0
cmp al, 0
jz  @SkipXd
;
; Check XD disable bit
;
mov ecx, MSR_IA32_MISC_ENABLE
rdmsr
pushedx; save MSR_IA32_MISC_ENABLE[63-32]

then later, below code *unconditionally* set CpuIndex above pushed EDX.
mov ebx, [esp + 4]  ; CPU Index

I cannot reproduce it before, because all my real hardware supports XD. My 
Windows QEMU also supports XD (to my surprise.)

Now I did reproduce it, after I hardcode XD to be disabled.


Laszlo, your analysis will save me one day to install the Linux QEMU. :)

Thank you
Yao Jiewen


From: edk2-devel [mailto:edk2-devel-boun...@lists.01.org] On Behalf Of Laszlo 
Ersek
Sent: Thursday, November 10, 2016 8:02 PM
To: Yao, Jiewen <jiewen@intel.com>
Cc: Tian, Feng <feng.t...@intel.com>; edk2-de...@ml01.01.org; Kinney, Michael D 
<michael.d.kin...@intel.com>; Paolo Bonzini <pbonz...@redhat.com>; Fan, Jeff 
<jeff@intel.com>; Zeng, Star <star.z...@intel.com>
Subject: Re: [edk2] [PATCH V2 0/6] Enable SMM page level protection.

On 11/10/16 11:41, Yao, Jiewen wrote:
> Thanks to report case 3 issue on bugzillar.
>
> Let's focus on Case 8.
> It seems another random failure issue.
>
> I did more test.
>
> 1)  I tested some other our internal real platform for UEFI32 OS boot. I 
> cannot reproduce the ASSERT.
>
> 2)  I wrote a small test app to call ExitBootServices and send SMI. I run 
> it on current my windows QEMU but I still cannot reproduce the ASSERT.
>
> It seem your env is the only way to repo the issue. I am trying to follow 
> your step by step to install OS on QEMU/KVM. I haven't finish all thing yet, 
> because of some network proxy issue. :(

Right, when you run a guest on TCG (QEMU's emulator) vs. on KVM (the 
virtualizer / accelerator in the host Linux kernel), you get very-very 
different timing behavior and interleaving of actions. For one, with KVM, the 
VCPUs really execute in parallel -- they are represented by host OS threads, 
and the host OS schedules them to separate "physical logical CPUs".

>
> Your information and analysis is great. It gives us some clue.
>
> I guess the same thing as you and checked: InitializeSpinLock 
> (mSmmMpSyncData->CpuData[CpuIndex].Busy);
>
> This address is initialized in InitializeMpSyncData(), with 
> gSmmCpuPrivate->SmmCoreEntryContext.NumberOfCpus which is got from 
> MpServices->GetNumberOfProcessors().
> I do not know why this address is zero.
>
> I also did not quite understand below log.
>
> * CPU #0: pc=0xc1844555 (halted) thread_id=7835
>   CPU #1: pc=0xc1844555 (halted) thread_id=7836
>   CPU #2: pc=0xc1844555 (halted) thread_id=7837
>   CPU #3: pc=0x7ffd17ca thread_id=7838
>
> As I recall, writing to B2 only cause BSP get SMI on OVMF. AP does not enter 
> SMM mode.
> So why #3 can enter SMM mode? Is that expected behavior? Or unexpected 
> behavior?
> If this is expected, how this happened? Does OS send 
> SendSmiIpiAllExcludingSelf, after ExitBootServices()?

My theory is that the OS is calling a runtime variable service during boot. 
That is supposed to pull in all APs into SMM, one way or another.

Also, during boot, the OS may call the runtime variable services genuinely on 
VCPU#3.

>
> I will see if I can finish QEMU/KVM installation tomorrow.

Thanks! Once you can test with KVM on your side, that should speed up debugging 
considerably, I think!

> If you have some idea on why and how #3 enter SMM, please let us know.

Well, I captured a KVM trace for this as well (fresh boot, up to the failure). 
Grepping the trace for entering / leaving SMM, we see:

(1) the initial SMBASE relocation:

 CPU-6948  [004] 11545.040294: kvm_enter_smm:vcpu 1: 
entering SMM, smbase 0x3
 CPU-6948  [004] 11545.040335: kvm_enter_smm:vcpu 1: 
leaving SMM, smbase 0x7ffb5000
 CPU-6949  [000] 11545.040363: kvm_enter_smm:vcpu 2: 
entering SMM, smbase 0x3
 CPU-6949  [000] 11545.040389: kvm_enter_smm:vcpu 2: 
leaving SMM, smbase 0x7ffb7000
 CPU-6950  [002] 11545.040417: kvm_enter_smm:vcpu 3: 
entering SMM, smbase 0x3
 CPU-6950  [002] 11545.040443: kvm_enter_smm:vcpu 3: 
leaving SMM, smbase 0x7ffb9000
 CPU-6947  [007] 11545.040453: kvm_enter_smm:vcpu 0: 
entering SMM, smbase 0x3
 CPU-6947  [007] 11545.040474: kvm_enter_smm:vcpu 0:

Re: [edk2] [PATCH V2 0/6] Enable SMM page level protection.

2016-11-10 Thread Paolo Bonzini


On 10/11/2016 11:41, Yao, Jiewen wrote:
> I also did not quite understand below log.
> 
> * CPU #0: pc=0xc1844555 (halted) thread_id=7835
>   CPU #1: pc=0xc1844555 (halted) thread_id=7836
>   CPU #2: pc=0xc1844555 (halted) thread_id=7837
>   CPU #3: pc=0x7ffd17ca thread_id=7838
> 
> As I recall, writing to B2 only cause BSP get SMI on OVMF. AP does not enter 
> SMM mode.

It's not BSP that enters SMM, it's the currently executing processor.

So this means that CPU #3 has written to B2.

Thanks,

Paolo


> So why #3 can enter SMM mode? Is that expected behavior? Or unexpected 
> behavior?
> If this is expected, how this happened? Does OS send 
> SendSmiIpiAllExcludingSelf, after ExitBootServices()?
> 
> I will see if I can finish QEMU/KVM installation tomorrow.
> 
> If you have some idea on why and how #3 enter SMM, please let us know.
>
> 
> Thank you
> Yao Jiewen
> 
> 
> From: Laszlo Ersek [mailto:ler...@redhat.com]
> Sent: Thursday, November 10, 2016 4:46 AM
> To: Yao, Jiewen <jiewen@intel.com>
> Cc: Tian, Feng <feng.t...@intel.com>; edk2-de...@ml01.01.org; Kinney, Michael 
> D <michael.d.kin...@intel.com>; Paolo Bonzini <pbonz...@redhat.com>; Fan, 
> Jeff <jeff@intel.com>; Zeng, Star <star.z...@intel.com>
> Subject: Re: [edk2] [PATCH V2 0/6] Enable SMM page level protection.
> 
> On 11/09/16 07:25, Yao, Jiewen wrote:
>> Hi Laszlo
>> I will fix DEBUG message issue in V3 patch.
>>
>> Below is rest issues:
>>
>>
>> l  Case 13: S3 fails randomly.
>> A good news: I worked with Jeff Fan to root-cause the S3 resume issue. Here 
>> is detail.
>>
>>
>> 1)  We believe the dead CPU is AP. Not BSP.
>> The reason is that:
>>
>> 1.1)   The BSP already transfer control to OS waking vector. The GDT/IDT/CR3 
>> should be set by OS.
>>
>> 1.2)   The current dead CPU still has GDT/IDT point to a BIOS reserved 
>> memory. The CS/DS/SS is typical BIOS X64 mode setting.
>>
>> 1.3)   The current dead CPU still has CR3 in SMM. (Which is obvious wrong)
>>
>>
>> 2)  Based upon the 1), we reviewed S3 resume AP flow.
>> Current BSP will wake up AP in SMRAM, for security consideration. At that 
>> time, we are using SMM mode CR3. It is OK for BSP because BSP is NOT in SMM 
>> mode yet. Even after SMM rebase, we can still use it because SMRR is not set 
>> in first SMM rebase.
>> Current BSP just uses its own context to initialize AP. So that AP takes BSP 
>> CR3, which is SMM CR3, unfortunately.
>> After BSP initialized APs, the AP is put to HALT-LOOP in X64 mode. It is the 
>> last straw, because X64 mode halt still need paging.
>>
>>
>> 3)  The error happen, once the AP receives an interrupt (for whatever 
>> reason), AP starts executing code. However, that that time the AP might not 
>> be in SMM mode. It means SMM CR3 is not available. And then we see this.
>>
>>
>> 4)  I guess we did not see the error, or this is RANDOM issue, because 
>> it depends on if AP receives an interrupt before BSP send INIT-SIPI-SIPI.
>>
>>
>> 5)  The fix, I think, should be below:
>> We should always put AP to protected mode, so that no paging is needed.
>> We should put AP in above 1M reserved memory, instead of <1M memory, because 
>> <1M memory is restored.
>>
>>
>> Would you please file a bugzillar? I think we need assign CPU owner to fix 
>> that critical issue.
>>
>> There is no need to do more investigation. Thanks for your great help on 
>> that. :)
> 
> Thank you for your help!
> 
> I filed <https://bugzilla.tianocore.org/show_bug.cgi?id=216>. The title is
> 
> BSP exits SMM and closes SMRAM on the S3 resume path before
> meeting with AP(s)
> 
> I hope the title is mostly right. I didn't add any other details (I
> haven't gone through the thread in detail yet, and without that I can't
> even write up a semi-reasonable report myself). Instead, I referenced
> this message of yours in the report, and I also linked Paolo's analysis
> from elsewhere in the thread. I hope this will do for the report.
> 
> (Also, thank you Paolo, from the amazing analysis -- I haven't digested
> it yet, but I can already tell it's amazing! :))
> 
>> l  Case 17 - I do not think it is a real issue, because SMM is out of 
>> resource.
>>
>>
>> l  Case 8 - that is a very weird issue. I talk with Jeff again. I do not 
>> have a clear clue yet.
>>> ASSERT MdePkg/Library/BaseSynchronizationLib/SynchronizationGcc.c(73): 
>>> SpinLock != ((v

Re: [edk2] [PATCH V2 0/6] Enable SMM page level protection.

2016-11-10 Thread Laszlo Ersek
t 
0x402 size 1 count 1 val 0x54  <- "T"
 CPU-6950  [005] 11550.521268: kvm_userspace_exit:   reason 
KVM_EXIT_IO (2)

This seems to be consistent with the OS calling a variable service on VCPU#3.

Also, as far as I can see, the above trace matches the assembly code in 
"UefiCpuPkg/PiSmmCpuDxeSmm/Ia32/SmiEntry.nasm".

Is perhaps CpuIndex out of bounds?... Hmm, with the following debug patch:

> diff --git a/UefiCpuPkg/PiSmmCpuDxeSmm/MpService.c 
> b/UefiCpuPkg/PiSmmCpuDxeSmm/MpService.c
> index d0092d2f145a..29f6e783c58f 100644
> --- a/UefiCpuPkg/PiSmmCpuDxeSmm/MpService.c
> +++ b/UefiCpuPkg/PiSmmCpuDxeSmm/MpService.c
> @@ -1143,6 +1143,9 @@ SmiRendezvous (
>// E.g., with Relaxed AP flow, SmmStartupThisAp() may be called 
> immediately
>// after AP's present flag is detected.
>//
> +  if (CpuIndex >= 4) {
> +DEBUG ((EFI_D_ERROR, "CpuIndex=%u\n", (UINT32)CpuIndex));
> +  }
>InitializeSpinLock (mSmmMpSyncData->CpuData[CpuIndex].Busy);
>  }
>  
> 

I get the following debug output (note that my SMP configuration is 1x2x2):

> SmmInstallProtocolInterface: [EdkiiSmmExitBootServicesProtocol] 0
> MpInitExitBootServicesCallback() done!
> CpuIndex=780161211
> ASSERT MdePkg/Library/BaseSynchronizationLib/SynchronizationGcc.c(73): 
> SpinLock != ((void *) 0)

Ehm... what? :)

SmiRendezvous() is EFIAPI, is the calling convention followed in 
"Ia32/SmiEntry.nasm"?

Thanks,
Laszlo

> Thank you
> Yao Jiewen
> 
> 
> From: Laszlo Ersek [mailto:ler...@redhat.com]
> Sent: Thursday, November 10, 2016 4:46 AM
> To: Yao, Jiewen <jiewen@intel.com>
> Cc: Tian, Feng <feng.t...@intel.com>; edk2-de...@ml01.01.org; Kinney, Michael 
> D <michael.d.kin...@intel.com>; Paolo Bonzini <pbonz...@redhat.com>; Fan, 
> Jeff <jeff@intel.com>; Zeng, Star <star.z...@intel.com>
> Subject: Re: [edk2] [PATCH V2 0/6] Enable SMM page level protection.
> 
> On 11/09/16 07:25, Yao, Jiewen wrote:
>> Hi Laszlo
>> I will fix DEBUG message issue in V3 patch.
>>
>> Below is rest issues:
>>
>>
>> l  Case 13: S3 fails randomly.
>> A good news: I worked with Jeff Fan to root-cause the S3 resume issue. Here 
>> is detail.
>>
>>
>> 1)  We believe the dead CPU is AP. Not BSP.
>> The reason is that:
>>
>> 1.1)   The BSP already transfer control to OS waking vector. The GDT/IDT/CR3 
>> should be set by OS.
>>
>> 1.2)   The current dead CPU still has GDT/IDT point to a BIOS reserved 
>> memory. The CS/DS/SS is typical BIOS X64 mode setting.
>>
>> 1.3)   The current dead CPU still has CR3 in SMM. (Which is obvious wrong)
>>
>>
>> 2)  Based upon the 1), we reviewed S3 resume AP flow.
>> Current BSP will wake up AP in SMRAM, for security consideration. At that 
>> time, we are using SMM mode CR3. It is OK for BSP because BSP is NOT in SMM 
>> mode yet. Even after SMM rebase, we can still use it because SMRR is not set 
>> in first SMM rebase.
>> Current BSP just uses its own context to initialize AP. So that AP takes BSP 
>> CR3, which is SMM CR3, unfortunately.
>> After BSP initialized APs, the AP is put to HALT-LOOP in X64 mode. It is the 
>> last straw, because X64 mode halt still need paging.
>>
>>
>> 3)  The error happen, once the AP receives an interrupt (for whatever 
>> reason), AP starts executing code. However, that that time the AP might not 
>> be in SMM mode. It means SMM CR3 is not available. And then we see this.
>>
>>
>> 4)  I guess we did not see the error, or this is RANDOM issue, because 
>> it depends on if AP receives an interrupt before BSP send INIT-SIPI-SIPI.
>>
>>
>> 5)  The fix, I think, should be below:
>> We should always put AP to protected mode, so that no paging is needed.
>> We should put AP in above 1M reserved memory, instead of <1M memory, because 
>> <1M memory is restored.
>>
>>
>> Would you please file a bugzillar? I think we need assign CPU owner to fix 
>> that critical issue.
>>
>> There is no need to do more investigation. Thanks for your great help on 
>> that. :)
> 
> Thank you for your help!
> 
> I filed <https://bugzilla.tianocore.org/show_bug.cgi?id=216>. The title is
> 
> BSP exits SMM and closes SMRAM on the S3 resume path before
> meeting with AP(s)
> 
> I hope the title is mostly right. I didn't add any other details (I
> haven't gone through the thread in detail yet, and without that I can't
> even write up a semi-reasonable report myself). Ins

Re: [edk2] [PATCH V2 0/6] Enable SMM page level protection.

2016-11-10 Thread Yao, Jiewen
Thanks to report case 3 issue on bugzillar.

Let's focus on Case 8.
It seems another random failure issue.

I did more test.

1)  I tested some other our internal real platform for UEFI32 OS boot. I 
cannot reproduce the ASSERT.

2)  I wrote a small test app to call ExitBootServices and send SMI. I run 
it on current my windows QEMU but I still cannot reproduce the ASSERT.

It seem your env is the only way to repo the issue. I am trying to follow your 
step by step to install OS on QEMU/KVM. I haven't finish all thing yet, because 
of some network proxy issue. :(

Your information and analysis is great. It gives us some clue.

I guess the same thing as you and checked: InitializeSpinLock 
(mSmmMpSyncData->CpuData[CpuIndex].Busy);

This address is initialized in InitializeMpSyncData(), with 
gSmmCpuPrivate->SmmCoreEntryContext.NumberOfCpus which is got from 
MpServices->GetNumberOfProcessors().
I do not know why this address is zero.

I also did not quite understand below log.

* CPU #0: pc=0xc1844555 (halted) thread_id=7835
  CPU #1: pc=0xc1844555 (halted) thread_id=7836
  CPU #2: pc=0xc1844555 (halted) thread_id=7837
  CPU #3: pc=0x7ffd17ca thread_id=7838

As I recall, writing to B2 only cause BSP get SMI on OVMF. AP does not enter 
SMM mode.
So why #3 can enter SMM mode? Is that expected behavior? Or unexpected behavior?
If this is expected, how this happened? Does OS send 
SendSmiIpiAllExcludingSelf, after ExitBootServices()?

I will see if I can finish QEMU/KVM installation tomorrow.

If you have some idea on why and how #3 enter SMM, please let us know.


Thank you
Yao Jiewen


From: Laszlo Ersek [mailto:ler...@redhat.com]
Sent: Thursday, November 10, 2016 4:46 AM
To: Yao, Jiewen <jiewen@intel.com>
Cc: Tian, Feng <feng.t...@intel.com>; edk2-de...@ml01.01.org; Kinney, Michael D 
<michael.d.kin...@intel.com>; Paolo Bonzini <pbonz...@redhat.com>; Fan, Jeff 
<jeff@intel.com>; Zeng, Star <star.z...@intel.com>
Subject: Re: [edk2] [PATCH V2 0/6] Enable SMM page level protection.

On 11/09/16 07:25, Yao, Jiewen wrote:
> Hi Laszlo
> I will fix DEBUG message issue in V3 patch.
>
> Below is rest issues:
>
>
> l  Case 13: S3 fails randomly.
> A good news: I worked with Jeff Fan to root-cause the S3 resume issue. Here 
> is detail.
>
>
> 1)  We believe the dead CPU is AP. Not BSP.
> The reason is that:
>
> 1.1)   The BSP already transfer control to OS waking vector. The GDT/IDT/CR3 
> should be set by OS.
>
> 1.2)   The current dead CPU still has GDT/IDT point to a BIOS reserved 
> memory. The CS/DS/SS is typical BIOS X64 mode setting.
>
> 1.3)   The current dead CPU still has CR3 in SMM. (Which is obvious wrong)
>
>
> 2)  Based upon the 1), we reviewed S3 resume AP flow.
> Current BSP will wake up AP in SMRAM, for security consideration. At that 
> time, we are using SMM mode CR3. It is OK for BSP because BSP is NOT in SMM 
> mode yet. Even after SMM rebase, we can still use it because SMRR is not set 
> in first SMM rebase.
> Current BSP just uses its own context to initialize AP. So that AP takes BSP 
> CR3, which is SMM CR3, unfortunately.
> After BSP initialized APs, the AP is put to HALT-LOOP in X64 mode. It is the 
> last straw, because X64 mode halt still need paging.
>
>
> 3)  The error happen, once the AP receives an interrupt (for whatever 
> reason), AP starts executing code. However, that that time the AP might not 
> be in SMM mode. It means SMM CR3 is not available. And then we see this.
>
>
> 4)  I guess we did not see the error, or this is RANDOM issue, because it 
> depends on if AP receives an interrupt before BSP send INIT-SIPI-SIPI.
>
>
> 5)  The fix, I think, should be below:
> We should always put AP to protected mode, so that no paging is needed.
> We should put AP in above 1M reserved memory, instead of <1M memory, because 
> <1M memory is restored.
>
>
> Would you please file a bugzillar? I think we need assign CPU owner to fix 
> that critical issue.
>
> There is no need to do more investigation. Thanks for your great help on 
> that. :)

Thank you for your help!

I filed <https://bugzilla.tianocore.org/show_bug.cgi?id=216>. The title is

BSP exits SMM and closes SMRAM on the S3 resume path before
meeting with AP(s)

I hope the title is mostly right. I didn't add any other details (I
haven't gone through the thread in detail yet, and without that I can't
even write up a semi-reasonable report myself). Instead, I referenced
this message of yours in the report, and I also linked Paolo's analysis
from elsewhere in the thread. I hope this will do for the report.

(Also, thank you Paolo, from the amazing analysis -- I haven't digested
it yet, but I can already tell it's amazing! :))

> l  Case 17 - 

Re: [edk2] [PATCH V2 0/6] Enable SMM page level protection.

2016-11-09 Thread Fan, Jeff
Laszlo,

I just sent the patch to place AP into safe hlt-loop code (in NVS range > 1MB, 
32 bit protected mode).

Could you check if it could solve the S3 unstable issue on OVMF?

Thanks!
Jeff

From: Yao, Jiewen
Sent: Thursday, November 10, 2016 9:13 AM
To: Laszlo Ersek; Paolo Bonzini
Cc: Kinney, Michael D; Tian, Feng; edk2-de...@ml01.01.org; Zeng, Star; Fan, Jeff
Subject: RE: [edk2] [PATCH V2 0/6] Enable SMM page level protection.

So, I don't understand how the CR3s that are used by the APs when they
serve MP services PPI requests, throughout the PEI phase (*), have
anything to do with CpuS3.c's page tables (which live in SMRAM, AIUI).
[Jiewen] It is very tricky.
First, in normal boot, the SMM need prepare a CR3 as SMM page table, which is 
obvious.

In S3, the S3Resume calls AsmWriteCr3(SmmS3ResumeState->SmmS3Cr3) then jump to 
SmmS3ResumeState->SmmS3ResumeEntryPoint. Now BSP hold SmmS3Cr3 but in non-SMM 
mode.

In SmmRestoreCpu(), BSP calls EarlyInitializeCpu()/PrepareApStartupVector() in 
non-SMM mode. And mExchangeInfo->Cr3 = (UINT32) (AsmReadCr3 ()); Now AP 
holds SmmS3Cr3 in non-SMM mode. It is OK, because SMRAM is OPEN.

When SmmRelocateBases() is called, AP is waken up and does rebase. SmmS3Cr3 is 
used for AP in SMM. But it does not change the fact that SmmS3Cr3 is also used 
in non-SMM mode.

Later in InitializeCpu(), AP wakeup buffer is put to below 1M with SmmS3Cr3.

Thank you
Yao Jiewen

From: Laszlo Ersek [mailto:ler...@redhat.com]
Sent: Thursday, November 10, 2016 7:27 AM
To: Paolo Bonzini <pbonz...@redhat.com<mailto:pbonz...@redhat.com>>
Cc: Yao, Jiewen <jiewen@intel.com<mailto:jiewen@intel.com>>; Kinney, 
Michael D <michael.d.kin...@intel.com<mailto:michael.d.kin...@intel.com>>; 
Tian, Feng <feng.t...@intel.com<mailto:feng.t...@intel.com>>; 
edk2-de...@ml01.01.org<mailto:edk2-de...@ml01.01.org>; Zeng, Star 
<star.z...@intel.com<mailto:star.z...@intel.com>>; Fan, Jeff 
<jeff....@intel.com<mailto:jeff@intel.com>>
Subject: Re: [edk2] [PATCH V2 0/6] Enable SMM page level protection.

On 11/09/16 23:59, Paolo Bonzini wrote:
>
>> Another question I have -- and I feel I should really know it, but I
>> don't... -- is *why* the APs are executing code from the page at
>> 0x9f000.
>
> This I can answer. :)
>
> The APs have done their INIT-SIPI-SIPI, and then went into the CLI;HLT;JMP
> loop.  When the AP exits SMM, it is in the JMP instruction.
>
> As suggested by Jiewen, edk2 could jump to a 32-bit loop that is _not_
> in the 0-640K area (perhaps it could be in what your doc calls the
> "permanent PEI memory for the S3 resume path"?).  After thinking a
> bit more about it, it seems simplest to me if CpuS3.c just uses
> SwitchStack or AsmDisablePaging64 at the end of MPRendezvousProcedure,
> to jump to a small stub like
>
> POP EAX   ; pop return address
> POP EAX   ; pop Context1 which is 
> DEC [EAX]
>  1: CLI
> HLT
> JMP 1
>
>> I could be utterly and inexcusably wrong, but I think that the
>> RIP=0x9f0fd symptom is a red herring.
>
> I wouldn't call it a red herring.  After all, CR3 points to SMM
> exactly because the CR3 that was set up for the 0x9f000 stub is
> CpuS3.c's SMRAM page table root.

Hrmpf. The stub at 0x9f000 does not belong to PiSmmCpuDxeSmm. Regardless
of the boot path (normal boot or S3 resume), it belongs to CpuMpPei, and
it partakes in the implementation of the MP services PPI. It is
practically the "parking lot" for the APs when they are not executing
any MP job, submitted by an MP services PPI client.

So, I don't understand how the CR3s that are used by the APs when they
serve MP services PPI requests, throughout the PEI phase (*), have
anything to do with CpuS3.c's page tables (which live in SMRAM, AIUI).

(*) For example, OVMF's PlatformPei uses this service to program
MSR_IA32_FEATURE_CONTROL from fw_cfg. On the resume path too, that
occurs before we do the SMBASE relocation.

(I.e., before S3RestoreConfig2() in
"UefiCpuPkg/Universal/Acpi/S3Resume2Pei/S3Resume.c" calls
SmmRestoreCpu() in "UefiCpuPkg/PiSmmCpuDxeSmm/CpuS3.c", via
SmmS3ResumeState->SmmS3ResumeEntryPoint.)

When an AP executes RSM, its CR3 should automatically be restored to the
original (non-SMM) value, should it not? I mean I do remember the CR3
value from the QEMU register dump, but now I don't understand how that's
possible with SMM=0.

Sorry if I'm being dense :)

> What _is_ a red herring is KVM's trace showing a RSM instruction
> at RIP=0x9f0fd.  That is clearly bogus, RSM was rather the last
> instruction executed _before_ getting to that RIP.
>
>>>   vcpu#0  vcpu#1  vcpu#2  vcpu#3
>>>   --  --  --  --
>>>   enter   enter
>>>   

Re: [edk2] [PATCH V2 0/6] Enable SMM page level protection.

2016-11-09 Thread Yao, Jiewen
Fix a typo.

From: Yao, Jiewen
Sent: Thursday, November 10, 2016 8:49 AM
To: 'Paolo Bonzini' <pbonz...@redhat.com>; Laszlo Ersek <ler...@redhat.com>
Cc: Kinney, Michael D <michael.d.kin...@intel.com>; Tian, Feng 
<feng.t...@intel.com>; edk2-de...@ml01.01.org; Zeng, Star 
<star.z...@intel.com>; Fan, Jeff <jeff....@intel.com>
Subject: RE: [edk2] [PATCH V2 0/6] Enable SMM page level protection.

> Anyway, I think if the BSP and the APs are properly synchronized around
> the SMI injections in S3ResumeExecuteBootScript(), then this bug is
> fixed. In that case, the APs' RSMs will restore the full context for the
> APs, including their sleep in the HLT instruction, in CpuMpPei's wakeup
> buffer. The BSP will proceed, exit PEI (restoring the CpuMpPei wakeup
> buffer -- but the APs will sleep on), and then Linux will bring up the
> APs, after taking control.

Agreed.

[Jiewen] I hold different opinion on that.

If the AP is in hlt-loop at some below 1M memory with CR3 pointing to SMRAM, it 
has 2 issues.

* The below 1M memory (0x9f0fd) might be consumed by OS with other instruction.

* If AP starts running the code, it will get exception because CR3 is obviously 
wrong. The AP cannot fetch any code.

We might have 2 possibles way to trigger this scenario, at least.

A) Jeff and I have discovered one possible case – AP may receive NMI/SMI, 
such as periodic SMI, before OS sends INIT-SIPI-SIPI to wake up AP.

B) Paolo has found one real case - AP is in SMRAM when BSP is out and about 
to close SMRAM.

IMHO, letting BSP/AP sync in S3 resume just resolved B). But it does not help 
on A).
If the system has some special SMI, such as periodic SMI. It will definitely 
trigger case A).

So we have to fix the AP state anyway.

Now, if the AP state is fixed, I do not think we need worry about the BSP/AP 
out of sync issue.
BSP and AP can be independent. AP can receive NMI/SMI at any time and just work 
in HLT-loop.

Thank you
Yao Jiewen

From: Paolo Bonzini [mailto:pbonz...@redhat.com]
Sent: Thursday, November 10, 2016 7:00 AM
To: Laszlo Ersek <ler...@redhat.com<mailto:ler...@redhat.com>>
Cc: Yao, Jiewen <jiewen@intel.com<mailto:jiewen@intel.com>>; Kinney, 
Michael D <michael.d.kin...@intel.com<mailto:michael.d.kin...@intel.com>>; 
Tian, Feng <feng.t...@intel.com<mailto:feng.t...@intel.com>>; 
edk2-de...@ml01.01.org<mailto:edk2-de...@ml01.01.org>; Zeng, Star 
<star.z...@intel.com<mailto:star.z...@intel.com>>; Fan, Jeff 
<jeff@intel.com<mailto:jeff@intel.com>>
Subject: Re: [edk2] [PATCH V2 0/6] Enable SMM page level protection.


> Another question I have -- and I feel I should really know it, but I
> don't... -- is *why* the APs are executing code from the page at
> 0x9f000.

This I can answer. :)

The APs have done their INIT-SIPI-SIPI, and then went into the CLI;HLT;JMP
loop.  When the AP exits SMM, it is in the JMP instruction.

As suggested by Jiewen, edk2 could jump to a 32-bit loop that is _not_
in the 0-640K area (perhaps it could be in what your doc calls the
"permanent PEI memory for the S3 resume path"?).  After thinking a
bit more about it, it seems simplest to me if CpuS3.c just uses
SwitchStack or AsmDisablePaging64 at the end of MPRendezvousProcedure,
to jump to a small stub like

POP EAX   ; pop return address
POP EAX   ; pop Context1 which is 
DEC [EAX]
 1: CLI
HLT
JMP 1

> I could be utterly and inexcusably wrong, but I think that the
> RIP=0x9f0fd symptom is a red herring.

I wouldn't call it a red herring.  After all, CR3 points to SMM
exactly because the CR3 that was set up for the 0x9f000 stub is
CpuS3.c's SMRAM page table root.

What _is_ a red herring is KVM's trace showing a RSM instruction
at RIP=0x9f0fd.  That is clearly bogus, RSM was rather the last
instruction executed _before_ getting to that RIP.

> >   vcpu#0  vcpu#1  vcpu#2  vcpu#3
> >   --  --  --  --
> >   enter   enter
> >enter| enter |
> >  |  |   |   |
> >leave|   |   |
> > <--- BAD
> >enter|   |   |
> >  |  |   |   |
> >leave  leave   leave   leave
>
> I don't understand why we don't get horrible faults on the APs
> *immediately* when the BSP closes/locks down SMRAM. Everything in SMRAM,
> page tables, executable code, everything, will read as 0xff on QEMU. How
> can the APs continue in SMM long enough to
>
> (a) time out and pull the BSP back into SMM,
> (b) complete the rendezvous and exit SMM?

Because the "0xff" only applies when you're out of SMM.  The three
states (open, closed, closed/locked) only apply when you're not in SMM.
While the AP is in S

Re: [edk2] [PATCH V2 0/6] Enable SMM page level protection.

2016-11-09 Thread Yao, Jiewen
> Anyway, I think if the BSP and the APs are properly synchronized around
> the SMI injections in S3ResumeExecuteBootScript(), then this bug is
> fixed. In that case, the APs' RSMs will restore the full context for the
> APs, including their sleep in the HLT instruction, in CpuMpPei's wakeup
> buffer. The BSP will proceed, exit PEI (restoring the CpuMpPei wakeup
> buffer -- but the APs will sleep on), and then Linux will bring up the
> APs, after taking control.

Agreed.

[Jiewen] I hold different opinion on that.

If the AP is in hlt-loop at some below 1M memory with CR3 pointing to SMRAM, it 
has 2 issues.

* The below 1M memory (0x9f0fd) might be consumed by OS with other instruction.

* If AP starts running the code, it will get exception because CR3 is obviously 
wrong. The AP cannot fetch any code.

We might have 2 possibles way to trigger this scenario, at least.

A) Jeff and I have discovered one possible case – AP may receive NMI/SMI, 
such as periodic SMI, before OS sends INIT-SIPI-SIPI to wake up AP.

B) Paolo has found one real case - AP is in SMRAM when BSP is out and about 
to close SMRAM.

IMHO, letting BSP/AP sync in S3 resume just resolved B). But it does not help 
on A).
If the system has some special SMI, such as periodic SMI. It will definitely 
trigger case A).

So we have to fix the AP state anyway.

Now, if the AP state is fixed, I do not think we do not need worry about the 
BSP/AP out of sync issue.
BSP and AP can be independent. AP can receive NMI/SMI at any time and just work 
in HLT-loop.

Thank you
Yao Jiewen

From: Paolo Bonzini [mailto:pbonz...@redhat.com]
Sent: Thursday, November 10, 2016 7:00 AM
To: Laszlo Ersek <ler...@redhat.com>
Cc: Yao, Jiewen <jiewen@intel.com>; Kinney, Michael D 
<michael.d.kin...@intel.com>; Tian, Feng <feng.t...@intel.com>; 
edk2-de...@ml01.01.org; Zeng, Star <star.z...@intel.com>; Fan, Jeff 
<jeff....@intel.com>
Subject: Re: [edk2] [PATCH V2 0/6] Enable SMM page level protection.


> Another question I have -- and I feel I should really know it, but I
> don't... -- is *why* the APs are executing code from the page at
> 0x9f000.

This I can answer. :)

The APs have done their INIT-SIPI-SIPI, and then went into the CLI;HLT;JMP
loop.  When the AP exits SMM, it is in the JMP instruction.

As suggested by Jiewen, edk2 could jump to a 32-bit loop that is _not_
in the 0-640K area (perhaps it could be in what your doc calls the
"permanent PEI memory for the S3 resume path"?).  After thinking a
bit more about it, it seems simplest to me if CpuS3.c just uses
SwitchStack or AsmDisablePaging64 at the end of MPRendezvousProcedure,
to jump to a small stub like

POP EAX   ; pop return address
POP EAX   ; pop Context1 which is 
DEC [EAX]
 1: CLI
HLT
JMP 1

> I could be utterly and inexcusably wrong, but I think that the
> RIP=0x9f0fd symptom is a red herring.

I wouldn't call it a red herring.  After all, CR3 points to SMM
exactly because the CR3 that was set up for the 0x9f000 stub is
CpuS3.c's SMRAM page table root.

What _is_ a red herring is KVM's trace showing a RSM instruction
at RIP=0x9f0fd.  That is clearly bogus, RSM was rather the last
instruction executed _before_ getting to that RIP.

> >   vcpu#0  vcpu#1  vcpu#2  vcpu#3
> >   --  --  --  --
> >   enter   enter
> >enter| enter |
> >  |  |   |   |
> >leave|   |   |
> > <--- BAD
> >enter|   |   |
> >  |  |   |   |
> >leave  leave   leave   leave
>
> I don't understand why we don't get horrible faults on the APs
> *immediately* when the BSP closes/locks down SMRAM. Everything in SMRAM,
> page tables, executable code, everything, will read as 0xff on QEMU. How
> can the APs continue in SMM long enough to
>
> (a) time out and pull the BSP back into SMM,
> (b) complete the rendezvous and exit SMM?

Because the "0xff" only applies when you're out of SMM.  The three
states (open, closed, closed/locked) only apply when you're not in SMM.
While the AP is in SMM they are executing in a separate address space
where SMRAM is "not closed".  (In QEMU that's a separate AddressSpace
struct, smram_address_space in target-i386/kvm.c).

> I think I sort of answered question (2). (Apologies if Paolo and Jiewen
> explained the exact same thing before; I had to spell it out for
> myself.) That leaves question (1) open. Why enter SMM in
> S3ResumeExecuteBootScript() at all?
>
> Anyway, I think if the BSP and the APs are properly synchronized around
> the SMI injections in S3ResumeExecuteBootScript(), then this bug is
> fixed. In that case, the APs' RSMs will restore the full context for the
&

Re: [edk2] [PATCH V2 0/6] Enable SMM page level protection.

2016-11-09 Thread Laszlo Ersek
On 11/09/16 23:59, Paolo Bonzini wrote:
> 
>> Another question I have -- and I feel I should really know it, but I
>> don't... -- is *why* the APs are executing code from the page at
>> 0x9f000.
> 
> This I can answer. :)
> 
> The APs have done their INIT-SIPI-SIPI, and then went into the CLI;HLT;JMP
> loop.  When the AP exits SMM, it is in the JMP instruction.
> 
> As suggested by Jiewen, edk2 could jump to a 32-bit loop that is _not_
> in the 0-640K area (perhaps it could be in what your doc calls the
> "permanent PEI memory for the S3 resume path"?).  After thinking a
> bit more about it, it seems simplest to me if CpuS3.c just uses
> SwitchStack or AsmDisablePaging64 at the end of MPRendezvousProcedure,
> to jump to a small stub like
> 
> POP EAX   ; pop return address
> POP EAX   ; pop Context1 which is 
> DEC [EAX]
>  1: CLI
> HLT
> JMP 1
> 
>> I could be utterly and inexcusably wrong, but I think that the
>> RIP=0x9f0fd symptom is a red herring.
> 
> I wouldn't call it a red herring.  After all, CR3 points to SMM
> exactly because the CR3 that was set up for the 0x9f000 stub is
> CpuS3.c's SMRAM page table root.

Hrmpf. The stub at 0x9f000 does not belong to PiSmmCpuDxeSmm. Regardless
of the boot path (normal boot or S3 resume), it belongs to CpuMpPei, and
it partakes in the implementation of the MP services PPI. It is
practically the "parking lot" for the APs when they are not executing
any MP job, submitted by an MP services PPI client.

So, I don't understand how the CR3s that are used by the APs when they
serve MP services PPI requests, throughout the PEI phase (*), have
anything to do with CpuS3.c's page tables (which live in SMRAM, AIUI).

(*) For example, OVMF's PlatformPei uses this service to program
MSR_IA32_FEATURE_CONTROL from fw_cfg. On the resume path too, that
occurs before we do the SMBASE relocation.

(I.e., before S3RestoreConfig2() in
"UefiCpuPkg/Universal/Acpi/S3Resume2Pei/S3Resume.c" calls
SmmRestoreCpu() in "UefiCpuPkg/PiSmmCpuDxeSmm/CpuS3.c", via
SmmS3ResumeState->SmmS3ResumeEntryPoint.)

When an AP executes RSM, its CR3 should automatically be restored to the
original (non-SMM) value, should it not? I mean I do remember the CR3
value from the QEMU register dump, but now I don't understand how that's
possible with SMM=0.

Sorry if I'm being dense :)

> What _is_ a red herring is KVM's trace showing a RSM instruction
> at RIP=0x9f0fd.  That is clearly bogus, RSM was rather the last
> instruction executed _before_ getting to that RIP.
> 
>>>   vcpu#0  vcpu#1  vcpu#2  vcpu#3
>>>   --  --  --  --
>>>   enter   enter
>>>enter| enter |
>>>  |  |   |   |
>>>leave|   |   |
>>> <--- BAD
>>>enter|   |   |
>>>  |  |   |   |
>>>leave  leave   leave   leave
>>
>> I don't understand why we don't get horrible faults on the APs
>> *immediately* when the BSP closes/locks down SMRAM. Everything in SMRAM,
>> page tables, executable code, everything, will read as 0xff on QEMU. How
>> can the APs continue in SMM long enough to
>>
>> (a) time out and pull the BSP back into SMM,
>> (b) complete the rendezvous and exit SMM?
> 
> Because the "0xff" only applies when you're out of SMM.  The three
> states (open, closed, closed/locked) only apply when you're not in SMM.
> While the AP is in SMM they are executing in a separate address space
> where SMRAM is "not closed".  (In QEMU that's a separate AddressSpace
> struct, smram_address_space in target-i386/kvm.c).

Sigh, in retrospect, this should have been obvious. :) Thanks for
pointing it out!

Laszlo

>> I think I sort of answered question (2). (Apologies if Paolo and Jiewen
>> explained the exact same thing before; I had to spell it out for
>> myself.) That leaves question (1) open. Why enter SMM in
>> S3ResumeExecuteBootScript() at all?
>>
>> Anyway, I think if the BSP and the APs are properly synchronized around
>> the SMI injections in S3ResumeExecuteBootScript(), then this bug is
>> fixed. In that case, the APs' RSMs will restore the full context for the
>> APs, including their sleep in the HLT instruction, in CpuMpPei's wakeup
>> buffer. The BSP will proceed, exit PEI (restoring the CpuMpPei wakeup
>> buffer -- but the APs will sleep on), and then Linux will bring up the
>> APs, after taking control.
> 
> Agreed.
> 
> Paolo
> 

___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH V2 0/6] Enable SMM page level protection.

2016-11-09 Thread Paolo Bonzini

> Another question I have -- and I feel I should really know it, but I
> don't... -- is *why* the APs are executing code from the page at
> 0x9f000.

This I can answer. :)

The APs have done their INIT-SIPI-SIPI, and then went into the CLI;HLT;JMP
loop.  When the AP exits SMM, it is in the JMP instruction.

As suggested by Jiewen, edk2 could jump to a 32-bit loop that is _not_
in the 0-640K area (perhaps it could be in what your doc calls the
"permanent PEI memory for the S3 resume path"?).  After thinking a
bit more about it, it seems simplest to me if CpuS3.c just uses
SwitchStack or AsmDisablePaging64 at the end of MPRendezvousProcedure,
to jump to a small stub like

POP EAX   ; pop return address
POP EAX   ; pop Context1 which is 
DEC [EAX]
 1: CLI
HLT
JMP 1

> I could be utterly and inexcusably wrong, but I think that the
> RIP=0x9f0fd symptom is a red herring.

I wouldn't call it a red herring.  After all, CR3 points to SMM
exactly because the CR3 that was set up for the 0x9f000 stub is
CpuS3.c's SMRAM page table root.

What _is_ a red herring is KVM's trace showing a RSM instruction
at RIP=0x9f0fd.  That is clearly bogus, RSM was rather the last
instruction executed _before_ getting to that RIP.

> >   vcpu#0  vcpu#1  vcpu#2  vcpu#3
> >   --  --  --  --
> >   enter   enter
> >enter| enter |
> >  |  |   |   |
> >leave|   |   |
> > <--- BAD
> >enter|   |   |
> >  |  |   |   |
> >leave  leave   leave   leave
> 
> I don't understand why we don't get horrible faults on the APs
> *immediately* when the BSP closes/locks down SMRAM. Everything in SMRAM,
> page tables, executable code, everything, will read as 0xff on QEMU. How
> can the APs continue in SMM long enough to
> 
> (a) time out and pull the BSP back into SMM,
> (b) complete the rendezvous and exit SMM?

Because the "0xff" only applies when you're out of SMM.  The three
states (open, closed, closed/locked) only apply when you're not in SMM.
While the AP is in SMM they are executing in a separate address space
where SMRAM is "not closed".  (In QEMU that's a separate AddressSpace
struct, smram_address_space in target-i386/kvm.c).

> I think I sort of answered question (2). (Apologies if Paolo and Jiewen
> explained the exact same thing before; I had to spell it out for
> myself.) That leaves question (1) open. Why enter SMM in
> S3ResumeExecuteBootScript() at all?
> 
> Anyway, I think if the BSP and the APs are properly synchronized around
> the SMI injections in S3ResumeExecuteBootScript(), then this bug is
> fixed. In that case, the APs' RSMs will restore the full context for the
> APs, including their sleep in the HLT instruction, in CpuMpPei's wakeup
> buffer. The BSP will proceed, exit PEI (restoring the CpuMpPei wakeup
> buffer -- but the APs will sleep on), and then Linux will bring up the
> APs, after taking control.

Agreed.

Paolo
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH V2 0/6] Enable SMM page level protection.

2016-11-09 Thread Laszlo Ersek
On 11/09/16 16:54, Paolo Bonzini wrote:
> 
> 
> On 09/11/2016 16:01, Yao, Jiewen wrote:
>> 1)  CpuS3.c – EarlyInitializeCpu()
>> 2)  CpuS3.c – SmmRelocateBases()
>> 3)  CpuS3.c – InitializeCpu()
>> 4)  S3Resume.c – SendSmiIpiAllExcludingSelf()
>>
>> I believe we can guarantee 1/2/3 is good, because I found we check BSP
>> check mNumberToFinish.
>>
>> 4 is a risk, because there is no AP finish check. If the AP is in below
>> 1M with CR3 in SMRAM, it will be a trouble.
>>
>> Once the AP executes RSM and return to non-SMM, the CR3 is no longer
>> valid and AP must be crashed immediately. WoW!
>>
>> The fix, I believe, is same.
>>
>> We should make 1) AP is in above 1M reserved memory,
> 
> Is this because of the NMI case?
> 
>> and 2) AP is in protected mode with paging disabled.
> 
> It is not clear to me what the (4) SIPI done is there for,

After reading through your great analysis with a keen focus :), I wanted
to ask the exact same thing. I managed to follow / recall the control
flow mostly, but when I saw that SMI, I didn't (and don't) understand
that it was (is) good for.

After all, we're not setting up any request parameters etc. for the
processors to handle in SMM. What's happening there?

Another question I have -- and I feel I should really know it, but I
don't... -- is *why* the APs are executing code from the page at
0x9f000. When the BSP exits SMM, replays the S3 boot script, and finally
finishes off the PEI phase and restores the page at 0x9f000, the APs
seem to be affected -- but why do they care about that page at all? That
page never belonged to PiSmmCpuSmmDxe, it belongs CpuMpPei.

I do understand that the CR3 registers for the APs point into SMRAM,
while they wait for the BSP in SMM. Thus, the BSP closing/locking down
SMRAM, in S3ResumeExecuteBootScript(), breaks the APs -- that's
understandable.

What I don't get is, again:
(1) why S3ResumeExecuteBootScript() raises SMIs at all, before locking
down SMRAM,
(2) what the AP SMM routine (from PiSmmCpuDxeSmm) has to do with the
Wakeup buffer that is allocated and used *solely* by CpuMpPei.

I could be utterly and inexcusably wrong, but I think that the
RIP=0x9f0fd symptom is a red herring. I wrote,

>   vcpu#0  vcpu#1  vcpu#2  vcpu#3
>   --  --  --  --
>   enter
>|
>   leave
>
>   enter
> |
>   leave
>
>   enter
> |
>   leave
>
>   enter
> |
>   leave
>
>   enter   enter
>enter| enter |
>  |  |   |   |
>leave|   |   |
> <--- BAD
>enter|   |   |
>  |  |   |   |
>leave  leave   leave   leave

Thanks to Paolo's analysis, we now know where that gap comes from and
what it does (so I marked it with BAD now) -- in the gap, the BSP leaves
SMM alone, closes/locks SMRAM, finishes off the PEI phase, restores the
contents of the borrowed wakeup buffer of CpuMpPei, and even transfers
control to Linux's S3 resume vector.

I don't understand why we don't get horrible faults on the APs
*immediately* when the BSP closes/locks down SMRAM. Everything in SMRAM,
page tables, executable code, everything, will read as 0xff on QEMU. How
can the APs continue in SMM long enough to

(a) time out and pull the BSP back into SMM,
(b) complete the rendezvous and exit SMM?

... Anyway, I think I do have an idea for question (2). Namely, when the
BSP starts executing S3ResumeExecuteBootScript(), in
"UefiCpuPkg/Universal/Acpi/S3Resume2Pei/S3Resume.c" -- for which the cue
is ultimately given by the DXE IPL PEIM, as the last action in PEI --,
CpuMpPei has been dispatched already! And, CpuMpPei has placed all the
APs into their comfy HLT loops, so that the MP services PPI could serve
multiprocessing requests.

Thus, the APs are executing code (the HLT loop) from CpuMpPei's wakeup
buffer on page 0x9f000 as *normal business*. That is where the SMI,
raised by the BSP in S3ResumeExecuteBootScript(), rips them out of. And
that's also where KVM tries to return them to, once they finish in SMM
and execute RSM. Too bad by the time KVM returns them there, the wakeup
page has been restored by the BSP.

In other words, the address RIP=0x9f0fd *is* a red herring, that's
simply where the APs happened to be when the SMI was raised, and where
KVM remembers to return the APs to, once the APs execute RSM.

I think I sort of answered question (2). (Apologies if Paolo and Jiewen
explained the exact same thing before; I had to spell it out for
myself.) That leaves question (1) open. Why enter SMM in
S3ResumeExecuteBootScript() at all?

Anyway, I think if the BSP and the APs are properly synchronized around
the SMI injections in S3ResumeExecuteBootScript(), then this bug is
fixed. In that case, 

Re: [edk2] [PATCH V2 0/6] Enable SMM page level protection.

2016-11-09 Thread Laszlo Ersek
c 10sub$0x10,%esp
  volatile UINTN  Index;

  for (Index = 0; Index == 0;);
87c3:   c7 45 fc 00 00 00 00movl   $0x0,-0x4(%ebp)
87ca:   8b 45 fcmov-0x4(%ebp),%eax  <-- HERE
87cd:   85 c0   test   %eax,%eax
87cf:   74 f9   je 87ca <CpuDeadLoop+0xd>
}
87d1:   c9  leave
87d2:   c3  ret

This seems consistent with an assertion failure.

I searched UefiCpuPkg/PiSmmCpuDxeSmm/ for InitializeSpinLock(), and the
SmiRendezvous() function [UefiCpuPkg/PiSmmCpuDxeSmm/MpService.c] looks
like a possible caller:

  //
  // The BUSY lock is initialized to Released state. This needs to
  // be done early enough to be ready for BSP's SmmStartupThisAp()
  // call. E.g., with Relaxed AP flow, SmmStartupThisAp() may be
  // called immediately after AP's present flag is detected.
  //
  InitializeSpinLock (mSmmMpSyncData->CpuData[CpuIndex].Busy);

Just a guess, of course.

> At same time, all my OS test is on real platform. I have not setup OVMF env 
> to run an OS yet.
> If you can share a step by step to me, that would be great.

(1) Grab a host computer with a CPU that supports VMX and EPT.

(2) Download and install Fedora 24 (for example):

https://getfedora.org/en/workstation/download/
http://docs.fedoraproject.org/install-guide

(3) Install the "qemu-system-x86" package with DNF

dnf install qemu-system-x86

(4) clone edk2 with git

(5) embed OpenSSL optionally (for secure boot); see
"CryptoPkg/Library/OpensslLib/Patch-HOWTO.txt"

(6) build OVMF:

source edksetup.sh
make -C "$EDK_TOOLS_PATH"

# Ia32
build \
  -a IA32 \
  -p OvmfPkg/OvmfPkgIa32.dsc \
  -D SMM_REQUIRE -D SECURE_BOOT_ENABLE \
  -t GCC5 -b DEBUG

# Ia32X64
build \
  -a IA32 -a X64 \
  -p OvmfPkg/OvmfPkgIa32X64.dsc \
  -D SMM_REQUIRE -D SECURE_BOOT_ENABLE \
  -t GCC5 -b DEBUG

(7) Create disk images:

qemu-img create -f qcow2 -o compat=1.1 -o cluster_size=65536 \
  -o preallocation=metadata -o lazy_refcounts=on disk-ia32.img 100G

qemu-img create -f qcow2 -o compat=1.1 -o cluster_size=65536 \
  -o preallocation=metadata -o lazy_refcounts=on disk-ia32x64.img 100G

(8) For a 64-bit guest OS, you can again use the Fedora 24 Workstation
that you downloaded already (the ISO image).

For 32-bit guest OS, this one used to work:

https://www.happyassassin.net/fedlet-a-fedora-remix-for-bay-trail-tablets/

minimally the 20141209 release. Hm... actually, I think the maintainer
of that image has discontinued the downloadable files :(

So, I don't know what 32-bit UEFI OS to recommend for testing.

32-bit Windows doesn't boot on OVMF (I looked into that earlier, several
times, with some help from a Microsoft developer, but we couldn't solve
it), so I can't recommend Windows as an alternative.

Perhaps you can use

https://linuxiumcomau.blogspot.com/2016/10/running-ubuntu-on-intel-bay-trail-and.html

as a 32-bit guest OS, I never tried.

(9) Anyway, once you have an installer ISO, set the "ISO" environment
variable to the ISO image's full pathname, and then run QEMU like this:

# Settings for Ia32 only:

ISO=...
DISK=.../disk-ia32.img
FW=.../Build/OvmfIa32/DEBUG_GCC5/FV/OVMF_CODE.fd
TEMPLATE=.../Build/OvmfIa32/DEBUG_GCC5/FV/OVMF_VARS.fd
VARS=vars-32.fd
QEMU_COMMAND="qemu-system-i386 -cpu coreduo,-nx"
DEBUG=debug-32.log

# Settings for Ia32X64 only:

ISO=...
DISK=.../disk-ia32x64.img
FW=.../Build/Ovmf3264/DEBUG_GCC5/FV/OVMF_CODE.fd
TEMPLATE=.../Build/Ovmf3264/DEBUG_GCC5/FV/OVMF_VARS.fd
VARS=vars-3264.fd
QEMU_COMMAND=qemu-system-x86_64
DEBUG=debug-3264.log

# Common commands for both target arches:

# create variable store from varstore template
# if the former doesn't exist yet
if ! [ -e "$VARS" ]; then
  cp -- "$TEMPLATE" "$VARS"
fi

$QEMU_COMMAND \
  -machine q35,smm=on,accel=kvm \
  -m 4096 \
  -smp sockets=1,cores=2,threads=2 \
  -global driver=cfi.pflash01,property=secure,value=on \
  -drive if=pflash,format=raw,unit=0,file=${FW},readonly=on \
  -drive if=pflash,format=raw,unit=1,file=${VARS} \
  \
  -chardev file,id=debugfile,path=$DEBUG \
  -device isa-debugcon,iobase=0x402,chardev=debugfile \
  \
  -chardev stdio,id=char0,signal=off,mux=on \
  -mon chardev=char0,mode=readline,default \
  -serial chardev:char0 \
  \
  -drive id=iso,if=none,format=raw,readonly,file=$ISO \
  -drive id=disk,if=none,format=qcow2,file=$DISK \
  \
  -device virtio-scsi-pci,id=scsi0 \
  -device scsi-cd,drive=iso,bus=scsi0.0,bootindex=2 \
  -device scsi-hd,drive=disk,bus=scsi0.0,bootindex=1 \
  \
  -device VGA

This will capture the OVMF debug output in the $DEBUG file. Also, the
terminal where you run the command can be switched between the guest's
serial console and the QEMU monitor with [Ctrl-A C].

Thanks
Laszlo

> 
> Thank you
> Yao Jiewen
> 
> From: edk2-devel [mailto:edk2-devel-bo

Re: [edk2] [PATCH V2 0/6] Enable SMM page level protection.

2016-11-09 Thread Paolo Bonzini


On 09/11/2016 16:54, Paolo Bonzini wrote:
>> > and 2) AP is in protected mode with paging disabled.
> It is not clear to me what the (4) SIPI done is there for, and why it is
> triggered in S3Resume.c rather than CpuS3.c.  And why does it take so
> much for APs to complete it?

SMI of course, not SIPI.

Paolo
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH V2 0/6] Enable SMM page level protection.

2016-11-09 Thread Paolo Bonzini


On 09/11/2016 16:01, Yao, Jiewen wrote:
> 1)  CpuS3.c – EarlyInitializeCpu()
> 2)  CpuS3.c – SmmRelocateBases()
> 3)  CpuS3.c – InitializeCpu()
> 4)  S3Resume.c – SendSmiIpiAllExcludingSelf()
> 
> I believe we can guarantee 1/2/3 is good, because I found we check BSP
> check mNumberToFinish.
> 
> 4 is a risk, because there is no AP finish check. If the AP is in below
> 1M with CR3 in SMRAM, it will be a trouble.
> 
> Once the AP executes RSM and return to non-SMM, the CR3 is no longer
> valid and AP must be crashed immediately. WoW!
> 
> The fix, I believe, is same.
> 
> We should make 1) AP is in above 1M reserved memory,

Is this because of the NMI case?

> and 2) AP is in protected mode with paging disabled.

It is not clear to me what the (4) SIPI done is there for, and why it is
triggered in S3Resume.c rather than CpuS3.c.  And why does it take so
much for APs to complete it?

That said, by the time you close and lock SMRAM, you aren't even sure
that you have reached the cli;hlt loop in the rendezvous funnel.  In
practice you will be there, but there is still a theoretical race.

InterlockedDecrement () should be moved from
EarlyMPRendezvousProcedure/MPRendezvousProcedure to GoToSleep, and
GoToSleep should leave 64-bit mode before doing it.  This will fix the
S3 bug as well.  It's only needed for 64-bit mode, but it is doable for
the Ia32 version as well.

Perhaps EarlyMPRendezvousProcedure and MPRendezvousProcedure can return
 what do you think?

Paolo
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH V2 0/6] Enable SMM page level protection.

2016-11-09 Thread Yao, Jiewen
Great work! I appreciate that.

It seems the slow emulated SMM keeps exposing the corner case on the code. :)

We will fix the bad AP in another patch.

Thank you
Yao Jiewen

From: Paolo Bonzini [mailto:pbonz...@redhat.com]
Sent: Wednesday, November 9, 2016 7:24 PM
To: Laszlo Ersek <ler...@redhat.com>
Cc: Yao, Jiewen <jiewen@intel.com>; edk2-de...@ml01.01.org; Kinney, Michael 
D <michael.d.kin...@intel.com>; Tian, Feng <feng.t...@intel.com>; Fan, Jeff 
<jeff@intel.com>; Zeng, Star <star.z...@intel.com>
Subject: Re: [edk2] [PATCH V2 0/6] Enable SMM page level protection.


>   * Second, the instruction that causes things to blow up is <0f aa>,
> i.e., RSM. I have absolutely no clue why RSM is executed:

It's probably not RSM.  RSM is probably the last instruction executed
before, and it's still in the buffer because, as you said, there's no
way that you can fetch an instruction while CR3 points into SMM.

My first thought was that the MMU is for some reason out of contact
with reality, but actually the CR3 write is correct:

 CPU-24446 [002] 39841.871040: kvm_exit: reason 
CR_ACCESS rip 0x9f05e info 103 0
 CPU-24446 [002] 39841.871040: kvm_cr:   cr_write 3 = 
0x7ff7f000

and it's coming from the stub as well.  So the second thought was that
the wakeup buffer has the wrong CR3 put into the wakeup buffer's Cr3 location.
I'm glad I kept looking because it was much more entertaining.  Especially
knowing that I (probably) will not have to fix it. :)

The basic idea for debugging was to look for interesting events and
use 0x402 writes to correlate them to the debug log.  For example, most
accesses to 0x9f??? are obviously not traced by KVM, but the first ones
are:

31519-  CPU-2 [006] 39841.783344: kvm_exit: reason 
EPT_VIOLATION rip 0x855d82 info 181 0
31520:  CPU-2 [006] 39841.783344: kvm_page_fault:   address 
9f000 error_code 181
280224- CPU-2 [006] 39841.860940: kvm_exit: reason 
EPT_VIOLATION rip 0x7ffd0d15 info 182 0
280225: CPU-2 [006] 39841.860940: kvm_page_fault:   address 
9f000 error_code 182

(The number is just the line number in the trace).  Luckily your machine
didn't have EPT accessed/dirty bits, so KVM trapped both the first read
and the first write.

The read is at

WakeupBufferStart = 9F000, WakeupBufferSize = 1000

but it's not too interesting.  The second is a good one to start debugging
because it's from SMRAM (though not from SMM, since the first kvm_enter_smm
happens later at 305930).  So it makes sense that it writes an SMRAM CR3.
There is a write to the debug log just before, at 279993, and it writes
"SmmRestoreCpu()".  As expected, the write is followed by a flurry of MSR
writes, the APIC programming at 280131, so I am pretty sure that the write to
mExchangeInfo->Cr3 comes from PrepareApStartupVector.

FWIW, I first looked at the call chain up from BackupAndPrepareWakeupBuffer,
but that led me nowhere for an hour.  So I was a bit lucky indeed. :)

Anyhow, SmmRestoreCpu is the SmmS3ResumeEntryPoint for S3Resume2Pei, and
indeed, earlier in the log you have this debugging output from S3Resume2Pei:

SMM S3 CR3  = 7FF7F000

Doh, maybe I should have looked at the log before the trace.  Who knows.
Anyway, the SMM_S3_RESUME_STATE is initialized by InitSmmS3ResumeState,
so the CR3 is the one that is initialized by InitSmmS3Cr3 in
UefiCpuPkg/PiSmmCpuDxeSmm/X64/SmmProfileArch.c.  At this point I
was still thinking that this CR3 was wrong, but by looking at the
places where SMM is entered, and correlating that with debug log writes,
the puzzle was relatively easy to solve:

1) SMBASE relocation, done by SmmRestoreCpu:

305930: CPU-24445 [005] 39841.871264: kvm_enter_smm:vcpu 1: 
entering SMM, smbase 0x3
306000: CPU-24445 [005] 39841.871318: kvm_enter_smm:vcpu 1: 
leaving SMM, smbase 0x7ffb3000
306051: CPU-24446 [002] 39841.871349: kvm_enter_smm:vcpu 2: 
entering SMM, smbase 0x3
306108: CPU-24446 [002] 39841.871390: kvm_enter_smm:vcpu 2: 
leaving SMM, smbase 0x7ffb5000
306161: CPU-24447 [004] 39841.871421: kvm_enter_smm:vcpu 3: 
entering SMM, smbase 0x3
306218: CPU-24447 [004] 39841.871463: kvm_enter_smm:vcpu 3: 
leaving SMM, smbase 0x7ffb7000
306254: CPU-2 [006] 39841.871473: kvm_enter_smm:vcpu 0: 
entering SMM, smbase 0x3
306311: CPU-2 [006] 39841.871512: kvm_enter_smm:vcpu 0: 
leaving SMM, smbase 0x7ffb1000

2) S3ResumeExecuteBootScript (again, the previous 0x402 write ends
at 334597 and promptly gives us a clue):

334698: CPU-24445 [005] 39841.882706: kvm_enter_smm:vcpu 1: 
entering SMM, smbase 0x7ffb3000
334699: CPU-24447 [004] 39841.882706: k

Re: [edk2] [PATCH V2 0/6] Enable SMM page level protection.

2016-11-09 Thread Yao, Jiewen
 What I found is that the BSP doesn't wait for the AP rendezvous before closing 
SMRAM.
[Jiewen] That is a good catch. Thanks to explain.
I believe that is more convincible than AP getting interrupt. :)

We have some places where BSP talking to AP in S3.

1)  CpuS3.c - EarlyInitializeCpu()

2)  CpuS3.c - SmmRelocateBases()

3)  CpuS3.c - InitializeCpu()

4)  S3Resume.c - SendSmiIpiAllExcludingSelf()

I believe we can guarantee 1/2/3 is good, because I found we check BSP check 
mNumberToFinish.
4 is a risk, because there is no AP finish check. If the AP is in below 1M with 
CR3 in SMRAM, it will be a trouble.

Once the AP executes RSM and return to non-SMM, the CR3 is no longer valid and 
AP must be crashed immediately. WoW!

The fix, I believe, is same.
We should make 1) AP is in above 1M reserved memory, and 2) AP is in protected 
mode with paging disabled.

Thank you
Yao Jiewen

From: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] On Behalf Of Paolo Bonzini
Sent: Wednesday, November 9, 2016 7:30 PM
To: Yao, Jiewen <jiewen@intel.com>; Laszlo Ersek <ler...@redhat.com>
Cc: Tian, Feng <feng.t...@intel.com>; edk2-de...@ml01.01.org; Kinney, Michael D 
<michael.d.kin...@intel.com>; Fan, Jeff <jeff@intel.com>; Zeng, Star 
<star.z...@intel.com>
Subject: Re: [edk2] [PATCH V2 0/6] Enable SMM page level protection.



On 09/11/2016 07:25, Yao, Jiewen wrote:
> Current BSP just uses its own context to initialize AP. So that AP
> takes BSP CR3, which is SMM CR3, unfortunately. After BSP initialized
> APs, the AP is put to HALT-LOOP in X64 mode. It is the last straw,
> because X64 mode halt still need paging.
>
> 3)  The error happen, once the AP receives an interrupt (for
> whatever reason), AP starts executing code. However, that that time
> the AP might not be in SMM mode. It means SMM CR3 is not available.
> And then we see this.
>
> 4)  I guess we did not see the error, or this is RANDOM issue,
> because it depends on if AP receives an interrupt before BSP send
> INIT-SIPI-SIPI.
>
> 5)  The fix, I think, should be below: We should always put AP to
> protected mode, so that no paging is needed. We should put AP in
> above 1M reserved memory, instead of <1M memory, because <1M memory
> is restored.

For what it's worth, this is not what I observed.  What I found is that
the BSP doesn't wait for the AP rendezvous before closing SMRAM.

I'm not sure if the two things are related, but (3) would be a much
worse bug.  APs should not be receiving an interrupt.  Perhaps an NMI if
API is sitting in a CLI;HLT loop, but this is not what is happening.

Paolo
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH V2 0/6] Enable SMM page level protection.

2016-11-09 Thread Paolo Bonzini


On 09/11/2016 07:25, Yao, Jiewen wrote:
> Current BSP just uses its own context to initialize AP. So that AP
> takes BSP CR3, which is SMM CR3, unfortunately. After BSP initialized
> APs, the AP is put to HALT-LOOP in X64 mode. It is the last straw,
> because X64 mode halt still need paging.
> 
> 3)  The error happen, once the AP receives an interrupt (for
> whatever reason), AP starts executing code. However, that that time
> the AP might not be in SMM mode. It means SMM CR3 is not available.
> And then we see this.
> 
> 4)  I guess we did not see the error, or this is RANDOM issue,
> because it depends on if AP receives an interrupt before BSP send
> INIT-SIPI-SIPI.
> 
> 5)  The fix, I think, should be below: We should always put AP to
> protected mode, so that no paging is needed. We should put AP in
> above 1M reserved memory, instead of <1M memory, because <1M memory
> is restored.

For what it's worth, this is not what I observed.  What I found is that
the BSP doesn't wait for the AP rendezvous before closing SMRAM.

I'm not sure if the two things are related, but (3) would be a much
worse bug.  APs should not be receiving an interrupt.  Perhaps an NMI if
API is sitting in a CLI;HLT loop, but this is not what is happening.

Paolo
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH V2 0/6] Enable SMM page level protection.

2016-11-09 Thread Paolo Bonzini

>   * Second, the instruction that causes things to blow up is <0f aa>,
> i.e., RSM. I have absolutely no clue why RSM is executed:

It's probably not RSM.  RSM is probably the last instruction executed
before, and it's still in the buffer because, as you said, there's no
way that you can fetch an instruction while CR3 points into SMM.

My first thought was that the MMU is for some reason out of contact
with reality, but actually the CR3 write is correct:

 CPU-24446 [002] 39841.871040: kvm_exit: reason 
CR_ACCESS rip 0x9f05e info 103 0
 CPU-24446 [002] 39841.871040: kvm_cr:   cr_write 3 = 
0x7ff7f000

and it's coming from the stub as well.  So the second thought was that
the wakeup buffer has the wrong CR3 put into the wakeup buffer's Cr3 location.
I'm glad I kept looking because it was much more entertaining.  Especially
knowing that I (probably) will not have to fix it. :)

The basic idea for debugging was to look for interesting events and
use 0x402 writes to correlate them to the debug log.  For example, most
accesses to 0x9f??? are obviously not traced by KVM, but the first ones
are:

31519-  CPU-2 [006] 39841.783344: kvm_exit: reason 
EPT_VIOLATION rip 0x855d82 info 181 0
31520:  CPU-2 [006] 39841.783344: kvm_page_fault:   address 
9f000 error_code 181
280224- CPU-2 [006] 39841.860940: kvm_exit: reason 
EPT_VIOLATION rip 0x7ffd0d15 info 182 0
280225: CPU-2 [006] 39841.860940: kvm_page_fault:   address 
9f000 error_code 182

(The number is just the line number in the trace).  Luckily your machine
didn't have EPT accessed/dirty bits, so KVM trapped both the first read
and the first write.

The read is at

WakeupBufferStart = 9F000, WakeupBufferSize = 1000

but it's not too interesting.  The second is a good one to start debugging
because it's from SMRAM (though not from SMM, since the first kvm_enter_smm
happens later at 305930).  So it makes sense that it writes an SMRAM CR3.
There is a write to the debug log just before, at 279993, and it writes
"SmmRestoreCpu()".  As expected, the write is followed by a flurry of MSR
writes, the APIC programming at 280131, so I am pretty sure that the write to
mExchangeInfo->Cr3 comes from PrepareApStartupVector.

FWIW, I first looked at the call chain up from BackupAndPrepareWakeupBuffer,
but that led me nowhere for an hour.  So I was a bit lucky indeed. :)

Anyhow, SmmRestoreCpu is the SmmS3ResumeEntryPoint for S3Resume2Pei, and
indeed, earlier in the log you have this debugging output from S3Resume2Pei:

SMM S3 CR3  = 7FF7F000

Doh, maybe I should have looked at the log before the trace.  Who knows.
Anyway, the SMM_S3_RESUME_STATE is initialized by InitSmmS3ResumeState,
so the CR3 is the one that is initialized by InitSmmS3Cr3 in
UefiCpuPkg/PiSmmCpuDxeSmm/X64/SmmProfileArch.c.  At this point I
was still thinking that this CR3 was wrong, but by looking at the
places where SMM is entered, and correlating that with debug log writes,
the puzzle was relatively easy to solve:

1) SMBASE relocation, done by SmmRestoreCpu:

305930: CPU-24445 [005] 39841.871264: kvm_enter_smm:vcpu 1: 
entering SMM, smbase 0x3
306000: CPU-24445 [005] 39841.871318: kvm_enter_smm:vcpu 1: 
leaving SMM, smbase 0x7ffb3000
306051: CPU-24446 [002] 39841.871349: kvm_enter_smm:vcpu 2: 
entering SMM, smbase 0x3
306108: CPU-24446 [002] 39841.871390: kvm_enter_smm:vcpu 2: 
leaving SMM, smbase 0x7ffb5000
306161: CPU-24447 [004] 39841.871421: kvm_enter_smm:vcpu 3: 
entering SMM, smbase 0x3
306218: CPU-24447 [004] 39841.871463: kvm_enter_smm:vcpu 3: 
leaving SMM, smbase 0x7ffb7000
306254: CPU-2 [006] 39841.871473: kvm_enter_smm:vcpu 0: 
entering SMM, smbase 0x3
306311: CPU-2 [006] 39841.871512: kvm_enter_smm:vcpu 0: 
leaving SMM, smbase 0x7ffb1000

2) S3ResumeExecuteBootScript (again, the previous 0x402 write ends
at 334597 and promptly gives us a clue):

334698: CPU-24445 [005] 39841.882706: kvm_enter_smm:vcpu 1: 
entering SMM, smbase 0x7ffb3000
334699: CPU-24447 [004] 39841.882706: kvm_enter_smm:vcpu 3: 
entering SMM, smbase 0x7ffb7000
334741: CPU-2 [006] 39841.882723: kvm_enter_smm:vcpu 0: 
entering SMM, smbase 0x7ffb1000
334742: CPU-24446 [002] 39841.882724: kvm_enter_smm:vcpu 2: 
entering SMM, smbase 0x7ffb5000
334875: CPU-2 [006] 39841.882755: kvm_enter_smm:vcpu 0: 
leaving SMM, smbase 0x7ffb1000

Here I think that it's where things go awry.  The lines after
S3ResumeExecuteBootScript() are

   Close all SMRAM regions before executing boot script
   Lock all SMRAM regions before executing boot script

and indeed the first is at 334898, 

Re: [edk2] [PATCH V2 0/6] Enable SMM page level protection.

2016-11-08 Thread Yao, Jiewen
Hi Laszlo
I will fix DEBUG message issue in V3 patch.

Below is rest issues:


l  Case 13: S3 fails randomly.
A good news: I worked with Jeff Fan to root-cause the S3 resume issue. Here is 
detail.


1)  We believe the dead CPU is AP. Not BSP.
The reason is that:

1.1)   The BSP already transfer control to OS waking vector. The GDT/IDT/CR3 
should be set by OS.

1.2)   The current dead CPU still has GDT/IDT point to a BIOS reserved memory. 
The CS/DS/SS is typical BIOS X64 mode setting.

1.3)   The current dead CPU still has CR3 in SMM. (Which is obvious wrong)


2)  Based upon the 1), we reviewed S3 resume AP flow.
Current BSP will wake up AP in SMRAM, for security consideration. At that time, 
we are using SMM mode CR3. It is OK for BSP because BSP is NOT in SMM mode yet. 
Even after SMM rebase, we can still use it because SMRR is not set in first SMM 
rebase.
Current BSP just uses its own context to initialize AP. So that AP takes BSP 
CR3, which is SMM CR3, unfortunately.
After BSP initialized APs, the AP is put to HALT-LOOP in X64 mode. It is the 
last straw, because X64 mode halt still need paging.


3)  The error happen, once the AP receives an interrupt (for whatever 
reason), AP starts executing code. However, that that time the AP might not be 
in SMM mode. It means SMM CR3 is not available. And then we see this.


4)  I guess we did not see the error, or this is RANDOM issue, because it 
depends on if AP receives an interrupt before BSP send INIT-SIPI-SIPI.


5)  The fix, I think, should be below:
We should always put AP to protected mode, so that no paging is needed.
We should put AP in above 1M reserved memory, instead of <1M memory, because 
<1M memory is restored.


Would you please file a bugzillar? I think we need assign CPU owner to fix that 
critical issue.

There is no need to do more investigation. Thanks for your great help on that. 
:)




l  Case 17 - I do not think it is a real issue, because SMM is out of resource.


l  Case 8 - that is a very weird issue. I talk with Jeff again. I do not have a 
clear clue yet.
> ASSERT MdePkg/Library/BaseSynchronizationLib/SynchronizationGcc.c(73): 
> SpinLock != ((void *) 0)
Here is code. We do not know why there is some code need InitializeSpinLock 
after ExitBootServices.
SPIN_LOCK *
EFIAPI
InitializeSpinLock (
  OUT  SPIN_LOCK *SpinLock
  )
{
  ASSERT (SpinLock != NULL);

  _ReadWriteBarrier();
  *SpinLock = SPIN_LOCK_RELEASED;
  _ReadWriteBarrier();

  return SpinLock;
}

If you can have a quick check on below, that would be great.

1)  Which processor triggers this ASSERT? BSP or AP.

2)  Which module triggers this ASSERT? Which module contains current RIP 
value?

At same time, all my OS test is on real platform. I have not setup OVMF env to 
run an OS yet.
If you can share a step by step to me, that would be great.

Thank you
Yao Jiewen

From: edk2-devel [mailto:edk2-devel-boun...@lists.01.org] On Behalf Of Laszlo 
Ersek
Sent: Tuesday, November 8, 2016 9:22 AM
To: Yao, Jiewen <jiewen@intel.com>
Cc: Tian, Feng <feng.t...@intel.com>; edk2-de...@ml01.01.org; Kinney, Michael D 
<michael.d.kin...@intel.com>; Paolo Bonzini <pbonz...@redhat.com>; Fan, Jeff 
<jeff@intel.com>; Zeng, Star <star.z...@intel.com>
Subject: Re: [edk2] [PATCH V2 0/6] Enable SMM page level protection.

On 11/04/16 10:30, Jiewen Yao wrote:
>  below is V2 description 
> 1) PiSmmCpu: resolve OVMF multiple processors boot hang issue.
> 2) PiSmmCpu: Add debug info on StartupAp() fails.
> 3) PiSmmCpu: Add ASSERT for AllocatePages().
> 4) PiSmmCpu: Add protection detail in commit message.
> 5) UefiCpuPkg.dsc: Add page table footprint info in commit message.
>
>  below is V1 description 
> This series patch enables SMM page level protection.
> Features are:
> 1) PiSmmCore reports SMM PE image code/data information
> in EdkiiPiSmmMemoryAttributeTable, if the SMM image is page aligned.
> 2) PiSmmCpu consumes EdkiiPiSmmMemoryAttributeTable
> and set XD for data page and RO for code page.
> 3) PiSmmCpu enables Static Paging for X64 according to
> PcdCpuSmmStaticPageTable. If it is true, 1G paging for above 4G
> is used as long as it is supported.
> 4) PiSmmCpu sets importance data structure to be read only,
> such as Gdt, Idt, SmmEntrypoint, and PageTable itself.
>
> tested platform:
> 1) Intel internal platform (X64).
> 2) EDKII Quark IA32
> 3) EDKII Vlv2  X64
> 4) EDKII OVMF IA32 and IA32X64. (with -smp 8)
>
> Cc: Jeff Fan <jeff@intel.com<mailto:jeff@intel.com>>
> Cc: Feng Tian <feng.t...@intel.com<mailto:feng.t...@intel.com>>
> Cc: Star Zeng <star.z...@intel.com<mailto:star.z...@intel.com>>
> Cc: Michael D Kinney 
> <michael.d.kin...@intel.com<mailto:michael.d.kin...@intel.com>>
> Cc: Laszlo Ersek <ler.

Re: [edk2] [PATCH V2 0/6] Enable SMM page level protection.

2016-11-08 Thread Yao, Jiewen
Yes, it is a good idea to check "(mSmmMpSyncData->EffectiveSyncMode == 
SmmCpuSyncModeTradition)".
I agree.

Thank you
Yao Jiewen

From: Laszlo Ersek [mailto:ler...@redhat.com]
Sent: Tuesday, November 8, 2016 9:23 PM
To: Yao, Jiewen <jiewen@intel.com>
Cc: Tian, Feng <feng.t...@intel.com>; edk2-de...@ml01.01.org; Kinney, Michael D 
<michael.d.kin...@intel.com>; Paolo Bonzini <pbonz...@redhat.com>; Fan, Jeff 
<jeff@intel.com>; Zeng, Star <star.z...@intel.com>
Subject: Re: [edk2] [PATCH V2 0/6] Enable SMM page level protection.

On 11/08/16 13:59, Yao, Jiewen wrote:
> HI Laszlo
>
> Thanks for the detail test result.
>
>
>
> Quick comment for the debug message:
>
> 1)  For "ConvertPageEntryAttribute 0x7F92B067->0x7F92B065", I agree
> to change to DEBUG_VERBOSE, because it pure debug purpose.
>
>
>
> 2)  For "!mSmmMpSyncData->CpuData[1].Present", I think people has
> interest to know startup failure reason. I would prefer to keep current
> DEBUG_ERROR.

I agree that DEBUG_ERROR is approprite for messages that can directly
relate to startup failures.

However, does this condition unavoidably imply startup failure? Because,
as demonstrated by QEMU + OVMF, a platform where an SMI does not pull
all processors into SMM at once can still work with PiSmmCpuDxeSmm,
assuming the appropriate PCD settings.

Therefore, can we make this error message conditional on

  (mSmmMpSyncData->EffectiveSyncMode == SmmCpuSyncModeTradition)

? Because, "not present" is an error for the traditional sync mode, but
for the relaxed / directed mode, "not present" is expected. Isn't it?


> At same time, I understand your OVMF concern on too many debug message
> in FlushTlb. So I plan to resolve problem in another way.
>
> I will check "mSmmMpSyncData->CpuData[1].Present" before calling
> SmmBlockingStartupThisAp(). So you will not see any debug message in
> FlashTlb(). J
>
>
>
> What about your idea?

If we cannot omit (or downgrade) the message for
SmmCpuSyncModeRelaxedAp, then decreasing its frequency would be appreciated.

Thanks
Laszlo

>
>
> *From:*edk2-devel [mailto:edk2-devel-boun...@lists.01.org] *On Behalf Of
> *Laszlo Ersek
> *Sent:* Tuesday, November 8, 2016 9:22 AM
> *To:* Yao, Jiewen <jiewen@intel.com<mailto:jiewen@intel.com>>
> *Cc:* Tian, Feng <feng.t...@intel.com<mailto:feng.t...@intel.com>>; 
> edk2-de...@ml01.01.org<mailto:edk2-de...@ml01.01.org>; Kinney,
> Michael D <michael.d.kin...@intel.com<mailto:michael.d.kin...@intel.com>>; 
> Paolo Bonzini
> <pbonz...@redhat.com<mailto:pbonz...@redhat.com>>; Fan, Jeff 
> <jeff@intel.com<mailto:jeff@intel.com>>; Zeng, Star
> <star.z...@intel.com<mailto:star.z...@intel.com>>
> *Subject:* Re: [edk2] [PATCH V2 0/6] Enable SMM page level protection.
>
>
>
> On 11/04/16 10:30, Jiewen Yao wrote:
>>  below is V2 description 
>> 1) PiSmmCpu: resolve OVMF multiple processors boot hang issue.
>> 2) PiSmmCpu: Add debug info on StartupAp() fails.
>> 3) PiSmmCpu: Add ASSERT for AllocatePages().
>> 4) PiSmmCpu: Add protection detail in commit message.
>> 5) UefiCpuPkg.dsc: Add page table footprint info in commit message.
>>
>>  below is V1 description 
>> This series patch enables SMM page level protection.
>> Features are:
>> 1) PiSmmCore reports SMM PE image code/data information
>> in EdkiiPiSmmMemoryAttributeTable, if the SMM image is page aligned.
>> 2) PiSmmCpu consumes EdkiiPiSmmMemoryAttributeTable
>> and set XD for data page and RO for code page.
>> 3) PiSmmCpu enables Static Paging for X64 according to
>> PcdCpuSmmStaticPageTable. If it is true, 1G paging for above 4G
>> is used as long as it is supported.
>> 4) PiSmmCpu sets importance data structure to be read only,
>> such as Gdt, Idt, SmmEntrypoint, and PageTable itself.
>>
>> tested platform:
>> 1) Intel internal platform (X64).
>> 2) EDKII Quark IA32
>> 3) EDKII Vlv2  X64
>> 4) EDKII OVMF IA32 and IA32X64. (with -smp 8)
>>
>> Cc: Jeff Fan <jeff@intel.com 
>> <mailto:jeff@intel.com<mailto:jeff@intel.com 
>> %3cmailto:jeff@intel.com>>>
>> Cc: Feng Tian <feng.t...@intel.com 
>> <mailto:feng.t...@intel.com<mailto:feng.t...@intel.com 
>> %3cmailto:feng.t...@intel.com>>>
>> Cc: Star Zeng <star.z...@intel.com 
>> <mailto:star.z...@intel.com<mailto:star.z...@intel.com 
>> %3cmailto:star.z...@intel.com>>>
>> Cc: Michael D Kinney <michael.d.kin...@intel.com 
>> <mailto:michae

Re: [edk2] [PATCH V2 0/6] Enable SMM page level protection.

2016-11-08 Thread Laszlo Ersek
On 11/08/16 13:59, Yao, Jiewen wrote:
> HI Laszlo
> 
> Thanks for the detail test result.
> 
>  
> 
> Quick comment for the debug message:
> 
> 1)  For “ConvertPageEntryAttribute 0x7F92B067->0x7F92B065”, I agree
> to change to DEBUG_VERBOSE, because it pure debug purpose.
> 
>  
> 
> 2)  For “!mSmmMpSyncData->CpuData[1].Present”, I think people has
> interest to know startup failure reason. I would prefer to keep current
> DEBUG_ERROR.

I agree that DEBUG_ERROR is approprite for messages that can directly
relate to startup failures.

However, does this condition unavoidably imply startup failure? Because,
as demonstrated by QEMU + OVMF, a platform where an SMI does not pull
all processors into SMM at once can still work with PiSmmCpuDxeSmm,
assuming the appropriate PCD settings.

Therefore, can we make this error message conditional on

  (mSmmMpSyncData->EffectiveSyncMode == SmmCpuSyncModeTradition)

? Because, "not present" is an error for the traditional sync mode, but
for the relaxed / directed mode, "not present" is expected. Isn't it?


> At same time, I understand your OVMF concern on too many debug message
> in FlushTlb. So I plan to resolve problem in another way.
> 
> I will check “mSmmMpSyncData->CpuData[1].Present” before calling
> SmmBlockingStartupThisAp(). So you will not see any debug message in
> FlashTlb(). J
> 
>  
> 
> What about your idea?

If we cannot omit (or downgrade) the message for
SmmCpuSyncModeRelaxedAp, then decreasing its frequency would be appreciated.

Thanks
Laszlo

>  
> 
> *From:*edk2-devel [mailto:edk2-devel-boun...@lists.01.org] *On Behalf Of
> *Laszlo Ersek
> *Sent:* Tuesday, November 8, 2016 9:22 AM
> *To:* Yao, Jiewen <jiewen@intel.com>
> *Cc:* Tian, Feng <feng.t...@intel.com>; edk2-de...@ml01.01.org; Kinney,
> Michael D <michael.d.kin...@intel.com>; Paolo Bonzini
> <pbonz...@redhat.com>; Fan, Jeff <jeff@intel.com>; Zeng, Star
> <star.z...@intel.com>
> *Subject:* Re: [edk2] [PATCH V2 0/6] Enable SMM page level protection.
> 
>  
> 
> On 11/04/16 10:30, Jiewen Yao wrote:
>>  below is V2 description 
>> 1) PiSmmCpu: resolve OVMF multiple processors boot hang issue.
>> 2) PiSmmCpu: Add debug info on StartupAp() fails.
>> 3) PiSmmCpu: Add ASSERT for AllocatePages().
>> 4) PiSmmCpu: Add protection detail in commit message.
>> 5) UefiCpuPkg.dsc: Add page table footprint info in commit message.
>>
>>  below is V1 description 
>> This series patch enables SMM page level protection.
>> Features are:
>> 1) PiSmmCore reports SMM PE image code/data information
>> in EdkiiPiSmmMemoryAttributeTable, if the SMM image is page aligned.
>> 2) PiSmmCpu consumes EdkiiPiSmmMemoryAttributeTable
>> and set XD for data page and RO for code page.
>> 3) PiSmmCpu enables Static Paging for X64 according to
>> PcdCpuSmmStaticPageTable. If it is true, 1G paging for above 4G
>> is used as long as it is supported.
>> 4) PiSmmCpu sets importance data structure to be read only,
>> such as Gdt, Idt, SmmEntrypoint, and PageTable itself.
>>
>> tested platform:
>> 1) Intel internal platform (X64).
>> 2) EDKII Quark IA32
>> 3) EDKII Vlv2  X64
>> 4) EDKII OVMF IA32 and IA32X64. (with -smp 8)
>>
>> Cc: Jeff Fan <jeff@intel.com <mailto:jeff@intel.com>>
>> Cc: Feng Tian <feng.t...@intel.com <mailto:feng.t...@intel.com>>
>> Cc: Star Zeng <star.z...@intel.com <mailto:star.z...@intel.com>>
>> Cc: Michael D Kinney <michael.d.kin...@intel.com 
>> <mailto:michael.d.kin...@intel.com>>
>> Cc: Laszlo Ersek <ler...@redhat.com <mailto:ler...@redhat.com>>
>> Contributed-under: TianoCore Contribution Agreement 1.0
>> Signed-off-by: Jiewen Yao <jiewen@intel.com 
>> <mailto:jiewen@intel.com>>
> 
> I have new test results. Let's start with the table again:
> 
> Legend:
> 
> - "untested" means the test was not executed because the same test
>   failed or proved unreliable in a less demanding configuration already,
> 
> - "n/a" means a setting or test case was impossible,
> 
> - "fail" and "unreliable" (lower case) are outside the scope of this
>   series; they either capture the pre-series status, or are expected
>   even with the series applied due to the pre-series status,
> 
> - "FAIL" and "UNRELIABLE" mean regressions caused (or exposed) by the
>   series.
> 
> In all cases, 36 bits were used as address width in the CPU HOB (--> up
> to 64GB guest-phys address space).
> 
>series  

Re: [edk2] [PATCH V2 0/6] Enable SMM page level protection.

2016-11-08 Thread Yao, Jiewen
HI Laszlo
Thanks for the detail test result.

Quick comment for the debug message:

1)  For "ConvertPageEntryAttribute 0x7F92B067->0x7F92B065", I agree to 
change to DEBUG_VERBOSE, because it pure debug purpose.


2)  For "!mSmmMpSyncData->CpuData[1].Present", I think people has interest 
to know startup failure reason. I would prefer to keep current DEBUG_ERROR.



At same time, I understand your OVMF concern on too many debug message in 
FlushTlb. So I plan to resolve problem in another way.

I will check "mSmmMpSyncData->CpuData[1].Present" before calling 
SmmBlockingStartupThisAp(). So you will not see any debug message in 
FlashTlb(). :)

What about your idea?

Thank you
Yao Jiewen

From: edk2-devel [mailto:edk2-devel-boun...@lists.01.org] On Behalf Of Laszlo 
Ersek
Sent: Tuesday, November 8, 2016 9:22 AM
To: Yao, Jiewen <jiewen@intel.com>
Cc: Tian, Feng <feng.t...@intel.com>; edk2-de...@ml01.01.org; Kinney, Michael D 
<michael.d.kin...@intel.com>; Paolo Bonzini <pbonz...@redhat.com>; Fan, Jeff 
<jeff....@intel.com>; Zeng, Star <star.z...@intel.com>
Subject: Re: [edk2] [PATCH V2 0/6] Enable SMM page level protection.

On 11/04/16 10:30, Jiewen Yao wrote:
>  below is V2 description 
> 1) PiSmmCpu: resolve OVMF multiple processors boot hang issue.
> 2) PiSmmCpu: Add debug info on StartupAp() fails.
> 3) PiSmmCpu: Add ASSERT for AllocatePages().
> 4) PiSmmCpu: Add protection detail in commit message.
> 5) UefiCpuPkg.dsc: Add page table footprint info in commit message.
>
>  below is V1 description 
> This series patch enables SMM page level protection.
> Features are:
> 1) PiSmmCore reports SMM PE image code/data information
> in EdkiiPiSmmMemoryAttributeTable, if the SMM image is page aligned.
> 2) PiSmmCpu consumes EdkiiPiSmmMemoryAttributeTable
> and set XD for data page and RO for code page.
> 3) PiSmmCpu enables Static Paging for X64 according to
> PcdCpuSmmStaticPageTable. If it is true, 1G paging for above 4G
> is used as long as it is supported.
> 4) PiSmmCpu sets importance data structure to be read only,
> such as Gdt, Idt, SmmEntrypoint, and PageTable itself.
>
> tested platform:
> 1) Intel internal platform (X64).
> 2) EDKII Quark IA32
> 3) EDKII Vlv2  X64
> 4) EDKII OVMF IA32 and IA32X64. (with -smp 8)
>
> Cc: Jeff Fan <jeff@intel.com<mailto:jeff@intel.com>>
> Cc: Feng Tian <feng.t...@intel.com<mailto:feng.t...@intel.com>>
> Cc: Star Zeng <star.z...@intel.com<mailto:star.z...@intel.com>>
> Cc: Michael D Kinney 
> <michael.d.kin...@intel.com<mailto:michael.d.kin...@intel.com>>
> Cc: Laszlo Ersek <ler...@redhat.com<mailto:ler...@redhat.com>>
> Contributed-under: TianoCore Contribution Agreement 1.0
> Signed-off-by: Jiewen Yao <jiewen@intel.com<mailto:jiewen@intel.com>>

I have new test results. Let's start with the table again:

Legend:

- "untested" means the test was not executed because the same test
  failed or proved unreliable in a less demanding configuration already,

- "n/a" means a setting or test case was impossible,

- "fail" and "unreliable" (lower case) are outside the scope of this
  series; they either capture the pre-series status, or are expected
  even with the series applied due to the pre-series status,

- "FAIL" and "UNRELIABLE" mean regressions caused (or exposed) by the
  series.

In all cases, 36 bits were used as address width in the CPU HOB (--> up
to 64GB guest-phys address space).

   series  OVMF  
VCPU boot   S3 resume
 # applied platform PcdCpuMaxLogicalProcessorNumber PcdCpuSmmStaticPageTable 
topology result result
-- ---  ---  
 -- -
 1 no  Ia32  64 n/a  
1x2x2pass   unreliable
 2 no  Ia32 255 n/a  
52x2x2   pass   untested
 3 no  Ia32 255 n/a  
53x2x2   unreliable untested
 4 no  Ia32X64   64 n/a  
1x2x2pass   unreliable
 5 no  Ia32X64  255 n/a  
52x2x2   pass   untested
 6 no  Ia32X64  255 n/a  
54x2x2   fail   n/a
 7 v2  Ia32  64 FALSE
1x2x2pass   untested
 8 v2  Ia32  64 TRUE 
1x2x2FAIL   untested
 9 v2  Ia32 255 FALSE

Re: [edk2] [PATCH V2 0/6] Enable SMM page level protection.

2016-11-07 Thread Laszlo Ersek
On 11/04/16 10:30, Jiewen Yao wrote:
>  below is V2 description 
> 1) PiSmmCpu: resolve OVMF multiple processors boot hang issue.
> 2) PiSmmCpu: Add debug info on StartupAp() fails.
> 3) PiSmmCpu: Add ASSERT for AllocatePages().
> 4) PiSmmCpu: Add protection detail in commit message.
> 5) UefiCpuPkg.dsc: Add page table footprint info in commit message.
>
>  below is V1 description 
> This series patch enables SMM page level protection.
> Features are:
> 1) PiSmmCore reports SMM PE image code/data information
> in EdkiiPiSmmMemoryAttributeTable, if the SMM image is page aligned.
> 2) PiSmmCpu consumes EdkiiPiSmmMemoryAttributeTable
> and set XD for data page and RO for code page.
> 3) PiSmmCpu enables Static Paging for X64 according to
> PcdCpuSmmStaticPageTable. If it is true, 1G paging for above 4G
> is used as long as it is supported.
> 4) PiSmmCpu sets importance data structure to be read only,
> such as Gdt, Idt, SmmEntrypoint, and PageTable itself.
>
> tested platform:
> 1) Intel internal platform (X64).
> 2) EDKII Quark IA32
> 3) EDKII Vlv2  X64
> 4) EDKII OVMF IA32 and IA32X64. (with -smp 8)
>
> Cc: Jeff Fan 
> Cc: Feng Tian 
> Cc: Star Zeng 
> Cc: Michael D Kinney 
> Cc: Laszlo Ersek 
> Contributed-under: TianoCore Contribution Agreement 1.0
> Signed-off-by: Jiewen Yao 

I have new test results. Let's start with the table again:

Legend:

- "untested" means the test was not executed because the same test
  failed or proved unreliable in a less demanding configuration already,

- "n/a" means a setting or test case was impossible,

- "fail" and "unreliable" (lower case) are outside the scope of this
  series; they either capture the pre-series status, or are expected
  even with the series applied due to the pre-series status,

- "FAIL" and "UNRELIABLE" mean regressions caused (or exposed) by the
  series.

In all cases, 36 bits were used as address width in the CPU HOB (--> up
to 64GB guest-phys address space).

   series  OVMF  
VCPU boot   S3 resume
 # applied platform PcdCpuMaxLogicalProcessorNumber PcdCpuSmmStaticPageTable 
topology result result
-- ---  ---  
 -- -
 1 no  Ia32  64 n/a  
1x2x2pass   unreliable
 2 no  Ia32 255 n/a  
52x2x2   pass   untested
 3 no  Ia32 255 n/a  
53x2x2   unreliable untested
 4 no  Ia32X64   64 n/a  
1x2x2pass   unreliable
 5 no  Ia32X64  255 n/a  
52x2x2   pass   untested
 6 no  Ia32X64  255 n/a  
54x2x2   fail   n/a
 7 v2  Ia32  64 FALSE
1x2x2pass   untested
 8 v2  Ia32  64 TRUE 
1x2x2FAIL   untested
 9 v2  Ia32 255 FALSE
52x2x2   pass   untested
10 v2  Ia32 255 FALSE
53x2x2   untested   untested
11 v2  Ia32 255 TRUE 
52x2x2   untested   untested
12 v2  Ia32 255 TRUE 
53x2x2   untested   untested
13 v2  Ia32X64   64 FALSE
1x2x2pass   unreliable
14 v2  Ia32X64   64 TRUE 
1x2x2pass   untested
15 v2  Ia32X64  255 FALSE
52x2x2   pass   untested
16 v2  Ia32X64  255 FALSE
54x2x2   untested   untested
17 v2  Ia32X64  255 TRUE 
52x2x2   FAIL   untested
18 v2  Ia32X64  255 TRUE 
54x2x2   untested   untested

* Case 8: this test case failed with v2 as well, but this time with
  different symptoms:

> FSOpen: Open '\EFI\fedora\grubia32.efi' Success
> InstallProtocolInterface: [EfiLoadedImageProtocol] 7E4037A8
> Loading driver at 0x0007DA7F000 EntryPoint=0x0007DA7F400
> InstallProtocolInterface: [EfiLoadedImageDevicePathProtocol] 7E403A90
> PixelBlueGreenRedReserved8BitPerColor
> ConvertPages: Incompatible memory types
> PixelBlueGreenRedReserved8BitPerColor
> ConvertPages: Incompatible memory types
> SmmInstallProtocolInterface: [EdkiiSmmExitBootServicesProtocol] 0
> MpInitExitBootServicesCallback() done!
> ASSERT 

Re: [edk2] [PATCH V2 0/6] Enable SMM page level protection.

2016-11-04 Thread Laszlo Ersek
On 11/04/16 23:46, Yao, Jiewen wrote:
> Ah, yes. Laszlo. You are right.
> 
> I forget to push the last update yesterday. Thank you to remind me.
> Now it is synced.

Thanks! The commit message updates and the v1->v2 differences look
good/reasonable to me (I diffed the code-level end results of the two
versions, plus I compared the commit messages pairwise). I hope to test
v2 sometime next week, and I intend to look into the S3 instability too
(I took note of Paolo's advice with the "info tlb" QEMU monitor command).

Going through the (now documented) SMRAM impact again, I realize the
platform can elect to set PcdCpuSmmStaticPageTable dynamically as well.
I'm sort of guessing that we might want to set the PCD in OVMF's
PlatformPei, based on the guest-phys address width (which we also
calculate in PlatformPei), in combination with availability of 1G
paging. The case we should likely avoid is

> A) If the system only supports 2M paging,
> When the whole memory/MMIO is 48bit, we need 1+256+256*256 pages
>   (~ 257M)

Anyway, I don't want to be too clever about this until we see a problem
(out-of-SMRAM) in practice.

Thanks!
Laszlo

> From: Laszlo Ersek [mailto:ler...@redhat.com]
> Sent: Saturday, November 5, 2016 6:40 AM
> To: Yao, Jiewen <jiewen@intel.com>; edk2-de...@ml01.01.org
> Cc: Kinney, Michael D <michael.d.kin...@intel.com>; Tian, Feng 
> <feng.t...@intel.com>; Fan, Jeff <jeff....@intel.com>; Zeng, Star 
> <star.z...@intel.com>
> Subject: Re: [edk2] [PATCH V2 0/6] Enable SMM page level protection.
> 
> On 11/04/16 10:30, Jiewen Yao wrote:
>>  below is V2 description 
>> 1) PiSmmCpu: resolve OVMF multiple processors boot hang issue.
>> 2) PiSmmCpu: Add debug info on StartupAp() fails.
>> 3) PiSmmCpu: Add ASSERT for AllocatePages().
>> 4) PiSmmCpu: Add protection detail in commit message.
>> 5) UefiCpuPkg.dsc: Add page table footprint info in commit message.
> 
> Jiewen, can you please push this series to a new branch in your repo?
> 
> I see a branch called "SmmProtection_V2", but it seems to end with an
> incomplete patch (26f482d8b611d0fcb07d3ffbf3f4468fd249767b, subject
> "pismmcpu"), so I figured I'd ask explicitly.
> 
> Thanks
> Laszlo
> 
>>  below is V1 description 
>> This series patch enables SMM page level protection.
>> Features are:
>> 1) PiSmmCore reports SMM PE image code/data information
>> in EdkiiPiSmmMemoryAttributeTable, if the SMM image is page aligned.
>> 2) PiSmmCpu consumes EdkiiPiSmmMemoryAttributeTable
>> and set XD for data page and RO for code page.
>> 3) PiSmmCpu enables Static Paging for X64 according to
>> PcdCpuSmmStaticPageTable. If it is true, 1G paging for above 4G
>> is used as long as it is supported.
>> 4) PiSmmCpu sets importance data structure to be read only,
>> such as Gdt, Idt, SmmEntrypoint, and PageTable itself.
>>
>> tested platform:
>> 1) Intel internal platform (X64).
>> 2) EDKII Quark IA32
>> 3) EDKII Vlv2  X64
>> 4) EDKII OVMF IA32 and IA32X64. (with -smp 8)
>>
>> Cc: Jeff Fan <jeff@intel.com<mailto:jeff@intel.com>>
>> Cc: Feng Tian <feng.t...@intel.com<mailto:feng.t...@intel.com>>
>> Cc: Star Zeng <star.z...@intel.com<mailto:star.z...@intel.com>>
>> Cc: Michael D Kinney 
>> <michael.d.kin...@intel.com<mailto:michael.d.kin...@intel.com>>
>> Cc: Laszlo Ersek <ler...@redhat.com<mailto:ler...@redhat.com>>
>> Contributed-under: TianoCore Contribution Agreement 1.0
>> Signed-off-by: Jiewen Yao <jiewen@intel.com<mailto:jiewen@intel.com>>
>>
>> Jiewen Yao (6):
>>   MdeModulePkg/Include: Add PiSmmMemoryAttributesTable.h
>>   MdeModulePkg/dec: Add gEdkiiPiSmmMemoryAttributesTableGuid.
>>   MdeModulePkg/PiSmmCore: Add MemoryAttributes support.
>>   UefiCpuPkg/dec: Add PcdCpuSmmStaticPageTable.
>>   UefiCpuPkg/PiSmmCpuDxeSmm: Add paging protection.
>>   QuarkPlatformPkg/dsc: enable Smm paging protection.
>>
>>  MdeModulePkg/Core/PiSmmCore/Dispatcher.c   |   66 +
>>  MdeModulePkg/Core/PiSmmCore/MemoryAttributesTable.c| 1509 
>> 
>>  MdeModulePkg/Core/PiSmmCore/Page.c |  775 +-
>>  MdeModulePkg/Core/PiSmmCore/PiSmmCore.c|   40 +
>>  MdeModulePkg/Core/PiSmmCore/PiSmmCore.h|   91 ++
>>  MdeModulePkg/Core/PiSmmCore/PiSmmCore.inf  |2 +
>>  MdeModulePkg/Core/PiSmmCore/Pool.c |   16 +
>>  MdeModulePkg/Include/Guid/PiSmmMemoryAttributesTable.h |   51 +
>>  MdeModulePkg/MdeMo

Re: [edk2] [PATCH V2 0/6] Enable SMM page level protection.

2016-11-04 Thread Yao, Jiewen
Ah, yes. Laszlo. You are right.

I forget to push the last update yesterday. Thank you to remind me.
Now it is synced.

Thank you
Yao Jiewen

From: Laszlo Ersek [mailto:ler...@redhat.com]
Sent: Saturday, November 5, 2016 6:40 AM
To: Yao, Jiewen <jiewen@intel.com>; edk2-de...@ml01.01.org
Cc: Kinney, Michael D <michael.d.kin...@intel.com>; Tian, Feng 
<feng.t...@intel.com>; Fan, Jeff <jeff@intel.com>; Zeng, Star 
<star.z...@intel.com>
Subject: Re: [edk2] [PATCH V2 0/6] Enable SMM page level protection.

On 11/04/16 10:30, Jiewen Yao wrote:
>  below is V2 description 
> 1) PiSmmCpu: resolve OVMF multiple processors boot hang issue.
> 2) PiSmmCpu: Add debug info on StartupAp() fails.
> 3) PiSmmCpu: Add ASSERT for AllocatePages().
> 4) PiSmmCpu: Add protection detail in commit message.
> 5) UefiCpuPkg.dsc: Add page table footprint info in commit message.

Jiewen, can you please push this series to a new branch in your repo?

I see a branch called "SmmProtection_V2", but it seems to end with an
incomplete patch (26f482d8b611d0fcb07d3ffbf3f4468fd249767b, subject
"pismmcpu"), so I figured I'd ask explicitly.

Thanks
Laszlo

>  below is V1 description 
> This series patch enables SMM page level protection.
> Features are:
> 1) PiSmmCore reports SMM PE image code/data information
> in EdkiiPiSmmMemoryAttributeTable, if the SMM image is page aligned.
> 2) PiSmmCpu consumes EdkiiPiSmmMemoryAttributeTable
> and set XD for data page and RO for code page.
> 3) PiSmmCpu enables Static Paging for X64 according to
> PcdCpuSmmStaticPageTable. If it is true, 1G paging for above 4G
> is used as long as it is supported.
> 4) PiSmmCpu sets importance data structure to be read only,
> such as Gdt, Idt, SmmEntrypoint, and PageTable itself.
>
> tested platform:
> 1) Intel internal platform (X64).
> 2) EDKII Quark IA32
> 3) EDKII Vlv2  X64
> 4) EDKII OVMF IA32 and IA32X64. (with -smp 8)
>
> Cc: Jeff Fan <jeff@intel.com<mailto:jeff@intel.com>>
> Cc: Feng Tian <feng.t...@intel.com<mailto:feng.t...@intel.com>>
> Cc: Star Zeng <star.z...@intel.com<mailto:star.z...@intel.com>>
> Cc: Michael D Kinney 
> <michael.d.kin...@intel.com<mailto:michael.d.kin...@intel.com>>
> Cc: Laszlo Ersek <ler...@redhat.com<mailto:ler...@redhat.com>>
> Contributed-under: TianoCore Contribution Agreement 1.0
> Signed-off-by: Jiewen Yao <jiewen@intel.com<mailto:jiewen@intel.com>>
>
> Jiewen Yao (6):
>   MdeModulePkg/Include: Add PiSmmMemoryAttributesTable.h
>   MdeModulePkg/dec: Add gEdkiiPiSmmMemoryAttributesTableGuid.
>   MdeModulePkg/PiSmmCore: Add MemoryAttributes support.
>   UefiCpuPkg/dec: Add PcdCpuSmmStaticPageTable.
>   UefiCpuPkg/PiSmmCpuDxeSmm: Add paging protection.
>   QuarkPlatformPkg/dsc: enable Smm paging protection.
>
>  MdeModulePkg/Core/PiSmmCore/Dispatcher.c   |   66 +
>  MdeModulePkg/Core/PiSmmCore/MemoryAttributesTable.c| 1509 
> 
>  MdeModulePkg/Core/PiSmmCore/Page.c |  775 +-
>  MdeModulePkg/Core/PiSmmCore/PiSmmCore.c|   40 +
>  MdeModulePkg/Core/PiSmmCore/PiSmmCore.h|   91 ++
>  MdeModulePkg/Core/PiSmmCore/PiSmmCore.inf  |2 +
>  MdeModulePkg/Core/PiSmmCore/Pool.c |   16 +
>  MdeModulePkg/Include/Guid/PiSmmMemoryAttributesTable.h |   51 +
>  MdeModulePkg/MdeModulePkg.dec  |3 +
>  QuarkPlatformPkg/Quark.dsc |6 +
>  UefiCpuPkg/PiSmmCpuDxeSmm/Ia32/PageTbl.c   |   71 +-
>  UefiCpuPkg/PiSmmCpuDxeSmm/Ia32/SmiEntry.S  |   67 +-
>  UefiCpuPkg/PiSmmCpuDxeSmm/Ia32/SmiEntry.asm|   68 +-
>  UefiCpuPkg/PiSmmCpuDxeSmm/Ia32/SmiEntry.nasm   |   70 +-
>  UefiCpuPkg/PiSmmCpuDxeSmm/Ia32/SmiException.S  |  226 +--
>  UefiCpuPkg/PiSmmCpuDxeSmm/Ia32/SmiException.asm|   36 +-
>  UefiCpuPkg/PiSmmCpuDxeSmm/Ia32/SmiException.nasm   |   36 +-
>  UefiCpuPkg/PiSmmCpuDxeSmm/Ia32/SmmFuncsArch.c  |   37 +-
>  UefiCpuPkg/PiSmmCpuDxeSmm/Ia32/SmmProfileArch.c|4 +-
>  UefiCpuPkg/PiSmmCpuDxeSmm/MpService.c  |  127 +-
>  UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.c |  142 +-
>  UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.h |  156 +-
>  UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.inf   |5 +-
>  UefiCpuPkg/PiSmmCpuDxeSmm/SmmCpuMemoryManagement.c |  871 +++
>  UefiCpuPkg/PiSmmCpuDxeSmm/SmmProfile.c |   39 +-
>  UefiCpuPkg/PiSmmCpuDxeSmm/SmmProfile.h |   15 +-
>  UefiCpuPkg/PiSmmCpuDxeSmm/X64/PageTbl.c|  274 +++-
>  UefiCpuPkg/PiSmmC

[edk2] [PATCH V2 0/6] Enable SMM page level protection.

2016-11-04 Thread Jiewen Yao
 below is V2 description 
1) PiSmmCpu: resolve OVMF multiple processors boot hang issue.
2) PiSmmCpu: Add debug info on StartupAp() fails.
3) PiSmmCpu: Add ASSERT for AllocatePages().
4) PiSmmCpu: Add protection detail in commit message.
5) UefiCpuPkg.dsc: Add page table footprint info in commit message.

 below is V1 description 
This series patch enables SMM page level protection.
Features are:
1) PiSmmCore reports SMM PE image code/data information
in EdkiiPiSmmMemoryAttributeTable, if the SMM image is page aligned.
2) PiSmmCpu consumes EdkiiPiSmmMemoryAttributeTable
and set XD for data page and RO for code page.
3) PiSmmCpu enables Static Paging for X64 according to
PcdCpuSmmStaticPageTable. If it is true, 1G paging for above 4G
is used as long as it is supported.
4) PiSmmCpu sets importance data structure to be read only,
such as Gdt, Idt, SmmEntrypoint, and PageTable itself.

tested platform:
1) Intel internal platform (X64).
2) EDKII Quark IA32
3) EDKII Vlv2  X64
4) EDKII OVMF IA32 and IA32X64. (with -smp 8)

Cc: Jeff Fan 
Cc: Feng Tian 
Cc: Star Zeng 
Cc: Michael D Kinney 
Cc: Laszlo Ersek 
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Jiewen Yao 

Jiewen Yao (6):
  MdeModulePkg/Include: Add PiSmmMemoryAttributesTable.h
  MdeModulePkg/dec: Add gEdkiiPiSmmMemoryAttributesTableGuid.
  MdeModulePkg/PiSmmCore: Add MemoryAttributes support.
  UefiCpuPkg/dec: Add PcdCpuSmmStaticPageTable.
  UefiCpuPkg/PiSmmCpuDxeSmm: Add paging protection.
  QuarkPlatformPkg/dsc: enable Smm paging protection.

 MdeModulePkg/Core/PiSmmCore/Dispatcher.c   |   66 +
 MdeModulePkg/Core/PiSmmCore/MemoryAttributesTable.c| 1509 

 MdeModulePkg/Core/PiSmmCore/Page.c |  775 +-
 MdeModulePkg/Core/PiSmmCore/PiSmmCore.c|   40 +
 MdeModulePkg/Core/PiSmmCore/PiSmmCore.h|   91 ++
 MdeModulePkg/Core/PiSmmCore/PiSmmCore.inf  |2 +
 MdeModulePkg/Core/PiSmmCore/Pool.c |   16 +
 MdeModulePkg/Include/Guid/PiSmmMemoryAttributesTable.h |   51 +
 MdeModulePkg/MdeModulePkg.dec  |3 +
 QuarkPlatformPkg/Quark.dsc |6 +
 UefiCpuPkg/PiSmmCpuDxeSmm/Ia32/PageTbl.c   |   71 +-
 UefiCpuPkg/PiSmmCpuDxeSmm/Ia32/SmiEntry.S  |   67 +-
 UefiCpuPkg/PiSmmCpuDxeSmm/Ia32/SmiEntry.asm|   68 +-
 UefiCpuPkg/PiSmmCpuDxeSmm/Ia32/SmiEntry.nasm   |   70 +-
 UefiCpuPkg/PiSmmCpuDxeSmm/Ia32/SmiException.S  |  226 +--
 UefiCpuPkg/PiSmmCpuDxeSmm/Ia32/SmiException.asm|   36 +-
 UefiCpuPkg/PiSmmCpuDxeSmm/Ia32/SmiException.nasm   |   36 +-
 UefiCpuPkg/PiSmmCpuDxeSmm/Ia32/SmmFuncsArch.c  |   37 +-
 UefiCpuPkg/PiSmmCpuDxeSmm/Ia32/SmmProfileArch.c|4 +-
 UefiCpuPkg/PiSmmCpuDxeSmm/MpService.c  |  127 +-
 UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.c |  142 +-
 UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.h |  156 +-
 UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.inf   |5 +-
 UefiCpuPkg/PiSmmCpuDxeSmm/SmmCpuMemoryManagement.c |  871 +++
 UefiCpuPkg/PiSmmCpuDxeSmm/SmmProfile.c |   39 +-
 UefiCpuPkg/PiSmmCpuDxeSmm/SmmProfile.h |   15 +-
 UefiCpuPkg/PiSmmCpuDxeSmm/X64/PageTbl.c|  274 +++-
 UefiCpuPkg/PiSmmCpuDxeSmm/X64/SmiEntry.S   |   51 +-
 UefiCpuPkg/PiSmmCpuDxeSmm/X64/SmiEntry.asm |   54 +-
 UefiCpuPkg/PiSmmCpuDxeSmm/X64/SmiEntry.nasm|   61 +-
 UefiCpuPkg/PiSmmCpuDxeSmm/X64/SmiException.S   |  250 +---
 UefiCpuPkg/PiSmmCpuDxeSmm/X64/SmiException.asm |   35 +-
 UefiCpuPkg/PiSmmCpuDxeSmm/X64/SmiException.nasm|   31 +-
 UefiCpuPkg/PiSmmCpuDxeSmm/X64/SmmFuncsArch.c   |   30 +-
 UefiCpuPkg/PiSmmCpuDxeSmm/X64/SmmProfileArch.c |7 +-
 UefiCpuPkg/UefiCpuPkg.dec  |8 +
 36 files changed, 4529 insertions(+), 801 deletions(-)
 create mode 100644 MdeModulePkg/Core/PiSmmCore/MemoryAttributesTable.c
 create mode 100644 MdeModulePkg/Include/Guid/PiSmmMemoryAttributesTable.h
 create mode 100644 UefiCpuPkg/PiSmmCpuDxeSmm/SmmCpuMemoryManagement.c

-- 
2.7.4.windows.1

___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel