On 09/04/2017 04:49 AM, Laszlo Ersek wrote:
On 09/03/17 04:12, Ge Song wrote:
In earlier PEI stage, temporary memory(Cache As Ram) is employed as stack
Please remove the "Cache As Ram" reference; it doesn't apply to OVMF.

I suggest "temporary memory at PcdOvmfSecPeiTempRamBase"

and heap. We move them to the new room and do some relocation fixup when
permanent memory becomes available. TemporaryRamMigration()
is responsible for switching the stack.

In the begining of the TemporaryRamMigration(),
I suggest rewording this as "before entering TemporaryRamMigration()" --
when I read the above, I went looking for Ebp/Rbp manipulation within
TemporaryRamMigration().

Ebp/Rbp is populated with content of Esp/Rsp and used as frame pointer,
after the execution of SetJump/LongJump, stack migrates to new position
Please start a new sentence with "after the execution of ...".

while the context keeps unchanged. But when TemporaryRamMigration() exits,
Esp/Rsp is filled with the content of Ebp/Rbp to destroy this stack frame.
The result is, stack switches back to previous temporary momery.
s/momery/memory/
Thanks for your detailed guidance, I'll revise the things you mentioned in
V2 patch.

And I must express my thanks for your article
"Laszlo's unkempt git guide for edk2 contributors and maintainers".
It helps me a lot!
But, more importantly, what are the practical consequences of this bug?

Does it mean that the temporary SEC/PEI stack is used until the end of
PEI, even after permanent RAM is installed, during both normal boot and
S3 resume?
Yes, I think that's exactly true.
If that's the case, I wonder how it was caught.
Once I was tracing a variable and found the abnormal value in SP.
If it was caught during S3 resume, then the only symptom could have been
stack overflow. Because during normal boot, both temporary and permanent
PEI RAM are reserved, so even if we stay on the temporary stack during
S3, if we don't overflow it, we cannot corrupt data:

/**
   Publish system RAM and reserve memory regions

**/
VOID
InitializeRamRegions (
   VOID
   )
{
   if (!mXen) {
     QemuInitializeRam ();
   } else {
     XenPublishRamRegions ();
   }

   if (mS3Supported && mBootMode != BOOT_ON_S3_RESUME) {
     //
     // This is the memory range that will be used for PEI on S3 resume
     //
     BuildMemoryAllocationHob (
       mS3AcpiReservedMemoryBase,
       mS3AcpiReservedMemorySize,
       EfiACPIMemoryNVS
       );

     //
     // Cover the initial RAM area used as stack and temporary PEI heap.
     //
     // This is reserved as ACPI NVS so it can be used on S3 resume.
     //
     BuildMemoryAllocationHob (
       PcdGet32 (PcdOvmfSecPeiTempRamBase),
       PcdGet32 (PcdOvmfSecPeiTempRamSize),
       EfiACPIMemoryNVS
       );
With stack overflow during S3 resume, we may have corrupted OS data
residing very low.
Yes, I think this is the real problem.

In current implementation, stack located in temporary memory will not be
exhausted during the whole PEI phase so there won't be an system crash
during S3 resume caused by this.

The size of stack in temporary memory is 32K, when in permanent memory it
is 128K. This approximately simulate the scene in real machine. Logically the
stack should be switched to permanent memory when the latter becomes
available.

Since hardware protection mechanism for stack is disabled, currently using
larger stack can reduced the possibility of stack overflow, thus, the possibility
of crash during S3 resume

Check the GDT set up by reset vector module:
GDT_BASE:
; null descriptor
NULL_SEL            equ $-GDT_BASE
    DW      0            ; limit 15:0
    DW      0            ; base 15:0
    DB      0            ; base 23:16
    DB      0            ; sys flag, dpl, type
    DB      0            ; limit 19:16, flags
    DB      0            ; base 31:24

; linear data segment descriptor
LINEAR_SEL          equ $-GDT_BASE
    DW      0xffff       ; limit 15:0
    DW      0            ; base 15:0
    DB      0            ; base 23:16
    DB PRESENT_FLAG(1)|DPL(0)|SYSTEM_FLAG(1)|DESC_TYPE(DATA32_TYPE)
DB GRANULARITY_FLAG(1)|DEFAULT_SIZE32(1)|CODE64_FLAG(0)|UPPER_LIMIT(0xf)
    DB      0            ; base 31:24

; linear code segment descriptor
LINEAR_CODE_SEL     equ $-GDT_BASE
    DW      0xffff       ; limit 15:0
    DW      0            ; base 15:0
    DB      0            ; base 23:16
    DB PRESENT_FLAG(1)|DPL(0)|SYSTEM_FLAG(1)|DESC_TYPE(CODE32_TYPE)
DB GRANULARITY_FLAG(1)|DEFAULT_SIZE32(1)|CODE64_FLAG(0)|UPPER_LIMIT(0xf)
    DB      0            ; base 31:24

%ifdef ARCH_X64
; linear code (64-bit) segment descriptor
LINEAR_CODE64_SEL   equ $-GDT_BASE
    DW      0xffff       ; limit 15:0
    DW      0            ; base 15:0
    DB      0            ; base 23:16
    DB PRESENT_FLAG(1)|DPL(0)|SYSTEM_FLAG(1)|DESC_TYPE(CODE64_TYPE)
DB GRANULARITY_FLAG(1)|DEFAULT_SIZE32(0)|CODE64_FLAG(1)|UPPER_LIMIT(0xf)
    DB      0            ; base 31:24
%endif

GDT_END:

SS is set to use LINEAR_SEL descriptor, actually stack protection provided by hardware is disabled

BTW, is it better to utilize the hardware mechanism to set up stack overflow protection in both
stage(pre-memory and post-memory)?

However, if the bug was caught during normal boot, then I don't know
what the symptom may have been. The permanent PEI RAM for normal boot is
pretty close to the top of low RAM (and the default RAM size for QEMU is
128MB).  The temporary stack is very low however ("Temp Stack :
BaseAddress=0x814000 Length=0x4000"). So even if we overflow the
temporary stack (i.e. go below the address 0x814000) during normal boot,
I don't think anything allocated from the permanent PEI RAM could be
corrupted by that.
I strongly agree with you. Normal boot shouldn't be affected even stack overflows.
I'll do some testing in the next few days.

The analysis and the patch look great (and impressive!) to me. I'd like
the commit message to be cleaned up a bit -- please see my notes above,
and please describe:

- practical consequences of the bug,

- how the bug was caught.

I have one more (superficial) comment below:

More detailed information:
TemporaryRamMigration (PeiServices=0x817790,
TemporaryMemoryBase=8454144, PermanentMemoryBase=469016576,
CopySize=32768)
     at /home/bird/src/edk2/OvmfPkg/Sec/SecMain.c:938
938         LongJump (&JumpBuffer, (UINTN)-1);
(gdb) info registers
rax            0x1bf4d3b0       469029808
rbx            0x810248 8454728
rcx            0x8173e8 8483816
rdx            0x8173b0 8483760
rsi            0x1bf4a000       469016576
rdi            0x8000   32768
rbp            0x817520 0x817520
rsp            0x8173b0 0x8173b0
r8             0x0      0
...
rip            0xfffcd409       0xfffcd409 <TemporaryRamMigration+365>
eflags         0x2      [ ]
cs             0x18     24
ss             0x8      8
...
After execution of LongJump:
943       return EFI_SUCCESS;
(gdb) info registers
rax            0x0      0
rbx            0x810248 8454728
rcx            0x0      0
rdx            0xffffffffffffffff       -1
rsi            0x1bf4a000       469016576
rdi            0x8000   32768
rbp            0x817520 0x817520
rsp            0x1bf4d3b0       0x1bf4d3b0
r8             0x0      0
...
rip            0xfffcd42a       0xfffcd42a <TemporaryRamMigration+398>
eflags         0x86     [ PF SF ]
cs             0x18     24
ss             0x8      8
...
We can find rsp has changed to new permanent memory

When leaving TemporaryRamMigration(), the stack swithes back to previous
temporary memory:
(gdb) finish
Run till exit from #0  TemporaryRamMigration (PeiServices=0x817790,
TemporaryMemoryBase=8454144, PermanentMemoryBase=469016576,
CopySize=32768)
     at /home/bird/src/edk2/OvmfPkg/Sec/SecMain.c:943
PeiCheckAndSwitchStack (SecCoreData=0x1bf4df78, Private=0x1bf4d788) at
/home/bird/src/edk2/MdeModulePkg/Core/Pei/Dispatcher/Dispatcher.c:806
806           PeiCore (SecCoreData, NULL, Private);
Value returned is $1 = 0
(gdb) info registers
rax            0x0      0
rbx            0x810248 8454728
rcx            0x0      0
rdx            0xffffffffffffffff       -1
rsi            0x1bf4a000       469016576
rdi            0x8000   32768
rbp            0x817630 0x817630
rsp            0x817530 0x817530
r8             0x0      0
...
rip            0x828135 0x828135 <PeiCheckAndSwitchStack+1573>
eflags         0x86     [ PF SF ]
cs             0x18     24
ss             0x8      8
...
(gdb) disassemble /r
Dump of assembler code for function TemporaryRamMigration:
   0x00000000fffcd29c <+0>:       55      push   %rbp
   0x00000000fffcd29d <+1>:       48 89 e5        mov    %rsp,%rbp
   0x00000000fffcd2a0 <+4>:       48 81 ec 70 01 00 00    sub
$0x170,%rsp
    ...
    ...
    0x00000000fffcd425 <+393>:    e8 80 10 00 00  callq  0xfffce4aa
<SaveAndSetDebugTimerInterrupt>
=> 0x00000000fffcd42a <+398>:  b8 00 00 00 00  mov    $0x0,%eax
    0x00000000fffcd42f <+403>:    c9      leaveq
    0x00000000fffcd430 <+404>:    c3      retq
End of assembler dump.
See the description of leave(opcode: c9), from
IntelĀ® 64 and IA-32 Architectures Software Developers Manual, Volume 2A
I prefer keeping commit messages ASCII-clean.

Thanks
Laszlo

"Releases the stack frame set up by an earlier ENTER instruction. The
LEAVE instruction copies the frame pointer (in the EBP register) into
the stack pointer register (ESP), which releases the stack space
allocated to the stack frame. The old frame pointer (the frame pointer
for the calling procedure that was saved by the ENTER instruction) is
then popped from the stack into the EBP register, restoring the calling
procedures stack frame."

To solve this, update Ebp/Rbp too when Esp/Rsp is updated

Cc: Jordan Justen <[email protected]>
Cc: Laszlo Ersek <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Contributed-under: TianoCore Contribution Agreement 1.1
Signed-off-by: Ge Song <[email protected]>
---
  OvmfPkg/Sec/SecMain.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/OvmfPkg/Sec/SecMain.c b/OvmfPkg/Sec/SecMain.c
index e1993ec347b5..f7fec3d8c03b 100644
--- a/OvmfPkg/Sec/SecMain.c
+++ b/OvmfPkg/Sec/SecMain.c
@@ -931,9 +931,11 @@ TemporaryRamMigration (
    if (SetJump (&JumpBuffer) == 0) {
  #if defined (MDE_CPU_IA32)
      JumpBuffer.Esp = JumpBuffer.Esp + DebugAgentContext.StackMigrateOffset;
+    JumpBuffer.Ebp = JumpBuffer.Ebp + DebugAgentContext.StackMigrateOffset;
  #endif
  #if defined (MDE_CPU_X64)
      JumpBuffer.Rsp = JumpBuffer.Rsp + DebugAgentContext.StackMigrateOffset;
+    JumpBuffer.Rbp = JumpBuffer.Rbp + DebugAgentContext.StackMigrateOffset;
  #endif
      LongJump (&JumpBuffer, (UINTN)-1);
    }


On 09/04/2017 04:49 AM, Laszlo Ersek wrote:
On 09/03/17 04:12, Ge Song wrote:
In earlier PEI stage, temporary memory(Cache As Ram) is employed as stack
Please remove the "Cache As Ram" reference; it doesn't apply to OVMF.

I suggest "temporary memory at PcdOvmfSecPeiTempRamBase"

and heap. We move them to the new room and do some relocation fixup when
permanent memory becomes available. TemporaryRamMigration()
is responsible for switching the stack.

In the begining of the TemporaryRamMigration(),
I suggest rewording this as "before entering TemporaryRamMigration()" --
when I read the above, I went looking for Ebp/Rbp manipulation within
TemporaryRamMigration().

Ebp/Rbp is populated with content of Esp/Rsp and used as frame pointer,
after the execution of SetJump/LongJump, stack migrates to new position
Please start a new sentence with "after the execution of ...".

while the context keeps unchanged. But when TemporaryRamMigration() exits,
Esp/Rsp is filled with the content of Ebp/Rbp to destroy this stack frame.
The result is, stack switches back to previous temporary momery.
s/momery/memory/


But, more importantly, what are the practical consequences of this bug?

Does it mean that the temporary SEC/PEI stack is used until the end of
PEI, even after permanent RAM is installed, during both normal boot and
S3 resume?

If that's the case, I wonder how it was caught.

If it was caught during S3 resume, then the only symptom could have been
stack overflow. Because during normal boot, both temporary and permanent
PEI RAM are reserved, so even if we stay on the temporary stack during
S3, if we don't overflow it, we cannot corrupt data:

/**
   Publish system RAM and reserve memory regions

**/
VOID
InitializeRamRegions (
   VOID
   )
{
   if (!mXen) {
     QemuInitializeRam ();
   } else {
     XenPublishRamRegions ();
   }

   if (mS3Supported && mBootMode != BOOT_ON_S3_RESUME) {
     //
     // This is the memory range that will be used for PEI on S3 resume
     //
     BuildMemoryAllocationHob (
       mS3AcpiReservedMemoryBase,
       mS3AcpiReservedMemorySize,
       EfiACPIMemoryNVS
       );

     //
     // Cover the initial RAM area used as stack and temporary PEI heap.
     //
     // This is reserved as ACPI NVS so it can be used on S3 resume.
     //
     BuildMemoryAllocationHob (
       PcdGet32 (PcdOvmfSecPeiTempRamBase),
       PcdGet32 (PcdOvmfSecPeiTempRamSize),
       EfiACPIMemoryNVS
       );
With stack overflow during S3 resume, we may have corrupted OS data
residing very low.

However, if the bug was caught during normal boot, then I don't know
what the symptom may have been. The permanent PEI RAM for normal boot is
pretty close to the top of low RAM (and the default RAM size for QEMU is
128MB).  The temporary stack is very low however ("Temp Stack :
BaseAddress=0x814000 Length=0x4000"). So even if we overflow the
temporary stack (i.e. go below the address 0x814000) during normal boot,
I don't think anything allocated from the permanent PEI RAM could be
corrupted by that.

I'll do some testing in the next few days.

The analysis and the patch look great (and impressive!) to me. I'd like
the commit message to be cleaned up a bit -- please see my notes above,
and please describe:

- practical consequences of the bug,

- how the bug was caught.

I have one more (superficial) comment below:

More detailed information:
TemporaryRamMigration (PeiServices=0x817790,
TemporaryMemoryBase=8454144, PermanentMemoryBase=469016576,
CopySize=32768)
     at /home/bird/src/edk2/OvmfPkg/Sec/SecMain.c:938
938         LongJump (&JumpBuffer, (UINTN)-1);
(gdb) info registers
rax            0x1bf4d3b0       469029808
rbx            0x810248 8454728
rcx            0x8173e8 8483816
rdx            0x8173b0 8483760
rsi            0x1bf4a000       469016576
rdi            0x8000   32768
rbp            0x817520 0x817520
rsp            0x8173b0 0x8173b0
r8             0x0      0
...
rip            0xfffcd409       0xfffcd409 <TemporaryRamMigration+365>
eflags         0x2      [ ]
cs             0x18     24
ss             0x8      8
...
After execution of LongJump:
943       return EFI_SUCCESS;
(gdb) info registers
rax            0x0      0
rbx            0x810248 8454728
rcx            0x0      0
rdx            0xffffffffffffffff       -1
rsi            0x1bf4a000       469016576
rdi            0x8000   32768
rbp            0x817520 0x817520
rsp            0x1bf4d3b0       0x1bf4d3b0
r8             0x0      0
...
rip            0xfffcd42a       0xfffcd42a <TemporaryRamMigration+398>
eflags         0x86     [ PF SF ]
cs             0x18     24
ss             0x8      8
...
We can find rsp has changed to new permanent memory

When leaving TemporaryRamMigration(), the stack swithes back to previous
temporary memory:
(gdb) finish
Run till exit from #0  TemporaryRamMigration (PeiServices=0x817790,
TemporaryMemoryBase=8454144, PermanentMemoryBase=469016576,
CopySize=32768)
     at /home/bird/src/edk2/OvmfPkg/Sec/SecMain.c:943
PeiCheckAndSwitchStack (SecCoreData=0x1bf4df78, Private=0x1bf4d788) at
/home/bird/src/edk2/MdeModulePkg/Core/Pei/Dispatcher/Dispatcher.c:806
806           PeiCore (SecCoreData, NULL, Private);
Value returned is $1 = 0
(gdb) info registers
rax            0x0      0
rbx            0x810248 8454728
rcx            0x0      0
rdx            0xffffffffffffffff       -1
rsi            0x1bf4a000       469016576
rdi            0x8000   32768
rbp            0x817630 0x817630
rsp            0x817530 0x817530
r8             0x0      0
...
rip            0x828135 0x828135 <PeiCheckAndSwitchStack+1573>
eflags         0x86     [ PF SF ]
cs             0x18     24
ss             0x8      8
...
(gdb) disassemble /r
Dump of assembler code for function TemporaryRamMigration:
   0x00000000fffcd29c <+0>:       55      push   %rbp
   0x00000000fffcd29d <+1>:       48 89 e5        mov    %rsp,%rbp
   0x00000000fffcd2a0 <+4>:       48 81 ec 70 01 00 00    sub
$0x170,%rsp
    ...
    ...
    0x00000000fffcd425 <+393>:    e8 80 10 00 00  callq  0xfffce4aa
<SaveAndSetDebugTimerInterrupt>
=> 0x00000000fffcd42a <+398>:  b8 00 00 00 00  mov    $0x0,%eax
    0x00000000fffcd42f <+403>:    c9      leaveq
    0x00000000fffcd430 <+404>:    c3      retq
End of assembler dump.
See the description of leave(opcode: c9), from
IntelĀ® 64 and IA-32 Architectures Software Developers Manual, Volume 2A
I prefer keeping commit messages ASCII-clean.

Thanks
Laszlo

"Releases the stack frame set up by an earlier ENTER instruction. The
LEAVE instruction copies the frame pointer (in the EBP register) into
the stack pointer register (ESP), which releases the stack space
allocated to the stack frame. The old frame pointer (the frame pointer
for the calling procedure that was saved by the ENTER instruction) is
then popped from the stack into the EBP register, restoring the calling
procedures stack frame."

To solve this, update Ebp/Rbp too when Esp/Rsp is updated

Cc: Jordan Justen <[email protected]>
Cc: Laszlo Ersek <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Contributed-under: TianoCore Contribution Agreement 1.1
Signed-off-by: Ge Song <[email protected]>
---
  OvmfPkg/Sec/SecMain.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/OvmfPkg/Sec/SecMain.c b/OvmfPkg/Sec/SecMain.c
index e1993ec347b5..f7fec3d8c03b 100644
--- a/OvmfPkg/Sec/SecMain.c
+++ b/OvmfPkg/Sec/SecMain.c
@@ -931,9 +931,11 @@ TemporaryRamMigration (
    if (SetJump (&JumpBuffer) == 0) {
  #if defined (MDE_CPU_IA32)
      JumpBuffer.Esp = JumpBuffer.Esp + DebugAgentContext.StackMigrateOffset;
+    JumpBuffer.Ebp = JumpBuffer.Ebp + DebugAgentContext.StackMigrateOffset;
  #endif
  #if defined (MDE_CPU_X64)
      JumpBuffer.Rsp = JumpBuffer.Rsp + DebugAgentContext.StackMigrateOffset;
+    JumpBuffer.Rbp = JumpBuffer.Rbp + DebugAgentContext.StackMigrateOffset;
  #endif
      LongJump (&JumpBuffer, (UINTN)-1);
    }


_______________________________________________
edk2-devel mailing list
[email protected]
https://lists.01.org/mailman/listinfo/edk2-devel

Reply via email to