On 4/26/19 3:27 PM, Laszlo Ersek wrote:
On 04/25/19 19:53, Michael D Kinney wrote:
Use CPUID in IA32 implementation of AsmLfence() to verify
that SSE2 is supported before using the LFENCE instruction.

Cc: Liming Gao <liming....@intel.com>
Signed-off-by: Michael D Kinney <michael.d.kin...@intel.com>
---
  MdePkg/Library/BaseLib/Ia32/Lfence.nasm | 14 +++++++++++++-
  1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/MdePkg/Library/BaseLib/Ia32/Lfence.nasm 
b/MdePkg/Library/BaseLib/Ia32/Lfence.nasm
index 44478be35f..0a60ae1d57 100644
--- a/MdePkg/Library/BaseLib/Ia32/Lfence.nasm
+++ b/MdePkg/Library/BaseLib/Ia32/Lfence.nasm
@@ -1,5 +1,5 @@
  
;------------------------------------------------------------------------------ 
;
-; Copyright (c) 2018, Intel Corporation. All rights reserved.<BR>
+; Copyright (c) 2018 - 2019, Intel Corporation. All rights reserved.<BR>
  ; SPDX-License-Identifier: BSD-2-Clause-Patent
  ;
  ; Module Name:
@@ -26,5 +26,17 @@
  
;------------------------------------------------------------------------------
  global ASM_PFX(AsmLfence)
  ASM_PFX(AsmLfence):
+    ;
+    ; Use CPUID instruction (CPUID.01H:EDX.SSE2[bit 26] = 1) to test whether 
the
+    ; processor supports SSE2 instruction.  Save/restore non-volatile register
+    ; EBX that is modified by CPUID
+    ;
+    push    ebx
+    mov     eax, 1
+    cpuid
+    bt      edx, 26
+    jnc     Done
      lfence
+Done:
+    pop     ebx
      ret


The SDM seems to confirm that lfence depends on SSE2.

However, that raises another question:

AsmLfence() is used for implementing SpeculationBarrier() on IA32/X64.
And so I wonder: the plaforms where lfence is unavailable, do they *not*
need a speculation barrier at all, or do they need a *different*
implementation?

Thanks,
Laszlo

Several issues:

This patch introduces stronger fencing than is required. The SDM says, "Reads or writes cannot be reordered with I/O instructions, locked instructions, or serializing instructions." (vol 3a, sec 8.2.2.) The cpuid instruction is a "serializing instruction" (sec 8.3). So the cpuid is essentially a load fence plus a store fence (plus other functionality, such as draining the memory queues.) That makes the lfence following it redundant.

Cpuid is a fairly heavyweight instruction due to its serializing behavior. It provides the load fencing semantics of lfence, but can introduce a significant performance hit if it's called often. Lfence is a lot lighter weight instruction. So using cpuid in AsmLfence may make it a lot slower than the caller expects.

Also, skipping a fencing instruction if it's not supported doesn't seem like the right approach in any case. The caller expects AsmLfence to provide certain fencing semantics... the implementation isn't free to just do nothing (unless it's on an architecture where load fencing is not required, since memory is always ordered.) Plus, the routine is called "AsmLfence", so I'd expect it to always use lfence, and cause an exception if the instruction isn't implemented. I'd think the callers should be modified to call AsmLfence or routines using other instructions (such as cpuid) to provide fencing according to the CPU architecture they are on.

Is there a way to make this a compile-time decision? I haven't tried to track down all the callers of AsmLfence....

Thanks,
--
Brian J. Johnson
Enterprise X86 Lab

Hewlett Packard Enterprise


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.

View/Reply Online (#39666): https://edk2.groups.io/g/devel/message/39666
Mute This Topic: https://groups.io/mt/31345225/21656
Group Owner: devel+ow...@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to