On 4/26/19 3:27 PM, Laszlo Ersek wrote:
On 04/25/19 19:53, Michael D Kinney wrote:
Use CPUID in IA32 implementation of AsmLfence() to verify
that SSE2 is supported before using the LFENCE instruction.
Cc: Liming Gao <liming....@intel.com>
Signed-off-by: Michael D Kinney <michael.d.kin...@intel.com>
---
MdePkg/Library/BaseLib/Ia32/Lfence.nasm | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)
diff --git a/MdePkg/Library/BaseLib/Ia32/Lfence.nasm
b/MdePkg/Library/BaseLib/Ia32/Lfence.nasm
index 44478be35f..0a60ae1d57 100644
--- a/MdePkg/Library/BaseLib/Ia32/Lfence.nasm
+++ b/MdePkg/Library/BaseLib/Ia32/Lfence.nasm
@@ -1,5 +1,5 @@
;------------------------------------------------------------------------------
;
-; Copyright (c) 2018, Intel Corporation. All rights reserved.<BR>
+; Copyright (c) 2018 - 2019, Intel Corporation. All rights reserved.<BR>
; SPDX-License-Identifier: BSD-2-Clause-Patent
;
; Module Name:
@@ -26,5 +26,17 @@
;------------------------------------------------------------------------------
global ASM_PFX(AsmLfence)
ASM_PFX(AsmLfence):
+ ;
+ ; Use CPUID instruction (CPUID.01H:EDX.SSE2[bit 26] = 1) to test whether
the
+ ; processor supports SSE2 instruction. Save/restore non-volatile register
+ ; EBX that is modified by CPUID
+ ;
+ push ebx
+ mov eax, 1
+ cpuid
+ bt edx, 26
+ jnc Done
lfence
+Done:
+ pop ebx
ret
The SDM seems to confirm that lfence depends on SSE2.
However, that raises another question:
AsmLfence() is used for implementing SpeculationBarrier() on IA32/X64.
And so I wonder: the plaforms where lfence is unavailable, do they *not*
need a speculation barrier at all, or do they need a *different*
implementation?
Thanks,
Laszlo
Several issues:
This patch introduces stronger fencing than is required. The SDM says,
"Reads or writes cannot be reordered with I/O instructions, locked
instructions, or serializing instructions." (vol 3a, sec 8.2.2.) The
cpuid instruction is a "serializing instruction" (sec 8.3). So the
cpuid is essentially a load fence plus a store fence (plus other
functionality, such as draining the memory queues.) That makes the
lfence following it redundant.
Cpuid is a fairly heavyweight instruction due to its serializing
behavior. It provides the load fencing semantics of lfence, but can
introduce a significant performance hit if it's called often. Lfence is
a lot lighter weight instruction. So using cpuid in AsmLfence may make
it a lot slower than the caller expects.
Also, skipping a fencing instruction if it's not supported doesn't seem
like the right approach in any case. The caller expects AsmLfence to
provide certain fencing semantics... the implementation isn't free to
just do nothing (unless it's on an architecture where load fencing is
not required, since memory is always ordered.) Plus, the routine is
called "AsmLfence", so I'd expect it to always use lfence, and cause an
exception if the instruction isn't implemented. I'd think the callers
should be modified to call AsmLfence or routines using other
instructions (such as cpuid) to provide fencing according to the CPU
architecture they are on.
Is there a way to make this a compile-time decision? I haven't tried to
track down all the callers of AsmLfence....
Thanks,
--
Brian J. Johnson
Enterprise X86 Lab
Hewlett Packard Enterprise
-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#39666): https://edk2.groups.io/g/devel/message/39666
Mute This Topic: https://groups.io/mt/31345225/21656
Group Owner: devel+ow...@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-