oops! forget attaching it. - michael
On Tue, 2013-02-19 at 14:11 -0800, Michael Liao wrote: > Here is the patch 0002-Add-HLE-target-feature.patch > > Yours > - Michael > > On Tue, 2013-02-19 at 14:07 -0800, Michael Liao wrote: > > Hi All, > > > > I'd like to add HLE support in LLVM/clang consistent to GCC's style [1]. > > HLE from Intel TSX [2] is legacy compatible instruction set extension to > > specify transactional region by adding XACQUIRE and XRELEASE prefixes. > > To support that, GCC chooses the approach by extending the memory order > > flag in __atomic_* builtins with target-specific memory model in high > > bits (bit 31-16 for target-specific memory model, bit 15-0 for the > > general memory model.) To follow the similar approach, I propose to > > change LLVM/clang by adding: > > > > + a metadata 'targetflags' in LLVM atomic IR to pass this > > target-specific memory model hint > > > > + one extra target flag in AtomicSDNode & MemIntrinsicSDNode to specify > > XACQUIRE or XRELEASE hints > > This extra target flag is embedded into the SubclassData fields. The > > following is rationale how such target flags are embedded into > > SubclassData in SDNode > > > > here is the current SDNode class hierarchy of memory related nodes > > > > SDNode -> MemSDNode -> LSBaseNode -> LoadSDNode > > | + -> StoreSDNode > > + -> AtomicSDNode > > + -> MemIntrinsicSDNode > > > > here is the current SubclassData definitions: > > > > bit 0~1 : extension type used in LoadSDNode > > bit 0 : truncating store in StoreSDNode > > bit 2~4 : addressing mode in LSBaseNode > > bit 5 : volatile bit in MemSDNode > > bit 6 : non-temporal bit in MemSDNode > > bit 7 : invariant bit in MemSDNode > > bit 8~11: memory order in AtomicSDNode > > bit 12 : synch scope in AtomicSDNode > > > > Considering the class hierarchy, we could safely reused bit 0~1 as the > > target flags in AtomicSDNode/MemIntrinsicNode > > > > + X86 backend is modified to generate additional XACQUIRE/XRELEASE > > prefix based on the specified target flag > > > > > > The following are details of each patch: > > > > * 0001-Add-targetflags-in-AtomicSDNode-MemIntrinsicSDNode.patch > > > > This patch adds 'targetflags' support in AtomicSDNode and > > MemIntrinsicSDNode. It will check metadata 'targetflags' and embedded > > its value into SubclassData. Currently, only two bits are defined. > > > > * 0002-Add-HLE-target-feature.patch > > > > This patch adds HLE feature and auto-detection support > > > > * 0003-Add-XACQ-XREL-prefix-and-encoding-asm-printer-suppor.patch > > > > This patch adds XACQUIRE/XRELEASE prefix and its assembler/encoding > > support > > > > * 0004-Enable-HLE-code-generation.patch > > > > This patch enables HLE code generation by extending the current logic to > > handle 'targetflags'. > > > > * 0001-Add-target-flags-support-for-atomic-ops.patch > > > > This patch adds target flags support in __atomic_* builtins. It splits > > the whole 32-bit order word into high and low 16-bit parts. The low > > 16-bit is the original memory order and the high 16-bit will be > > re-defined as target-specific flags and passed through 'targetflags' > > metadata. > > > > * 0002-Add-mhle-option-support-and-populate-pre-defined-mac.patch > > > > It adds '-m[no]hle' option to turn on HLE feature or not. Once HLE > > feature is turned on, two more macros (__ATOMIC_HLE_ACQUIRE and > > __ATOMIC_HLE_RELEASE) are defined for developers to mark atomic > > builtins. > > > > Thanks for your time to review! > > > > Yours > > - Michael > > --- > > [1] http://gcc.gnu.org/ml/gcc-patches/2012-04/msg01073.html > > [2] http://software.intel.com/sites/default/files/319433-014.pdf > > >
>From 5f18d83c4c633c43becfcb2557f831e3df717815 Mon Sep 17 00:00:00 2001 From: Michael Liao <[email protected]> Date: Thu, 5 Jul 2012 23:38:57 -0700 Subject: [PATCH 2/4] Add HLE target feature --- lib/Target/X86/X86.td | 4 +++- lib/Target/X86/X86InstrInfo.td | 1 + lib/Target/X86/X86Subtarget.cpp | 5 +++++ lib/Target/X86/X86Subtarget.h | 4 ++++ 4 files changed, 13 insertions(+), 1 deletion(-) diff --git a/lib/Target/X86/X86.td b/lib/Target/X86/X86.td index 0216252..810acee 100644 --- a/lib/Target/X86/X86.td +++ b/lib/Target/X86/X86.td @@ -120,6 +120,8 @@ def FeatureBMI2 : SubtargetFeature<"bmi2", "HasBMI2", "true", "Support BMI2 instructions">; def FeatureRTM : SubtargetFeature<"rtm", "HasRTM", "true", "Support RTM instructions">; +def FeatureHLE : SubtargetFeature<"hle", "HasHLE", "true", + "Support HLE">; def FeatureADX : SubtargetFeature<"adx", "HasADX", "true", "Support ADX instructions">; def FeatureLeaForSP : SubtargetFeature<"lea-sp", "UseLeaForSP", "true", @@ -201,7 +203,7 @@ def : Proc<"core-avx2", [FeatureAVX2, FeatureCMPXCHG16B, FeatureFastUAMem, FeatureRDRAND, FeatureF16C, FeatureFSGSBase, FeatureMOVBE, FeatureLZCNT, FeatureBMI, FeatureBMI2, FeatureFMA, - FeatureRTM]>; + FeatureRTM, FeatureHLE]>; def : Proc<"k6", [FeatureMMX]>; def : Proc<"k6-2", [Feature3DNow]>; diff --git a/lib/Target/X86/X86InstrInfo.td b/lib/Target/X86/X86InstrInfo.td index 84c278c..46daaad 100644 --- a/lib/Target/X86/X86InstrInfo.td +++ b/lib/Target/X86/X86InstrInfo.td @@ -603,6 +603,7 @@ def HasLZCNT : Predicate<"Subtarget->hasLZCNT()">; def HasBMI : Predicate<"Subtarget->hasBMI()">; def HasBMI2 : Predicate<"Subtarget->hasBMI2()">; def HasRTM : Predicate<"Subtarget->hasRTM()">; +def HasHLE : Predicate<"Subtarget->hasHLE()">; def HasADX : Predicate<"Subtarget->hasADX()">; def FPStackf32 : Predicate<"!Subtarget->hasSSE1()">; def FPStackf64 : Predicate<"!Subtarget->hasSSE2()">; diff --git a/lib/Target/X86/X86Subtarget.cpp b/lib/Target/X86/X86Subtarget.cpp index 0f2c008..a9955ce 100644 --- a/lib/Target/X86/X86Subtarget.cpp +++ b/lib/Target/X86/X86Subtarget.cpp @@ -310,6 +310,10 @@ void X86Subtarget::AutoDetectSubtargetFeatures() { HasBMI = true; ToggleFeature(X86::FeatureBMI); } + if ((EBX >> 4) & 0x1) { + HasHLE = true; + ToggleFeature(X86::FeatureHLE); + } if (IsIntel && ((EBX >> 5) & 0x1)) { X86SSELevel = AVX2; ToggleFeature(X86::FeatureAVX2); @@ -439,6 +443,7 @@ void X86Subtarget::initializeEnvironment() { HasBMI = false; HasBMI2 = false; HasRTM = false; + HasHLE = false; HasADX = false; IsBTMemSlow = false; IsUAMemFast = false; diff --git a/lib/Target/X86/X86Subtarget.h b/lib/Target/X86/X86Subtarget.h index e97da4b..411494a 100644 --- a/lib/Target/X86/X86Subtarget.h +++ b/lib/Target/X86/X86Subtarget.h @@ -121,6 +121,9 @@ protected: /// HasRTM - Processor has RTM instructions. bool HasRTM; + /// HasHLE - Processor has HLE. + bool HasHLE; + /// HasADX - Processor has ADX instructions. bool HasADX; @@ -253,6 +256,7 @@ public: bool hasBMI() const { return HasBMI; } bool hasBMI2() const { return HasBMI2; } bool hasRTM() const { return HasRTM; } + bool hasHLE() const { return HasHLE; } bool hasADX() const { return HasADX; } bool isBTMemSlow() const { return IsBTMemSlow; } bool isUnalignedMemAccessFast() const { return IsUAMemFast; } -- 1.7.9.5
_______________________________________________ cfe-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
