oops! forget attaching it.

- michael

On Tue, 2013-02-19 at 14:11 -0800, Michael Liao wrote:
> Here is the patch 0002-Add-HLE-target-feature.patch
> 
> Yours
> - Michael
> 
> On Tue, 2013-02-19 at 14:07 -0800, Michael Liao wrote:
> > Hi All,
> > 
> > I'd like to add HLE support in LLVM/clang consistent to GCC's style [1].
> > HLE from Intel TSX [2] is legacy compatible instruction set extension to
> > specify transactional region by adding XACQUIRE and XRELEASE prefixes.
> > To support that, GCC chooses the approach by extending the memory order
> > flag in __atomic_* builtins with target-specific memory model in high
> > bits (bit 31-16 for target-specific memory model, bit 15-0 for the
> > general memory model.) To follow the similar approach, I propose to
> > change LLVM/clang by adding:
> > 
> > + a metadata 'targetflags' in LLVM atomic IR to pass this
> >   target-specific memory model hint
> > 
> > + one extra target flag in AtomicSDNode & MemIntrinsicSDNode to specify
> > XACQUIRE or XRELEASE hints
> >   This extra target flag is embedded into the SubclassData fields. The
> > following is rationale how such target flags are embedded into
> > SubclassData in SDNode
> > 
> >   here is the current SDNode class hierarchy of memory related nodes
> > 
> >   SDNode -> MemSDNode -> LSBaseNode -> LoadSDNode
> >                     |             + -> StoreSDNode
> >                     + -> AtomicSDNode
> >                     + -> MemIntrinsicSDNode
> > 
> >   here is the current SubclassData definitions:
> > 
> >   bit 0~1 : extension type used in LoadSDNode
> >   bit 0   : truncating store in StoreSDNode
> >   bit 2~4 : addressing mode in LSBaseNode
> >   bit 5   : volatile bit in MemSDNode
> >   bit 6   : non-temporal bit in MemSDNode
> >   bit 7   : invariant bit in MemSDNode
> >   bit 8~11: memory order in AtomicSDNode
> >   bit 12  : synch scope in AtomicSDNode
> > 
> >   Considering the class hierarchy, we could safely reused bit 0~1 as the
> > target flags in AtomicSDNode/MemIntrinsicNode
> >   
> > + X86 backend is modified to generate additional XACQUIRE/XRELEASE
> > prefix based on the specified target flag
> > 
> > 
> > The following are details of each patch:
> > 
> > * 0001-Add-targetflags-in-AtomicSDNode-MemIntrinsicSDNode.patch
> > 
> > This patch adds 'targetflags' support in AtomicSDNode and
> > MemIntrinsicSDNode. It will check metadata 'targetflags' and embedded
> > its value into SubclassData. Currently, only two bits are defined.
> > 
> > * 0002-Add-HLE-target-feature.patch
> > 
> > This patch adds HLE feature and auto-detection support
> > 
> > * 0003-Add-XACQ-XREL-prefix-and-encoding-asm-printer-suppor.patch
> > 
> > This patch adds XACQUIRE/XRELEASE prefix and its assembler/encoding
> > support
> > 
> > * 0004-Enable-HLE-code-generation.patch
> > 
> > This patch enables HLE code generation by extending the current logic to
> > handle 'targetflags'.
> > 
> > * 0001-Add-target-flags-support-for-atomic-ops.patch
> > 
> > This patch adds target flags support in __atomic_* builtins. It splits
> > the whole 32-bit order word into high and low 16-bit parts. The low
> > 16-bit is the original memory order and the high 16-bit will be
> > re-defined as target-specific flags and passed through 'targetflags'
> > metadata.
> > 
> > * 0002-Add-mhle-option-support-and-populate-pre-defined-mac.patch
> > 
> > It adds '-m[no]hle' option to turn on HLE feature or not. Once HLE
> > feature is turned on, two more macros (__ATOMIC_HLE_ACQUIRE and
> > __ATOMIC_HLE_RELEASE) are defined for developers to mark atomic
> > builtins.
> > 
> > Thanks for your time to review!
> > 
> > Yours
> > - Michael
> > ---
> > [1] http://gcc.gnu.org/ml/gcc-patches/2012-04/msg01073.html
> > [2] http://software.intel.com/sites/default/files/319433-014.pdf
> > 
> 

>From 5f18d83c4c633c43becfcb2557f831e3df717815 Mon Sep 17 00:00:00 2001
From: Michael Liao <[email protected]>
Date: Thu, 5 Jul 2012 23:38:57 -0700
Subject: [PATCH 2/4] Add HLE target feature

---
 lib/Target/X86/X86.td           |    4 +++-
 lib/Target/X86/X86InstrInfo.td  |    1 +
 lib/Target/X86/X86Subtarget.cpp |    5 +++++
 lib/Target/X86/X86Subtarget.h   |    4 ++++
 4 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/lib/Target/X86/X86.td b/lib/Target/X86/X86.td
index 0216252..810acee 100644
--- a/lib/Target/X86/X86.td
+++ b/lib/Target/X86/X86.td
@@ -120,6 +120,8 @@ def FeatureBMI2    : SubtargetFeature<"bmi2", "HasBMI2", "true",
                                       "Support BMI2 instructions">;
 def FeatureRTM     : SubtargetFeature<"rtm", "HasRTM", "true",
                                       "Support RTM instructions">;
+def FeatureHLE     : SubtargetFeature<"hle", "HasHLE", "true",
+                                      "Support HLE">;
 def FeatureADX     : SubtargetFeature<"adx", "HasADX", "true",
                                       "Support ADX instructions">;
 def FeatureLeaForSP : SubtargetFeature<"lea-sp", "UseLeaForSP", "true",
@@ -201,7 +203,7 @@ def : Proc<"core-avx2",       [FeatureAVX2, FeatureCMPXCHG16B, FeatureFastUAMem,
                                FeatureRDRAND, FeatureF16C, FeatureFSGSBase,
                                FeatureMOVBE, FeatureLZCNT, FeatureBMI,
                                FeatureBMI2, FeatureFMA,
-                               FeatureRTM]>;
+                               FeatureRTM, FeatureHLE]>;
 
 def : Proc<"k6",              [FeatureMMX]>;
 def : Proc<"k6-2",            [Feature3DNow]>;
diff --git a/lib/Target/X86/X86InstrInfo.td b/lib/Target/X86/X86InstrInfo.td
index 84c278c..46daaad 100644
--- a/lib/Target/X86/X86InstrInfo.td
+++ b/lib/Target/X86/X86InstrInfo.td
@@ -603,6 +603,7 @@ def HasLZCNT     : Predicate<"Subtarget->hasLZCNT()">;
 def HasBMI       : Predicate<"Subtarget->hasBMI()">;
 def HasBMI2      : Predicate<"Subtarget->hasBMI2()">;
 def HasRTM       : Predicate<"Subtarget->hasRTM()">;
+def HasHLE       : Predicate<"Subtarget->hasHLE()">;
 def HasADX       : Predicate<"Subtarget->hasADX()">;
 def FPStackf32   : Predicate<"!Subtarget->hasSSE1()">;
 def FPStackf64   : Predicate<"!Subtarget->hasSSE2()">;
diff --git a/lib/Target/X86/X86Subtarget.cpp b/lib/Target/X86/X86Subtarget.cpp
index 0f2c008..a9955ce 100644
--- a/lib/Target/X86/X86Subtarget.cpp
+++ b/lib/Target/X86/X86Subtarget.cpp
@@ -310,6 +310,10 @@ void X86Subtarget::AutoDetectSubtargetFeatures() {
         HasBMI = true;
         ToggleFeature(X86::FeatureBMI);
       }
+      if ((EBX >> 4) & 0x1) {
+        HasHLE = true;
+        ToggleFeature(X86::FeatureHLE);
+      }
       if (IsIntel && ((EBX >> 5) & 0x1)) {
         X86SSELevel = AVX2;
         ToggleFeature(X86::FeatureAVX2);
@@ -439,6 +443,7 @@ void X86Subtarget::initializeEnvironment() {
   HasBMI = false;
   HasBMI2 = false;
   HasRTM = false;
+  HasHLE = false;
   HasADX = false;
   IsBTMemSlow = false;
   IsUAMemFast = false;
diff --git a/lib/Target/X86/X86Subtarget.h b/lib/Target/X86/X86Subtarget.h
index e97da4b..411494a 100644
--- a/lib/Target/X86/X86Subtarget.h
+++ b/lib/Target/X86/X86Subtarget.h
@@ -121,6 +121,9 @@ protected:
   /// HasRTM - Processor has RTM instructions.
   bool HasRTM;
 
+  /// HasHLE - Processor has HLE.
+  bool HasHLE;
+
   /// HasADX - Processor has ADX instructions.
   bool HasADX;
 
@@ -253,6 +256,7 @@ public:
   bool hasBMI() const { return HasBMI; }
   bool hasBMI2() const { return HasBMI2; }
   bool hasRTM() const { return HasRTM; }
+  bool hasHLE() const { return HasHLE; }
   bool hasADX() const { return HasADX; }
   bool isBTMemSlow() const { return IsBTMemSlow; }
   bool isUnalignedMemAccessFast() const { return IsUAMemFast; }
-- 
1.7.9.5

_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits

Reply via email to