llvmbot wrote:

<!--LLVM PR SUMMARY COMMENT-->

@llvm/pr-subscribers-backend-x86

Author: Tharun V K (tharunvk)

<details>
<summary>Changes</summary>

This patch prevents the unwanted promotion of _Float16 FNEG and FABS 
operations to float at -O0, ensuring bitwise correctness and 
improving performance.

Previously, at -O0, Clang would emit fpext/fptrunc sequences for 
these operations because HasFastHalfType was false. This could 
potentially alter NaN payloads and introduced unnecessary 
instruction overhead.
Changes:
  Clang: Set HasFastHalfType to true for X86 targets with SSE2+( already set 
for avxfp16)
   This signals the frontend that the backend can handle half-precision 
   bitwise operations natively in IR.

Fixes : https://github.com/llvm/llvm-project/issues/188201

---
Full diff: https://github.com/llvm/llvm-project/pull/189395.diff


3 Files Affected:

- (modified) clang/lib/Basic/Targets/X86.cpp (+2) 
- (added) clang/test/CodeGen/X86/Float16-fneg-fabs.c (+18) 
- (added) llvm/test/CodeGen/X86/half-fneg-fabs.ll (+46) 


``````````diff
diff --git a/clang/lib/Basic/Targets/X86.cpp b/clang/lib/Basic/Targets/X86.cpp
index cb941c94c84a7..b6138fee89b4c 100644
--- a/clang/lib/Basic/Targets/X86.cpp
+++ b/clang/lib/Basic/Targets/X86.cpp
@@ -471,6 +471,8 @@ bool 
X86TargetInfo::handleTargetFeatures(std::vector<std::string> &Features,
     SSELevel = std::max(SSELevel, Level);
 
     HasFloat16 = SSELevel >= SSE2;
+    if (HasFloat16)
+      HasFastHalfType = true;
 
     // X86 target has bfloat16 emulation support in the backend, where
     // bfloat16 is treated as a 32-bit float, arithmetic operations are
diff --git a/clang/test/CodeGen/X86/Float16-fneg-fabs.c 
b/clang/test/CodeGen/X86/Float16-fneg-fabs.c
new file mode 100644
index 0000000000000..92760bd377926
--- /dev/null
+++ b/clang/test/CodeGen/X86/Float16-fneg-fabs.c
@@ -0,0 +1,18 @@
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -target-feature +sse2 -O0 
-emit-llvm %s -o - | FileCheck %s --check-prefixes=CHECK
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -target-feature +f16c -O0 
-emit-llvm %s -o - | FileCheck %s --check-prefixes=CHECK
+
+_Float16 test_fneg(_Float16 x) {
+  // CHECK-LABEL: define {{.*}} @test_fneg
+  // CHECK-NOT: fpext
+  // CHECK: fneg half
+  // CHECK-NOT: fptrunc
+  return -x;
+}
+
+_Float16 test_fabs(_Float16 x) {
+  // CHECK-LABEL: define {{.*}} @test_fabs
+  // CHECK-NOT: fpext
+  // CHECK: call half @llvm.fabs.f16(half
+  // CHECK-NOT: fptrunc
+  return __builtin_fabsf16(x);
+}
diff --git a/llvm/test/CodeGen/X86/half-fneg-fabs.ll 
b/llvm/test/CodeGen/X86/half-fneg-fabs.ll
new file mode 100644
index 0000000000000..d24a5fbe1dfe7
--- /dev/null
+++ b/llvm/test/CodeGen/X86/half-fneg-fabs.ll
@@ -0,0 +1,46 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc < %s -mtriple=x86_64-unknown-linux-gnu -mattr=+sse2 | FileCheck %s 
--check-prefixes=SSE2
+; RUN: llc < %s -mtriple=x86_64-unknown-linux-gnu -mattr=+avx | FileCheck %s 
--check-prefixes=AVX
+; RUN: llc < %s -mtriple=x86_64-unknown-linux-gnu -mattr=+f16c | FileCheck %s 
--check-prefixes=F16C
+
+; Test Floating Point Negation (fneg)
+define half @test_fneg(half %a) nounwind {
+; SSE2-LABEL: test_fneg:
+; SSE2:       # %bb.0:
+; SSE2-NEXT:    xorps {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0
+; SSE2-NEXT:    retq
+;
+; AVX-LABEL: test_fneg:
+; AVX:       # %bb.0:
+; AVX-NEXT:    vxorps {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0, %xmm0
+; AVX-NEXT:    retq
+;
+; F16C-LABEL: test_fneg:
+; F16C:       # %bb.0:
+; F16C-NEXT:    vxorps {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0, %xmm0
+; F16C-NEXT:    retq
+  %res = fneg half %a
+  ret half %res
+}
+
+; Test Floating Point Absolute Value (fabs)
+define half @test_fabs(half %a) nounwind {
+; SSE2-LABEL: test_fabs:
+; SSE2:       # %bb.0:
+; SSE2-NEXT:    andps {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0
+; SSE2-NEXT:    retq
+;
+; AVX-LABEL: test_fabs:
+; AVX:       # %bb.0:
+; AVX-NEXT:    vandps {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0, %xmm0
+; AVX-NEXT:    retq
+;
+; F16C-LABEL: test_fabs:
+; F16C:       # %bb.0:
+; F16C-NEXT:    vandps {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0, %xmm0
+; F16C-NEXT:    retq
+  %res = call half @llvm.fabs.f16(half %a)
+  ret half %res
+}
+
+declare half @llvm.fabs.f16(half)

``````````

</details>


https://github.com/llvm/llvm-project/pull/189395
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to