[PATCH] D128571: [X86] Support `_Float16` on SSE2 and up

Phoebe Wang via Phabricator via cfe-commits Sat, 25 Jun 2022 18:14:20 -0700

pengfei updated this revision to Diff 440016.
pengfei marked 3 inline comments as done.
pengfei added a comment.


Address review comments. Thanks @rjmccall !


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D128571/new/

https://reviews.llvm.org/D128571

Files:
  clang/docs/LanguageExtensions.rst
  clang/docs/ReleaseNotes.rst
  clang/lib/Basic/Targets/X86.cpp
  clang/test/CodeGen/X86/Float16-arithmetic.c
  clang/test/CodeGen/X86/Float16-complex.c
  clang/test/CodeGen/X86/avx512fp16-complex.c
  clang/test/Sema/Float16.c
  clang/test/Sema/conversion-target-dep.c
  clang/test/SemaCXX/Float16.cpp

Index: clang/test/SemaCXX/Float16.cpp
===================================================================
--- clang/test/SemaCXX/Float16.cpp
+++ clang/test/SemaCXX/Float16.cpp
@@ -1,4 +1,6 @@
-// RUN: %clang_cc1 -fsyntax-only -verify -triple x86_64-linux-pc %s
+// RUN: %clang_cc1 -fsyntax-only -verify -triple i686-linux-pc %s
+// RUN: %clang_cc1 -fsyntax-only -verify -triple i686-linux-pc -target-feature +sse2 %s -DHAVE
+// RUN: %clang_cc1 -fsyntax-only -verify -triple x86_64-linux-pc %s -DHAVE
 // RUN: %clang_cc1 -fsyntax-only -verify -triple spir-unknown-unknown %s -DHAVE
 // RUN: %clang_cc1 -fsyntax-only -verify -triple armv7a-linux-gnu %s -DHAVE
 // RUN: %clang_cc1 -fsyntax-only -verify -triple aarch64-linux-gnu %s -DHAVE
Index: clang/test/Sema/conversion-target-dep.c
===================================================================
--- clang/test/Sema/conversion-target-dep.c
+++ clang/test/Sema/conversion-target-dep.c
@@ -6,7 +6,7 @@
 
 long double ld;
 double d;
-_Float16 f16; // x86-error {{_Float16 is not supported on this target}}
+_Float16 f16;
 
 int main(void) {
   ld = d; // x86-warning {{implicit conversion increases floating-point precision: 'double' to 'long double'}}
Index: clang/test/Sema/Float16.c
===================================================================
--- clang/test/Sema/Float16.c
+++ clang/test/Sema/Float16.c
@@ -1,5 +1,6 @@
-// RUN: %clang_cc1 -fsyntax-only -verify -triple x86_64-linux-pc %s
-// RUN: %clang_cc1 -fsyntax-only -verify -triple x86_64-linux-pc -target-feature +avx512fp16 %s -DHAVE
+// RUN: %clang_cc1 -fsyntax-only -verify -triple i686-linux-pc %s
+// RUN: %clang_cc1 -fsyntax-only -verify -triple i686-linux-pc -target-feature +sse2 %s -DHAVE
+// RUN: %clang_cc1 -fsyntax-only -verify -triple x86_64-linux-pc %s -DHAVE
 // RUN: %clang_cc1 -fsyntax-only -verify -triple spir-unknown-unknown %s -DHAVE
 // RUN: %clang_cc1 -fsyntax-only -verify -triple armv7a-linux-gnu %s -DHAVE
 // RUN: %clang_cc1 -fsyntax-only -verify -triple aarch64-linux-gnu %s -DHAVE
Index: clang/test/CodeGen/X86/Float16-complex.c
===================================================================
--- clang/test/CodeGen/X86/Float16-complex.c
+++ clang/test/CodeGen/X86/Float16-complex.c
@@ -1,4 +1,5 @@
 // RUN: %clang_cc1 %s -O0 -emit-llvm -triple x86_64-unknown-unknown -target-feature +avx512fp16 -o - | FileCheck %s --check-prefix=X86
+// RUN: %clang_cc1 %s -O0 -emit-llvm -triple x86_64-unknown-unknown -o - | FileCheck %s --check-prefix=X86
 
 _Float16 _Complex add_half_rr(_Float16 a, _Float16 b) {
   // X86-LABEL: @add_half_rr(
Index: clang/test/CodeGen/X86/Float16-arithmetic.c
===================================================================
--- /dev/null
+++ clang/test/CodeGen/X86/Float16-arithmetic.c
@@ -0,0 +1,29 @@
+// RUN: %clang_cc1 -triple  x86_64-unknown-unknown \
+// RUN: -emit-llvm -o - %s  | FileCheck %s --check-prefixes=CHECK
+
+// CHECK-NOT: fpext
+// CHECK-NOT: fptrunc
+
+_Float16 add1(_Float16 a, _Float16 b) {
+  return a + b;
+}
+
+_Float16 add2(_Float16 a, _Float16 b, _Float16 c) {
+  return a + b + c;
+}
+
+_Float16 div(_Float16 a, _Float16 b) {
+  return a / b;
+}
+
+_Float16 mul(_Float16 a, _Float16 b) {
+  return a * b;
+}
+
+_Float16 add_and_mul1(_Float16 a, _Float16 b, _Float16 c, _Float16 d) {
+  return a * b + c * d;
+}
+
+_Float16 add_and_mul2(_Float16 a, _Float16 b, _Float16 c, _Float16 d) {
+  return (a - 6 * b) + c;
+}
Index: clang/lib/Basic/Targets/X86.cpp
===================================================================
--- clang/lib/Basic/Targets/X86.cpp
+++ clang/lib/Basic/Targets/X86.cpp
@@ -239,7 +239,6 @@
       HasAVX512ER = true;
     } else if (Feature == "+avx512fp16") {
       HasAVX512FP16 = true;
-      HasFloat16 = true;
     } else if (Feature == "+avx512pf") {
       HasAVX512PF = true;
     } else if (Feature == "+avx512dq") {
@@ -355,6 +354,9 @@
                            .Default(NoSSE);
     SSELevel = std::max(SSELevel, Level);
 
+    // Turn on _float16 for x86 (feature sse2)
+    HasFloat16 = SSELevel >= SSE2;
+
     MMX3DNowEnum ThreeDNowLevel = llvm::StringSwitch<MMX3DNowEnum>(Feature)
                                       .Case("+3dnowa", AMD3DNowAthlon)
                                       .Case("+3dnow", AMD3DNow)
Index: clang/docs/ReleaseNotes.rst
===================================================================
--- clang/docs/ReleaseNotes.rst
+++ clang/docs/ReleaseNotes.rst
@@ -513,6 +513,9 @@
 
 - Support ``-mharden-sls=[none|all|return|indirect-jmp]`` for straight-line
   speculation hardening.
+- Support for the ``_Float16`` type has been added for all targets with SSE2.
+  When AVX512-FP16 is not available, arithmetic on ``_Float16`` is emulated
+  using ``float``.
 
 DWARF Support in Clang
 ----------------------
Index: clang/docs/LanguageExtensions.rst
===================================================================
--- clang/docs/LanguageExtensions.rst
+++ clang/docs/LanguageExtensions.rst
@@ -743,7 +743,13 @@
 * 64-bit ARM (AArch64)
 * AMDGPU
 * SPIR
-* X86 (Only available under feature AVX512-FP16)
+* X86 (see below)
+
+On X86 targets, ``_Float16`` is supported as long as SSE2 is available, which
+includes all 64-bit and all recent 32-bit processors. When the target supports
+AVX512-FP16, ``_Float16`` arithmetic is performed using that native support.
+Otherwise, ``_Float16`` arithmetic is performed by promoting to ``float``,
+performing the operation, and then truncating to ``_Float16``.
 
 ``_Float16`` will be supported on more targets as they define ABIs for it.

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D128571: [X86] Support `_Float16` on SSE2 and up

Reply via email to