[clang] [X86] Finely handle target of __builtin_ia32_cmp[p|s][s|d] from avx into sse/sse2/avx (PR #84136)

2024-03-06 Thread Freddy Ye via cfe-commits

FreddyLeaf wrote:

> > > Still not understand. Shouldn't the target feature of the caller be 
> > > passed to the intrinsic when checking the argument of the intrinsic?
> > 
> > 
> > The target features passing from option can be passed to Semachecking. It's 
> > special for attribute_target, which is handled in `handleTargetAttr`, which 
> > is after Sema::CheckBuiltinFunctionCall.
> 
> Then why not move the call `handleTargetAttr` before 
> `Sema::CheckBuiltinFunctionCall`?

Mainly because all Sema::CheckBuiltinFunctionCall before doesn't dependent on 
target feature, right? We don't have any other builtins have different range 
limit of same const parameter. But clang::CodegenFunction has the information 
of the whole function, which I think is a more reasonable place to do the check.

https://github.com/llvm/llvm-project/pull/84136
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [X86] Finely handle target of __builtin_ia32_cmp[p|s][s|d] from avx into sse/sse2/avx (PR #84136)

2024-03-06 Thread Shengchen Kan via cfe-commits

KanRobert wrote:

> > Still not understand. Shouldn't the target feature of the caller be passed 
> > to the intrinsic when checking the argument of the intrinsic?
> 
> The target features passing from option can be passed to Semachecking. It's 
> special for attribute_target, which is handled in `handleTargetAttr`, which 
> is after Sema::CheckBuiltinFunctionCall.

Then why not move the call `handleTargetAttr` before 
`Sema::CheckBuiltinFunctionCall`?

https://github.com/llvm/llvm-project/pull/84136
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [X86] Finely handle target of __builtin_ia32_cmp[p|s][s|d] from avx into sse/sse2/avx (PR #84136)

2024-03-06 Thread Freddy Ye via cfe-commits

FreddyLeaf wrote:

> Still not understand. Shouldn't the target feature of the caller be passed to 
> the intrinsic when checking the argument of the intrinsic?
The target features passing from option can be passed to Semachecking. It's 
special for attribute_target, which is handled in `handleTargetAttr`, which is 
after Sema::CheckBuiltinFunctionCall.



https://github.com/llvm/llvm-project/pull/84136
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [X86] Finely handle target of __builtin_ia32_cmp[p|s][s|d] from avx into sse/sse2/avx (PR #84136)

2024-03-06 Thread Shengchen Kan via cfe-commits

KanRobert wrote:

Still not understand. Shouldn't the target feature of the caller be passed to 
the intrinsic when checking the argument of the intrinsic? 

https://github.com/llvm/llvm-project/pull/84136
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [X86] Finely handle target of __builtin_ia32_cmp[p|s][s|d] from avx into sse/sse2/avx (PR #84136)

2024-03-06 Thread Freddy Ye via cfe-commits


@@ -2613,6 +2614,24 @@ void CGBuilderInserter::InsertHelper(
 // called function.
 void CodeGenFunction::checkTargetFeatures(const CallExpr *E,
   const FunctionDecl *TargetDecl) {
+  // SemaCheking cannot handle below x86 builtins because they have different
+  // parameter ranges with different TargetAttribute of caller.

FreddyLeaf wrote:

Updated in description.

https://github.com/llvm/llvm-project/pull/84136
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [X86] Finely handle target of __builtin_ia32_cmp[p|s][s|d] from avx into sse/sse2/avx (PR #84136)

2024-03-06 Thread Freddy Ye via cfe-commits

https://github.com/FreddyLeaf edited 
https://github.com/llvm/llvm-project/pull/84136
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [X86] Finely handle target of __builtin_ia32_cmp[p|s][s|d] from avx into sse/sse2/avx (PR #84136)

2024-03-06 Thread Freddy Ye via cfe-commits


@@ -4745,6 +4745,127 @@ static __inline__ __m128d __DEFAULT_FN_ATTRS 
_mm_castsi128_pd(__m128i __a) {
   return (__m128d)__a;
 }
 
+/// Compares each of the corresponding double-precision values of two
+///128-bit vectors of [2 x double], using the operation specified by the
+///immediate integer operand.
+///
+///Returns a [2 x double] vector consisting of two doubles corresponding to
+///the two comparison results: zero if the comparison is false, and all 1's
+///if the comparison is true.
+///
+/// \headerfile 
+///
+/// \code
+/// __m128d _mm_cmp_pd(__m128d a, __m128d b, const int c);
+/// \endcode
+///
+/// This intrinsic corresponds to the  (V)CMPPD  instruction.
+///
+/// \param a
+///A 128-bit vector of [2 x double].
+/// \param b
+///A 128-bit vector of [2 x double].
+/// \param c
+///An immediate integer operand, with bits [4:0] specifying which 
comparison

FreddyLeaf wrote:

Refined in 51eda069c797c91fd3072f347e1dd4a20ab53702

https://github.com/llvm/llvm-project/pull/84136
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [X86] Finely handle target of __builtin_ia32_cmp[p|s][s|d] from avx into sse/sse2/avx (PR #84136)

2024-03-06 Thread Freddy Ye via cfe-commits


@@ -2940,6 +2940,134 @@ _mm_movemask_ps(__m128 __a)
   return __builtin_ia32_movmskps((__v4sf)__a);
 }
 
+/* Compare */
+#define _CMP_EQ_OQ0x00 /* Equal (ordered, non-signaling)  */
+#define _CMP_LT_OS0x01 /* Less-than (ordered, signaling)  */
+#define _CMP_LE_OS0x02 /* Less-than-or-equal (ordered, signaling)  */
+#define _CMP_UNORD_Q  0x03 /* Unordered (non-signaling)  */
+#define _CMP_NEQ_UQ   0x04 /* Not-equal (unordered, non-signaling)  */
+#define _CMP_NLT_US   0x05 /* Not-less-than (unordered, signaling)  */
+#define _CMP_NLE_US   0x06 /* Not-less-than-or-equal (unordered, signaling)  */
+#define _CMP_ORD_Q0x07 /* Ordered (non-signaling)   */
+
+/// Compares each of the corresponding values of two 128-bit vectors of
+///[4 x float], using the operation specified by the immediate integer
+///operand.
+///
+///Returns a [4 x float] vector consisting of four floats corresponding to
+///the four comparison results: zero if the comparison is false, and all 
1's
+///if the comparison is true.
+///
+/// \headerfile 
+///
+/// \code
+/// __m128 _mm_cmp_ps(__m128 a, __m128 b, const int c);
+/// \endcode
+///
+/// This intrinsic corresponds to the  (V)CMPPS  instruction.
+///
+/// \param a
+///A 128-bit vector of [4 x float].
+/// \param b
+///A 128-bit vector of [4 x float].
+/// \param c
+///An immediate integer operand, with bits [4:0] specifying which 
comparison
+///operation to use: \n
+///(Note that without avx enabled, only bits [2:0] are supported) \n
+///0x00: Equal (ordered, non-signaling) \n
+///0x01: Less-than (ordered, signaling) \n
+///0x02: Less-than-or-equal (ordered, signaling) \n
+///0x03: Unordered (non-signaling) \n
+///0x04: Not-equal (unordered, non-signaling) \n
+///0x05: Not-less-than (unordered, signaling) \n
+///0x06: Not-less-than-or-equal (unordered, signaling) \n
+///0x07: Ordered (non-signaling) \n
+///0x08: Equal (unordered, non-signaling) \n
+///0x09: Not-greater-than-or-equal (unordered, signaling) \n
+///0x0A: Not-greater-than (unordered, signaling) \n
+///0x0B: False (ordered, non-signaling) \n
+///0x0C: Not-equal (ordered, non-signaling) \n
+///0x0D: Greater-than-or-equal (ordered, signaling) \n
+///0x0E: Greater-than (ordered, signaling) \n
+///0x0F: True (unordered, non-signaling) \n
+///0x10: Equal (ordered, signaling) \n
+///0x11: Less-than (ordered, non-signaling) \n
+///0x12: Less-than-or-equal (ordered, non-signaling) \n
+///0x13: Unordered (signaling) \n
+///0x14: Not-equal (unordered, signaling) \n
+///0x15: Not-less-than (unordered, non-signaling) \n
+///0x16: Not-less-than-or-equal (unordered, non-signaling) \n
+///0x17: Ordered (signaling) \n
+///0x18: Equal (unordered, signaling) \n
+///0x19: Not-greater-than-or-equal (unordered, non-signaling) \n
+///0x1A: Not-greater-than (unordered, non-signaling) \n
+///0x1B: False (ordered, signaling) \n
+///0x1C: Not-equal (ordered, signaling) \n
+///0x1D: Greater-than-or-equal (ordered, non-signaling) \n
+///0x1E: Greater-than (ordered, non-signaling) \n
+///0x1F: True (unordered, signaling)

FreddyLeaf wrote:

Refined in 51eda069c797c91fd3072f347e1dd4a20ab53702

https://github.com/llvm/llvm-project/pull/84136
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [X86] Finely handle target of __builtin_ia32_cmp[p|s][s|d] from avx into sse/sse2/avx (PR #84136)

2024-03-06 Thread Freddy Ye via cfe-commits

https://github.com/FreddyLeaf updated 
https://github.com/llvm/llvm-project/pull/84136

>From 2ba698b222fa1dc963d21850d5931562aa65533d Mon Sep 17 00:00:00 2001
From: Freddy Ye 
Date: Tue, 26 Sep 2023 16:44:01 +0800
Subject: [PATCH 1/5] [X86] Change target of __builtin_ia32_cmp[p|s][s|d] from
 avx into sse/sse2

---
 clang/include/clang/Basic/BuiltinsX86.def |   8 +-
 clang/lib/Headers/avxintrin.h | 247 --
 clang/lib/Headers/emmintrin.h | 121 +
 clang/lib/Headers/xmmintrin.h | 128 +
 clang/test/CodeGen/X86/avx-builtins.c |  96 ---
 .../test/CodeGen/X86/cmp-avx-builtins-error.c |  22 ++
 clang/test/CodeGen/X86/sse-builtins.c |  54 
 clang/test/CodeGen/X86/sse2-builtins.c|  54 
 clang/test/CodeGen/target-features-error-2.c  |   4 +-
 9 files changed, 385 insertions(+), 349 deletions(-)
 create mode 100644 clang/test/CodeGen/X86/cmp-avx-builtins-error.c

diff --git a/clang/include/clang/Basic/BuiltinsX86.def 
b/clang/include/clang/Basic/BuiltinsX86.def
index 207cde0414b54e..eafcc219c10966 100644
--- a/clang/include/clang/Basic/BuiltinsX86.def
+++ b/clang/include/clang/Basic/BuiltinsX86.def
@@ -226,6 +226,8 @@ TARGET_BUILTIN(__builtin_ia32_minps, "V4fV4fV4f", 
"ncV:128:", "sse")
 TARGET_BUILTIN(__builtin_ia32_maxps, "V4fV4fV4f", "ncV:128:", "sse")
 TARGET_BUILTIN(__builtin_ia32_minss, "V4fV4fV4f", "ncV:128:", "sse")
 TARGET_BUILTIN(__builtin_ia32_maxss, "V4fV4fV4f", "ncV:128:", "sse")
+TARGET_BUILTIN(__builtin_ia32_cmpps, "V4fV4fV4fIc", "ncV:128:", "sse")
+TARGET_BUILTIN(__builtin_ia32_cmpss, "V4fV4fV4fIc", "ncV:128:", "sse")
 
 TARGET_BUILTIN(__builtin_ia32_cmpeqpd, "V2dV2dV2d", "ncV:128:", "sse2")
 TARGET_BUILTIN(__builtin_ia32_cmpltpd, "V2dV2dV2d", "ncV:128:", "sse2")
@@ -243,6 +245,8 @@ TARGET_BUILTIN(__builtin_ia32_cmpneqsd, "V2dV2dV2d", 
"ncV:128:", "sse2")
 TARGET_BUILTIN(__builtin_ia32_cmpnltsd, "V2dV2dV2d", "ncV:128:", "sse2")
 TARGET_BUILTIN(__builtin_ia32_cmpnlesd, "V2dV2dV2d", "ncV:128:", "sse2")
 TARGET_BUILTIN(__builtin_ia32_cmpordsd, "V2dV2dV2d", "ncV:128:", "sse2")
+TARGET_BUILTIN(__builtin_ia32_cmpsd, "V2dV2dV2dIc", "ncV:128:", "sse2")
+TARGET_BUILTIN(__builtin_ia32_cmppd, "V2dV2dV2dIc", "ncV:128:", "sse2")
 TARGET_BUILTIN(__builtin_ia32_minpd, "V2dV2dV2d", "ncV:128:", "sse2")
 TARGET_BUILTIN(__builtin_ia32_maxpd, "V2dV2dV2d", "ncV:128:", "sse2")
 TARGET_BUILTIN(__builtin_ia32_minsd, "V2dV2dV2d", "ncV:128:", "sse2")
@@ -462,12 +466,8 @@ TARGET_BUILTIN(__builtin_ia32_blendvps256, "V8fV8fV8fV8f", 
"ncV:256:", "avx")
 TARGET_BUILTIN(__builtin_ia32_shufpd256, "V4dV4dV4dIi", "ncV:256:", "avx")
 TARGET_BUILTIN(__builtin_ia32_shufps256, "V8fV8fV8fIi", "ncV:256:", "avx")
 TARGET_BUILTIN(__builtin_ia32_dpps256, "V8fV8fV8fIc", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_cmppd, "V2dV2dV2dIc", "ncV:128:", "avx")
 TARGET_BUILTIN(__builtin_ia32_cmppd256, "V4dV4dV4dIc", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_cmpps, "V4fV4fV4fIc", "ncV:128:", "avx")
 TARGET_BUILTIN(__builtin_ia32_cmpps256, "V8fV8fV8fIc", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_cmpsd, "V2dV2dV2dIc", "ncV:128:", "avx")
-TARGET_BUILTIN(__builtin_ia32_cmpss, "V4fV4fV4fIc", "ncV:128:", "avx")
 TARGET_BUILTIN(__builtin_ia32_vextractf128_pd256, "V2dV4dIi", "ncV:256:", 
"avx")
 TARGET_BUILTIN(__builtin_ia32_vextractf128_ps256, "V4fV8fIi", "ncV:256:", 
"avx")
 TARGET_BUILTIN(__builtin_ia32_vextractf128_si256, "V4iV8iIi", "ncV:256:", 
"avx")
diff --git a/clang/lib/Headers/avxintrin.h b/clang/lib/Headers/avxintrin.h
index f116d8bc3a94c7..d6192518ea24ba 100644
--- a/clang/lib/Headers/avxintrin.h
+++ b/clang/lib/Headers/avxintrin.h
@@ -1573,15 +1573,6 @@ _mm256_blendv_ps(__m256 __a, __m256 __b, __m256 __c)
   ((__m256d)__builtin_ia32_shufpd256((__v4df)(__m256d)(a), \
  (__v4df)(__m256d)(b), (int)(mask)))
 
-/* Compare */
-#define _CMP_EQ_OQ0x00 /* Equal (ordered, non-signaling)  */
-#define _CMP_LT_OS0x01 /* Less-than (ordered, signaling)  */
-#define _CMP_LE_OS0x02 /* Less-than-or-equal (ordered, signaling)  */
-#define _CMP_UNORD_Q  0x03 /* Unordered (non-signaling)  */
-#define _CMP_NEQ_UQ   0x04 /* Not-equal (unordered, non-signaling)  */
-#define _CMP_NLT_US   0x05 /* Not-less-than (unordered, signaling)  */
-#define _CMP_NLE_US   0x06 /* Not-less-than-or-equal (unordered, signaling)  */
-#define _CMP_ORD_Q0x07 /* Ordered (non-signaling)   */
 #define _CMP_EQ_UQ0x08 /* Equal (unordered, non-signaling)  */
 #define _CMP_NGE_US   0x09 /* Not-greater-than-or-equal (unordered, signaling) 
 */
 #define _CMP_NGT_US   0x0a /* Not-greater-than (unordered, signaling)  */
@@ -1607,126 +1598,6 @@ _mm256_blendv_ps(__m256 __a, __m256 __b, __m256 __c)
 #define _CMP_GT_OQ0x1e /* Greater-than (ordered, non-signaling)  */
 #define _CMP_TRUE_US  0x1f /* True (unordered, signaling)  */
 
-/// Compares each of the corresponding double-precision values of two
-///

[clang] [X86] Finely handle target of __builtin_ia32_cmp[p|s][s|d] from avx into sse/sse2/avx (PR #84136)

2024-03-06 Thread Freddy Ye via cfe-commits


@@ -34,6 +34,60 @@ __m128 test_mm_andnot_ps(__m128 A, __m128 B) {
   return _mm_andnot_ps(A, B);
 }
 
+__m128 test_mm_cmp_ps_eq_oq(__m128 a, __m128 b) {
+  // CHECK-LABEL: test_mm_cmp_ps_eq_oq
+  // CHECK: fcmp oeq <4 x float> %{{.*}}, %{{.*}}
+  return _mm_cmp_ps(a, b, _CMP_EQ_OQ);

FreddyLeaf wrote:

Oh the remaining 24 are already there. So I only moved the first 8 from avx to 
sse/sse2. which is expected.

https://github.com/llvm/llvm-project/pull/84136
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [X86] Finely handle target of __builtin_ia32_cmp[p|s][s|d] from avx into sse/sse2/avx (PR #84136)

2024-03-06 Thread Shengchen Kan via cfe-commits


@@ -2613,6 +2614,24 @@ void CGBuilderInserter::InsertHelper(
 // called function.
 void CodeGenFunction::checkTargetFeatures(const CallExpr *E,
   const FunctionDecl *TargetDecl) {
+  // SemaCheking cannot handle below x86 builtins because they have different
+  // parameter ranges with different TargetAttribute of caller.

KanRobert wrote:

I see. They corresponds to different instructions.

https://github.com/llvm/llvm-project/pull/84136
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [X86] Finely handle target of __builtin_ia32_cmp[p|s][s|d] from avx into sse/sse2/avx (PR #84136)

2024-03-06 Thread Freddy Ye via cfe-commits


@@ -2613,6 +2614,24 @@ void CGBuilderInserter::InsertHelper(
 // called function.
 void CodeGenFunction::checkTargetFeatures(const CallExpr *E,
   const FunctionDecl *TargetDecl) {
+  // SemaCheking cannot handle below x86 builtins because they have different
+  // parameter ranges with different TargetAttribute of caller.

FreddyLeaf wrote:

because SDM defined so?

https://github.com/llvm/llvm-project/pull/84136
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [X86] Finely handle target of __builtin_ia32_cmp[p|s][s|d] from avx into sse/sse2/avx (PR #84136)

2024-03-06 Thread Shengchen Kan via cfe-commits


@@ -2613,6 +2614,24 @@ void CGBuilderInserter::InsertHelper(
 // called function.
 void CodeGenFunction::checkTargetFeatures(const CallExpr *E,
   const FunctionDecl *TargetDecl) {
+  // SemaCheking cannot handle below x86 builtins because they have different
+  // parameter ranges with different TargetAttribute of caller.

KanRobert wrote:

This is the behavior. But why? 

https://github.com/llvm/llvm-project/pull/84136
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [X86] Finely handle target of __builtin_ia32_cmp[p|s][s|d] from avx into sse/sse2/avx (PR #84136)

2024-03-06 Thread Freddy Ye via cfe-commits


@@ -34,6 +34,60 @@ __m128 test_mm_andnot_ps(__m128 A, __m128 B) {
   return _mm_andnot_ps(A, B);
 }
 
+__m128 test_mm_cmp_ps_eq_oq(__m128 a, __m128 b) {
+  // CHECK-LABEL: test_mm_cmp_ps_eq_oq
+  // CHECK: fcmp oeq <4 x float> %{{.*}}, %{{.*}}
+  return _mm_cmp_ps(a, b, _CMP_EQ_OQ);

FreddyLeaf wrote:

These condition codes are [0,7], so they won't throw errors. What about adding 
8 more tests to avx-builtins.c with condition codes of [8,15]?

https://github.com/llvm/llvm-project/pull/84136
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [X86] Finely handle target of __builtin_ia32_cmp[p|s][s|d] from avx into sse/sse2/avx (PR #84136)

2024-03-06 Thread Simon Pilgrim via cfe-commits


@@ -2940,6 +2940,134 @@ _mm_movemask_ps(__m128 __a)
   return __builtin_ia32_movmskps((__v4sf)__a);
 }
 
+/* Compare */
+#define _CMP_EQ_OQ0x00 /* Equal (ordered, non-signaling)  */
+#define _CMP_LT_OS0x01 /* Less-than (ordered, signaling)  */
+#define _CMP_LE_OS0x02 /* Less-than-or-equal (ordered, signaling)  */
+#define _CMP_UNORD_Q  0x03 /* Unordered (non-signaling)  */
+#define _CMP_NEQ_UQ   0x04 /* Not-equal (unordered, non-signaling)  */
+#define _CMP_NLT_US   0x05 /* Not-less-than (unordered, signaling)  */
+#define _CMP_NLE_US   0x06 /* Not-less-than-or-equal (unordered, signaling)  */
+#define _CMP_ORD_Q0x07 /* Ordered (non-signaling)   */
+
+/// Compares each of the corresponding values of two 128-bit vectors of
+///[4 x float], using the operation specified by the immediate integer
+///operand.
+///
+///Returns a [4 x float] vector consisting of four floats corresponding to
+///the four comparison results: zero if the comparison is false, and all 
1's
+///if the comparison is true.
+///
+/// \headerfile 
+///
+/// \code
+/// __m128 _mm_cmp_ps(__m128 a, __m128 b, const int c);
+/// \endcode
+///
+/// This intrinsic corresponds to the  (V)CMPPS  instruction.
+///
+/// \param a
+///A 128-bit vector of [4 x float].
+/// \param b
+///A 128-bit vector of [4 x float].
+/// \param c
+///An immediate integer operand, with bits [4:0] specifying which 
comparison
+///operation to use: \n
+///(Note that without avx enabled, only bits [2:0] are supported) \n
+///0x00: Equal (ordered, non-signaling) \n
+///0x01: Less-than (ordered, signaling) \n
+///0x02: Less-than-or-equal (ordered, signaling) \n
+///0x03: Unordered (non-signaling) \n
+///0x04: Not-equal (unordered, non-signaling) \n
+///0x05: Not-less-than (unordered, signaling) \n
+///0x06: Not-less-than-or-equal (unordered, signaling) \n
+///0x07: Ordered (non-signaling) \n
+///0x08: Equal (unordered, non-signaling) \n
+///0x09: Not-greater-than-or-equal (unordered, signaling) \n
+///0x0A: Not-greater-than (unordered, signaling) \n
+///0x0B: False (ordered, non-signaling) \n
+///0x0C: Not-equal (ordered, non-signaling) \n
+///0x0D: Greater-than-or-equal (ordered, signaling) \n
+///0x0E: Greater-than (ordered, signaling) \n
+///0x0F: True (unordered, non-signaling) \n
+///0x10: Equal (ordered, signaling) \n
+///0x11: Less-than (ordered, non-signaling) \n
+///0x12: Less-than-or-equal (ordered, non-signaling) \n
+///0x13: Unordered (signaling) \n
+///0x14: Not-equal (unordered, signaling) \n
+///0x15: Not-less-than (unordered, non-signaling) \n
+///0x16: Not-less-than-or-equal (unordered, non-signaling) \n
+///0x17: Ordered (signaling) \n
+///0x18: Equal (unordered, signaling) \n
+///0x19: Not-greater-than-or-equal (unordered, non-signaling) \n
+///0x1A: Not-greater-than (unordered, non-signaling) \n
+///0x1B: False (ordered, signaling) \n
+///0x1C: Not-equal (ordered, signaling) \n
+///0x1D: Greater-than-or-equal (ordered, non-signaling) \n
+///0x1E: Greater-than (ordered, non-signaling) \n
+///0x1F: True (unordered, signaling)

RKSimon wrote:

No objections, but I'd prefer to see the new macros (and tests) and comment 
removal to be done as a followup patch

https://github.com/llvm/llvm-project/pull/84136
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [X86] Finely handle target of __builtin_ia32_cmp[p|s][s|d] from avx into sse/sse2/avx (PR #84136)

2024-03-06 Thread Phoebe Wang via cfe-commits


@@ -34,6 +34,60 @@ __m128 test_mm_andnot_ps(__m128 A, __m128 B) {
   return _mm_andnot_ps(A, B);
 }
 
+__m128 test_mm_cmp_ps_eq_oq(__m128 a, __m128 b) {
+  // CHECK-LABEL: test_mm_cmp_ps_eq_oq
+  // CHECK: fcmp oeq <4 x float> %{{.*}}, %{{.*}}
+  return _mm_cmp_ps(a, b, _CMP_EQ_OQ);

phoebewang wrote:

I don't understand why moving them here. Shouldn't they emit error without avx?

https://github.com/llvm/llvm-project/pull/84136
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [X86] Finely handle target of __builtin_ia32_cmp[p|s][s|d] from avx into sse/sse2/avx (PR #84136)

2024-03-06 Thread Freddy Ye via cfe-commits


@@ -462,12 +462,12 @@ TARGET_BUILTIN(__builtin_ia32_blendvps256, 
"V8fV8fV8fV8f", "ncV:256:", "avx")
 TARGET_BUILTIN(__builtin_ia32_shufpd256, "V4dV4dV4dIi", "ncV:256:", "avx")
 TARGET_BUILTIN(__builtin_ia32_shufps256, "V8fV8fV8fIi", "ncV:256:", "avx")
 TARGET_BUILTIN(__builtin_ia32_dpps256, "V8fV8fV8fIc", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_cmppd, "V2dV2dV2dIc", "ncV:128:", "avx")
+TARGET_BUILTIN(__builtin_ia32_cmppd, "V2dV2dV2dIc", "ncV:128:", "avx|sse2")
 TARGET_BUILTIN(__builtin_ia32_cmppd256, "V4dV4dV4dIc", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_cmpps, "V4fV4fV4fIc", "ncV:128:", "avx")
+TARGET_BUILTIN(__builtin_ia32_cmpps, "V4fV4fV4fIc", "ncV:128:", "avx|sse")
 TARGET_BUILTIN(__builtin_ia32_cmpps256, "V8fV8fV8fIc", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_cmpsd, "V2dV2dV2dIc", "ncV:128:", "avx")
-TARGET_BUILTIN(__builtin_ia32_cmpss, "V4fV4fV4fIc", "ncV:128:", "avx")
+TARGET_BUILTIN(__builtin_ia32_cmpsd, "V2dV2dV2dIc", "ncV:128:", "avx|sse2")
+TARGET_BUILTIN(__builtin_ia32_cmpss, "V4fV4fV4fIc", "ncV:128:", "avx|sse")

FreddyLeaf wrote:

[d8979e2](https://github.com/llvm/llvm-project/pull/84136/commits/d8979e29e1932b42bb54424e1c6aa97ba9ba07e7)

https://github.com/llvm/llvm-project/pull/84136
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [X86] Finely handle target of __builtin_ia32_cmp[p|s][s|d] from avx into sse/sse2/avx (PR #84136)

2024-03-06 Thread Freddy Ye via cfe-commits


@@ -1719,3 +1719,57 @@ __m128i test_mm_xor_si128(__m128i A, __m128i B) {
   // CHECK: xor <2 x i64> %{{.*}}, %{{.*}}
   return _mm_xor_si128(A, B);
 }
+
+__m128d test_mm_cmp_pd_eq_oq(__m128d a, __m128d b) {

FreddyLeaf wrote:

[d8979e2](https://github.com/llvm/llvm-project/pull/84136/commits/d8979e29e1932b42bb54424e1c6aa97ba9ba07e7)

https://github.com/llvm/llvm-project/pull/84136
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [X86] Finely handle target of __builtin_ia32_cmp[p|s][s|d] from avx into sse/sse2/avx (PR #84136)

2024-03-06 Thread Freddy Ye via cfe-commits

https://github.com/FreddyLeaf deleted 
https://github.com/llvm/llvm-project/pull/84136
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [X86] Finely handle target of __builtin_ia32_cmp[p|s][s|d] from avx into sse/sse2/avx (PR #84136)

2024-03-06 Thread Freddy Ye via cfe-commits


@@ -2613,6 +2614,24 @@ void CGBuilderInserter::InsertHelper(
 // called function.
 void CodeGenFunction::checkTargetFeatures(const CallExpr *E,
   const FunctionDecl *TargetDecl) {
+  // SemaCheking cannot handle below x86 builtins because they have different
+  // parameter ranges with different TargetAttribute of caller.

FreddyLeaf wrote:

[d8979e2](https://github.com/llvm/llvm-project/pull/84136/commits/d8979e29e1932b42bb54424e1c6aa97ba9ba07e7)

https://github.com/llvm/llvm-project/pull/84136
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [X86] Finely handle target of __builtin_ia32_cmp[p|s][s|d] from avx into sse/sse2/avx (PR #84136)

2024-03-06 Thread Freddy Ye via cfe-commits

https://github.com/FreddyLeaf deleted 
https://github.com/llvm/llvm-project/pull/84136
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [X86] Finely handle target of __builtin_ia32_cmp[p|s][s|d] from avx into sse/sse2/avx (PR #84136)

2024-03-06 Thread Freddy Ye via cfe-commits


@@ -2613,6 +2614,24 @@ void CGBuilderInserter::InsertHelper(
 // called function.
 void CodeGenFunction::checkTargetFeatures(const CallExpr *E,
   const FunctionDecl *TargetDecl) {
+  // SemaCheking cannot handle below x86 builtins because they have different
+  // parameter ranges with different TargetAttribute of caller.

FreddyLeaf wrote:

[d8979e2](https://github.com/llvm/llvm-project/pull/84136/commits/d8979e29e1932b42bb54424e1c6aa97ba9ba07e7)

https://github.com/llvm/llvm-project/pull/84136
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [X86] Finely handle target of __builtin_ia32_cmp[p|s][s|d] from avx into sse/sse2/avx (PR #84136)

2024-03-06 Thread Freddy Ye via cfe-commits

https://github.com/FreddyLeaf updated 
https://github.com/llvm/llvm-project/pull/84136

>From 2ba698b222fa1dc963d21850d5931562aa65533d Mon Sep 17 00:00:00 2001
From: Freddy Ye 
Date: Tue, 26 Sep 2023 16:44:01 +0800
Subject: [PATCH 1/3] [X86] Change target of __builtin_ia32_cmp[p|s][s|d] from
 avx into sse/sse2

---
 clang/include/clang/Basic/BuiltinsX86.def |   8 +-
 clang/lib/Headers/avxintrin.h | 247 --
 clang/lib/Headers/emmintrin.h | 121 +
 clang/lib/Headers/xmmintrin.h | 128 +
 clang/test/CodeGen/X86/avx-builtins.c |  96 ---
 .../test/CodeGen/X86/cmp-avx-builtins-error.c |  22 ++
 clang/test/CodeGen/X86/sse-builtins.c |  54 
 clang/test/CodeGen/X86/sse2-builtins.c|  54 
 clang/test/CodeGen/target-features-error-2.c  |   4 +-
 9 files changed, 385 insertions(+), 349 deletions(-)
 create mode 100644 clang/test/CodeGen/X86/cmp-avx-builtins-error.c

diff --git a/clang/include/clang/Basic/BuiltinsX86.def 
b/clang/include/clang/Basic/BuiltinsX86.def
index 207cde0414b54e..eafcc219c10966 100644
--- a/clang/include/clang/Basic/BuiltinsX86.def
+++ b/clang/include/clang/Basic/BuiltinsX86.def
@@ -226,6 +226,8 @@ TARGET_BUILTIN(__builtin_ia32_minps, "V4fV4fV4f", 
"ncV:128:", "sse")
 TARGET_BUILTIN(__builtin_ia32_maxps, "V4fV4fV4f", "ncV:128:", "sse")
 TARGET_BUILTIN(__builtin_ia32_minss, "V4fV4fV4f", "ncV:128:", "sse")
 TARGET_BUILTIN(__builtin_ia32_maxss, "V4fV4fV4f", "ncV:128:", "sse")
+TARGET_BUILTIN(__builtin_ia32_cmpps, "V4fV4fV4fIc", "ncV:128:", "sse")
+TARGET_BUILTIN(__builtin_ia32_cmpss, "V4fV4fV4fIc", "ncV:128:", "sse")
 
 TARGET_BUILTIN(__builtin_ia32_cmpeqpd, "V2dV2dV2d", "ncV:128:", "sse2")
 TARGET_BUILTIN(__builtin_ia32_cmpltpd, "V2dV2dV2d", "ncV:128:", "sse2")
@@ -243,6 +245,8 @@ TARGET_BUILTIN(__builtin_ia32_cmpneqsd, "V2dV2dV2d", 
"ncV:128:", "sse2")
 TARGET_BUILTIN(__builtin_ia32_cmpnltsd, "V2dV2dV2d", "ncV:128:", "sse2")
 TARGET_BUILTIN(__builtin_ia32_cmpnlesd, "V2dV2dV2d", "ncV:128:", "sse2")
 TARGET_BUILTIN(__builtin_ia32_cmpordsd, "V2dV2dV2d", "ncV:128:", "sse2")
+TARGET_BUILTIN(__builtin_ia32_cmpsd, "V2dV2dV2dIc", "ncV:128:", "sse2")
+TARGET_BUILTIN(__builtin_ia32_cmppd, "V2dV2dV2dIc", "ncV:128:", "sse2")
 TARGET_BUILTIN(__builtin_ia32_minpd, "V2dV2dV2d", "ncV:128:", "sse2")
 TARGET_BUILTIN(__builtin_ia32_maxpd, "V2dV2dV2d", "ncV:128:", "sse2")
 TARGET_BUILTIN(__builtin_ia32_minsd, "V2dV2dV2d", "ncV:128:", "sse2")
@@ -462,12 +466,8 @@ TARGET_BUILTIN(__builtin_ia32_blendvps256, "V8fV8fV8fV8f", 
"ncV:256:", "avx")
 TARGET_BUILTIN(__builtin_ia32_shufpd256, "V4dV4dV4dIi", "ncV:256:", "avx")
 TARGET_BUILTIN(__builtin_ia32_shufps256, "V8fV8fV8fIi", "ncV:256:", "avx")
 TARGET_BUILTIN(__builtin_ia32_dpps256, "V8fV8fV8fIc", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_cmppd, "V2dV2dV2dIc", "ncV:128:", "avx")
 TARGET_BUILTIN(__builtin_ia32_cmppd256, "V4dV4dV4dIc", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_cmpps, "V4fV4fV4fIc", "ncV:128:", "avx")
 TARGET_BUILTIN(__builtin_ia32_cmpps256, "V8fV8fV8fIc", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_cmpsd, "V2dV2dV2dIc", "ncV:128:", "avx")
-TARGET_BUILTIN(__builtin_ia32_cmpss, "V4fV4fV4fIc", "ncV:128:", "avx")
 TARGET_BUILTIN(__builtin_ia32_vextractf128_pd256, "V2dV4dIi", "ncV:256:", 
"avx")
 TARGET_BUILTIN(__builtin_ia32_vextractf128_ps256, "V4fV8fIi", "ncV:256:", 
"avx")
 TARGET_BUILTIN(__builtin_ia32_vextractf128_si256, "V4iV8iIi", "ncV:256:", 
"avx")
diff --git a/clang/lib/Headers/avxintrin.h b/clang/lib/Headers/avxintrin.h
index f116d8bc3a94c7..d6192518ea24ba 100644
--- a/clang/lib/Headers/avxintrin.h
+++ b/clang/lib/Headers/avxintrin.h
@@ -1573,15 +1573,6 @@ _mm256_blendv_ps(__m256 __a, __m256 __b, __m256 __c)
   ((__m256d)__builtin_ia32_shufpd256((__v4df)(__m256d)(a), \
  (__v4df)(__m256d)(b), (int)(mask)))
 
-/* Compare */
-#define _CMP_EQ_OQ0x00 /* Equal (ordered, non-signaling)  */
-#define _CMP_LT_OS0x01 /* Less-than (ordered, signaling)  */
-#define _CMP_LE_OS0x02 /* Less-than-or-equal (ordered, signaling)  */
-#define _CMP_UNORD_Q  0x03 /* Unordered (non-signaling)  */
-#define _CMP_NEQ_UQ   0x04 /* Not-equal (unordered, non-signaling)  */
-#define _CMP_NLT_US   0x05 /* Not-less-than (unordered, signaling)  */
-#define _CMP_NLE_US   0x06 /* Not-less-than-or-equal (unordered, signaling)  */
-#define _CMP_ORD_Q0x07 /* Ordered (non-signaling)   */
 #define _CMP_EQ_UQ0x08 /* Equal (unordered, non-signaling)  */
 #define _CMP_NGE_US   0x09 /* Not-greater-than-or-equal (unordered, signaling) 
 */
 #define _CMP_NGT_US   0x0a /* Not-greater-than (unordered, signaling)  */
@@ -1607,126 +1598,6 @@ _mm256_blendv_ps(__m256 __a, __m256 __b, __m256 __c)
 #define _CMP_GT_OQ0x1e /* Greater-than (ordered, non-signaling)  */
 #define _CMP_TRUE_US  0x1f /* True (unordered, signaling)  */
 
-/// Compares each of the corresponding double-precision values of two
-///

[clang] [X86] Finely handle target of __builtin_ia32_cmp[p|s][s|d] from avx into sse/sse2/avx (PR #84136)

2024-03-06 Thread Freddy Ye via cfe-commits

https://github.com/FreddyLeaf edited 
https://github.com/llvm/llvm-project/pull/84136
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [X86] Finely handle target of __builtin_ia32_cmp[p|s][s|d] from avx into sse/sse2/avx (PR #84136)

2024-03-06 Thread Freddy Ye via cfe-commits


@@ -2613,6 +2614,24 @@ void CGBuilderInserter::InsertHelper(
 // called function.
 void CodeGenFunction::checkTargetFeatures(const CallExpr *E,
   const FunctionDecl *TargetDecl) {
+  // SemaCheking cannot handle below x86 builtins because they have different
+  // parameter ranges with different TargetAttribute of caller.

FreddyLeaf wrote:

Semachecking doesn't have the information of the caller of the intrinsic. While 
here we require to know if caller has "avx" or not when the condition code is 
[8, 31].
Considering such case:
```
#include 

__attribute__((target("sse2")))
__m128 test1(__m128 a, __m128 b) {
  return _mm_cmp_ps(a, b, 14); // shouldn't report error.
}

__attribute__((target("avx")))
__m128 test2(__m128 a, __m128 b) {
  return _mm_cmp_ps(a, b, 14); // should report error.
}


https://github.com/llvm/llvm-project/pull/84136
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [X86] Finely handle target of __builtin_ia32_cmp[p|s][s|d] from avx into sse/sse2/avx (PR #84136)

2024-03-06 Thread Freddy Ye via cfe-commits

FreddyLeaf wrote:

> My understanding is that you change the required feature of 
> __builtin_ia32_cmp[p|s][s|d] from avx to sse for some condition codes. The 
> title is not accurate.

You are right. Updated.

https://github.com/llvm/llvm-project/pull/84136
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [X86] Finely handle target of __builtin_ia32_cmp[p|s][s|d] from avx into sse/sse2/avx (PR #84136)

2024-03-06 Thread Freddy Ye via cfe-commits


@@ -462,12 +462,12 @@ TARGET_BUILTIN(__builtin_ia32_blendvps256, 
"V8fV8fV8fV8f", "ncV:256:", "avx")
 TARGET_BUILTIN(__builtin_ia32_shufpd256, "V4dV4dV4dIi", "ncV:256:", "avx")
 TARGET_BUILTIN(__builtin_ia32_shufps256, "V8fV8fV8fIi", "ncV:256:", "avx")
 TARGET_BUILTIN(__builtin_ia32_dpps256, "V8fV8fV8fIc", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_cmppd, "V2dV2dV2dIc", "ncV:128:", "avx")
+TARGET_BUILTIN(__builtin_ia32_cmppd, "V2dV2dV2dIc", "ncV:128:", "avx|sse2")
 TARGET_BUILTIN(__builtin_ia32_cmppd256, "V4dV4dV4dIc", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_cmpps, "V4fV4fV4fIc", "ncV:128:", "avx")
+TARGET_BUILTIN(__builtin_ia32_cmpps, "V4fV4fV4fIc", "ncV:128:", "avx|sse")
 TARGET_BUILTIN(__builtin_ia32_cmpps256, "V8fV8fV8fIc", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_cmpsd, "V2dV2dV2dIc", "ncV:128:", "avx")
-TARGET_BUILTIN(__builtin_ia32_cmpss, "V4fV4fV4fIc", "ncV:128:", "avx")
+TARGET_BUILTIN(__builtin_ia32_cmpsd, "V2dV2dV2dIc", "ncV:128:", "avx|sse2")
+TARGET_BUILTIN(__builtin_ia32_cmpss, "V4fV4fV4fIc", "ncV:128:", "avx|sse")

FreddyLeaf wrote:

Good catch. No strong objections.

https://github.com/llvm/llvm-project/pull/84136
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [X86] Finely handle target of __builtin_ia32_cmp[p|s][s|d] from avx into sse/sse2/avx (PR #84136)

2024-03-06 Thread Freddy Ye via cfe-commits


@@ -462,12 +462,12 @@ TARGET_BUILTIN(__builtin_ia32_blendvps256, 
"V8fV8fV8fV8f", "ncV:256:", "avx")
 TARGET_BUILTIN(__builtin_ia32_shufpd256, "V4dV4dV4dIi", "ncV:256:", "avx")
 TARGET_BUILTIN(__builtin_ia32_shufps256, "V8fV8fV8fIi", "ncV:256:", "avx")
 TARGET_BUILTIN(__builtin_ia32_dpps256, "V8fV8fV8fIc", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_cmppd, "V2dV2dV2dIc", "ncV:128:", "avx")
+TARGET_BUILTIN(__builtin_ia32_cmppd, "V2dV2dV2dIc", "ncV:128:", "avx|sse2")
 TARGET_BUILTIN(__builtin_ia32_cmppd256, "V4dV4dV4dIc", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_cmpps, "V4fV4fV4fIc", "ncV:128:", "avx")
+TARGET_BUILTIN(__builtin_ia32_cmpps, "V4fV4fV4fIc", "ncV:128:", "avx|sse")
 TARGET_BUILTIN(__builtin_ia32_cmpps256, "V8fV8fV8fIc", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_cmpsd, "V2dV2dV2dIc", "ncV:128:", "avx")
-TARGET_BUILTIN(__builtin_ia32_cmpss, "V4fV4fV4fIc", "ncV:128:", "avx")
+TARGET_BUILTIN(__builtin_ia32_cmpsd, "V2dV2dV2dIc", "ncV:128:", "avx|sse2")
+TARGET_BUILTIN(__builtin_ia32_cmpss, "V4fV4fV4fIc", "ncV:128:", "avx|sse")

FreddyLeaf wrote:

Good catch. No strong objections.

https://github.com/llvm/llvm-project/pull/84136
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [X86] Finely handle target of __builtin_ia32_cmp[p|s][s|d] from avx into sse/sse2/avx (PR #84136)

2024-03-06 Thread Freddy Ye via cfe-commits

https://github.com/FreddyLeaf edited 
https://github.com/llvm/llvm-project/pull/84136
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits