[clang] [clang] Split up `SemaDeclAttr.cpp` (PR #93966)

2024-05-31 Thread Simon Pilgrim via cfe-commits

RKSimon wrote:

Would it make sense to add a new header (SemaUtils.h? SemaTargetUtils.h?) to 
move most of the exposed templated helpers to instead of Sema.h?

https://github.com/llvm/llvm-project/pull/93966
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [X86] Add support for MS inp functions. (PR #93804)

2024-05-31 Thread Simon Pilgrim via cfe-commits

https://github.com/RKSimon commented:

Maybe add this to ReleaseNotes?

https://github.com/llvm/llvm-project/pull/93804
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [X86]Add support for _outp{|w|d} (PR #93774)

2024-05-31 Thread Simon Pilgrim via cfe-commits

https://github.com/RKSimon commented:

Maybe add a line to ReleaseNotes? But otherwise SGTM

https://github.com/llvm/llvm-project/pull/93774
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [X86] Support EGPR for inline assembly. (PR #92338)

2024-05-30 Thread Simon Pilgrim via cfe-commits

RKSimon wrote:

@FreddyLeaf This is corrupting git checkouts on windows - please can you revert 
?

https://github.com/llvm/llvm-project/pull/92338
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [X86]Add support for _outp{|w|d} (PR #93774)

2024-05-30 Thread Simon Pilgrim via cfe-commits


@@ -63,6 +63,91 @@ unsigned __int64 test__emulu(unsigned int a, unsigned int b) 
{
 // CHECK: [[RES:%[0-9]+]] = mul nuw i64 [[Y]], [[X]]
 // CHECK: ret i64 [[RES]]
 
+//
+// CHECK-I386-LABEL: define dso_local noundef i32 @test_outp(
+// CHECK-I386-SAME: i16 noundef zeroext [[PORT:%.*]], i32 noundef returned 
[[DATA:%.*]]) local_unnamed_addr #[[ATTR2:[0-9]+]] {
+// CHECK-I386-NEXT:  [[ENTRY:.*:]]
+// CHECK-I386-NEXT:tail call void asm sideeffect "outb ${0:b}, ${1:w}", 
"{ax},N{dx},~{dirflag},~{fpsr},~{flags}"(i32 [[DATA]], i16 [[PORT]]) 
#[[ATTR3:[0-9]+]], !srcloc [[META4:![0-9]+]]
+// CHECK-I386-NEXT:ret i32 [[DATA]]
+//
+// CHECK-X64-LABEL: define dso_local noundef i32 @test_outp(
+// CHECK-X64-SAME: i16 noundef [[PORT:%.*]], i32 noundef returned 
[[DATA:%.*]]) local_unnamed_addr #[[ATTR1:[0-9]+]] {
+// CHECK-X64-NEXT:  [[ENTRY:.*:]]
+// CHECK-X64-NEXT:tail call void asm sideeffect "outb ${0:b}, ${1:w}", 
"{ax},N{dx},~{dirflag},~{fpsr},~{flags}"(i32 [[DATA]], i16 [[PORT]]) 
#[[ATTR5:[0-9]+]], !srcloc [[META3:![0-9]+]]
+// CHECK-X64-NEXT:ret i32 [[DATA]]
+//

RKSimon wrote:

(pedantic) all other checks in this file are after the test

https://github.com/llvm/llvm-project/pull/93774
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] f3fb7f5 - [X86] x86-atomic-float.c - cleanup unused check prefixes

2024-05-29 Thread Simon Pilgrim via cfe-commits

Author: Simon Pilgrim
Date: 2024-05-29T10:38:02+01:00
New Revision: f3fb7f569936db418feef98e4ae68777a9a4cd2a

URL: 
https://github.com/llvm/llvm-project/commit/f3fb7f569936db418feef98e4ae68777a9a4cd2a
DIFF: 
https://github.com/llvm/llvm-project/commit/f3fb7f569936db418feef98e4ae68777a9a4cd2a.diff

LOG: [X86] x86-atomic-float.c - cleanup unused check prefixes

Added: 


Modified: 
clang/test/CodeGen/X86/x86-atomic-float.c

Removed: 




diff  --git a/clang/test/CodeGen/X86/x86-atomic-float.c 
b/clang/test/CodeGen/X86/x86-atomic-float.c
index 2d3c72d2a0299..6ee441c2dd7a8 100644
--- a/clang/test/CodeGen/X86/x86-atomic-float.c
+++ b/clang/test/CodeGen/X86/x86-atomic-float.c
@@ -1,11 +1,11 @@
-// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --version 4
-// RUN: %clang_cc1 -triple x86_64-linux-gnu -target-cpu core2 %s -emit-llvm -o 
- | FileCheck -check-prefixes=CHECK,CHECK64 %s
-// RUN: %clang_cc1 -triple i686-linux-gnu -target-cpu core2 %s -emit-llvm -o - 
| FileCheck -check-prefixes=CHECK,CHECK32 %s
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --version 5
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -target-cpu core2 %s -emit-llvm -o 
- | FileCheck %s
+// RUN: %clang_cc1 -triple i686-linux-gnu -target-cpu core2 %s -emit-llvm -o - 
| FileCheck %s
 
 
 // CHECK-LABEL: define dso_local i32 @test_int_inc(
 // CHECK-SAME: ) #[[ATTR0:[0-9]+]] {
-// CHECK-NEXT:  entry:
+// CHECK-NEXT:  [[ENTRY:.*:]]
 // CHECK-NEXT:[[TMP0:%.*]] = atomicrmw add ptr @test_int_inc.n, i32 1 
seq_cst, align 4
 // CHECK-NEXT:ret i32 [[TMP0]]
 //
@@ -17,7 +17,7 @@ int test_int_inc()
 
 // CHECK-LABEL: define dso_local float @test_float_post_inc(
 // CHECK-SAME: ) #[[ATTR0]] {
-// CHECK-NEXT:  entry:
+// CHECK-NEXT:  [[ENTRY:.*:]]
 // CHECK-NEXT:[[TMP0:%.*]] = atomicrmw fadd ptr @test_float_post_inc.n, 
float 1.00e+00 seq_cst, align 4
 // CHECK-NEXT:ret float [[TMP0]]
 //
@@ -29,7 +29,7 @@ float test_float_post_inc()
 
 // CHECK-LABEL: define dso_local float @test_float_post_dc(
 // CHECK-SAME: ) #[[ATTR0]] {
-// CHECK-NEXT:  entry:
+// CHECK-NEXT:  [[ENTRY:.*:]]
 // CHECK-NEXT:[[TMP0:%.*]] = atomicrmw fsub ptr @test_float_post_dc.n, 
float 1.00e+00 seq_cst, align 4
 // CHECK-NEXT:ret float [[TMP0]]
 //
@@ -41,7 +41,7 @@ float test_float_post_dc()
 
 // CHECK-LABEL: define dso_local float @test_float_pre_dc(
 // CHECK-SAME: ) #[[ATTR0]] {
-// CHECK-NEXT:  entry:
+// CHECK-NEXT:  [[ENTRY:.*:]]
 // CHECK-NEXT:[[TMP0:%.*]] = atomicrmw fsub ptr @test_float_pre_dc.n, 
float 1.00e+00 seq_cst, align 4
 // CHECK-NEXT:[[TMP1:%.*]] = fsub float [[TMP0]], 1.00e+00
 // CHECK-NEXT:ret float [[TMP1]]
@@ -54,7 +54,7 @@ float test_float_pre_dc()
 
 // CHECK-LABEL: define dso_local float @test_float_pre_inc(
 // CHECK-SAME: ) #[[ATTR0]] {
-// CHECK-NEXT:  entry:
+// CHECK-NEXT:  [[ENTRY:.*:]]
 // CHECK-NEXT:[[TMP0:%.*]] = atomicrmw fadd ptr @test_float_pre_inc.n, 
float 1.00e+00 seq_cst, align 4
 // CHECK-NEXT:[[TMP1:%.*]] = fadd float [[TMP0]], 1.00e+00
 // CHECK-NEXT:ret float [[TMP1]]
@@ -64,6 +64,3 @@ float test_float_pre_inc()
 static _Atomic float n;
 return ++n;
 }
- NOTE: These prefixes are unused and the list is autogenerated. Do not add 
tests below this line:
-// CHECK32: {{.*}}
-// CHECK64: {{.*}}



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] 9c42ed1 - [X86] Add x86-atomic-double.c double test coverage

2024-05-29 Thread Simon Pilgrim via cfe-commits

Author: Simon Pilgrim
Date: 2024-05-29T10:38:03+01:00
New Revision: 9c42ed1371ee8c211aedcfe8aed16662a9befb69

URL: 
https://github.com/llvm/llvm-project/commit/9c42ed1371ee8c211aedcfe8aed16662a9befb69
DIFF: 
https://github.com/llvm/llvm-project/commit/9c42ed1371ee8c211aedcfe8aed16662a9befb69.diff

LOG: [X86] Add x86-atomic-double.c double test coverage

Added: 
clang/test/CodeGen/X86/x86-atomic-double.c

Modified: 


Removed: 




diff  --git a/clang/test/CodeGen/X86/x86-atomic-double.c 
b/clang/test/CodeGen/X86/x86-atomic-double.c
new file mode 100644
index 0..2354c89cc2b17
--- /dev/null
+++ b/clang/test/CodeGen/X86/x86-atomic-double.c
@@ -0,0 +1,104 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --version 4
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -target-cpu core2 %s -emit-llvm -o 
- | FileCheck -check-prefixes=X64 %s
+// RUN: %clang_cc1 -triple i686-linux-gnu -target-cpu core2 %s -emit-llvm -o - 
| FileCheck -check-prefixes=X86 %s
+
+
+// X64-LABEL: define dso_local double @test_double_post_inc(
+// X64-SAME: ) #[[ATTR0:[0-9]+]] {
+// X64-NEXT:  entry:
+// X64-NEXT:[[RETVAL:%.*]] = alloca double, align 8
+// X64-NEXT:[[TMP0:%.*]] = atomicrmw fadd ptr @test_double_post_inc.n, 
float 1.00e+00 seq_cst, align 8
+// X64-NEXT:store float [[TMP0]], ptr [[RETVAL]], align 8
+// X64-NEXT:[[TMP1:%.*]] = load double, ptr [[RETVAL]], align 8
+// X64-NEXT:ret double [[TMP1]]
+//
+// X86-LABEL: define dso_local double @test_double_post_inc(
+// X86-SAME: ) #[[ATTR0:[0-9]+]] {
+// X86-NEXT:  entry:
+// X86-NEXT:[[RETVAL:%.*]] = alloca double, align 4
+// X86-NEXT:[[TMP0:%.*]] = atomicrmw fadd ptr @test_double_post_inc.n, 
float 1.00e+00 seq_cst, align 8
+// X86-NEXT:store float [[TMP0]], ptr [[RETVAL]], align 4
+// X86-NEXT:[[TMP1:%.*]] = load double, ptr [[RETVAL]], align 4
+// X86-NEXT:ret double [[TMP1]]
+//
+double test_double_post_inc()
+{
+static _Atomic double n;
+return n++;
+}
+
+// X64-LABEL: define dso_local double @test_double_post_dc(
+// X64-SAME: ) #[[ATTR0]] {
+// X64-NEXT:  entry:
+// X64-NEXT:[[RETVAL:%.*]] = alloca double, align 8
+// X64-NEXT:[[TMP0:%.*]] = atomicrmw fsub ptr @test_double_post_dc.n, 
float 1.00e+00 seq_cst, align 8
+// X64-NEXT:store float [[TMP0]], ptr [[RETVAL]], align 8
+// X64-NEXT:[[TMP1:%.*]] = load double, ptr [[RETVAL]], align 8
+// X64-NEXT:ret double [[TMP1]]
+//
+// X86-LABEL: define dso_local double @test_double_post_dc(
+// X86-SAME: ) #[[ATTR0]] {
+// X86-NEXT:  entry:
+// X86-NEXT:[[RETVAL:%.*]] = alloca double, align 4
+// X86-NEXT:[[TMP0:%.*]] = atomicrmw fsub ptr @test_double_post_dc.n, 
float 1.00e+00 seq_cst, align 8
+// X86-NEXT:store float [[TMP0]], ptr [[RETVAL]], align 4
+// X86-NEXT:[[TMP1:%.*]] = load double, ptr [[RETVAL]], align 4
+// X86-NEXT:ret double [[TMP1]]
+//
+double test_double_post_dc()
+{
+static _Atomic double n;
+return n--;
+}
+
+// X64-LABEL: define dso_local double @test_double_pre_dc(
+// X64-SAME: ) #[[ATTR0]] {
+// X64-NEXT:  entry:
+// X64-NEXT:[[RETVAL:%.*]] = alloca double, align 8
+// X64-NEXT:[[TMP0:%.*]] = atomicrmw fsub ptr @test_double_pre_dc.n, float 
1.00e+00 seq_cst, align 8
+// X64-NEXT:[[TMP1:%.*]] = fsub float [[TMP0]], 1.00e+00
+// X64-NEXT:store float [[TMP1]], ptr [[RETVAL]], align 8
+// X64-NEXT:[[TMP2:%.*]] = load double, ptr [[RETVAL]], align 8
+// X64-NEXT:ret double [[TMP2]]
+//
+// X86-LABEL: define dso_local double @test_double_pre_dc(
+// X86-SAME: ) #[[ATTR0]] {
+// X86-NEXT:  entry:
+// X86-NEXT:[[RETVAL:%.*]] = alloca double, align 4
+// X86-NEXT:[[TMP0:%.*]] = atomicrmw fsub ptr @test_double_pre_dc.n, float 
1.00e+00 seq_cst, align 8
+// X86-NEXT:[[TMP1:%.*]] = fsub float [[TMP0]], 1.00e+00
+// X86-NEXT:store float [[TMP1]], ptr [[RETVAL]], align 4
+// X86-NEXT:[[TMP2:%.*]] = load double, ptr [[RETVAL]], align 4
+// X86-NEXT:ret double [[TMP2]]
+//
+double test_double_pre_dc()
+{
+static _Atomic double n;
+return --n;
+}
+
+// X64-LABEL: define dso_local double @test_double_pre_inc(
+// X64-SAME: ) #[[ATTR0]] {
+// X64-NEXT:  entry:
+// X64-NEXT:[[RETVAL:%.*]] = alloca double, align 8
+// X64-NEXT:[[TMP0:%.*]] = atomicrmw fadd ptr @test_double_pre_inc.n, 
float 1.00e+00 seq_cst, align 8
+// X64-NEXT:[[TMP1:%.*]] = fadd float [[TMP0]], 1.00e+00
+// X64-NEXT:store float [[TMP1]], ptr [[RETVAL]], align 8
+// X64-NEXT:[[TMP2:%.*]] = load double, ptr [[RETVAL]], align 8
+// X64-NEXT:ret double [[TMP2]]
+//
+// X86-LABEL: define dso_local double @test_double_pre_inc(
+// X86-SAME: ) #[[ATTR0]] {
+// X86-NEXT:  entry:
+// X86-NEXT:[[RETVAL:%.*]] = alloca double, align 4
+// X86-NEXT:[[TMP0:%.*]] = atomicrmw fadd ptr @test_double_pre_inc.n, 

[clang] 4bb6974 - [X86] x86-atomic-long_double.c - cleanup check prefixes

2024-05-29 Thread Simon Pilgrim via cfe-commits

Author: Simon Pilgrim
Date: 2024-05-29T10:38:03+01:00
New Revision: 4bb6974a87e495f19faea4b13475a65e842473f0

URL: 
https://github.com/llvm/llvm-project/commit/4bb6974a87e495f19faea4b13475a65e842473f0
DIFF: 
https://github.com/llvm/llvm-project/commit/4bb6974a87e495f19faea4b13475a65e842473f0.diff

LOG: [X86] x86-atomic-long_double.c - cleanup check prefixes

Added: 


Modified: 
clang/test/CodeGen/X86/x86-atomic-long_double.c

Removed: 




diff  --git a/clang/test/CodeGen/X86/x86-atomic-long_double.c 
b/clang/test/CodeGen/X86/x86-atomic-long_double.c
index 74a22d5db151e..2c3f381f13511 100644
--- a/clang/test/CodeGen/X86/x86-atomic-long_double.c
+++ b/clang/test/CodeGen/X86/x86-atomic-long_double.c
@@ -1,170 +1,171 @@
-// RUN: %clang_cc1 -triple x86_64-linux-gnu -target-cpu core2 %s -emit-llvm -o 
- | FileCheck %s
-// RUN: %clang_cc1 -triple i686-linux-gnu -target-cpu core2 %s -emit-llvm -o - 
| FileCheck -check-prefix=CHECK32 %s
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --version 5
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -target-cpu core2 %s -emit-llvm -o 
- | FileCheck --check-prefixes=X64 %s
+// RUN: %clang_cc1 -triple i686-linux-gnu -target-cpu core2 %s -emit-llvm -o - 
| FileCheck --check-prefixes=X86 %s
 
-// CHECK-LABEL: define dso_local x86_fp80 @testinc(
-// CHECK-SAME: ptr noundef [[ADDR:%.*]]) #[[ATTR0:[0-9]+]] {
-// CHECK-NEXT:  entry:
-// CHECK-NEXT:[[RETVAL:%.*]] = alloca x86_fp80, align 16
-// CHECK-NEXT:[[ADDR_ADDR:%.*]] = alloca ptr, align 8
-// CHECK-NEXT:store ptr [[ADDR]], ptr [[ADDR_ADDR]], align 8
-// CHECK-NEXT:[[TMP0:%.*]] = load ptr, ptr [[ADDR_ADDR]], align 8
-// CHECK-NEXT:[[TMP1:%.*]] = atomicrmw fadd ptr [[TMP0]], float 
1.00e+00 seq_cst, align 16
-// CHECK-NEXT:[[TMP2:%.*]] = fadd float [[TMP1]], 1.00e+00
-// CHECK-NEXT:store float [[TMP2]], ptr [[RETVAL]], align 16
-// CHECK-NEXT:[[TMP3:%.*]] = load x86_fp80, ptr [[RETVAL]], align 16
-// CHECK-NEXT:ret x86_fp80 [[TMP3]]
+// X64-LABEL: define dso_local x86_fp80 @testinc(
+// X64-SAME: ptr noundef [[ADDR:%.*]]) #[[ATTR0:[0-9]+]] {
+// X64-NEXT:  [[ENTRY:.*:]]
+// X64-NEXT:[[RETVAL:%.*]] = alloca x86_fp80, align 16
+// X64-NEXT:[[ADDR_ADDR:%.*]] = alloca ptr, align 8
+// X64-NEXT:store ptr [[ADDR]], ptr [[ADDR_ADDR]], align 8
+// X64-NEXT:[[TMP0:%.*]] = load ptr, ptr [[ADDR_ADDR]], align 8
+// X64-NEXT:[[TMP1:%.*]] = atomicrmw fadd ptr [[TMP0]], float 1.00e+00 
seq_cst, align 16
+// X64-NEXT:[[TMP2:%.*]] = fadd float [[TMP1]], 1.00e+00
+// X64-NEXT:store float [[TMP2]], ptr [[RETVAL]], align 16
+// X64-NEXT:[[TMP3:%.*]] = load x86_fp80, ptr [[RETVAL]], align 16
+// X64-NEXT:ret x86_fp80 [[TMP3]]
 //
-// CHECK32-LABEL: define dso_local x86_fp80 @testinc(
-// CHECK32-SAME: ptr noundef [[ADDR:%.*]]) #[[ATTR0:[0-9]+]] {
-// CHECK32-NEXT:  entry:
-// CHECK32-NEXT:[[RETVAL:%.*]] = alloca x86_fp80, align 4
-// CHECK32-NEXT:[[ADDR_ADDR:%.*]] = alloca ptr, align 4
-// CHECK32-NEXT:store ptr [[ADDR]], ptr [[ADDR_ADDR]], align 4
-// CHECK32-NEXT:[[TMP0:%.*]] = load ptr, ptr [[ADDR_ADDR]], align 4
-// CHECK32-NEXT:[[TMP1:%.*]] = atomicrmw fadd ptr [[TMP0]], float 
1.00e+00 seq_cst, align 4
-// CHECK32-NEXT:[[TMP2:%.*]] = fadd float [[TMP1]], 1.00e+00
-// CHECK32-NEXT:store float [[TMP2]], ptr [[RETVAL]], align 4
-// CHECK32-NEXT:[[TMP3:%.*]] = load x86_fp80, ptr [[RETVAL]], align 4
-// CHECK32-NEXT:ret x86_fp80 [[TMP3]]
+// X86-LABEL: define dso_local x86_fp80 @testinc(
+// X86-SAME: ptr noundef [[ADDR:%.*]]) #[[ATTR0:[0-9]+]] {
+// X86-NEXT:  [[ENTRY:.*:]]
+// X86-NEXT:[[RETVAL:%.*]] = alloca x86_fp80, align 4
+// X86-NEXT:[[ADDR_ADDR:%.*]] = alloca ptr, align 4
+// X86-NEXT:store ptr [[ADDR]], ptr [[ADDR_ADDR]], align 4
+// X86-NEXT:[[TMP0:%.*]] = load ptr, ptr [[ADDR_ADDR]], align 4
+// X86-NEXT:[[TMP1:%.*]] = atomicrmw fadd ptr [[TMP0]], float 1.00e+00 
seq_cst, align 4
+// X86-NEXT:[[TMP2:%.*]] = fadd float [[TMP1]], 1.00e+00
+// X86-NEXT:store float [[TMP2]], ptr [[RETVAL]], align 4
+// X86-NEXT:[[TMP3:%.*]] = load x86_fp80, ptr [[RETVAL]], align 4
+// X86-NEXT:ret x86_fp80 [[TMP3]]
 //
 long double testinc(_Atomic long double *addr) {
 
   return ++*addr;
 }
 
-// CHECK-LABEL: define dso_local x86_fp80 @testdec(
-// CHECK-SAME: ptr noundef [[ADDR:%.*]]) #[[ATTR0]] {
-// CHECK-NEXT:  entry:
-// CHECK-NEXT:[[RETVAL:%.*]] = alloca x86_fp80, align 16
-// CHECK-NEXT:[[ADDR_ADDR:%.*]] = alloca ptr, align 8
-// CHECK-NEXT:store ptr [[ADDR]], ptr [[ADDR_ADDR]], align 8
-// CHECK-NEXT:[[TMP0:%.*]] = load ptr, ptr [[ADDR_ADDR]], align 8
-// CHECK-NEXT:[[TMP1:%.*]] = atomicrmw fsub ptr [[TMP0]], float 
1.00e+00 seq_cst, align 16
-// CHECK-NEXT:store float [[TMP1]], ptr [[RETVAL]], align 16
-// 

[clang] [llvm] Reland "[X86] Remove knl/knm specific ISAs supports (#92883)" (PR #93136)

2024-05-23 Thread Simon Pilgrim via cfe-commits

RKSimon wrote:

Not sure - CI checks aren't running either

https://github.com/llvm/llvm-project/pull/93136
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] Introduce `SemaX86` (PR #93098)

2024-05-23 Thread Simon Pilgrim via cfe-commits

RKSimon wrote:

@endilll Thanks for working on this - out of interest are you intending to do 
this for CGBuiltin as well? Is the plan to no longer have to include all target 
builtins in all clang builds?

https://github.com/llvm/llvm-project/pull/93098
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [polly] [X86] Remove knl/knm specific ISAs supports (PR #92883)

2024-05-22 Thread Simon Pilgrim via cfe-commits

https://github.com/RKSimon approved this pull request.

LGTM with one minor

https://github.com/llvm/llvm-project/pull/92883
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [polly] [X86] Remove knl/knm specific ISAs supports (PR #92883)

2024-05-22 Thread Simon Pilgrim via cfe-commits


@@ -139,6 +139,9 @@ Changes to the Windows Target
 Changes to the X86 Backend
 --
 
+- Removed knl/knm specific ISA lowerings: AVX512PF, AVX512ER, PREFETCHWT1,

RKSimon wrote:

"ISA intrinsics" instead?

https://github.com/llvm/llvm-project/pull/92883
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [polly] [X86] Remove knl/knm specific ISAs supports (PR #92883)

2024-05-22 Thread Simon Pilgrim via cfe-commits

https://github.com/RKSimon edited 
https://github.com/llvm/llvm-project/pull/92883
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Add __builtin_selectvector and use it for AVX512 intrinsics (PR #91306)

2024-05-16 Thread Simon Pilgrim via cfe-commits

https://github.com/RKSimon commented:

constexpr handling?

https://github.com/llvm/llvm-project/pull/91306
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Add __builtin_selectvector and use it for AVX512 intrinsics (PR #91306)

2024-05-16 Thread Simon Pilgrim via cfe-commits


@@ -232,225 +232,225 @@ typedef char __v2qi __attribute__((__vector_size__(2)));
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_mask_add_epi32(__m256i __W, __mmask8 __U, __m256i __A, __m256i __B)
 {
-  return (__m256i)__builtin_ia32_selectd_256((__mmask8)__U,
- (__v8si)_mm256_add_epi32(__A, 
__B),
- (__v8si)__W);
+  return (__m256i)__builtin_selectvector((__v8si)_mm256_add_epi32(__A, __B),
+ (__v8si)__W,
+ __builtin_bit_cast(__vecmask8, __U));
 }
 
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_maskz_add_epi32(__mmask8 __U, __m256i __A, __m256i __B)
 {
-  return (__m256i)__builtin_ia32_selectd_256((__mmask8)__U,
- (__v8si)_mm256_add_epi32(__A, 
__B),
- (__v8si)_mm256_setzero_si256());
+  return (__m256i)__builtin_selectvector((__v8si)_mm256_add_epi32(__A, __B),
+ (__v8si)_mm256_setzero_si256(),
+ __builtin_bit_cast(__vecmask8, __U));
 }
 
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_mask_add_epi64(__m256i __W, __mmask8 __U, __m256i __A, __m256i __B)
 {
-  return (__m256i)__builtin_ia32_selectq_256((__mmask8)__U,
- (__v4di)_mm256_add_epi64(__A, 
__B),
- (__v4di)__W);
+  return (__m256i)__builtin_selectvector((__v4di)_mm256_add_epi64(__A, __B),
+ (__v4di)__W,
+ __builtin_bit_cast(__vecmask4, __U));
 }
 
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_maskz_add_epi64(__mmask8 __U, __m256i __A, __m256i __B)
 {
-  return (__m256i)__builtin_ia32_selectq_256((__mmask8)__U,
- (__v4di)_mm256_add_epi64(__A, 
__B),
- (__v4di)_mm256_setzero_si256());
+  return (__m256i)__builtin_selectvector((__v4di)_mm256_add_epi64(__A, __B),
+ (__v4di)_mm256_setzero_si256(),
+ __builtin_bit_cast(__vecmask4, __U));
 }
 
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_mask_sub_epi32(__m256i __W, __mmask8 __U, __m256i __A, __m256i __B)
 {
-  return (__m256i)__builtin_ia32_selectd_256((__mmask8)__U,
- (__v8si)_mm256_sub_epi32(__A, 
__B),
- (__v8si)__W);
+  return (__m256i)__builtin_selectvector((__v8si)_mm256_sub_epi32(__A, __B),
+ (__v8si)__W,
+ __builtin_bit_cast(__vecmask8, __U));
 }
 
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_maskz_sub_epi32(__mmask8 __U, __m256i __A, __m256i __B)
 {
-  return (__m256i)__builtin_ia32_selectd_256((__mmask8)__U,
- (__v8si)_mm256_sub_epi32(__A, 
__B),
- (__v8si)_mm256_setzero_si256());
+  return (__m256i)__builtin_selectvector((__v8si)_mm256_sub_epi32(__A, __B),
+ (__v8si)_mm256_setzero_si256(),
+ __builtin_bit_cast(__vecmask8, __U));
 }
 
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_mask_sub_epi64(__m256i __W, __mmask8 __U, __m256i __A, __m256i __B)
 {
-  return (__m256i)__builtin_ia32_selectq_256((__mmask8)__U,
- (__v4di)_mm256_sub_epi64(__A, 
__B),
- (__v4di)__W);
+  return (__m256i)__builtin_selectvector((__v4di)_mm256_sub_epi64(__A, __B),
+ (__v4di)__W,
+ __builtin_bit_cast(__vecmask4, __U));
 }
 
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_maskz_sub_epi64(__mmask8 __U, __m256i __A, __m256i __B)
 {
-  return (__m256i)__builtin_ia32_selectq_256((__mmask8)__U,
- (__v4di)_mm256_sub_epi64(__A, 
__B),
- (__v4di)_mm256_setzero_si256());
+  return (__m256i)__builtin_selectvector((__v4di)_mm256_sub_epi64(__A, __B),
+ (__v4di)_mm256_setzero_si256(),
+ __builtin_bit_cast(__vecmask4, __U));
 }
 
 static __inline__ __m128i __DEFAULT_FN_ATTRS128
 _mm_mask_add_epi32(__m128i __W, __mmask8 __U, __m128i __A, __m128i __B)
 {
-  return (__m128i)__builtin_ia32_selectd_128((__mmask8)__U,
- (__v4si)_mm_add_epi32(__A, __B),
- (__v4si)__W);
+  return (__m128i)__builtin_selectvector((__v4si)_mm_add_epi32(__A, 

[clang] [Clang] Add __builtin_selectvector and use it for AVX512 intrinsics (PR #91306)

2024-05-16 Thread Simon Pilgrim via cfe-commits


@@ -3019,6 +3019,26 @@ C-style cast applied to each element of the first 
argument.
 
 Query for this feature with ``__has_builtin(__builtin_convertvector)``.
 
+``__builtin_selectvector``
+--
+
+``__builtin_selectvector`` is used to express generic vector element selection.

RKSimon wrote:

Extend this description to explicitly describe the input/output types and 
mechanism - don't just rely on the code snippet (although that's a nice 
accompaniment): The input must all be vectors of the same same number of 
elements, the 2 first operands must be the same type etc. etc. (basically 
everything in SemaChecking).

https://github.com/llvm/llvm-project/pull/91306
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Avoid unevaluated implicit private (PR #92055)

2024-05-14 Thread Simon Pilgrim via cfe-commits

https://github.com/RKSimon updated 
https://github.com/llvm/llvm-project/pull/92055

>From 6946c9f1285d5a27eafcdbf13f79c0641736198d Mon Sep 17 00:00:00 2001
From: Sunil Kuravinakop 
Date: Thu, 9 May 2024 12:09:15 -0500
Subject: [PATCH 1/3] Avoiding DeclRefExpr with "non_odr_use_unevaluated" to
 declare "Implicit Private variable" DeclRefExpr.

  Changes to be committed:
modified:   clang/lib/Sema/SemaOpenMP.cpp
---
 clang/lib/Sema/SemaOpenMP.cpp | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/clang/lib/Sema/SemaOpenMP.cpp b/clang/lib/Sema/SemaOpenMP.cpp
index cf5447f223d45..bb6518099b4df 100644
--- a/clang/lib/Sema/SemaOpenMP.cpp
+++ b/clang/lib/Sema/SemaOpenMP.cpp
@@ -3757,7 +3757,8 @@ class DSAAttrChecker final : public 
StmtVisitor {
   void VisitDeclRefExpr(DeclRefExpr *E) {
 if (TryCaptureCXXThisMembers || E->isTypeDependent() ||
 E->isValueDependent() || E->containsUnexpandedParameterPack() ||
-E->isInstantiationDependent())
+E->isInstantiationDependent() ||
+E->isNonOdrUse() == clang::NOUR_Unevaluated)
   return;
 if (auto *VD = dyn_cast(E->getDecl())) {
   // Check the datasharing rules for the expressions in the clauses.

>From 862907f4a6d7cebfb1b816e9ec890c39d0da112e Mon Sep 17 00:00:00 2001
From: Sunil Kuravinakop 
Date: Mon, 13 May 2024 01:28:59 -0500
Subject: [PATCH 2/3] Adding checks for proper declaration of DeclRefExpr under
 the task directive (when variable can be non_odr_use_unevaluated).

  Changes to be committed:
modified:   clang/test/OpenMP/task_ast_print.cpp
---
 clang/test/OpenMP/task_ast_print.cpp | 34 
 1 file changed, 34 insertions(+)

diff --git a/clang/test/OpenMP/task_ast_print.cpp 
b/clang/test/OpenMP/task_ast_print.cpp
index 12923e6ab4244..9d545c5f6716c 100644
--- a/clang/test/OpenMP/task_ast_print.cpp
+++ b/clang/test/OpenMP/task_ast_print.cpp
@@ -5,6 +5,7 @@
 // RUN: %clang_cc1 -verify -Wno-vla -fopenmp-simd -ast-print %s | FileCheck %s
 // RUN: %clang_cc1 -fopenmp-simd -x c++ -std=c++11 -emit-pch -o %t %s
 // RUN: %clang_cc1 -fopenmp-simd -std=c++11 -include-pch %t -verify -Wno-vla 
%s -ast-print | FileCheck %s
+// RUN: %clang_cc1 -triple x86_64-pc-linux-gnu -fopenmp -ast-dump  %s | 
FileCheck %s --check-prefix=DUMP
 // expected-no-diagnostics
 
 #ifndef HEADER
@@ -208,4 +209,37 @@ int main(int argc, char **argv) {
 extern template int S::TS;
 extern template long S::TS;
 
+int
+implicit_firstprivate() {
+
+#pragma omp parallel num_threads(1)
+  {
+int i = 0;
+// DUMP : OMPTaskDirective
+// DUMP-NEXT : OMPFirstprivateClause
+// DUMP-NEXT : DeclRefExpr {{.+}} 'i' {{.+}} 
refers_to_enclosing_variable_or_capture
+#pragma omp task
+{
+   int j = sizeof(i);
+   j = i;
+}
+  }
+}
+
+int
+no_implicit_firstprivate() {
+
+#pragma omp parallel num_threads(1)
+  {
+int i = 0;
+// DUMP : OMPTaskDirective
+// DUMP-NEXT : CapturedStmt
+// DUMP : DeclRefExpr {{.+}} 'i' {{.+}} non_odr_use_unevaluated 
refers_to_enclosing_variable_or_capture
+#pragma omp task
+{
+   int j = sizeof(i);
+}
+  }
+}
+
 #endif

>From 96c997ad09d87e48af7929b527259c5037242e10 Mon Sep 17 00:00:00 2001
From: Sunil Kuravinakop 
Date: Tue, 14 May 2024 04:47:00 -0500
Subject: [PATCH 3/3] Minor changes to the test case.   Changes to be
 committed: modified:   clang/test/OpenMP/task_ast_print.cpp

---
 clang/test/OpenMP/task_ast_print.cpp | 18 ++
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/clang/test/OpenMP/task_ast_print.cpp 
b/clang/test/OpenMP/task_ast_print.cpp
index 9d545c5f6716c..cb2cc63f63214 100644
--- a/clang/test/OpenMP/task_ast_print.cpp
+++ b/clang/test/OpenMP/task_ast_print.cpp
@@ -209,15 +209,16 @@ int main(int argc, char **argv) {
 extern template int S::TS;
 extern template long S::TS;
 
-int
+// DUMP-LABEL:  FunctionDecl {{.*}} implicit_firstprivate
+void
 implicit_firstprivate() {
 
 #pragma omp parallel num_threads(1)
   {
 int i = 0;
-// DUMP : OMPTaskDirective
-// DUMP-NEXT : OMPFirstprivateClause
-// DUMP-NEXT : DeclRefExpr {{.+}} 'i' {{.+}} 
refers_to_enclosing_variable_or_capture
+// DUMP: OMPTaskDirective 
+// DUMP-NEXT: OMPFirstprivateClause
+// DUMP-NEXT: DeclRefExpr {{.+}} 'i' {{.+}} 
refers_to_enclosing_variable_or_capture
 #pragma omp task
 {
int j = sizeof(i);
@@ -226,15 +227,16 @@ implicit_firstprivate() {
   }
 }
 
-int
+// DUMP-LABEL:  FunctionDecl {{.*}} no_implicit_firstprivate
+void
 no_implicit_firstprivate() {
 
 #pragma omp parallel num_threads(1)
   {
 int i = 0;
-// DUMP : OMPTaskDirective
-// DUMP-NEXT : CapturedStmt
-// DUMP : DeclRefExpr {{.+}} 'i' {{.+}} non_odr_use_unevaluated 
refers_to_enclosing_variable_or_capture
+// DUMP: OMPTaskDirective
+// DUMP-NEXT: CapturedStmt
+// DUMP: DeclRefExpr {{.+}} 'i' {{.+}} non_odr_use_unevaluated 
refers_to_enclosing_variable_or_capture
 

[clang] [clang-tools-extra] [flang] [llvm] [mlir] [polly] [test]: fix filecheck annotation typos (PR #91854)

2024-05-13 Thread Simon Pilgrim via cfe-commits


@@ -285,7 +285,7 @@ define i32 @xunp(ptr %p) nounwind readnone {
 ; BMI264-NEXT:rorxl $7, (%rdi), %eax
 ; BMI264-NEXT:retq
 entry:
-; shld-label: xunp:
+; shld-LABEL: xunp:
 ; shld: shldl $25

RKSimon wrote:

remove these 2 checks (the auto checks above cover this)

https://github.com/llvm/llvm-project/pull/91854
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [compiler-rt] [libc] [libclc] [libcxxabi] [lld] [lldb] [llvm] [mlir] Add clarifying parenthesis around non-trivial conditions in ternary expressions. (PR #90391)

2024-05-04 Thread Simon Pilgrim via cfe-commits

https://github.com/RKSimon closed 
https://github.com/llvm/llvm-project/pull/90391
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [compiler-rt] [libc] [libclc] [libcxxabi] [lld] [lldb] [llvm] [mlir] Add clarifying parenthesis around non-trivial conditions in ternary expressions. (PR #90391)

2024-05-03 Thread Simon Pilgrim via cfe-commits

https://github.com/RKSimon approved this pull request.

LGTM

https://github.com/llvm/llvm-project/pull/90391
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang][CodeGen] Optimised LLVM IR for atomic increments/decrements on floats (PR #89362)

2024-05-02 Thread Simon Pilgrim via cfe-commits

https://github.com/RKSimon closed 
https://github.com/llvm/llvm-project/pull/89362
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang][CodeGen] Optimised LLVM IR for atomic increments/decrements on floats (PR #89362)

2024-05-01 Thread Simon Pilgrim via cfe-commits

https://github.com/RKSimon approved this pull request.

LGTM

https://github.com/llvm/llvm-project/pull/89362
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang][CodeGen] Optimised LLVM IR for atomic increments/decrements on floats (PR #89362)

2024-05-01 Thread Simon Pilgrim via cfe-commits


@@ -0,0 +1,69 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --version 4
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -target-cpu core2 %s -S -emit-llvm 
-o - | FileCheck -check-prefixes=CHECK,CHECK64 %s
+// RUN: %clang_cc1 -triple i686-linux-gnu -target-cpu core2 %s -S -emit-llvm 
-o - | FileCheck -check-prefixes=CHECK,CHECK32 %s
+
+
+// CHECK-LABEL: define dso_local i32 @test_int_inc(
+// CHECK-SAME: ) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[TMP0:%.*]] = atomicrmw add ptr @test_int_inc.n, i32 1 
seq_cst, align 4
+// CHECK-NEXT:ret i32 [[TMP0]]
+//
+int test_int_inc()
+{
+static _Atomic int n;
+return n++;
+}
+
+// CHECK-LABEL: define dso_local float @test_float_post_inc(
+// CHECK-SAME: ) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[TMP0:%.*]] = atomicrmw fadd ptr @test_float_post_inc.n, 
float 1.00e+00 seq_cst, align 4
+// CHECK-NEXT:ret float [[TMP0]]
+//
+float test_float_post_inc()
+{
+static _Atomic float n;
+return n++;
+}
+
+// CHECK-LABEL: define dso_local float @test_float_post_dc(
+// CHECK-SAME: ) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[TMP0:%.*]] = atomicrmw fsub ptr @test_float_post_dc.n, 
float -1.00e+00 seq_cst, align 4

RKSimon wrote:

`fsub x, -1.0`? Should it be `fsub x, 1.0`?

https://github.com/llvm/llvm-project/pull/89362
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang][CodeGen] Optimised LLVM IR for atomic increments/decrements on floats (PR #89362)

2024-05-01 Thread Simon Pilgrim via cfe-commits

https://github.com/RKSimon edited 
https://github.com/llvm/llvm-project/pull/89362
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [compiler-rt] [libc] [libclc] [libcxxabi] [lld] [lldb] [llvm] [mlir] llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp:3804: lacking () for c… (PR #90391)

2024-04-28 Thread Simon Pilgrim via cfe-commits

https://github.com/RKSimon commented:

Please address the clang-format warnings the CI has reported

https://github.com/llvm/llvm-project/pull/90391
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [compiler-rt] [libc] [libclc] [libcxxabi] [lld] [lldb] [llvm] [mlir] llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp:3804: lacking () for c… (PR #90391)

2024-04-28 Thread Simon Pilgrim via cfe-commits

https://github.com/RKSimon commented:

Please update the PR subject as its a lot more than just X86AsmParser.cpp

https://github.com/llvm/llvm-project/pull/90391
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang]MveEmitter: Pass Args as a const reference (PR #89551)

2024-04-22 Thread Simon Pilgrim via cfe-commits

https://github.com/RKSimon closed 
https://github.com/llvm/llvm-project/pull/89551
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang]MveEmitter: Pass Args as a const reference (PR #89551)

2024-04-22 Thread Simon Pilgrim via cfe-commits

https://github.com/RKSimon approved this pull request.

LGTM

https://github.com/llvm/llvm-project/pull/89551
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang]MveEmitter:Pass Args as const references (PR #89202)

2024-04-21 Thread Simon Pilgrim via cfe-commits


@@ -660,7 +660,7 @@ class IRBuilderResult : public Result {
   std::map IntegerArgs;
   IRBuilderResult(StringRef CallPrefix, std::vector Args,

RKSimon wrote:

std::vector Args?

https://github.com/llvm/llvm-project/pull/89202
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang]MveEmitter:Pass Args as const references (PR #89202)

2024-04-21 Thread Simon Pilgrim via cfe-commits


@@ -660,7 +660,7 @@ class IRBuilderResult : public Result {
   std::map IntegerArgs;
   IRBuilderResult(StringRef CallPrefix, std::vector Args,
   std::set AddressArgs,

RKSimon wrote:

std::set AddressArgs?

https://github.com/llvm/llvm-project/pull/89202
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang]MveEmitter:Pass Args as const references (PR #89202)

2024-04-21 Thread Simon Pilgrim via cfe-commits


@@ -728,7 +728,7 @@ class IRIntrinsicResult : public Result {
   std::vector ParamTypes;
   std::vector Args;
   IRIntrinsicResult(StringRef IntrinsicID, std::vector 
ParamTypes,

RKSimon wrote:

std::vector ParamTypes?

https://github.com/llvm/llvm-project/pull/89202
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] cppcheck: use move semantics for 'NodeKinds' and update possible callers to use it (PR #87273)

2024-04-20 Thread Simon Pilgrim via cfe-commits

https://github.com/RKSimon closed 
https://github.com/llvm/llvm-project/pull/87273
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang][CodeGen] Optimised LLVM IR for atomic increments/decrements on floats (PR #89362)

2024-04-19 Thread Simon Pilgrim via cfe-commits


@@ -0,0 +1,64 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --version 4
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -target-cpu core2 %s -S -emit-llvm 
-o - | FileCheck %s

RKSimon wrote:

Please can you add the i686 test coverage back? I mean that in most cases you 
should be able to share the check-prefixes

https://github.com/llvm/llvm-project/pull/89362
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] cppcheck: use move semantics for 'NodeKinds' and update possible callers to use it (PR #87273)

2024-04-19 Thread Simon Pilgrim via cfe-commits

RKSimon wrote:

@Amila-Rukshan please can you rebase this patch? merge is currently failing

https://github.com/llvm/llvm-project/pull/87273
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] cppcheck: use move semantics for 'NodeKinds' and update possible callers to use it (PR #87273)

2024-04-19 Thread Simon Pilgrim via cfe-commits

https://github.com/RKSimon approved this pull request.

LGTM

https://github.com/llvm/llvm-project/pull/87273
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang][CodeGen] Optimised LLVM IR for atomic increments/decrements on floats (PR #89362)

2024-04-19 Thread Simon Pilgrim via cfe-commits


@@ -0,0 +1,97 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --version 4
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -target-cpu core2 %s -S -emit-llvm 
-o - | FileCheck %s
+// RUN: %clang_cc1 -triple i686-linux-gnu -target-cpu core2 %s -S -emit-llvm 
-o - | FileCheck -check-prefix=CHECK32 %s

RKSimon wrote:

Reduce a lot of duplicate checks:
```
// RUN: %clang_cc1 -triple x86_64-linux-gnu -target-cpu core2 %s -S -emit-llvm 
-o - | FileCheck -check-prefixes=CHECK,CHECK64 %s
// RUN: %clang_cc1 -triple i686-linux-gnu -target-cpu core2 %s -S -emit-llvm -o 
- | FileCheck -check-prefixes=CHECK,CHECK32 %s
```

https://github.com/llvm/llvm-project/pull/89362
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] Constexpr for __builtin_shufflevector and __builtin_convertvector (PR #76615)

2024-04-15 Thread Simon Pilgrim via cfe-commits
Pol Marcet =?utf-8?q?Sardà?= ,
Pol Marcet =?utf-8?q?Sardà?= ,
Pol Marcet =?utf-8?q?Sardà?= 
Message-ID:
In-Reply-To: 


RKSimon wrote:

Other than fixing the ReleaseNotes.rst conflict is there anything outstanding 
on this now?

https://github.com/llvm/llvm-project/pull/76615
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] 6fd2fdc - [VectorCombine] foldShuffleOfCastops - extend shuffle(bitcast(x),bitcast(y)) -> bitcast(shuffle(x,y)) support

2024-04-11 Thread Simon Pilgrim via cfe-commits

Author: Simon Pilgrim
Date: 2024-04-11T14:02:56+01:00
New Revision: 6fd2fdccf2f28fc155f614eec41f785492aad618

URL: 
https://github.com/llvm/llvm-project/commit/6fd2fdccf2f28fc155f614eec41f785492aad618
DIFF: 
https://github.com/llvm/llvm-project/commit/6fd2fdccf2f28fc155f614eec41f785492aad618.diff

LOG: [VectorCombine] foldShuffleOfCastops - extend 
shuffle(bitcast(x),bitcast(y)) -> bitcast(shuffle(x,y)) support

Handle shuffle mask scaling handling for cases where the bitcast src/dst 
element counts are different

Added: 


Modified: 
clang/test/CodeGen/X86/avx-shuffle-builtins.c
llvm/lib/Transforms/Vectorize/VectorCombine.cpp
llvm/test/Transforms/PhaseOrdering/X86/pr67803.ll
llvm/test/Transforms/VectorCombine/X86/shuffle-of-casts.ll
llvm/test/Transforms/VectorCombine/X86/shuffle.ll

Removed: 




diff  --git a/clang/test/CodeGen/X86/avx-shuffle-builtins.c 
b/clang/test/CodeGen/X86/avx-shuffle-builtins.c
index 49a56e73230d7d..d184d28f3e07aa 100644
--- a/clang/test/CodeGen/X86/avx-shuffle-builtins.c
+++ b/clang/test/CodeGen/X86/avx-shuffle-builtins.c
@@ -61,8 +61,7 @@ __m256 test_mm256_permute2f128_ps(__m256 a, __m256 b) {
 
 __m256i test_mm256_permute2f128_si256(__m256i a, __m256i b) {
   // CHECK-LABEL: test_mm256_permute2f128_si256
-  // X64: shufflevector{{.*}}
-  // X86: shufflevector{{.*}}
+  // CHECK: shufflevector{{.*}}
   return _mm256_permute2f128_si256(a, b, 0x20);
 }
 

diff  --git a/llvm/lib/Transforms/Vectorize/VectorCombine.cpp 
b/llvm/lib/Transforms/Vectorize/VectorCombine.cpp
index b74fdf27d213a1..658e8e74fe5b80 100644
--- a/llvm/lib/Transforms/Vectorize/VectorCombine.cpp
+++ b/llvm/lib/Transforms/Vectorize/VectorCombine.cpp
@@ -1448,9 +1448,9 @@ bool VectorCombine::foldShuffleOfBinops(Instruction ) {
 /// into "castop (shuffle)".
 bool VectorCombine::foldShuffleOfCastops(Instruction ) {
   Value *V0, *V1;
-  ArrayRef Mask;
+  ArrayRef OldMask;
   if (!match(, m_Shuffle(m_OneUse(m_Value(V0)), m_OneUse(m_Value(V1)),
-   m_Mask(Mask
+   m_Mask(OldMask
 return false;
 
   auto *C0 = dyn_cast(V0);
@@ -1473,12 +1473,32 @@ bool VectorCombine::foldShuffleOfCastops(Instruction 
) {
   auto *ShuffleDstTy = dyn_cast(I.getType());
   auto *CastDstTy = dyn_cast(C0->getDestTy());
   auto *CastSrcTy = dyn_cast(C0->getSrcTy());
-  if (!ShuffleDstTy || !CastDstTy || !CastSrcTy ||
-  CastDstTy->getElementCount() != CastSrcTy->getElementCount())
+  if (!ShuffleDstTy || !CastDstTy || !CastSrcTy)
 return false;
 
+  unsigned NumSrcElts = CastSrcTy->getNumElements();
+  unsigned NumDstElts = CastDstTy->getNumElements();
+  assert((NumDstElts == NumSrcElts || Opcode == Instruction::BitCast) &&
+ "Only bitcasts expected to alter src/dst element counts");
+
+  SmallVector NewMask;
+  if (NumSrcElts >= NumDstElts) {
+// The bitcast is from wide to narrow/equal elements. The shuffle mask can
+// always be expanded to the equivalent form choosing narrower elements.
+assert(NumSrcElts % NumDstElts == 0 && "Unexpected shuffle mask");
+unsigned ScaleFactor = NumSrcElts / NumDstElts;
+narrowShuffleMaskElts(ScaleFactor, OldMask, NewMask);
+  } else {
+// The bitcast is from narrow elements to wide elements. The shuffle mask
+// must choose consecutive elements to allow casting first.
+assert(NumDstElts % NumSrcElts == 0 && "Unexpected shuffle mask");
+unsigned ScaleFactor = NumDstElts / NumSrcElts;
+if (!widenShuffleMaskElts(ScaleFactor, OldMask, NewMask))
+  return false;
+  }
+
   auto *NewShuffleDstTy =
-  FixedVectorType::get(CastSrcTy->getScalarType(), Mask.size());
+  FixedVectorType::get(CastSrcTy->getScalarType(), NewMask.size());
 
   // Try to replace a castop with a shuffle if the shuffle is not costly.
   TTI::TargetCostKind CostKind = TTI::TCK_RecipThroughput;
@@ -1489,11 +1509,11 @@ bool VectorCombine::foldShuffleOfCastops(Instruction 
) {
   TTI.getCastInstrCost(C1->getOpcode(), CastDstTy, CastSrcTy,
TTI::CastContextHint::None, CostKind);
   OldCost +=
-  TTI.getShuffleCost(TargetTransformInfo::SK_PermuteTwoSrc, CastDstTy, 
Mask,
- CostKind, 0, nullptr, std::nullopt, );
+  TTI.getShuffleCost(TargetTransformInfo::SK_PermuteTwoSrc, CastDstTy,
+ OldMask, CostKind, 0, nullptr, std::nullopt, );
 
   InstructionCost NewCost = TTI.getShuffleCost(
-  TargetTransformInfo::SK_PermuteTwoSrc, CastSrcTy, Mask, CostKind);
+  TargetTransformInfo::SK_PermuteTwoSrc, CastSrcTy, NewMask, CostKind);
   NewCost += TTI.getCastInstrCost(Opcode, ShuffleDstTy, NewShuffleDstTy,
   TTI::CastContextHint::None, CostKind);
 
@@ -1503,8 +1523,8 @@ bool VectorCombine::foldShuffleOfCastops(Instruction ) {
   if (NewCost > OldCost)
 return false;
 
-  Value *Shuf =
-  

[clang] [clang][Interp] Integral pointers (PR #84159)

2024-04-11 Thread Simon Pilgrim via cfe-commits


@@ -22,7 +22,11 @@ class FunctionPointer final {
   const Function *Func;
 
 public:
-  FunctionPointer() : Func(nullptr) {}
+  // FIXME: We might want to track the fact that the Function pointer
+  // has been created from an integer and is most likely garbage anyway.
+  FunctionPointer(int IntVal = 0, const Descriptor *Desc = nullptr)

RKSimon wrote:

For reference @AaronBallman fixed this in 
4d80dff819d1164775d0d55fc68bffedb90ba53c

https://github.com/llvm/llvm-project/pull/84159
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] 798e04f - Fix MSVC "not all control paths return a value" warning. NFC.

2024-04-10 Thread Simon Pilgrim via cfe-commits

Author: Simon Pilgrim
Date: 2024-04-10T17:50:13+01:00
New Revision: 798e04f93769318db857b27f51020e7115e00301

URL: 
https://github.com/llvm/llvm-project/commit/798e04f93769318db857b27f51020e7115e00301
DIFF: 
https://github.com/llvm/llvm-project/commit/798e04f93769318db857b27f51020e7115e00301.diff

LOG: Fix MSVC "not all control paths return a value" warning. NFC.

Added: 


Modified: 
clang/include/clang/Basic/OpenACCKinds.h

Removed: 




diff  --git a/clang/include/clang/Basic/OpenACCKinds.h 
b/clang/include/clang/Basic/OpenACCKinds.h
index e191e9e0a5a153..3414df1701 100644
--- a/clang/include/clang/Basic/OpenACCKinds.h
+++ b/clang/include/clang/Basic/OpenACCKinds.h
@@ -430,6 +430,7 @@ inline StreamTy (StreamTy 
,
   case OpenACCDefaultClauseKind::Invalid:
 return Out << "";
   }
+  llvm_unreachable("Unknown OpenACCDefaultClauseKind enum");
 }
 
 inline const StreamingDiagnostic <<(const StreamingDiagnostic ,



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][Interp] Integral pointers (PR #84159)

2024-04-10 Thread Simon Pilgrim via cfe-commits


@@ -22,7 +22,11 @@ class FunctionPointer final {
   const Function *Func;
 
 public:
-  FunctionPointer() : Func(nullptr) {}
+  // FIXME: We might want to track the fact that the Function pointer
+  // has been created from an integer and is most likely garbage anyway.
+  FunctionPointer(int IntVal = 0, const Descriptor *Desc = nullptr)

RKSimon wrote:

@tbaederr  This is causing a lot of MSVC warnings - `int IntVal` -> `intptr_t 
IntVal`?

`warning C4312: 'reinterpret_cast': conversion from 'int' to 'const 
clang::interp::Function *' of greater size`

https://github.com/llvm/llvm-project/pull/84159
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] 4ae33c5 - Fix MSVC "switch statement contains 'default' but no 'case' labels" warning. NFC.

2024-04-09 Thread Simon Pilgrim via cfe-commits

Author: Simon Pilgrim
Date: 2024-04-09T09:59:57+01:00
New Revision: 4ae33c52f794dbd64924dd006570cdc409c297bc

URL: 
https://github.com/llvm/llvm-project/commit/4ae33c52f794dbd64924dd006570cdc409c297bc
DIFF: 
https://github.com/llvm/llvm-project/commit/4ae33c52f794dbd64924dd006570cdc409c297bc.diff

LOG: Fix MSVC "switch statement contains 'default' but no 'case' labels" 
warning. NFC.

Added: 


Modified: 
clang/lib/Sema/SemaOpenACC.cpp

Removed: 




diff  --git a/clang/lib/Sema/SemaOpenACC.cpp b/clang/lib/Sema/SemaOpenACC.cpp
index f520b9bfe81193..2ba1e49b5739db 100644
--- a/clang/lib/Sema/SemaOpenACC.cpp
+++ b/clang/lib/Sema/SemaOpenACC.cpp
@@ -39,14 +39,11 @@ bool diagnoseConstructAppertainment(SemaOpenACC , 
OpenACCDirectiveKind K,
 
 bool doesClauseApplyToDirective(OpenACCDirectiveKind DirectiveKind,
 OpenACCClauseKind ClauseKind) {
-  switch (ClauseKind) {
-// FIXME: For each clause as we implement them, we can add the
-// 'legalization' list here.
-  default:
-// Do nothing so we can go to the 'unimplemented' diagnostic instead.
-return true;
-  }
-  llvm_unreachable("Invalid clause kind");
+  // FIXME: For each clause as we implement them, we can add the
+  // 'legalization' list here.
+
+  // Do nothing so we can go to the 'unimplemented' diagnostic instead.
+  return true;
 }
 } // namespace
 



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] f139387 - Fix MSVC "not all control paths return a value" warning. NFC.

2024-04-08 Thread Simon Pilgrim via cfe-commits

Author: Simon Pilgrim
Date: 2024-04-08T14:31:46+01:00
New Revision: f139387fb6e76a5249e8d7c2d124565e6b566ef4

URL: 
https://github.com/llvm/llvm-project/commit/f139387fb6e76a5249e8d7c2d124565e6b566ef4
DIFF: 
https://github.com/llvm/llvm-project/commit/f139387fb6e76a5249e8d7c2d124565e6b566ef4.diff

LOG: Fix MSVC "not all control paths return a value" warning. NFC.

Added: 


Modified: 
clang/lib/InstallAPI/DiagnosticBuilderWrappers.cpp

Removed: 




diff  --git a/clang/lib/InstallAPI/DiagnosticBuilderWrappers.cpp 
b/clang/lib/InstallAPI/DiagnosticBuilderWrappers.cpp
index 1fa988f93bdd5c..cc252d51e3b677 100644
--- a/clang/lib/InstallAPI/DiagnosticBuilderWrappers.cpp
+++ b/clang/lib/InstallAPI/DiagnosticBuilderWrappers.cpp
@@ -81,8 +81,9 @@ const DiagnosticBuilder <<(const DiagnosticBuilder 
,
 return DB;
   case FileType::Invalid:
   case FileType::All:
-llvm_unreachable("Unexpected file type for diagnostics.");
+break;
   }
+  llvm_unreachable("Unexpected file type for diagnostics.");
 }
 
 const DiagnosticBuilder <<(const DiagnosticBuilder ,



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] 0e87366 - TextNodeDumper.cpp - remove empty switch to fix MSVC "switch statement contains 'default' but no 'case' labels" warning. NFC.

2024-04-08 Thread Simon Pilgrim via cfe-commits

Author: Simon Pilgrim
Date: 2024-04-08T14:31:46+01:00
New Revision: 0e8736694f752898ed7957a11a11c42f8f6a98d1

URL: 
https://github.com/llvm/llvm-project/commit/0e8736694f752898ed7957a11a11c42f8f6a98d1
DIFF: 
https://github.com/llvm/llvm-project/commit/0e8736694f752898ed7957a11a11c42f8f6a98d1.diff

LOG: TextNodeDumper.cpp - remove empty switch to fix MSVC "switch statement 
contains 'default' but no 'case' labels" warning. NFC.

Added: 


Modified: 
clang/lib/AST/TextNodeDumper.cpp

Removed: 




diff  --git a/clang/lib/AST/TextNodeDumper.cpp 
b/clang/lib/AST/TextNodeDumper.cpp
index 0ffbf47c9a2f4e..f498de6374348e 100644
--- a/clang/lib/AST/TextNodeDumper.cpp
+++ b/clang/lib/AST/TextNodeDumper.cpp
@@ -390,14 +390,6 @@ void TextNodeDumper::Visit(const OpenACCClause *C) {
   {
 ColorScope Color(OS, ShowColors, AttrColor);
 OS << C->getClauseKind();
-
-// Handle clauses with parens for types that have no children, likely
-// because there is no sub expression.
-switch (C->getClauseKind()) {
-default:
-  // Nothing to do here.
-  break;
-}
   }
   dumpPointer(C);
   dumpSourceRange(SourceRange(C->getBeginLoc(), C->getEndLoc()));



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] 110e933 - CGOpenMPRuntime.cpp - fix Wparentheses warning. NFC.

2024-04-04 Thread Simon Pilgrim via cfe-commits

Author: Simon Pilgrim
Date: 2024-04-04T14:59:00+01:00
New Revision: 110e933b7ae9150710a48b586fd3da39439079c2

URL: 
https://github.com/llvm/llvm-project/commit/110e933b7ae9150710a48b586fd3da39439079c2
DIFF: 
https://github.com/llvm/llvm-project/commit/110e933b7ae9150710a48b586fd3da39439079c2.diff

LOG: CGOpenMPRuntime.cpp - fix Wparentheses warning. NFC.

Added: 


Modified: 
clang/lib/CodeGen/CGOpenMPRuntime.cpp

Removed: 




diff  --git a/clang/lib/CodeGen/CGOpenMPRuntime.cpp 
b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
index bc363313dec6f8..8eb10584699fad 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntime.cpp
+++ b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
@@ -2648,9 +2648,9 @@ void CGOpenMPRuntime::emitDistributeStaticInit(
 void CGOpenMPRuntime::emitForStaticFinish(CodeGenFunction ,
   SourceLocation Loc,
   OpenMPDirectiveKind DKind) {
-  assert(DKind == OMPD_distribute || DKind == OMPD_for ||
- DKind == OMPD_sections &&
- "Expected distribute, for, or sections directive kind");
+  assert((DKind == OMPD_distribute || DKind == OMPD_for ||
+  DKind == OMPD_sections) &&
+ "Expected distribute, for, or sections directive kind");
   if (!CGF.HaveInsertPoint())
 return;
   // Call __kmpc_for_static_fini(ident_t *loc, kmp_int32 tid);



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Lambda parameter should be passed by const reference (PR #87306)

2024-04-02 Thread Simon Pilgrim via cfe-commits


@@ -3578,7 +3578,7 @@ cleanupAroundReplacements(StringRef Code, const 
tooling::Replacements ,
   // We need to use lambda function here since there are two versions of
   // `cleanup`.
   auto Cleanup = [](const FormatStyle , StringRef Code,
-std::vector Ranges,
+const std::vector ,

RKSimon wrote:

Since cleanup() takes an `ArrayRef` should this also?

https://github.com/llvm/llvm-project/pull/87306
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] Constexpr for __builtin_shufflevector and __builtin_convertvector (PR #76615)

2024-03-21 Thread Simon Pilgrim via cfe-commits
Pol Marcet =?utf-8?q?Sard=C3=A0?= ,
Pol Marcet =?utf-8?q?Sard=C3=A0?= ,Pol M
 
Message-ID:
In-Reply-To: 


RKSimon wrote:

@Destroyerrrocket reverse-ping - are you still working on this?

https://github.com/llvm/llvm-project/pull/76615
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [X86][Headers] Specify result of NaN comparisons (PR #85862)

2024-03-21 Thread Simon Pilgrim via cfe-commits


@@ -207,6 +207,8 @@ _mm256_div_ps(__m256 __a, __m256 __b)
 /// Compares two 256-bit vectors of [4 x double] and returns the greater
 ///of each pair of values.
 ///
+///If either value in a comparison is NaN, returns the value from \a __b.

RKSimon wrote:

I don't think so either - we don't need to start explaining general fp 
comparison behaviour, just any x86/sse quirks.

https://github.com/llvm/llvm-project/pull/85862
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] 25d61be - [X86] avx-shuffle-builtins.c - limit to x86 targets

2024-03-20 Thread Simon Pilgrim via cfe-commits

Author: Simon Pilgrim
Date: 2024-03-20T16:59:11Z
New Revision: 25d61be8a5e563988661709c5d01f67c06b388e2

URL: 
https://github.com/llvm/llvm-project/commit/25d61be8a5e563988661709c5d01f67c06b388e2
DIFF: 
https://github.com/llvm/llvm-project/commit/25d61be8a5e563988661709c5d01f67c06b388e2.diff

LOG: [X86] avx-shuffle-builtins.c - limit to x86 targets

Attempt to fix issue with non-x86 buildbots (sorry its blind but I can't test 
this)

Added: 


Modified: 
clang/test/CodeGen/X86/avx-shuffle-builtins.c

Removed: 




diff  --git a/clang/test/CodeGen/X86/avx-shuffle-builtins.c 
b/clang/test/CodeGen/X86/avx-shuffle-builtins.c
index 82be43bc05049f..49a56e73230d7d 100644
--- a/clang/test/CodeGen/X86/avx-shuffle-builtins.c
+++ b/clang/test/CodeGen/X86/avx-shuffle-builtins.c
@@ -1,3 +1,4 @@
+// REQUIRES: x86-registered-target
 // RUN: %clang_cc1 -ffreestanding %s -O3 -triple=x86_64-apple-darwin 
-target-feature +avx -emit-llvm -o - | FileCheck %s --check-prefixes=CHECK,X64
 // RUN: %clang_cc1 -ffreestanding %s -O3 -triple=i386-apple-darwin 
-target-feature +avx -emit-llvm -o - | FileCheck %s --check-prefixes=CHECK,X86
 // FIXME: This is testing optimized generation of shuffle instructions and 
should be fixed.



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] 7812fcf - [VectorCombine] foldBitcastShuf - add support for binary shuffles (REAPPLIED)

2024-03-20 Thread Simon Pilgrim via cfe-commits

Author: Simon Pilgrim
Date: 2024-03-20T15:06:19Z
New Revision: 7812fcf3d79ef7fe9ec6bcdfc8fd9143864956cb

URL: 
https://github.com/llvm/llvm-project/commit/7812fcf3d79ef7fe9ec6bcdfc8fd9143864956cb
DIFF: 
https://github.com/llvm/llvm-project/commit/7812fcf3d79ef7fe9ec6bcdfc8fd9143864956cb.diff

LOG: [VectorCombine] foldBitcastShuf - add support for binary shuffles 
(REAPPLIED)

Generalise fold to "bitcast (shuf V0, V1, MaskC) --> shuf (bitcast V0), 
(bitcast V1), MaskC'".

Reapplied with a clang codegen test fix.

Further prep work for #67803

Added: 


Modified: 
clang/test/CodeGen/X86/avx-shuffle-builtins.c
llvm/lib/Transforms/Vectorize/VectorCombine.cpp
llvm/test/Transforms/PhaseOrdering/X86/pr67803.ll

Removed: 




diff  --git a/clang/test/CodeGen/X86/avx-shuffle-builtins.c 
b/clang/test/CodeGen/X86/avx-shuffle-builtins.c
index 9109247e534f4f..82be43bc05049f 100644
--- a/clang/test/CodeGen/X86/avx-shuffle-builtins.c
+++ b/clang/test/CodeGen/X86/avx-shuffle-builtins.c
@@ -60,7 +60,8 @@ __m256 test_mm256_permute2f128_ps(__m256 a, __m256 b) {
 
 __m256i test_mm256_permute2f128_si256(__m256i a, __m256i b) {
   // CHECK-LABEL: test_mm256_permute2f128_si256
-  // CHECK: shufflevector{{.*}} <8 x i32> 
+  // X64: shufflevector{{.*}}
+  // X86: shufflevector{{.*}}
   return _mm256_permute2f128_si256(a, b, 0x20);
 }
 
@@ -104,7 +105,8 @@ __m256d test_mm256_insertf128_pd_0(__m256d a, __m128d b) {
 
 __m256i test_mm256_insertf128_si256_0(__m256i a, __m128i b) {
   // CHECK-LABEL: test_mm256_insertf128_si256_0
-  // CHECK: shufflevector{{.*}}
+  // X64: shufflevector{{.*}}
+  // X86: shufflevector{{.*}}
   return _mm256_insertf128_si256(a, b, 0);
 }
 
@@ -122,7 +124,8 @@ __m256d test_mm256_insertf128_pd_1(__m256d a, __m128d b) {
 
 __m256i test_mm256_insertf128_si256_1(__m256i a, __m128i b) {
   // CHECK-LABEL: test_mm256_insertf128_si256_1
-  // CHECK: shufflevector{{.*}}
+  // X64: shufflevector{{.*}}
+  // X86: shufflevector{{.*}}
   return _mm256_insertf128_si256(a, b, 1);
 }
 

diff  --git a/llvm/lib/Transforms/Vectorize/VectorCombine.cpp 
b/llvm/lib/Transforms/Vectorize/VectorCombine.cpp
index 0b16a8b7676923..23494314f132c9 100644
--- a/llvm/lib/Transforms/Vectorize/VectorCombine.cpp
+++ b/llvm/lib/Transforms/Vectorize/VectorCombine.cpp
@@ -684,10 +684,10 @@ bool VectorCombine::foldInsExtFNeg(Instruction ) {
 /// destination type followed by shuffle. This can enable further transforms by
 /// moving bitcasts or shuffles together.
 bool VectorCombine::foldBitcastShuffle(Instruction ) {
-  Value *V0;
+  Value *V0, *V1;
   ArrayRef Mask;
   if (!match(, m_BitCast(m_OneUse(
- m_Shuffle(m_Value(V0), m_Undef(), m_Mask(Mask))
+ m_Shuffle(m_Value(V0), m_Value(V1), m_Mask(Mask))
 return false;
 
   // 1) Do not fold bitcast shuffle for scalable type. First, shuffle cost for
@@ -728,17 +728,21 @@ bool VectorCombine::foldBitcastShuffle(Instruction ) {
   FixedVectorType::get(DestTy->getScalarType(), NumSrcElts);
   auto *OldShuffleTy =
   FixedVectorType::get(SrcTy->getScalarType(), Mask.size());
+  bool IsUnary = isa(V1);
+  unsigned NumOps = IsUnary ? 1 : 2;
 
   // The new shuffle must not cost more than the old shuffle.
   TargetTransformInfo::TargetCostKind CK =
   TargetTransformInfo::TCK_RecipThroughput;
   TargetTransformInfo::ShuffleKind SK =
-  TargetTransformInfo::SK_PermuteSingleSrc;
+  IsUnary ? TargetTransformInfo::SK_PermuteSingleSrc
+  : TargetTransformInfo::SK_PermuteTwoSrc;
 
   InstructionCost DestCost =
   TTI.getShuffleCost(SK, NewShuffleTy, NewMask, CK) +
-  TTI.getCastInstrCost(Instruction::BitCast, NewShuffleTy, SrcTy,
-   TargetTransformInfo::CastContextHint::None, CK);
+  (NumOps * TTI.getCastInstrCost(Instruction::BitCast, NewShuffleTy, SrcTy,
+ 
TargetTransformInfo::CastContextHint::None,
+ CK));
   InstructionCost SrcCost =
   TTI.getShuffleCost(SK, SrcTy, Mask, CK) +
   TTI.getCastInstrCost(Instruction::BitCast, DestTy, OldShuffleTy,
@@ -746,10 +750,11 @@ bool VectorCombine::foldBitcastShuffle(Instruction ) {
   if (DestCost > SrcCost || !DestCost.isValid())
 return false;
 
-  // bitcast (shuf V0, MaskC) --> shuf (bitcast V0), MaskC'
+  // bitcast (shuf V0, V1, MaskC) --> shuf (bitcast V0), (bitcast V1), MaskC'
   ++NumShufOfBitcast;
-  Value *CastV = Builder.CreateBitCast(V0, NewShuffleTy);
-  Value *Shuf = Builder.CreateShuffleVector(CastV, NewMask);
+  Value *CastV0 = Builder.CreateBitCast(V0, NewShuffleTy);
+  Value *CastV1 = Builder.CreateBitCast(V1, NewShuffleTy);
+  Value *Shuf = Builder.CreateShuffleVector(CastV0, CastV1, NewMask);
   replaceValue(I, *Shuf);
   return true;
 }

diff  --git a/llvm/test/Transforms/PhaseOrdering/X86/pr67803.ll 

[clang] [X86_64] fix arg pass error in struct. (PR #85394)

2024-03-15 Thread Simon Pilgrim via cfe-commits

https://github.com/RKSimon edited 
https://github.com/llvm/llvm-project/pull/85394
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [compiler-rt] [flang] [libc] [libcxx] [lldb] [llvm] [mlir] [X86] Fast AVX-512-VNNI vpdpwssd tuning (PR #85033)

2024-03-14 Thread Simon Pilgrim via cfe-commits

https://github.com/RKSimon requested changes to this pull request.

This patch needs to be cleanly rebased on trunk (force push is OK in PR branchs)

https://github.com/llvm/llvm-project/pull/85033
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [X86] Finally handle target of __builtin_ia32_cmp[p|s][s|d] from avx into sse/sse2/avx (PR #84136)

2024-03-08 Thread Simon Pilgrim via cfe-commits

https://github.com/RKSimon approved this pull request.

LGTM - but @pogo59 needs a headup as it will affect some ongoing documentation 
cleanup work

https://github.com/llvm/llvm-project/pull/84136
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Headers][X86] Add specific results to comparisons (PR #83316)

2024-03-08 Thread Simon Pilgrim via cfe-commits

RKSimon wrote:

> @RKSimon note this will affect what the tooltips show. Is that okay?

I think so - we're just losing the extra info about -1/0 or 1/0 result values?

https://github.com/llvm/llvm-project/pull/83316
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][dataflow] Fix u8 string error with C++20. (PR #84302)

2024-03-07 Thread Simon Pilgrim via cfe-commits

https://github.com/RKSimon approved this pull request.

LGTM - cheers

https://github.com/llvm/llvm-project/pull/84302
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [dataflow][nfc] Fix u8 string usage with c++20 (PR #84291)

2024-03-07 Thread Simon Pilgrim via cfe-commits


@@ -500,7 +500,7 @@ class HTMLLogger : public Logger {
 for (unsigned I = 0; I < CFG.getNumBlockIDs(); ++I) {
   std::string Name = blockID(I);
   // Rightwards arrow, vertical line
-  char ConvergenceMarker[] = u8"\\n\u2192\u007c";
+  char ConvergenceMarker[] = "\\n\u2192\u007c";

RKSimon wrote:

@thevinster @martinboehme This change now fails on MSVC builds:

`warning C4566: character represented by universal-character-name '\u2192' 
cannot be represented in the current code page (1252)`

Maybe try this instead?
```c
const char *ConvergenceMarker = (const char*)u8"\\n\u2192\u007c";
```

https://github.com/llvm/llvm-project/pull/84291
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [X86] Finally handle target of __builtin_ia32_cmp[p|s][s|d] from avx into sse/sse2/avx (PR #84136)

2024-03-07 Thread Simon Pilgrim via cfe-commits

https://github.com/RKSimon edited 
https://github.com/llvm/llvm-project/pull/84136
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [X86] Finely handle target of __builtin_ia32_cmp[p|s][s|d] from avx into sse/sse2/avx (PR #84136)

2024-03-06 Thread Simon Pilgrim via cfe-commits


@@ -2940,6 +2940,134 @@ _mm_movemask_ps(__m128 __a)
   return __builtin_ia32_movmskps((__v4sf)__a);
 }
 
+/* Compare */
+#define _CMP_EQ_OQ0x00 /* Equal (ordered, non-signaling)  */
+#define _CMP_LT_OS0x01 /* Less-than (ordered, signaling)  */
+#define _CMP_LE_OS0x02 /* Less-than-or-equal (ordered, signaling)  */
+#define _CMP_UNORD_Q  0x03 /* Unordered (non-signaling)  */
+#define _CMP_NEQ_UQ   0x04 /* Not-equal (unordered, non-signaling)  */
+#define _CMP_NLT_US   0x05 /* Not-less-than (unordered, signaling)  */
+#define _CMP_NLE_US   0x06 /* Not-less-than-or-equal (unordered, signaling)  */
+#define _CMP_ORD_Q0x07 /* Ordered (non-signaling)   */
+
+/// Compares each of the corresponding values of two 128-bit vectors of
+///[4 x float], using the operation specified by the immediate integer
+///operand.
+///
+///Returns a [4 x float] vector consisting of four floats corresponding to
+///the four comparison results: zero if the comparison is false, and all 
1's
+///if the comparison is true.
+///
+/// \headerfile 
+///
+/// \code
+/// __m128 _mm_cmp_ps(__m128 a, __m128 b, const int c);
+/// \endcode
+///
+/// This intrinsic corresponds to the  (V)CMPPS  instruction.
+///
+/// \param a
+///A 128-bit vector of [4 x float].
+/// \param b
+///A 128-bit vector of [4 x float].
+/// \param c
+///An immediate integer operand, with bits [4:0] specifying which 
comparison
+///operation to use: \n
+///(Note that without avx enabled, only bits [2:0] are supported) \n
+///0x00: Equal (ordered, non-signaling) \n
+///0x01: Less-than (ordered, signaling) \n
+///0x02: Less-than-or-equal (ordered, signaling) \n
+///0x03: Unordered (non-signaling) \n
+///0x04: Not-equal (unordered, non-signaling) \n
+///0x05: Not-less-than (unordered, signaling) \n
+///0x06: Not-less-than-or-equal (unordered, signaling) \n
+///0x07: Ordered (non-signaling) \n
+///0x08: Equal (unordered, non-signaling) \n
+///0x09: Not-greater-than-or-equal (unordered, signaling) \n
+///0x0A: Not-greater-than (unordered, signaling) \n
+///0x0B: False (ordered, non-signaling) \n
+///0x0C: Not-equal (ordered, non-signaling) \n
+///0x0D: Greater-than-or-equal (ordered, signaling) \n
+///0x0E: Greater-than (ordered, signaling) \n
+///0x0F: True (unordered, non-signaling) \n
+///0x10: Equal (ordered, signaling) \n
+///0x11: Less-than (ordered, non-signaling) \n
+///0x12: Less-than-or-equal (ordered, non-signaling) \n
+///0x13: Unordered (signaling) \n
+///0x14: Not-equal (unordered, signaling) \n
+///0x15: Not-less-than (unordered, non-signaling) \n
+///0x16: Not-less-than-or-equal (unordered, non-signaling) \n
+///0x17: Ordered (signaling) \n
+///0x18: Equal (unordered, signaling) \n
+///0x19: Not-greater-than-or-equal (unordered, non-signaling) \n
+///0x1A: Not-greater-than (unordered, non-signaling) \n
+///0x1B: False (ordered, signaling) \n
+///0x1C: Not-equal (ordered, signaling) \n
+///0x1D: Greater-than-or-equal (ordered, non-signaling) \n
+///0x1E: Greater-than (ordered, non-signaling) \n
+///0x1F: True (unordered, signaling)

RKSimon wrote:

No objections, but I'd prefer to see the new macros (and tests) and comment 
removal to be done as a followup patch

https://github.com/llvm/llvm-project/pull/84136
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [X86] Change target of __builtin_ia32_cmp[p|s][s|d] from avx into sse/sse2 (PR #84136)

2024-03-06 Thread Simon Pilgrim via cfe-commits

https://github.com/RKSimon edited 
https://github.com/llvm/llvm-project/pull/84136
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [X86] Change target of __builtin_ia32_cmp[p|s][s|d] from avx into sse/sse2 (PR #84136)

2024-03-06 Thread Simon Pilgrim via cfe-commits


@@ -1719,3 +1719,57 @@ __m128i test_mm_xor_si128(__m128i A, __m128i B) {
   // CHECK: xor <2 x i64> %{{.*}}, %{{.*}}
   return _mm_xor_si128(A, B);
 }
+
+__m128d test_mm_cmp_pd_eq_oq(__m128d a, __m128d b) {

RKSimon wrote:

Sorting

https://github.com/llvm/llvm-project/pull/84136
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [X86] Change target of __builtin_ia32_cmp[p|s][s|d] from avx into sse/sse2 (PR #84136)

2024-03-06 Thread Simon Pilgrim via cfe-commits


@@ -462,12 +462,12 @@ TARGET_BUILTIN(__builtin_ia32_blendvps256, 
"V8fV8fV8fV8f", "ncV:256:", "avx")
 TARGET_BUILTIN(__builtin_ia32_shufpd256, "V4dV4dV4dIi", "ncV:256:", "avx")
 TARGET_BUILTIN(__builtin_ia32_shufps256, "V8fV8fV8fIi", "ncV:256:", "avx")
 TARGET_BUILTIN(__builtin_ia32_dpps256, "V8fV8fV8fIc", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_cmppd, "V2dV2dV2dIc", "ncV:128:", "avx")
+TARGET_BUILTIN(__builtin_ia32_cmppd, "V2dV2dV2dIc", "ncV:128:", "avx|sse2")
 TARGET_BUILTIN(__builtin_ia32_cmppd256, "V4dV4dV4dIc", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_cmpps, "V4fV4fV4fIc", "ncV:128:", "avx")
+TARGET_BUILTIN(__builtin_ia32_cmpps, "V4fV4fV4fIc", "ncV:128:", "avx|sse")
 TARGET_BUILTIN(__builtin_ia32_cmpps256, "V8fV8fV8fIc", "ncV:256:", "avx")
-TARGET_BUILTIN(__builtin_ia32_cmpsd, "V2dV2dV2dIc", "ncV:128:", "avx")
-TARGET_BUILTIN(__builtin_ia32_cmpss, "V4fV4fV4fIc", "ncV:128:", "avx")
+TARGET_BUILTIN(__builtin_ia32_cmpsd, "V2dV2dV2dIc", "ncV:128:", "avx|sse2")
+TARGET_BUILTIN(__builtin_ia32_cmpss, "V4fV4fV4fIc", "ncV:128:", "avx|sse")

RKSimon wrote:

Why not move these up to the SSE/SSE2 implementations? And is the "avx|" 
component necessary anymore?

https://github.com/llvm/llvm-project/pull/84136
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [X86] Change target of __builtin_ia32_cmp[p|s][s|d] from avx into sse/sse2 (PR #84136)

2024-03-06 Thread Simon Pilgrim via cfe-commits

https://github.com/RKSimon commented:

Should we update the existing SSE/SSE2 cmpeq/cmplt intinsics to use 
__builtin_ia32_cmp* instead of __builtin_ia32_cmpeq??/__builtin_ia32_cmplt?? 
etc?

https://github.com/llvm/llvm-project/pull/84136
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [X86] Change target of __builtin_ia32_cmp[p|s][s|d] from avx into sse/sse2 (PR #84136)

2024-03-06 Thread Simon Pilgrim via cfe-commits


@@ -2613,6 +2614,24 @@ void CGBuilderInserter::InsertHelper(
 // called function.
 void CodeGenFunction::checkTargetFeatures(const CallExpr *E,
   const FunctionDecl *TargetDecl) {
+  // SemaCheking cannot handle below x86 builtins because they have different

RKSimon wrote:

Semachecking

https://github.com/llvm/llvm-project/pull/84136
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [X86] Change target of __builtin_ia32_cmp[p|s][s|d] from avx into sse/sse2 (PR #84136)

2024-03-06 Thread Simon Pilgrim via cfe-commits


@@ -813,3 +813,57 @@ __m128 test_mm_xor_ps(__m128 A, __m128 B) {
   // CHECK: xor <4 x i32>
   return _mm_xor_ps(A, B);
 }
+
+__m128 test_mm_cmp_ps_eq_oq(__m128 a, __m128 b) {

RKSimon wrote:

Move these up (they should be approximately alpha sorted)

https://github.com/llvm/llvm-project/pull/84136
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [X86] Add Support for X86 TLSDESC Relocations (PR #83136)

2024-03-06 Thread Simon Pilgrim via cfe-commits


@@ -18515,20 +18515,20 @@ X86TargetLowering::LowerGlobalAddress(SDValue Op, 
SelectionDAG ) const {
   return LowerGlobalOrExternal(Op, DAG, /*ForCall=*/false);
 }
 
-static SDValue
-GetTLSADDR(SelectionDAG , SDValue Chain, GlobalAddressSDNode *GA,
-   SDValue *InGlue, const EVT PtrVT, unsigned ReturnReg,
-   unsigned char OperandFlags, bool LocalDynamic = false) {
+static SDValue getTLSADDR(SelectionDAG , SDValue Chain,
+  GlobalAddressSDNode *GA, SDValue *InGlue,
+  const EVT PtrVT, unsigned ReturnReg,
+  unsigned char OperandFlags, bool UseTLSDESC = false,

RKSimon wrote:

Do we need a default arg for UseTLSDESC - all the calls to getTLSADDR seem to 
set it?

https://github.com/llvm/llvm-project/pull/83136
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [X86] Use generic CPU tuning when tune-cpu is empty (PR #83631)

2024-03-02 Thread Simon Pilgrim via cfe-commits

RKSimon wrote:

Please check the CI - these failures look relevant
```
Failed Tests (4):
  lld :: COFF/lto-cpu-string.ll
  lld :: COFF/lto.ll
  lld :: ELF/lto/cpu-string.ll
  lld :: MachO/lto-cpu-string.ll
```

https://github.com/llvm/llvm-project/pull/83631
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Headers][X86] Add specific results to comparisons (PR #83316)

2024-03-02 Thread Simon Pilgrim via cfe-commits

RKSimon wrote:

@pogo This doesn't match what we did for the various cmp intrinsics in 
emmintrin.h - should it?
```cpp
/// Compares each of the corresponding signed 32-bit values of the
///128-bit integer vectors to determine if the values in the first operand
///are greater than those in the second operand.
///
///Each comparison yields 0x0 for false, 0x for true.
///
/// \headerfile 
///
/// This intrinsic corresponds to the  VPCMPGTD / PCMPGTD  instruction.
```

https://github.com/llvm/llvm-project/pull/83316
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] d50dec6 - Fix MSVC "not all control paths return a value" warnings. NFC.

2024-03-01 Thread Simon Pilgrim via cfe-commits

Author: Simon Pilgrim
Date: 2024-03-01T09:57:09Z
New Revision: d50dec6f413ce1953bede94bdd11261b6684c7c4

URL: 
https://github.com/llvm/llvm-project/commit/d50dec6f413ce1953bede94bdd11261b6684c7c4
DIFF: 
https://github.com/llvm/llvm-project/commit/d50dec6f413ce1953bede94bdd11261b6684c7c4.diff

LOG: Fix MSVC "not all control paths return a value" warnings. NFC.

Added: 


Modified: 
clang/include/clang/Basic/TargetInfo.h

Removed: 




diff  --git a/clang/include/clang/Basic/TargetInfo.h 
b/clang/include/clang/Basic/TargetInfo.h
index b94d13609c3dd2..7682f84e491c7b 100644
--- a/clang/include/clang/Basic/TargetInfo.h
+++ b/clang/include/clang/Basic/TargetInfo.h
@@ -1386,7 +1386,7 @@ class TargetInfo : public TransferrableTargetInfo,
   case LangOptions::SignReturnAddressScopeKind::All:
 return "all";
   }
-  assert(false && "Unexpected SignReturnAddressScopeKind");
+  llvm_unreachable("Unexpected SignReturnAddressScopeKind");
 }
 
 const char *getSignKeyStr() const {
@@ -1396,7 +1396,7 @@ class TargetInfo : public TransferrableTargetInfo,
   case LangOptions::SignReturnAddressKeyKind::BKey:
 return "b_key";
   }
-  assert(false && "Unexpected SignReturnAddressKeyKind");
+  llvm_unreachable("Unexpected SignReturnAddressKeyKind");
 }
   };
 



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] Implement __builtin_popcountg (PR #82359)

2024-02-21 Thread Simon Pilgrim via cfe-commits

RKSimon wrote:

> @RKSimon The builtin currently can't be used with `constexpr`. Support for 
> constant evaluation is planned for a follow-up PR unless you would like me to 
> add it in this one. Should I remove the `Constexpr` attribute from the 
> builtin in Builtins.td for now?

Yes please!

https://github.com/llvm/llvm-project/pull/82359
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] Implement __builtin_popcountg (PR #82359)

2024-02-21 Thread Simon Pilgrim via cfe-commits

https://github.com/RKSimon commented:

Please can you add constexpr test coverage?

https://github.com/llvm/llvm-project/pull/82359
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] 0636309 - Fix MSVC "signed/unsigned mismatch" warning. NFC.

2024-02-15 Thread Simon Pilgrim via cfe-commits

Author: Simon Pilgrim
Date: 2024-02-15T10:41:09Z
New Revision: 0636309051f3b1a2b87047770bb3f7df1f3e27c3

URL: 
https://github.com/llvm/llvm-project/commit/0636309051f3b1a2b87047770bb3f7df1f3e27c3
DIFF: 
https://github.com/llvm/llvm-project/commit/0636309051f3b1a2b87047770bb3f7df1f3e27c3.diff

LOG: Fix MSVC "signed/unsigned mismatch" warning. NFC.

Added: 


Modified: 
clang/lib/AST/Interp/Function.h

Removed: 




diff  --git a/clang/lib/AST/Interp/Function.h b/clang/lib/AST/Interp/Function.h
index 6500e0126c226f..b19d64f9371e3c 100644
--- a/clang/lib/AST/Interp/Function.h
+++ b/clang/lib/AST/Interp/Function.h
@@ -186,7 +186,7 @@ class Function final {
   /// Returns the number of parameter this function takes when it's called,
   /// i.e excluding the instance pointer and the RVO pointer.
   unsigned getNumWrittenParams() const {
-assert(getNumParams() >= (hasThisPointer() + hasRVO()));
+assert(getNumParams() >= (unsigned)(hasThisPointer() + hasRVO()));
 return getNumParams() - hasThisPointer() - hasRVO();
   }
   unsigned getWrittenArgSize() const {



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [transforms] Inline simple variadic functions (PR #81058)

2024-02-14 Thread Simon Pilgrim via cfe-commits

RKSimon wrote:

Why have the x86 tests been placed in test\CodeGen\X86 instead of something 
like test\Transforms\ExpandVariadics\X86 ?

https://github.com/llvm/llvm-project/pull/81058
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [LLVM] Add `__builtin_readsteadycounter` intrinsic and builtin for realtime clocks (PR #81331)

2024-02-12 Thread Simon Pilgrim via cfe-commits

RKSimon wrote:

Are we assuming any particular relationship to __builtin_readcyclecounter in 
terms of scales etc? 

__builtin_readsteadycounter could be used to access x86 MPERF clock counters, 
but to access the corresponding APERF clock we'd then need a 
__builtin_readvariablecounter equivalent (__builtin_readcyclecounter gives the 
separate RDTSC clock value)

https://github.com/llvm/llvm-project/pull/81331
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[llvm] [clang] [clang-tools-extra] [X86] Use plain load/store instead of cmpxchg16b for atomics with AVX (PR #74275)

2024-02-06 Thread Simon Pilgrim via cfe-commits

https://github.com/RKSimon approved this pull request.

LGTM. I'd still like to ensure we have unaligned x86 test coverage.

https://github.com/llvm/llvm-project/pull/74275
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[llvm] [clang] [polly] [X86] Remove Intel Xeon Phi Supports. (PR #76383)

2024-02-02 Thread Simon Pilgrim via cfe-commits

RKSimon wrote:

@FreddyLeaf Can this be abandoned now?

https://github.com/llvm/llvm-project/pull/76383
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang-tools-extra] [clang] [llvm] [X86] Use plain load/store instead of cmpxchg16b for atomics with AVX (PR #74275)

2024-02-02 Thread Simon Pilgrim via cfe-commits

https://github.com/RKSimon commented:

Please can you confirm we have tests for underaligned pointers?

https://github.com/llvm/llvm-project/pull/74275
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[compiler-rt] [clang] [flang] [llvm] [clang-tools-extra] [TTI]Fallback to SingleSrcPermute shuffle kind, if no direct estimation for (PR #79837)

2024-02-01 Thread Simon Pilgrim via cfe-commits


@@ -2,15 +2,15 @@
 ; RUN: opt < %s -mtriple=x86_64-unknown-linux-gnu -passes="print" 
2>&1 -disable-output -cost-kind=latency -mattr=+sse2 | FileCheck %s 
-check-prefixes=SSE,SSE2
 ; RUN: opt < %s -mtriple=x86_64-unknown-linux-gnu -passes="print" 
2>&1 -disable-output -cost-kind=latency -mattr=+ssse3 | FileCheck %s 
-check-prefixes=SSE,SSSE3
 ; RUN: opt < %s -mtriple=x86_64-unknown-linux-gnu -passes="print" 
2>&1 -disable-output -cost-kind=latency -mattr=+sse4.2 | FileCheck %s 
-check-prefixes=SSE,SSE42
-; RUN: opt < %s -mtriple=x86_64-unknown-linux-gnu -passes="print" 
2>&1 -disable-output -cost-kind=latency -mattr=+avx | FileCheck %s 
-check-prefixes=AVX
-; RUN: opt < %s -mtriple=x86_64-unknown-linux-gnu -passes="print" 
2>&1 -disable-output -cost-kind=latency -mattr=+avx2 | FileCheck %s 
-check-prefixes=AVX
-; RUN: opt < %s -mtriple=x86_64-unknown-linux-gnu -passes="print" 
2>&1 -disable-output -cost-kind=latency -mattr=+avx512f | FileCheck %s 
--check-prefixes=AVX512
-; RUN: opt < %s -mtriple=x86_64-unknown-linux-gnu -passes="print" 
2>&1 -disable-output -cost-kind=latency -mattr=+avx512f,+avx512bw | FileCheck 
%s --check-prefixes=AVX512
-; RUN: opt < %s -mtriple=x86_64-unknown-linux-gnu -passes="print" 
2>&1 -disable-output -cost-kind=latency -mattr=+avx512f,+avx512bw,+avx512vbmi | 
FileCheck %s --check-prefixes=AVX512
+; RUN: opt < %s -mtriple=x86_64-unknown-linux-gnu -passes="print" 
2>&1 -disable-output -cost-kind=latency -mattr=+avx | FileCheck %s 
-check-prefixes=AVX,AVX1
+; RUN: opt < %s -mtriple=x86_64-unknown-linux-gnu -passes="print" 
2>&1 -disable-output -cost-kind=latency -mattr=+avx2 | FileCheck %s 
-check-prefixes=AVX,AVX2
+; RUN: opt < %s -mtriple=x86_64-unknown-linux-gnu -passes="print" 
2>&1 -disable-output -cost-kind=latency -mattr=+avx512f | FileCheck %s 
--check-prefixes=AVX512,AVX512F
+; RUN: opt < %s -mtriple=x86_64-unknown-linux-gnu -passes="print" 
2>&1 -disable-output -cost-kind=latency -mattr=+avx512f,+avx512bw | FileCheck 
%s --check-prefixes=AVX512,AVX512BW
+; RUN: opt < %s -mtriple=x86_64-unknown-linux-gnu -passes="print" 
2>&1 -disable-output -cost-kind=latency -mattr=+avx512f,+avx512bw,+avx512vbmi | 
FileCheck %s --check-prefixes=AVX512,AVX512VBMI
 ;
 ; RUN: opt < %s -mtriple=x86_64-unknown-linux-gnu -passes="print" 
2>&1 -disable-output -cost-kind=latency -mcpu=slm | FileCheck %s 
--check-prefixes=SSE,SLM
 ; RUN: opt < %s -mtriple=x86_64-unknown-linux-gnu -passes="print" 
2>&1 -disable-output -cost-kind=latency -mcpu=goldmont | FileCheck %s 
--check-prefixes=SSE,GLM
-; RUN: opt < %s -mtriple=x86_64-unknown-linux-gnu -passes="print" 
2>&1 -disable-output -cost-kind=latency -mcpu=btver2 | FileCheck %s 
--check-prefixes=AVX
+; RUN: opt < %s -mtriple=x86_64-unknown-linux-gnu -passes="print" 
2>&1 -disable-output -cost-kind=latency -mcpu=btver2 | FileCheck %s 
--check-prefixes=AVX,BTVER2

RKSimon wrote:

Use AVX1 instead of BTVER2?

https://github.com/llvm/llvm-project/pull/79837
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[llvm] [clang] [compiler-rt] [X86] Support more ISAs to enable __builtin_cpu_supports (PR #79086)

2024-02-01 Thread Simon Pilgrim via cfe-commits


@@ -139,20 +139,77 @@ enum ProcessorFeatures {
   FEATURE_AVX512BITALG,
   FEATURE_AVX512BF16,
   FEATURE_AVX512VP2INTERSECT,
+  FEATURE_3DNOW,
+  FEATURE_ADX = 40,
+  FEATURE_CLDEMOTE = 42,

RKSimon wrote:

Maybe leave in a commented out entry to make that clear?

https://github.com/llvm/llvm-project/pull/79086
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[llvm] [clang] Adding support of AMDLIBM vector library (PR #78560)

2024-01-31 Thread Simon Pilgrim via cfe-commits


@@ -0,0 +1,332 @@
+; RUN: opt -vector-library=AMDLIBM -passes=inject-tli-mappings,loop-vectorize 
-S < %s | FileCheck %s
+
+; Test to verify that when math headers are built with
+; __FINITE_MATH_ONLY__ enabled, causing use of ___finite
+; function versions, vectorization can map these to vector versions.
+
+target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+declare float @__expf_finite(float) #0
+
+; CHECK-LABEL: @exp_f32
+; CHECK: <4 x float> @amd_vrs4_expf

RKSimon wrote:

I'm still not seeing cases where the vector width is varying depending on cpu 
abilities?

https://github.com/llvm/llvm-project/pull/78560
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[llvm] [clang] [clang-tools-extra] [SLP]Improve findReusedOrderedScalars and graph rotation. (PR #77529)

2024-01-30 Thread Simon Pilgrim via cfe-commits


@@ -858,7 +858,7 @@ static void addMask(SmallVectorImpl , 
ArrayRef SubMask,
 /// values 3 and 7 respectively:
 /// before:  6 9 5 4 9 2 1 0
 /// after:   6 3 5 4 7 2 1 0
-static void fixupOrderingIndices(SmallVectorImpl ) {
+static void fixupOrderingIndices(MutableArrayRef Order) {

RKSimon wrote:

Pull out this NFC change

https://github.com/llvm/llvm-project/pull/77529
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [clang-tools-extra] [SLP]Improve findReusedOrderedScalars and graph rotation. (PR #77529)

2024-01-30 Thread Simon Pilgrim via cfe-commits


@@ -3779,65 +3780,169 @@ static void reorderOrder(SmallVectorImpl 
, ArrayRef Mask,
 std::optional
 BoUpSLP::findReusedOrderedScalars(const BoUpSLP::TreeEntry ) {
   assert(TE.State == TreeEntry::NeedToGather && "Expected gather node only.");
-  unsigned NumScalars = TE.Scalars.size();
+  // Try to find subvector extract/insert patterns and reorder only such
+  // patterns.
+  SmallVector GatheredScalars(TE.Scalars.begin(), TE.Scalars.end());
+  Type *ScalarTy = GatheredScalars.front()->getType();
+  int NumScalars = GatheredScalars.size();
+  if (!isValidElementType(ScalarTy))
+return std::nullopt;
+  auto *VecTy = FixedVectorType::get(ScalarTy, NumScalars);
+  int NumParts = TTI->getNumberOfParts(VecTy);
+  if (NumParts == 0 || NumParts >= NumScalars)
+NumParts = 1;
+  SmallVector ExtractMask;
+  SmallVector Mask;
+  SmallVector> Entries;
+  SmallVector> ExtractShuffles 
=
+  tryToGatherExtractElements(GatheredScalars, ExtractMask, NumParts);
+  SmallVector> GatherShuffles =
+  isGatherShuffledEntry(, GatheredScalars, Mask, Entries, NumParts,
+/*ForOrder=*/true);
+  // No shuffled operands - ignore.
+  if (GatherShuffles.empty() && ExtractShuffles.empty())
+return std::nullopt;
   OrdersType CurrentOrder(NumScalars, NumScalars);
-  SmallVector Positions;
-  SmallBitVector UsedPositions(NumScalars);
-  const TreeEntry *STE = nullptr;
-  // Try to find all gathered scalars that are gets vectorized in other
-  // vectorize node. Here we can have only one single tree vector node to
-  // correctly identify order of the gathered scalars.
-  for (unsigned I = 0; I < NumScalars; ++I) {
-Value *V = TE.Scalars[I];
-if (!isa(V))
-  continue;
-if (const auto *LocalSTE = getTreeEntry(V)) {
-  if (!STE)
-STE = LocalSTE;
-  else if (STE != LocalSTE)
-// Take the order only from the single vector node.
-return std::nullopt;
-  unsigned Lane =
-  std::distance(STE->Scalars.begin(), find(STE->Scalars, V));
-  if (Lane >= NumScalars)
-return std::nullopt;
-  if (CurrentOrder[Lane] != NumScalars) {
-if (Lane != I)
+  if (GatherShuffles.size() == 1 &&
+  *GatherShuffles.front() == TTI::SK_PermuteSingleSrc &&
+  Entries.front().front()->isSame(TE.Scalars)) {
+// Exclude nodes for strided geps from analysis, better to reorder them.
+if (!TE.UserTreeIndices.empty() &&
+TE.UserTreeIndices.front().UserTE->State ==
+TreeEntry::PossibleStridedVectorize &&
+Entries.front().front()->State == TreeEntry::NeedToGather)
+  return std::nullopt;
+// Perfect match in the graph, will reuse the previously vectorized
+// node. Cost is 0.
+std::iota(CurrentOrder.begin(), CurrentOrder.end(), 0);
+return CurrentOrder;
+  }
+  auto IsBroadcastMask = [](ArrayRef Mask) {
+int SingleElt = PoisonMaskElem;
+return all_of(Mask, [&](int I) {
+  if (SingleElt == PoisonMaskElem && I != PoisonMaskElem)
+SingleElt = I;
+  return I == PoisonMaskElem || I == SingleElt;
+});
+  };

RKSimon wrote:

Don't we have this anywhere else that we can reuse?

https://github.com/llvm/llvm-project/pull/77529
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang-tools-extra] [llvm] [clang] [SLP]Improve findReusedOrderedScalars and graph rotation. (PR #77529)

2024-01-30 Thread Simon Pilgrim via cfe-commits


@@ -2418,7 +2418,8 @@ class BoUpSLP {
   std::optional
   isGatherShuffledSingleRegisterEntry(
   const TreeEntry *TE, ArrayRef VL, MutableArrayRef Mask,
-  SmallVectorImpl , unsigned Part);
+  SmallVectorImpl , unsigned Part,
+  bool ForOrder);

RKSimon wrote:

Add ForOrder to the doxygen description

https://github.com/llvm/llvm-project/pull/77529
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-tools-extra] [llvm] [SLP]Improve findReusedOrderedScalars and graph rotation. (PR #77529)

2024-01-30 Thread Simon Pilgrim via cfe-commits


@@ -3779,65 +3780,169 @@ static void reorderOrder(SmallVectorImpl 
, ArrayRef Mask,
 std::optional
 BoUpSLP::findReusedOrderedScalars(const BoUpSLP::TreeEntry ) {
   assert(TE.State == TreeEntry::NeedToGather && "Expected gather node only.");
-  unsigned NumScalars = TE.Scalars.size();
+  // Try to find subvector extract/insert patterns and reorder only such
+  // patterns.
+  SmallVector GatheredScalars(TE.Scalars.begin(), TE.Scalars.end());
+  Type *ScalarTy = GatheredScalars.front()->getType();
+  int NumScalars = GatheredScalars.size();
+  if (!isValidElementType(ScalarTy))
+return std::nullopt;
+  auto *VecTy = FixedVectorType::get(ScalarTy, NumScalars);
+  int NumParts = TTI->getNumberOfParts(VecTy);
+  if (NumParts == 0 || NumParts >= NumScalars)
+NumParts = 1;
+  SmallVector ExtractMask;
+  SmallVector Mask;
+  SmallVector> Entries;
+  SmallVector> ExtractShuffles 
=
+  tryToGatherExtractElements(GatheredScalars, ExtractMask, NumParts);
+  SmallVector> GatherShuffles =
+  isGatherShuffledEntry(, GatheredScalars, Mask, Entries, NumParts,
+/*ForOrder=*/true);
+  // No shuffled operands - ignore.
+  if (GatherShuffles.empty() && ExtractShuffles.empty())
+return std::nullopt;
   OrdersType CurrentOrder(NumScalars, NumScalars);
-  SmallVector Positions;
-  SmallBitVector UsedPositions(NumScalars);
-  const TreeEntry *STE = nullptr;
-  // Try to find all gathered scalars that are gets vectorized in other
-  // vectorize node. Here we can have only one single tree vector node to
-  // correctly identify order of the gathered scalars.
-  for (unsigned I = 0; I < NumScalars; ++I) {
-Value *V = TE.Scalars[I];
-if (!isa(V))
-  continue;
-if (const auto *LocalSTE = getTreeEntry(V)) {
-  if (!STE)
-STE = LocalSTE;
-  else if (STE != LocalSTE)
-// Take the order only from the single vector node.
-return std::nullopt;
-  unsigned Lane =
-  std::distance(STE->Scalars.begin(), find(STE->Scalars, V));
-  if (Lane >= NumScalars)
-return std::nullopt;
-  if (CurrentOrder[Lane] != NumScalars) {
-if (Lane != I)
+  if (GatherShuffles.size() == 1 &&
+  *GatherShuffles.front() == TTI::SK_PermuteSingleSrc &&
+  Entries.front().front()->isSame(TE.Scalars)) {
+// Exclude nodes for strided geps from analysis, better to reorder them.
+if (!TE.UserTreeIndices.empty() &&
+TE.UserTreeIndices.front().UserTE->State ==
+TreeEntry::PossibleStridedVectorize &&
+Entries.front().front()->State == TreeEntry::NeedToGather)
+  return std::nullopt;
+// Perfect match in the graph, will reuse the previously vectorized
+// node. Cost is 0.
+std::iota(CurrentOrder.begin(), CurrentOrder.end(), 0);
+return CurrentOrder;
+  }
+  auto IsBroadcastMask = [](ArrayRef Mask) {
+int SingleElt = PoisonMaskElem;
+return all_of(Mask, [&](int I) {
+  if (SingleElt == PoisonMaskElem && I != PoisonMaskElem)
+SingleElt = I;
+  return I == PoisonMaskElem || I == SingleElt;
+});
+  };
+  // Exclusive broadcast mask - ignore.
+  if ((ExtractShuffles.empty() && IsBroadcastMask(Mask) &&
+   (Entries.size() != 1 ||
+Entries.front().front()->ReorderIndices.empty())) ||
+  (GatherShuffles.empty() && IsBroadcastMask(ExtractMask)))
+return std::nullopt;
+  SmallBitVector ShuffledSubMasks(NumParts);
+  auto TransformMaskToOrder = [&](MutableArrayRef CurrentOrder,
+  ArrayRef Mask, int PartSz, int NumParts,
+  function_ref GetVF) {
+for (int I : seq(0, NumParts)) {

RKSimon wrote:

why use seq instead of just a basic for loop?

https://github.com/llvm/llvm-project/pull/77529
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[llvm] [clang] [clang-tools-extra] [SLP]Improve findReusedOrderedScalars and graph rotation. (PR #77529)

2024-01-30 Thread Simon Pilgrim via cfe-commits

https://github.com/RKSimon edited 
https://github.com/llvm/llvm-project/pull/77529
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang-tools-extra] [clang] [llvm] [SLP]Improve findReusedOrderedScalars and graph rotation. (PR #77529)

2024-01-30 Thread Simon Pilgrim via cfe-commits


@@ -2432,7 +2433,7 @@ class BoUpSLP {
   isGatherShuffledEntry(
   const TreeEntry *TE, ArrayRef VL, SmallVectorImpl ,
   SmallVectorImpl> ,
-  unsigned NumParts);
+  unsigned NumParts, bool ForOrder = false);

RKSimon wrote:

Add ForOrder to the doxygen description

https://github.com/llvm/llvm-project/pull/77529
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[llvm] [clang-tools-extra] [clang] [SLP]Improve findReusedOrderedScalars and graph rotation. (PR #77529)

2024-01-30 Thread Simon Pilgrim via cfe-commits

https://github.com/RKSimon commented:

Please can you add better comments explaining the process

https://github.com/llvm/llvm-project/pull/77529
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libcxx] [llvm] [libc] [compiler-rt] [lldb] [clang-tools-extra] [mlir] [clang] [flang] [AArch64] add intrinsic to generate a bfi instruction (PR #79672)

2024-01-29 Thread Simon Pilgrim via cfe-commits

RKSimon wrote:

@RamaMalladiAWS Do you have examples of the IR that fails to lower to BFI? 
These things often turn out to be either a missing middle-end canonicalization 
or maybe a case that could be added to existing pattern matching in the 
back-end.

https://github.com/llvm/llvm-project/pull/79672
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libcxx] [flang] [libc] [clang-tools-extra] [clang] [llvm] [compiler-rt] [libunwind] [lld] [lldb] [X86] Use RORX over SHR imm (PR #77964)

2024-01-28 Thread Simon Pilgrim via cfe-commits


@@ -4216,6 +4217,95 @@ MachineSDNode *X86DAGToDAGISel::emitPCMPESTR(unsigned 
ROpc, unsigned MOpc,
   return CNode;
 }
 
+// When the consumer of a right shift (arithmetic or logical) wouldn't notice
+// the difference if the instruction was a rotate right instead (because the
+// bits shifted in are truncated away), the shift can be replaced by the RORX
+// instruction from BMI2. This doesn't set flags and can output to a different
+// register. However, this increases code size in most cases, and doesn't leave
+// the high bits in a useful state. There may be other situations where this
+// transformation is profitable given those conditions, but currently the
+// transformation is only made when it likely avoids spilling flags.
+bool X86DAGToDAGISel::rightShiftUnclobberFlags(SDNode *N) {
+  EVT VT = N->getValueType(0);
+
+  // Target has to have BMI2 for RORX
+  if (!Subtarget->hasBMI2())
+return false;
+
+  // Only handle scalar shifts.
+  if (VT.isVector())
+return false;
+
+  unsigned OpSize;
+  if (VT == MVT::i64)
+OpSize = 64;
+  else if (VT == MVT::i32)
+OpSize = 32;
+  else if (VT == MVT::i16)
+OpSize = 16;
+  else if (VT == MVT::i8)
+return false; // i8 shift can't be truncated.
+  else
+llvm_unreachable("Unexpected shift size");
+
+  unsigned TruncateSize = 0;
+  // This only works when the result is truncated.
+  for (const SDNode *User : N->uses()) {
+if (!User->isMachineOpcode() ||
+User->getMachineOpcode() != TargetOpcode::EXTRACT_SUBREG)
+  return false;
+EVT TuncateType = User->getValueType(0);
+if (TuncateType == MVT::i32)
+  TruncateSize = std::max(TruncateSize, 32U);
+else if (TuncateType == MVT::i16)
+  TruncateSize = std::max(TruncateSize, 16U);
+else if (TuncateType == MVT::i8)
+  TruncateSize = std::max(TruncateSize, 8U);
+else
+  return false;
+  }
+  if (TruncateSize >= OpSize)
+return false;
+
+  // The shift must be by an immediate that wouldn't expose the zero or sign
+  // extended result.
+  auto *ShiftAmount = dyn_cast(N->getOperand(1));
+  if (!ShiftAmount || ShiftAmount->getZExtValue() > OpSize - TruncateSize)
+return false;
+
+  // If the shift argument has non-dead EFLAGS, then this shift probably
+  // clobbers those flags making the transformation to RORX useful. This may
+  // have false negatives or positives so ideally this transformation is made
+  // later on.
+  bool ArgProducesFlags = false;
+  SDNode *Input = N->getOperand(0).getNode();
+  for (auto Use : Input->uses()) {
+if (Use->getOpcode() == ISD::CopyToReg) {
+  auto *RegisterNode =
+  dyn_cast(Use->getOperand(1).getNode());

RKSimon wrote:

```dyn_cast(Use->getOperand(1))```

https://github.com/llvm/llvm-project/pull/77964
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang-tools-extra] [libcxx] [flang] [lldb] [llvm] [libunwind] [compiler-rt] [lld] [libc] [clang] [X86] Use RORX over SHR imm (PR #77964)

2024-01-25 Thread Simon Pilgrim via cfe-commits


@@ -4216,6 +4217,97 @@ MachineSDNode *X86DAGToDAGISel::emitPCMPESTR(unsigned 
ROpc, unsigned MOpc,
   return CNode;
 }
 
+// When the consumer of a right shift (arithmetic or logical) wouldn't notice
+// the difference if the instruction was a rotate right instead (because the
+// bits shifted in are truncated away), the shift can be replaced by the RORX
+// instruction from BMI2. This doesn't set flags and can output to a different
+// register. However, this increases code size in most cases, and doesn't leave
+// the high bits in a useful state. There may be other situations where this
+// transformation is profitable given those conditions, but currently the
+// transformation is only made when it likely avoids spilling flags.
+bool X86DAGToDAGISel::rightShiftUncloberFlags(SDNode *N) {

RKSimon wrote:

typo: rightShiftUncloberFlags -> rightShiftUnclobberFlags

https://github.com/llvm/llvm-project/pull/77964
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libc] [lld] [clang-tools-extra] [libcxx] [libunwind] [compiler-rt] [lldb] [flang] [llvm] [clang] [X86] Use RORX over SHR imm (PR #77964)

2024-01-25 Thread Simon Pilgrim via cfe-commits

https://github.com/RKSimon requested changes to this pull request.


https://github.com/llvm/llvm-project/pull/77964
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libunwind] [clang-tools-extra] [libc] [flang] [lldb] [lld] [compiler-rt] [libcxx] [llvm] [clang] [X86] Use RORX over SHR imm (PR #77964)

2024-01-25 Thread Simon Pilgrim via cfe-commits


@@ -4216,6 +4217,97 @@ MachineSDNode *X86DAGToDAGISel::emitPCMPESTR(unsigned 
ROpc, unsigned MOpc,
   return CNode;
 }
 
+// When the consumer of a right shift (arithmetic or logical) wouldn't notice
+// the difference if the instruction was a rotate right instead (because the
+// bits shifted in are truncated away), the shift can be replaced by the RORX
+// instruction from BMI2. This doesn't set flags and can output to a different
+// register. However, this increases code size in most cases, and doesn't leave
+// the high bits in a useful state. There may be other situations where this
+// transformation is profitable given those conditions, but currently the
+// transformation is only made when it likely avoids spilling flags.
+bool X86DAGToDAGISel::rightShiftUncloberFlags(SDNode *N) {
+  EVT VT = N->getValueType(0);
+
+  // Target has to have BMI2 for RORX
+  if (!Subtarget->hasBMI2())
+return false;
+
+  // Only handle scalar shifts.
+  if (VT.isVector())
+return false;
+
+  unsigned OpSize;
+  if (VT == MVT::i64)
+OpSize = 64;
+  else if (VT == MVT::i32)
+OpSize = 32;
+  else if (VT == MVT::i16)
+OpSize = 16;
+  else if (VT == MVT::i8)
+return false; // i8 shift can't be truncated.
+  else
+llvm_unreachable("Unexpected shift size");
+
+  unsigned TruncateSize = 0;
+  // This only works when the result is truncated.
+  for (const SDNode *User : N->uses()) {
+auto name = User->getOperationName(CurDAG);

RKSimon wrote:

unused variable

https://github.com/llvm/llvm-project/pull/77964
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [libunwind] [libcxx] [flang] [compiler-rt] [clang-tools-extra] [libc] [lldb] [llvm] [lld] [X86] Use RORX over SHR imm (PR #77964)

2024-01-25 Thread Simon Pilgrim via cfe-commits


@@ -4216,6 +4217,97 @@ MachineSDNode *X86DAGToDAGISel::emitPCMPESTR(unsigned 
ROpc, unsigned MOpc,
   return CNode;
 }
 
+// When the consumer of a right shift (arithmetic or logical) wouldn't notice
+// the difference if the instruction was a rotate right instead (because the
+// bits shifted in are truncated away), the shift can be replaced by the RORX
+// instruction from BMI2. This doesn't set flags and can output to a different
+// register. However, this increases code size in most cases, and doesn't leave
+// the high bits in a useful state. There may be other situations where this
+// transformation is profitable given those conditions, but currently the
+// transformation is only made when it likely avoids spilling flags.
+bool X86DAGToDAGISel::rightShiftUncloberFlags(SDNode *N) {
+  EVT VT = N->getValueType(0);
+
+  // Target has to have BMI2 for RORX
+  if (!Subtarget->hasBMI2())
+return false;
+
+  // Only handle scalar shifts.
+  if (VT.isVector())
+return false;
+
+  unsigned OpSize;
+  if (VT == MVT::i64)
+OpSize = 64;
+  else if (VT == MVT::i32)
+OpSize = 32;
+  else if (VT == MVT::i16)
+OpSize = 16;
+  else if (VT == MVT::i8)
+return false; // i8 shift can't be truncated.
+  else
+llvm_unreachable("Unexpected shift size");
+
+  unsigned TruncateSize = 0;
+  // This only works when the result is truncated.
+  for (const SDNode *User : N->uses()) {
+auto name = User->getOperationName(CurDAG);
+if (!User->isMachineOpcode() ||
+User->getMachineOpcode() != TargetOpcode::EXTRACT_SUBREG)
+  return false;
+EVT TuncateType = User->getValueType(0);
+if (TuncateType == MVT::i32)
+  TruncateSize = std::max(TruncateSize, 32U);
+else if (TuncateType == MVT::i16)
+  TruncateSize = std::max(TruncateSize, 16U);
+else if (TuncateType == MVT::i8)
+  TruncateSize = std::max(TruncateSize, 8U);
+else
+  return false;
+  }
+  if (TruncateSize >= OpSize)
+return false;
+
+  // The shift must be by an immediate that wouldn't expose the zero or sign
+  // extended result.
+  auto *ShiftAmount = dyn_cast(N->getOperand(1));
+  if (!ShiftAmount || ShiftAmount->getZExtValue() > OpSize - TruncateSize)
+return false;
+
+  // Only make the replacement when it avoids clobbering used flags. This is a
+  // similar heuristic as used in the conversion to LEA, namely looking at the
+  // operand for an instruction that creates flags where those flags are used.
+  // This will have both false positives and false negatives. Ideally, both of
+  // these happen later on. Perhaps in copy to flags lowering or in register
+  // allocation.
+  bool MightClobberFlags = false;
+  SDNode *Input = N->getOperand(0).getNode();
+  for (auto Use : Input->uses()) {
+if (Use->getOpcode() == ISD::CopyToReg) {
+  auto *RegisterNode =
+  dyn_cast(Use->getOperand(1).getNode());
+  if (RegisterNode && RegisterNode->getReg() == X86::EFLAGS) {
+MightClobberFlags = true;
+break;
+  }
+}
+  }
+  if (!MightClobberFlags)
+return false;

RKSimon wrote:

Is this correct? The logic appears to be flipped.

https://github.com/llvm/llvm-project/pull/77964
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libc] [clang-tools-extra] [llvm] [compiler-rt] [clang] [lldb] [lld] [flang] [libcxx] [libunwind] [X86] Use RORX over SHR imm (PR #77964)

2024-01-25 Thread Simon Pilgrim via cfe-commits

https://github.com/RKSimon edited 
https://github.com/llvm/llvm-project/pull/77964
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Amend SME attributes with support for ZT0. (PR #77941)

2024-01-23 Thread Simon Pilgrim via cfe-commits

RKSimon wrote:

@sdesmalen-arm This appears to be failing on some buildbots: 
https://lab.llvm.org/buildbot/#/builders/176/builds/8232
```
llvm-lit: 
/home/tcwg-buildbot/worker/clang-aarch64-sve-vls-2stage/llvm/llvm/utils/lit/lit/TestingConfig.py:152:
 fatal: unable to parse config file 
'/home/tcwg-buildbot/worker/clang-aarch64-sve-vls-2stage/stage2/runtimes/runtimes-bins/compiler-rt/unittests/lit.common.unit.configured',
 traceback: Traceback (most recent call last):
  File 
"/home/tcwg-buildbot/worker/clang-aarch64-sve-vls-2stage/llvm/llvm/utils/lit/lit/TestingConfig.py",
 line 140, in load_from_path
exec(compile(data, path, "exec"), cfg_globals, None)
  File 
"/home/tcwg-buildbot/worker/clang-aarch64-sve-vls-2stage/stage2/runtimes/runtimes-bins/compiler-rt/unittests/lit.common.unit.configured",
 line 23
config.aarch64_sme =
^
SyntaxError: invalid syntax
```

https://github.com/llvm/llvm-project/pull/77941
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Adding support of AMDLIBM vector library (PR #78560)

2024-01-19 Thread Simon Pilgrim via cfe-commits


@@ -0,0 +1,332 @@
+; RUN: opt -vector-library=AMDLIBM -passes=inject-tli-mappings,loop-vectorize 
-S < %s | FileCheck %s
+
+; Test to verify that when math headers are built with
+; __FINITE_MATH_ONLY__ enabled, causing use of ___finite
+; function versions, vectorization can map these to vector versions.
+
+target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+declare float @__expf_finite(float) #0
+
+; CHECK-LABEL: @exp_f32
+; CHECK: <4 x float> @amd_vrs4_expf

RKSimon wrote:

Add RUNs for avx2 / avx512 capable targets to ensure amd_vrs8_expf / 
amd_vrs16_expf etc. are used when appropriate? 

https://github.com/llvm/llvm-project/pull/78560
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang-tools-extra] [lldb] [libc] [llvm] [clang] [libcxxabi] [libunwind] [libcxx] [flang] [lld] [compiler-rt] Fix a bug in Smith's algorithm used in complex div/mul. (PR #78330)

2024-01-18 Thread Simon Pilgrim via cfe-commits

https://github.com/RKSimon updated 
https://github.com/llvm/llvm-project/pull/78330

>From 8f8917528e30d2ba67f669cfd1a893bc85c21121 Mon Sep 17 00:00:00 2001
From: Ammarguellat 
Date: Tue, 16 Jan 2024 11:24:03 -0800
Subject: [PATCH 1/4] Fixed a bug in Smith's algorithm and made sure last
 option in command line rules.

---
 clang/lib/CodeGen/CGExprComplex.cpp   |  8 ++--
 clang/lib/Driver/ToolChains/Clang.cpp | 19 ++-
 clang/test/CodeGen/cx-complex-range.c | 16 
 clang/test/Driver/range.c |  7 +++
 4 files changed, 39 insertions(+), 11 deletions(-)

diff --git a/clang/lib/CodeGen/CGExprComplex.cpp 
b/clang/lib/CodeGen/CGExprComplex.cpp
index e532794b71bdb4a..6fbd8f19eeb50a4 100644
--- a/clang/lib/CodeGen/CGExprComplex.cpp
+++ b/clang/lib/CodeGen/CGExprComplex.cpp
@@ -936,7 +936,7 @@ ComplexPairTy 
ComplexExprEmitter::EmitRangeReductionDiv(llvm::Value *LHSr,
   llvm::Value *RC = Builder.CreateFMul(CdD, RHSr);  // rc
   llvm::Value *DpRC = Builder.CreateFAdd(RHSi, RC); // tmp=d+rc
 
-  llvm::Value *T7 = Builder.CreateFMul(LHSr, RC);// ar
+  llvm::Value *T7 = Builder.CreateFMul(LHSr, CdD);   // ar
   llvm::Value *T8 = Builder.CreateFAdd(T7, LHSi);// ar+b
   llvm::Value *DSTFr = Builder.CreateFDiv(T8, DpRC); // (ar+b)/tmp
 
@@ -978,7 +978,11 @@ ComplexPairTy ComplexExprEmitter::EmitBinDiv(const 
BinOpInfo ) {
   return EmitRangeReductionDiv(LHSr, LHSi, RHSr, RHSi);
 else if (Op.FPFeatures.getComplexRange() == LangOptions::CX_Limited)
   return EmitAlgebraicDiv(LHSr, LHSi, RHSr, RHSi);
-else if (!CGF.getLangOpts().FastMath) {
+else if (!CGF.getLangOpts().FastMath ||
+ // '-ffast-math' is used in the command line but followed by an
+ // '-fno-cx-limited-range'.
+ (CGF.getLangOpts().FastMath &&
+  Op.FPFeatures.getComplexRange() == LangOptions::CX_Full)) {
   LHSi = OrigLHSi;
   // If we have a complex operand on the RHS and FastMath is not allowed, 
we
   // delegate to a libcall to handle all of the complexities and minimize
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index 9edae3fec91a87f..0c05d1db3960718 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -2753,6 +2753,7 @@ static void RenderFloatingPointOptions(const ToolChain 
, const Driver ,
   StringRef Float16ExcessPrecision = "";
   StringRef BFloat16ExcessPrecision = "";
   LangOptions::ComplexRangeKind Range = LangOptions::ComplexRangeKind::CX_Full;
+  std::string ComplexRangeStr = "";
 
   if (const Arg *A = Args.getLastArg(options::OPT_flimited_precision_EQ)) {
 CmdArgs.push_back("-mlimit-float-precision");
@@ -2768,24 +2769,24 @@ static void RenderFloatingPointOptions(const ToolChain 
, const Driver ,
 case options::OPT_fcx_limited_range: {
   EmitComplexRangeDiag(D, Range, 
LangOptions::ComplexRangeKind::CX_Limited);
   Range = LangOptions::ComplexRangeKind::CX_Limited;
-  std::string ComplexRangeStr = RenderComplexRangeOption("limited");
-  if (!ComplexRangeStr.empty())
-CmdArgs.push_back(Args.MakeArgString(ComplexRangeStr));
+  ComplexRangeStr = RenderComplexRangeOption("limited");
   break;
 }
 case options::OPT_fno_cx_limited_range:
+  EmitComplexRangeDiag(D, Range, LangOptions::ComplexRangeKind::CX_Full);
   Range = LangOptions::ComplexRangeKind::CX_Full;
+  ComplexRangeStr = RenderComplexRangeOption("full");
   break;
 case options::OPT_fcx_fortran_rules: {
   EmitComplexRangeDiag(D, Range, 
LangOptions::ComplexRangeKind::CX_Fortran);
   Range = LangOptions::ComplexRangeKind::CX_Fortran;
-  std::string ComplexRangeStr = RenderComplexRangeOption("fortran");
-  if (!ComplexRangeStr.empty())
-CmdArgs.push_back(Args.MakeArgString(ComplexRangeStr));
+  ComplexRangeStr = RenderComplexRangeOption("fortran");
   break;
 }
 case options::OPT_fno_cx_fortran_rules:
+  EmitComplexRangeDiag(D, Range, LangOptions::ComplexRangeKind::CX_Full);
   Range = LangOptions::ComplexRangeKind::CX_Full;
+  ComplexRangeStr = RenderComplexRangeOption("full");
   break;
 case options::OPT_ffp_model_EQ: {
   // If -ffp-model= is seen, reset to fno-fast-math
@@ -3056,9 +3057,7 @@ static void RenderFloatingPointOptions(const ToolChain 
, const Driver ,
   SeenUnsafeMathModeOption = true;
   // ffast-math enables fortran rules for complex multiplication and
   // division.
-  std::string ComplexRangeStr = RenderComplexRangeOption("limited");
-  if (!ComplexRangeStr.empty())
-CmdArgs.push_back(Args.MakeArgString(ComplexRangeStr));
+  ComplexRangeStr = RenderComplexRangeOption("limited");
   break;
 }
 case options::OPT_fno_fast_math:
@@ -3215,6 +3214,8 @@ static void RenderFloatingPointOptions(const ToolChain 
, const Driver ,

[clang] [clang-tools-extra] [llvm] DAG: Fix ABI lowering with FP promote in strictfp functions (PR #74405)

2024-01-17 Thread Simon Pilgrim via cfe-commits

https://github.com/RKSimon approved this pull request.

LGTM

https://github.com/llvm/llvm-project/pull/74405
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[llvm] [clang] [X86] Use vXi1 for `k` constraint in inline asm (PR #77733)

2024-01-12 Thread Simon Pilgrim via cfe-commits

https://github.com/RKSimon commented:

Should we add test coverage for the gpr <-> mask transfers?

https://github.com/llvm/llvm-project/pull/77733
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [AVX10][Doc] Add documentation about AVX10 options and their attentions (PR #77925)

2024-01-12 Thread Simon Pilgrim via cfe-commits


@@ -3963,6 +3963,60 @@ implicitly included in later levels.
 - ``-march=x86-64-v3``: (close to Haswell) AVX, AVX2, BMI1, BMI2, F16C, FMA, 
LZCNT, MOVBE, XSAVE
 - ``-march=x86-64-v4``: AVX512F, AVX512BW, AVX512CD, AVX512DQ, AVX512VL
 
+`Intel AVX10 ISA `_ is
+a major new vector ISA incorporating the modern vectorization aspects of
+Intel AVX-512. This ISA will be supported on all future Intel processor.
+Users are supposed to use the new options ``-mavx10.N`` and ``-mavx10.N-512``
+on these processors and do not use traditional AVX512 options anymore.
+
+The ``N`` in ``-mavx10.N`` represents a continuous integer number starting
+from ``1``. ``-mavx10.N`` is an alias of ``-mavx10.N-256``, which means to
+enable all instructions within AVX10 version N at a maximum vector length of
+256 bits. ``-mavx10.N-512`` enables all instructions at a maximum vector
+length of 512 bits, which is a superset of instructions ``-mavx10.N`` enabled.
+
+Current binaries built with AVX512 features can run on Intel AVX10/512 capable
+processor without re-compile, but cannot run on AVX10/256 capable processor.
+Users need to re-compile their code with ``-mavx10.N``, and maybe update some
+code that calling to 512-bit X86 specific intrinsics and passing or returning
+512-bit vector types in function call, if they want to run on AVX10/256 capable
+processor. Binaries built with ``-mavx10.N`` can run on both AVX10/256 and
+AVX10/512 capable processor.
+
+Users can add a ``-mno-evex512`` in the command line with AVX512 options if
+they want run the binary on both legacy AVX512 processor and new AVX10/256

RKSimon wrote:

"they want to run"
processor -> processors

https://github.com/llvm/llvm-project/pull/77925
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [AVX10][Doc] Add documentation about AVX10 options and their attentions (PR #77925)

2024-01-12 Thread Simon Pilgrim via cfe-commits


@@ -3963,6 +3963,60 @@ implicitly included in later levels.
 - ``-march=x86-64-v3``: (close to Haswell) AVX, AVX2, BMI1, BMI2, F16C, FMA, 
LZCNT, MOVBE, XSAVE
 - ``-march=x86-64-v4``: AVX512F, AVX512BW, AVX512CD, AVX512DQ, AVX512VL
 
+`Intel AVX10 ISA `_ is
+a major new vector ISA incorporating the modern vectorization aspects of
+Intel AVX-512. This ISA will be supported on all future Intel processor.
+Users are supposed to use the new options ``-mavx10.N`` and ``-mavx10.N-512``
+on these processors and do not use traditional AVX512 options anymore.
+
+The ``N`` in ``-mavx10.N`` represents a continuous integer number starting
+from ``1``. ``-mavx10.N`` is an alias of ``-mavx10.N-256``, which means to
+enable all instructions within AVX10 version N at a maximum vector length of
+256 bits. ``-mavx10.N-512`` enables all instructions at a maximum vector
+length of 512 bits, which is a superset of instructions ``-mavx10.N`` enabled.
+
+Current binaries built with AVX512 features can run on Intel AVX10/512 capable
+processor without re-compile, but cannot run on AVX10/256 capable processor.
+Users need to re-compile their code with ``-mavx10.N``, and maybe update some
+code that calling to 512-bit X86 specific intrinsics and passing or returning
+512-bit vector types in function call, if they want to run on AVX10/256 capable
+processor. Binaries built with ``-mavx10.N`` can run on both AVX10/256 and
+AVX10/512 capable processor.
+
+Users can add a ``-mno-evex512`` in the command line with AVX512 options if
+they want run the binary on both legacy AVX512 processor and new AVX10/256
+capable processor. The option has the same constraints as ``-mavx10.N``, i.e.,
+cannot call to 512-bit X86 specific intrinsics and pass or return 512-bit 
vector
+types in function call.
+
+Users should avoid to use AVX512 features in function target attributes when
+develop code for AVX10. If they have to do so, they need to add an explicit

RKSimon wrote:

"Users should avoid using AVX512 features in function target attributes when 
developing code for AVX10."

https://github.com/llvm/llvm-project/pull/77925
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


  1   2   3   4   5   6   7   8   9   10   >