[clang] [llvm] [X86][AMX] Support AMX-AVX512 (PR #114070)

2024-11-08 Thread Phoebe Wang via cfe-commits

phoebewang wrote:

Thanks all! Fixed in https://github.com/llvm/llvm-project/pull/115581

https://github.com/llvm/llvm-project/pull/114070
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [X86][AMX] Support AMX-AVX512 (PR #114070)

2024-11-08 Thread Alan Zhao via cfe-commits

alanzhao1 wrote:

Figured out a repro - `immintrin.h` doesn't compile if both `-msse` and 
`-mno-sse2` are passed to clang:

```
$ cat ~/src/test-mac.c
#include

$ bin/clang -msse -mno-sse2 -o /dev/null -c ~/src/test-mac.c
In file included from /usr/local/google/home/ayzhao/src/test-mac.c:1:
In file included from 
/usr/local/google/home/ayzhao/src/llvm-project/build/lib/clang/20/include/immintrin.h:660:
/usr/local/google/home/ayzhao/src/llvm-project/build/lib/clang/20/include/amxavx512intrin.h:240:19:
 error: unknown type name '__m512bh'
  240 | static __inline__ __m512bh __DEFAULT_FN_ATTRS_AVX512
  |   ^
/usr/local/google/home/ayzhao/src/llvm-project/build/lib/clang/20/include/amxavx512intrin.h:243:10:
 error: returning '__attribute__((__vector_size__(32 * sizeof(__bf16 
__bf16' (vector of 32 '__bf16' values) from a function with incompatible result 
type 'int'
  243 |   return __builtin_ia32_tcvtrowps2pbf16h_internal(m, n, src, u);
  |  ^~
/usr/local/google/home/ayzhao/src/llvm-project/build/lib/clang/20/include/amxavx512intrin.h:246:19:
 error: unknown type name '__m512bh'
  246 | static __inline__ __m512bh __DEFAULT_FN_ATTRS_AVX512
  |   ^
/usr/local/google/home/ayzhao/src/llvm-project/build/lib/clang/20/include/amxavx512intrin.h:249:10:
 error: returning '__attribute__((__vector_size__(32 * sizeof(__bf16 
__bf16' (vector of 32 '__bf16' values) from a function with incompatible result 
type 'int'
  249 |   return __builtin_ia32_tcvtrowps2pbf16l_internal(m, n, src, u);
  |  ^~
/usr/local/google/home/ayzhao/src/llvm-project/build/lib/clang/20/include/amxavx512intrin.h:252:19:
 error: unknown type name '__m512h'
  252 | static __inline__ __m512h __DEFAULT_FN_ATTRS_AVX512 
_tile_cvtrowps2phh_internal(
  |   ^
/usr/local/google/home/ayzhao/src/llvm-project/build/lib/clang/20/include/amxavx512intrin.h:254:10:
 error: returning '__attribute__((__vector_size__(32 * sizeof(_Float16 
_Float16' (vector of 32 '_Float16' values) from a function with incompatible 
result type 'int'
  254 |   return __builtin_ia32_tcvtrowps2phh_internal(m, n, src, u);
  |  ^~~
/usr/local/google/home/ayzhao/src/llvm-project/build/lib/clang/20/include/amxavx512intrin.h:257:19:
 error: unknown type name '__m512h'
  257 | static __inline__ __m512h __DEFAULT_FN_ATTRS_AVX512 
_tile_cvtrowps2phl_internal(
  |   ^
/usr/local/google/home/ayzhao/src/llvm-project/build/lib/clang/20/include/amxavx512intrin.h:259:10:
 error: returning '__attribute__((__vector_size__(32 * sizeof(_Float16 
_Float16' (vector of 32 '_Float16' values) from a function with incompatible 
result type 'int'
  259 |   return __builtin_ia32_tcvtrowps2phl_internal(m, n, src, u);
  |  ^~~
/usr/local/google/home/ayzhao/src/llvm-project/build/lib/clang/20/include/amxavx512intrin.h:302:8:
 error: unknown type name '__m512bh'
  302 | static __m512bh __tile_cvtrowps2pbf16h(__tile1024i src0, unsigned src1) 
{
  |^
/usr/local/google/home/ayzhao/src/llvm-project/build/lib/clang/20/include/amxavx512intrin.h:321:8:
 error: unknown type name '__m512bh'
  321 | static __m512bh __tile_cvtrowps2pbf16l(__tile1024i src0, unsigned src1) 
{
  |^
/usr/local/google/home/ayzhao/src/llvm-project/build/lib/clang/20/include/amxavx512intrin.h:340:8:
 error: unknown type name '__m512h'
  340 | static __m512h __tile_cvtrowps2phh(__tile1024i src0, unsigned src1) {
  |^
/usr/local/google/home/ayzhao/src/llvm-project/build/lib/clang/20/include/amxavx512intrin.h:359:8:
 error: unknown type name '__m512h'
  359 | static __m512h __tile_cvtrowps2phl(__tile1024i src0, unsigned src1) {
  |^
12 errors generated.
```

Test was on an x64 linux system.

https://github.com/llvm/llvm-project/pull/114070
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [X86][AMX] Support AMX-AVX512 (PR #114070)

2024-11-08 Thread Pranav Kant via cfe-commits

pranavk wrote:

This is causing similar errors for us as well.

https://github.com/llvm/llvm-project/pull/114070
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [X86][AMX] Support AMX-AVX512 (PR #114070)

2024-11-08 Thread Alan Zhao via cfe-commits

alanzhao1 wrote:

> FYI this is causing Chrome X86 MacOS builds to fail due to `error: unknown 
> type name '__m512bh'`: https://crbug.com/378111077

As I mentioned in https://crbug.com/378111077#comment3, the issue is that we 
pull in avx512bf16intrin.h because `__SCE__` is not defined, but `__SSE2__` is 
also not defined, so we don't pull in the `typedef` for `__m512bh` (since 
they're guarded by macros that look for `__SSE2__`) and so on. We then pull in 
amxavx512intrin.h which was introduced in this PR which then fails to compile 
because we don't have the `typedef`s.

https://github.com/llvm/llvm-project/pull/114070
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [X86][AMX] Support AMX-AVX512 (PR #114070)

2024-11-08 Thread Alan Zhao via cfe-commits

alanzhao1 wrote:

FYI this is causing Chrome X86 MacOS builds to fail due to `error: unknown type 
name '__m512bh'`: https://crbug.com/378111077

https://github.com/llvm/llvm-project/pull/114070
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [X86][AMX] Support AMX-AVX512 (PR #114070)

2024-11-08 Thread LLVM Continuous Integration via cfe-commits

llvm-ci wrote:

LLVM Buildbot has detected a new failure on builder `clang-s390x-linux` running 
on `systemz-1` while building `clang,llvm` at step 5 "ninja check 1".

Full details are available at: 
https://lab.llvm.org/buildbot/#/builders/42/builds/1812


Here is the relevant piece of the build log for the reference

```
Step 5 (ninja check 1) failure: stage 1 checked (failure)
 TEST 'libFuzzer-s390x-default-Linux :: 
fuzzer-timeout.test' FAILED 
Exit Code: 1

Command Output (stderr):
--
RUN: at line 1: 
/home/uweigand/sandbox/buildbot/clang-s390x-linux/stage1/./bin/clang
-Wthread-safety -Wthread-safety-reference -Wthread-safety-beta   
--driver-mode=g++ -O2 -gline-tables-only -fsanitize=address,fuzzer 
-I/home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/compiler-rt/lib/fuzzer 
 
/home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/compiler-rt/test/fuzzer/TimeoutTest.cpp
 -o 
/home/uweigand/sandbox/buildbot/clang-s390x-linux/stage1/runtimes/runtimes-bins/compiler-rt/test/fuzzer/S390XDefaultLinuxConfig/Output/fuzzer-timeout.test.tmp-TimeoutTest
+ /home/uweigand/sandbox/buildbot/clang-s390x-linux/stage1/./bin/clang 
-Wthread-safety -Wthread-safety-reference -Wthread-safety-beta 
--driver-mode=g++ -O2 -gline-tables-only -fsanitize=address,fuzzer 
-I/home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/compiler-rt/lib/fuzzer 
/home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/compiler-rt/test/fuzzer/TimeoutTest.cpp
 -o 
/home/uweigand/sandbox/buildbot/clang-s390x-linux/stage1/runtimes/runtimes-bins/compiler-rt/test/fuzzer/S390XDefaultLinuxConfig/Output/fuzzer-timeout.test.tmp-TimeoutTest
RUN: at line 2: 
/home/uweigand/sandbox/buildbot/clang-s390x-linux/stage1/./bin/clang
-Wthread-safety -Wthread-safety-reference -Wthread-safety-beta   
--driver-mode=g++ -O2 -gline-tables-only -fsanitize=address,fuzzer 
-I/home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/compiler-rt/lib/fuzzer 
 
/home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/compiler-rt/test/fuzzer/TimeoutEmptyTest.cpp
 -o 
/home/uweigand/sandbox/buildbot/clang-s390x-linux/stage1/runtimes/runtimes-bins/compiler-rt/test/fuzzer/S390XDefaultLinuxConfig/Output/fuzzer-timeout.test.tmp-TimeoutEmptyTest
+ /home/uweigand/sandbox/buildbot/clang-s390x-linux/stage1/./bin/clang 
-Wthread-safety -Wthread-safety-reference -Wthread-safety-beta 
--driver-mode=g++ -O2 -gline-tables-only -fsanitize=address,fuzzer 
-I/home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/compiler-rt/lib/fuzzer 
/home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/compiler-rt/test/fuzzer/TimeoutEmptyTest.cpp
 -o 
/home/uweigand/sandbox/buildbot/clang-s390x-linux/stage1/runtimes/runtimes-bins/compiler-rt/test/fuzzer/S390XDefaultLinuxConfig/Output/fuzzer-timeout.test.tmp-TimeoutEmptyTest
RUN: at line 3: not  
/home/uweigand/sandbox/buildbot/clang-s390x-linux/stage1/runtimes/runtimes-bins/compiler-rt/test/fuzzer/S390XDefaultLinuxConfig/Output/fuzzer-timeout.test.tmp-TimeoutTest
 -timeout=1 2>&1 | FileCheck 
/home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/compiler-rt/test/fuzzer/fuzzer-timeout.test
 --check-prefix=TimeoutTest
+ not 
/home/uweigand/sandbox/buildbot/clang-s390x-linux/stage1/runtimes/runtimes-bins/compiler-rt/test/fuzzer/S390XDefaultLinuxConfig/Output/fuzzer-timeout.test.tmp-TimeoutTest
 -timeout=1
+ FileCheck 
/home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/compiler-rt/test/fuzzer/fuzzer-timeout.test
 --check-prefix=TimeoutTest
/home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/compiler-rt/test/fuzzer/fuzzer-timeout.test:7:14:
 error: TimeoutTest: expected string not found in input
TimeoutTest: #0
 ^
:19:44: note: scanning from here
==3335512== ERROR: libFuzzer: timeout after 1 seconds
   ^
:24:104: note: possible intended match here
AddressSanitizer: CHECK failed: asan_report.cpp:199 "((current_error_.kind)) == 
((kErrorKindInvalid))" (0x1, 0x0) (tid=3335512)

   ^

Input file: 
Check file: 
/home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/compiler-rt/test/fuzzer/fuzzer-timeout.test

-dump-input=help explains the following input dump.

Input was:
<<
   .
   .
   .
  14: MS: 1 InsertByte-; base unit: 
bf397014ecbce0b1be8d9011c77f6181927a357f 
  15: 0x48,0x69,0x21,0x48, 
  16: Hi!H 
  17: artifact_prefix='./'; Test unit written to 
./timeout-c9f7ef19d5ac7565f3dcaf7a3221ae711a187db5 
  18: Base64: SGkhSA== 
  19: ==3335512== ERROR: libFuzzer: timeout after 1 seconds 
check:7'0X~~ error: no 
match found
  20: AddressSanitizer:DEADLYSIGNAL 
check:7'0 ~~
  21: = 
check:7'0 

[clang] [llvm] [X86][AMX] Support AMX-AVX512 (PR #114070)

2024-11-08 Thread LLVM Continuous Integration via cfe-commits

llvm-ci wrote:

LLVM Buildbot has detected a new failure on builder 
`sanitizer-x86_64-linux-bootstrap-asan` running on `sanitizer-buildbot2` while 
building `clang,llvm` at step 2 "annotate".

Full details are available at: 
https://lab.llvm.org/buildbot/#/builders/52/builds/3564


Here is the relevant piece of the build log for the reference

```
Step 2 (annotate) failure: 'python 
../sanitizer_buildbot/sanitizers/zorg/buildbot/builders/sanitizers/buildbot_selector.py'
 (failure)
...
llvm-lit: 
/home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506:
 note: using lld-link: 
/home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/lld-link
llvm-lit: 
/home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506:
 note: using ld64.lld: 
/home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/ld64.lld
llvm-lit: 
/home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506:
 note: using wasm-ld: 
/home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/wasm-ld
llvm-lit: 
/home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506:
 note: using ld.lld: 
/home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/ld.lld
llvm-lit: 
/home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506:
 note: using lld-link: 
/home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/lld-link
llvm-lit: 
/home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506:
 note: using ld64.lld: 
/home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/ld64.lld
llvm-lit: 
/home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506:
 note: using wasm-ld: 
/home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/wasm-ld
llvm-lit: 
/home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/main.py:72:
 note: The test suite configuration requested an individual test timeout of 0 
seconds but a timeout of 900 seconds was requested on the command line. Forcing 
timeout to be 900 seconds.
-- Testing: 87001 of 87002 tests, 88 workers --
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90.
FAIL: lld :: ELF/allow-shlib-undefined.s (84780 of 87001)
 TEST 'lld :: ELF/allow-shlib-undefined.s' FAILED 

Exit Code: 1

Command Output (stderr):
--
RUN: at line 3: rm -rf 
/home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/tools/lld/test/ELF/Output/allow-shlib-undefined.s.tmp
 && split-file 
/home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/lld/test/ELF/allow-shlib-undefined.s
 
/home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/tools/lld/test/ELF/Output/allow-shlib-undefined.s.tmp
 && cd 
/home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/tools/lld/test/ELF/Output/allow-shlib-undefined.s.tmp
+ rm -rf 
/home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/tools/lld/test/ELF/Output/allow-shlib-undefined.s.tmp
+ split-file 
/home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/lld/test/ELF/allow-shlib-undefined.s
 
/home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/tools/lld/test/ELF/Output/allow-shlib-undefined.s.tmp
+ cd 
/home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/tools/lld/test/ELF/Output/allow-shlib-undefined.s.tmp
RUN: at line 4: 
/home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/llvm-mc 
-filetype=obj -triple=x86_64 main.s -o main.o
+ 
/home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/llvm-mc 
-filetype=obj -triple=x86_64 main.s -o main.o
RUN: at line 5: 
/home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/llvm-mc 
-filetype=obj -triple=x86_64 def.s -o def.o
+ 
/home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/llvm-mc 
-filetype=obj -triple=x86_64 def.s -o def.o
RUN: at line 6: 
/home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/llvm-mc 
-filetype=obj -triple=x86_64 def-hidden.s -o def-hidden.o
+ 
/home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/llvm-mc 
-filetype=obj -triple=x86_64 def-hidden.s -o def-hidden.o
RUN: at line 7: 
/home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/llvm-mc 
-filetype=obj -triple=x86_64 ref.s -o ref.o
+ 
/home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/llvm-mc 
-filetype=obj -triple=x86_64 ref.s -o ref.o
RUN: at line 8: 
/home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/llvm-mc 
-filetype=obj -triple=x86_64 a.s -o a.o && 
/home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/ld.lld 
-shared a.o -o a.so
+ 
/home/b/

[clang] [llvm] [X86][AMX] Support AMX-AVX512 (PR #114070)

2024-11-08 Thread Feng Zou via cfe-commits

https://github.com/fzou1 approved this pull request.

LGTM

https://github.com/llvm/llvm-project/pull/114070
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [X86][AMX] Support AMX-AVX512 (PR #114070)

2024-11-07 Thread Phoebe Wang via cfe-commits


@@ -369,3 +369,150 @@ let Predicates = [HasAMXTRANSPOSE, In64BitMode] in {
 }
   }
 } // HasAMXTILE, HasAMXTRANSPOSE
+
+multiclass m_tcvtrowd2ps {
+  let Predicates = [HasAMXAVX512, In64BitMode] in {

phoebewang wrote:

Done, thanks!

https://github.com/llvm/llvm-project/pull/114070
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [X86][AMX] Support AMX-AVX512 (PR #114070)

2024-11-07 Thread Phoebe Wang via cfe-commits

https://github.com/phoebewang updated 
https://github.com/llvm/llvm-project/pull/114070

>From 587d0105e7724db0f35fc5c8179519fa6319e5c8 Mon Sep 17 00:00:00 2001
From: "Wang, Phoebe" 
Date: Tue, 29 Oct 2024 22:29:25 +0800
Subject: [PATCH 1/6] [X86][AMX] Support AMX-AVX512

---
 clang/docs/ReleaseNotes.rst   |   2 +
 clang/include/clang/Basic/BuiltinsX86_64.def  |  13 +
 clang/include/clang/Driver/Options.td |   2 +
 clang/lib/Basic/Targets/X86.cpp   |   6 +
 clang/lib/Basic/Targets/X86.h |   1 +
 clang/lib/Headers/CMakeLists.txt  |   1 +
 clang/lib/Headers/amxavx512intrin.h   | 381 ++
 clang/lib/Headers/immintrin.h |   4 +
 clang/lib/Sema/SemaX86.cpp|   6 +
 clang/test/CodeGen/X86/amx_avx512_api.c   |  52 +++
 clang/test/CodeGen/X86/amxavx512-builtins.c   |  41 ++
 clang/test/CodeGen/attr-target-x86.c  |   8 +-
 clang/test/Driver/x86-target-features.c   |   7 +
 clang/test/Preprocessor/x86_target_features.c |   7 +
 llvm/include/llvm/IR/IntrinsicsX86.td |  50 +++
 .../llvm/TargetParser/X86TargetParser.def |   1 +
 llvm/lib/Target/X86/X86.td|   4 +
 llvm/lib/Target/X86/X86ExpandPseudo.cpp   |  64 ++-
 llvm/lib/Target/X86/X86ISelLowering.cpp   |  76 
 llvm/lib/Target/X86/X86InstrAMX.td| 147 +++
 llvm/lib/Target/X86/X86InstrPredicates.td |   1 +
 llvm/lib/Target/X86/X86LowerAMXType.cpp   |  11 +
 llvm/lib/Target/X86/X86PreTileConfig.cpp  |  19 +-
 llvm/lib/TargetParser/Host.cpp|   4 +
 llvm/lib/TargetParser/X86TargetParser.cpp |   2 +
 .../CodeGen/X86/amx-across-func-tilemovrow.ll | 171 
 .../test/CodeGen/X86/amx-avx512-intrinsics.ll | 116 ++
 .../CodeGen/X86/amx-tile-avx512-internals.ll  |  61 +++
 llvm/test/MC/Disassembler/X86/amx-avx512.txt  | 106 +
 llvm/test/MC/X86/amx-avx512-att.s | 105 +
 llvm/test/MC/X86/amx-avx512-intel.s   | 105 +
 31 files changed, 1564 insertions(+), 10 deletions(-)
 create mode 100644 clang/lib/Headers/amxavx512intrin.h
 create mode 100644 clang/test/CodeGen/X86/amx_avx512_api.c
 create mode 100644 clang/test/CodeGen/X86/amxavx512-builtins.c
 create mode 100644 llvm/test/CodeGen/X86/amx-across-func-tilemovrow.ll
 create mode 100644 llvm/test/CodeGen/X86/amx-avx512-intrinsics.ll
 create mode 100644 llvm/test/CodeGen/X86/amx-tile-avx512-internals.ll
 create mode 100644 llvm/test/MC/Disassembler/X86/amx-avx512.txt
 create mode 100644 llvm/test/MC/X86/amx-avx512-att.s
 create mode 100644 llvm/test/MC/X86/amx-avx512-intel.s

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index ce046a305c89b6..d45bd1240dd173 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -611,6 +611,8 @@ X86 Support
   * Supported MINMAX intrinsics of ``*_(mask(z)))_minmax(ne)_p[s|d|h|bh]`` and
   ``*_(mask(z)))_minmax_s[s|d|h]``.
 
+- Support ISA of ``AMX-AVX512``.
+
 - All intrinsics in adcintrin.h can now be used in constant expressions.
 
 - All intrinsics in adxintrin.h can now be used in constant expressions.
diff --git a/clang/include/clang/Basic/BuiltinsX86_64.def 
b/clang/include/clang/Basic/BuiltinsX86_64.def
index 2c591edb2835cd..70644f3f6b6054 100644
--- a/clang/include/clang/Basic/BuiltinsX86_64.def
+++ b/clang/include/clang/Basic/BuiltinsX86_64.def
@@ -128,6 +128,12 @@ TARGET_BUILTIN(__builtin_ia32_tdpbf16ps_internal, 
"V256iUsUsUsV256iV256iV256i",
 TARGET_BUILTIN(__builtin_ia32_tdpfp16ps_internal, 
"V256iUsUsUsV256iV256iV256i", "n", "amx-fp16")
 TARGET_BUILTIN(__builtin_ia32_tcmmimfp16ps_internal, 
"V256iUsUsUsV256iV256iV256i", "n", "amx-complex")
 TARGET_BUILTIN(__builtin_ia32_tcmmrlfp16ps_internal, 
"V256iUsUsUsV256iV256iV256i", "n", "amx-complex")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowd2ps_internal, "V16fUsUsV256iUi", "n", 
"amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowps2pbf16h_internal, "V32yUsUsV256iUi", 
"n", "amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowps2pbf16l_internal, "V32yUsUsV256iUi", 
"n", "amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowps2phh_internal, "V32xUsUsV256iUi", "n", 
"amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowps2phl_internal, "V32xUsUsV256iUi", "n", 
"amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tilemovrow_internal, "V16iUsUsV256iUi", "n", 
"amx-avx512")
 // AMX
 TARGET_BUILTIN(__builtin_ia32_tile_loadconfig, "vvC*", "n", "amx-tile")
 TARGET_BUILTIN(__builtin_ia32_tile_storeconfig, "vvC*", "n", "amx-tile")
@@ -148,6 +154,13 @@ TARGET_BUILTIN(__builtin_ia32_ptwrite64, "vUOi", "n", 
"ptwrite")
 TARGET_BUILTIN(__builtin_ia32_tcmmimfp16ps, "vIUcIUcIUc", "n", "amx-complex")
 TARGET_BUILTIN(__builtin_ia32_tcmmrlfp16ps, "vIUcIUcIUc", "n", "amx-complex")
 
+TARGET_BUILTIN(__builtin_ia32_tcvtrowd2ps, "V16fIUcUi", "n", "amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowps2pbf16h, "V32yIUcUi", "n", "amx-avx512")
+TARGET_BUILTIN(__builtin_i

[clang] [llvm] [X86][AMX] Support AMX-AVX512 (PR #114070)

2024-11-07 Thread Feng Zou via cfe-commits


@@ -369,3 +369,150 @@ let Predicates = [HasAMXTRANSPOSE, In64BitMode] in {
 }
   }
 } // HasAMXTILE, HasAMXTRANSPOSE
+
+multiclass m_tcvtrowd2ps {
+  let Predicates = [HasAMXAVX512, In64BitMode] in {

fzou1 wrote:

Should add HasAVX10_2_512 in line 374, 390 and 454?

https://github.com/llvm/llvm-project/pull/114070
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [X86][AMX] Support AMX-AVX512 (PR #114070)

2024-11-07 Thread Feng Zou via cfe-commits

https://github.com/fzou1 edited https://github.com/llvm/llvm-project/pull/114070
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [X86][AMX] Support AMX-AVX512 (PR #114070)

2024-11-07 Thread Feng Zou via cfe-commits

https://github.com/fzou1 commented:

LGTM except the last place probably missing avx10.2-512 dependency.

https://github.com/llvm/llvm-project/pull/114070
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [X86][AMX] Support AMX-AVX512 (PR #114070)

2024-11-07 Thread Phoebe Wang via cfe-commits


@@ -0,0 +1,381 @@
+/*===- amxavx512intrin.h - AMXAVX512 
===
+ *
+ * Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+ * See https://llvm.org/LICENSE.txt for license information.
+ * SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+ *
+ 
*======
+ */
+#ifndef __IMMINTRIN_H
+#error "Never use  directly; include  instead."
+#endif // __IMMINTRIN_H
+
+#ifndef __AMX_AVX512INTRIN_H
+#define __AMX_AVX512INTRIN_H
+#ifdef __x86_64__
+
+#define __DEFAULT_FN_ATTRS_AVX512  
\
+  __attribute__((__always_inline__, __nodebug__, __target__("amx-avx512")))

phoebewang wrote:

Yes, we need them all. Good catch!

https://github.com/llvm/llvm-project/pull/114070
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [X86][AMX] Support AMX-AVX512 (PR #114070)

2024-11-07 Thread Phoebe Wang via cfe-commits

https://github.com/phoebewang updated 
https://github.com/llvm/llvm-project/pull/114070

>From 587d0105e7724db0f35fc5c8179519fa6319e5c8 Mon Sep 17 00:00:00 2001
From: "Wang, Phoebe" 
Date: Tue, 29 Oct 2024 22:29:25 +0800
Subject: [PATCH 1/5] [X86][AMX] Support AMX-AVX512

---
 clang/docs/ReleaseNotes.rst   |   2 +
 clang/include/clang/Basic/BuiltinsX86_64.def  |  13 +
 clang/include/clang/Driver/Options.td |   2 +
 clang/lib/Basic/Targets/X86.cpp   |   6 +
 clang/lib/Basic/Targets/X86.h |   1 +
 clang/lib/Headers/CMakeLists.txt  |   1 +
 clang/lib/Headers/amxavx512intrin.h   | 381 ++
 clang/lib/Headers/immintrin.h |   4 +
 clang/lib/Sema/SemaX86.cpp|   6 +
 clang/test/CodeGen/X86/amx_avx512_api.c   |  52 +++
 clang/test/CodeGen/X86/amxavx512-builtins.c   |  41 ++
 clang/test/CodeGen/attr-target-x86.c  |   8 +-
 clang/test/Driver/x86-target-features.c   |   7 +
 clang/test/Preprocessor/x86_target_features.c |   7 +
 llvm/include/llvm/IR/IntrinsicsX86.td |  50 +++
 .../llvm/TargetParser/X86TargetParser.def |   1 +
 llvm/lib/Target/X86/X86.td|   4 +
 llvm/lib/Target/X86/X86ExpandPseudo.cpp   |  64 ++-
 llvm/lib/Target/X86/X86ISelLowering.cpp   |  76 
 llvm/lib/Target/X86/X86InstrAMX.td| 147 +++
 llvm/lib/Target/X86/X86InstrPredicates.td |   1 +
 llvm/lib/Target/X86/X86LowerAMXType.cpp   |  11 +
 llvm/lib/Target/X86/X86PreTileConfig.cpp  |  19 +-
 llvm/lib/TargetParser/Host.cpp|   4 +
 llvm/lib/TargetParser/X86TargetParser.cpp |   2 +
 .../CodeGen/X86/amx-across-func-tilemovrow.ll | 171 
 .../test/CodeGen/X86/amx-avx512-intrinsics.ll | 116 ++
 .../CodeGen/X86/amx-tile-avx512-internals.ll  |  61 +++
 llvm/test/MC/Disassembler/X86/amx-avx512.txt  | 106 +
 llvm/test/MC/X86/amx-avx512-att.s | 105 +
 llvm/test/MC/X86/amx-avx512-intel.s   | 105 +
 31 files changed, 1564 insertions(+), 10 deletions(-)
 create mode 100644 clang/lib/Headers/amxavx512intrin.h
 create mode 100644 clang/test/CodeGen/X86/amx_avx512_api.c
 create mode 100644 clang/test/CodeGen/X86/amxavx512-builtins.c
 create mode 100644 llvm/test/CodeGen/X86/amx-across-func-tilemovrow.ll
 create mode 100644 llvm/test/CodeGen/X86/amx-avx512-intrinsics.ll
 create mode 100644 llvm/test/CodeGen/X86/amx-tile-avx512-internals.ll
 create mode 100644 llvm/test/MC/Disassembler/X86/amx-avx512.txt
 create mode 100644 llvm/test/MC/X86/amx-avx512-att.s
 create mode 100644 llvm/test/MC/X86/amx-avx512-intel.s

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index ce046a305c89b6..d45bd1240dd173 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -611,6 +611,8 @@ X86 Support
   * Supported MINMAX intrinsics of ``*_(mask(z)))_minmax(ne)_p[s|d|h|bh]`` and
   ``*_(mask(z)))_minmax_s[s|d|h]``.
 
+- Support ISA of ``AMX-AVX512``.
+
 - All intrinsics in adcintrin.h can now be used in constant expressions.
 
 - All intrinsics in adxintrin.h can now be used in constant expressions.
diff --git a/clang/include/clang/Basic/BuiltinsX86_64.def 
b/clang/include/clang/Basic/BuiltinsX86_64.def
index 2c591edb2835cd..70644f3f6b6054 100644
--- a/clang/include/clang/Basic/BuiltinsX86_64.def
+++ b/clang/include/clang/Basic/BuiltinsX86_64.def
@@ -128,6 +128,12 @@ TARGET_BUILTIN(__builtin_ia32_tdpbf16ps_internal, 
"V256iUsUsUsV256iV256iV256i",
 TARGET_BUILTIN(__builtin_ia32_tdpfp16ps_internal, 
"V256iUsUsUsV256iV256iV256i", "n", "amx-fp16")
 TARGET_BUILTIN(__builtin_ia32_tcmmimfp16ps_internal, 
"V256iUsUsUsV256iV256iV256i", "n", "amx-complex")
 TARGET_BUILTIN(__builtin_ia32_tcmmrlfp16ps_internal, 
"V256iUsUsUsV256iV256iV256i", "n", "amx-complex")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowd2ps_internal, "V16fUsUsV256iUi", "n", 
"amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowps2pbf16h_internal, "V32yUsUsV256iUi", 
"n", "amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowps2pbf16l_internal, "V32yUsUsV256iUi", 
"n", "amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowps2phh_internal, "V32xUsUsV256iUi", "n", 
"amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowps2phl_internal, "V32xUsUsV256iUi", "n", 
"amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tilemovrow_internal, "V16iUsUsV256iUi", "n", 
"amx-avx512")
 // AMX
 TARGET_BUILTIN(__builtin_ia32_tile_loadconfig, "vvC*", "n", "amx-tile")
 TARGET_BUILTIN(__builtin_ia32_tile_storeconfig, "vvC*", "n", "amx-tile")
@@ -148,6 +154,13 @@ TARGET_BUILTIN(__builtin_ia32_ptwrite64, "vUOi", "n", 
"ptwrite")
 TARGET_BUILTIN(__builtin_ia32_tcmmimfp16ps, "vIUcIUcIUc", "n", "amx-complex")
 TARGET_BUILTIN(__builtin_ia32_tcmmrlfp16ps, "vIUcIUcIUc", "n", "amx-complex")
 
+TARGET_BUILTIN(__builtin_ia32_tcvtrowd2ps, "V16fIUcUi", "n", "amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowps2pbf16h, "V32yIUcUi", "n", "amx-avx512")
+TARGET_BUILTIN(__builtin_i

[clang] [llvm] [X86][AMX] Support AMX-AVX512 (PR #114070)

2024-11-07 Thread Phoebe Wang via cfe-commits


@@ -133,6 +133,12 @@ TARGET_BUILTIN(__builtin_ia32_t2rpntlvwz0t1_internal, 
"vUsUsUsV256i*V256i*vC*z",
 TARGET_BUILTIN(__builtin_ia32_t2rpntlvwz1_internal, "vUsUsUsV256i*V256i*vC*z", 
"n", "amx-transpose")
 TARGET_BUILTIN(__builtin_ia32_t2rpntlvwz1t1_internal, 
"vUsUsUsV256i*V256i*vC*z", "n", "amx-transpose")
 TARGET_BUILTIN(__builtin_ia32_ttransposed_internal, "V256iUsUsV256i", "n", 
"amx-transpose")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowd2ps_internal, "V16fUsUsV256iUi", "n", 
"amx-avx512")

phoebewang wrote:

Yes, sorry I missed that parts.

https://github.com/llvm/llvm-project/pull/114070
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [X86][AMX] Support AMX-AVX512 (PR #114070)

2024-11-07 Thread Phoebe Wang via cfe-commits

https://github.com/phoebewang updated 
https://github.com/llvm/llvm-project/pull/114070

>From 587d0105e7724db0f35fc5c8179519fa6319e5c8 Mon Sep 17 00:00:00 2001
From: "Wang, Phoebe" 
Date: Tue, 29 Oct 2024 22:29:25 +0800
Subject: [PATCH 1/4] [X86][AMX] Support AMX-AVX512

---
 clang/docs/ReleaseNotes.rst   |   2 +
 clang/include/clang/Basic/BuiltinsX86_64.def  |  13 +
 clang/include/clang/Driver/Options.td |   2 +
 clang/lib/Basic/Targets/X86.cpp   |   6 +
 clang/lib/Basic/Targets/X86.h |   1 +
 clang/lib/Headers/CMakeLists.txt  |   1 +
 clang/lib/Headers/amxavx512intrin.h   | 381 ++
 clang/lib/Headers/immintrin.h |   4 +
 clang/lib/Sema/SemaX86.cpp|   6 +
 clang/test/CodeGen/X86/amx_avx512_api.c   |  52 +++
 clang/test/CodeGen/X86/amxavx512-builtins.c   |  41 ++
 clang/test/CodeGen/attr-target-x86.c  |   8 +-
 clang/test/Driver/x86-target-features.c   |   7 +
 clang/test/Preprocessor/x86_target_features.c |   7 +
 llvm/include/llvm/IR/IntrinsicsX86.td |  50 +++
 .../llvm/TargetParser/X86TargetParser.def |   1 +
 llvm/lib/Target/X86/X86.td|   4 +
 llvm/lib/Target/X86/X86ExpandPseudo.cpp   |  64 ++-
 llvm/lib/Target/X86/X86ISelLowering.cpp   |  76 
 llvm/lib/Target/X86/X86InstrAMX.td| 147 +++
 llvm/lib/Target/X86/X86InstrPredicates.td |   1 +
 llvm/lib/Target/X86/X86LowerAMXType.cpp   |  11 +
 llvm/lib/Target/X86/X86PreTileConfig.cpp  |  19 +-
 llvm/lib/TargetParser/Host.cpp|   4 +
 llvm/lib/TargetParser/X86TargetParser.cpp |   2 +
 .../CodeGen/X86/amx-across-func-tilemovrow.ll | 171 
 .../test/CodeGen/X86/amx-avx512-intrinsics.ll | 116 ++
 .../CodeGen/X86/amx-tile-avx512-internals.ll  |  61 +++
 llvm/test/MC/Disassembler/X86/amx-avx512.txt  | 106 +
 llvm/test/MC/X86/amx-avx512-att.s | 105 +
 llvm/test/MC/X86/amx-avx512-intel.s   | 105 +
 31 files changed, 1564 insertions(+), 10 deletions(-)
 create mode 100644 clang/lib/Headers/amxavx512intrin.h
 create mode 100644 clang/test/CodeGen/X86/amx_avx512_api.c
 create mode 100644 clang/test/CodeGen/X86/amxavx512-builtins.c
 create mode 100644 llvm/test/CodeGen/X86/amx-across-func-tilemovrow.ll
 create mode 100644 llvm/test/CodeGen/X86/amx-avx512-intrinsics.ll
 create mode 100644 llvm/test/CodeGen/X86/amx-tile-avx512-internals.ll
 create mode 100644 llvm/test/MC/Disassembler/X86/amx-avx512.txt
 create mode 100644 llvm/test/MC/X86/amx-avx512-att.s
 create mode 100644 llvm/test/MC/X86/amx-avx512-intel.s

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index ce046a305c89b6..d45bd1240dd173 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -611,6 +611,8 @@ X86 Support
   * Supported MINMAX intrinsics of ``*_(mask(z)))_minmax(ne)_p[s|d|h|bh]`` and
   ``*_(mask(z)))_minmax_s[s|d|h]``.
 
+- Support ISA of ``AMX-AVX512``.
+
 - All intrinsics in adcintrin.h can now be used in constant expressions.
 
 - All intrinsics in adxintrin.h can now be used in constant expressions.
diff --git a/clang/include/clang/Basic/BuiltinsX86_64.def 
b/clang/include/clang/Basic/BuiltinsX86_64.def
index 2c591edb2835cd..70644f3f6b6054 100644
--- a/clang/include/clang/Basic/BuiltinsX86_64.def
+++ b/clang/include/clang/Basic/BuiltinsX86_64.def
@@ -128,6 +128,12 @@ TARGET_BUILTIN(__builtin_ia32_tdpbf16ps_internal, 
"V256iUsUsUsV256iV256iV256i",
 TARGET_BUILTIN(__builtin_ia32_tdpfp16ps_internal, 
"V256iUsUsUsV256iV256iV256i", "n", "amx-fp16")
 TARGET_BUILTIN(__builtin_ia32_tcmmimfp16ps_internal, 
"V256iUsUsUsV256iV256iV256i", "n", "amx-complex")
 TARGET_BUILTIN(__builtin_ia32_tcmmrlfp16ps_internal, 
"V256iUsUsUsV256iV256iV256i", "n", "amx-complex")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowd2ps_internal, "V16fUsUsV256iUi", "n", 
"amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowps2pbf16h_internal, "V32yUsUsV256iUi", 
"n", "amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowps2pbf16l_internal, "V32yUsUsV256iUi", 
"n", "amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowps2phh_internal, "V32xUsUsV256iUi", "n", 
"amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowps2phl_internal, "V32xUsUsV256iUi", "n", 
"amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tilemovrow_internal, "V16iUsUsV256iUi", "n", 
"amx-avx512")
 // AMX
 TARGET_BUILTIN(__builtin_ia32_tile_loadconfig, "vvC*", "n", "amx-tile")
 TARGET_BUILTIN(__builtin_ia32_tile_storeconfig, "vvC*", "n", "amx-tile")
@@ -148,6 +154,13 @@ TARGET_BUILTIN(__builtin_ia32_ptwrite64, "vUOi", "n", 
"ptwrite")
 TARGET_BUILTIN(__builtin_ia32_tcmmimfp16ps, "vIUcIUcIUc", "n", "amx-complex")
 TARGET_BUILTIN(__builtin_ia32_tcmmrlfp16ps, "vIUcIUcIUc", "n", "amx-complex")
 
+TARGET_BUILTIN(__builtin_ia32_tcvtrowd2ps, "V16fIUcUi", "n", "amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowps2pbf16h, "V32yIUcUi", "n", "amx-avx512")
+TARGET_BUILTIN(__builtin_i

[clang] [llvm] [X86][AMX] Support AMX-AVX512 (PR #114070)

2024-11-07 Thread Feng Zou via cfe-commits


@@ -0,0 +1,381 @@
+/*===- amxavx512intrin.h - AMXAVX512 
===
+ *
+ * Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+ * See https://llvm.org/LICENSE.txt for license information.
+ * SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+ *
+ 
*======
+ */
+#ifndef __IMMINTRIN_H
+#error "Never use  directly; include  instead."
+#endif // __IMMINTRIN_H
+
+#ifndef __AMX_AVX512INTRIN_H
+#define __AMX_AVX512INTRIN_H
+#ifdef __x86_64__
+
+#define __DEFAULT_FN_ATTRS_AVX512  
\
+  __attribute__((__always_inline__, __nodebug__, __target__("amx-avx512")))

fzou1 wrote:

If AVX10.2-512 feature dependency is needed for internal APIs, we may create 
another attribute with AVX10.2-512 and add it to internal APIs.

https://github.com/llvm/llvm-project/pull/114070
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [X86][AMX] Support AMX-AVX512 (PR #114070)

2024-11-07 Thread Feng Zou via cfe-commits


@@ -133,6 +133,12 @@ TARGET_BUILTIN(__builtin_ia32_t2rpntlvwz0t1_internal, 
"vUsUsUsV256i*V256i*vC*z",
 TARGET_BUILTIN(__builtin_ia32_t2rpntlvwz1_internal, "vUsUsUsV256i*V256i*vC*z", 
"n", "amx-transpose")
 TARGET_BUILTIN(__builtin_ia32_t2rpntlvwz1t1_internal, 
"vUsUsUsV256i*V256i*vC*z", "n", "amx-transpose")
 TARGET_BUILTIN(__builtin_ia32_ttransposed_internal, "V256iUsUsV256i", "n", 
"amx-transpose")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowd2ps_internal, "V16fUsUsV256iUi", "n", 
"amx-avx512")

fzou1 wrote:

Is it necessary to add avx10.2-512 feature for these internal APIs? With that, 
we may detect errors for new APIs if there is no AVX10.2-512 target feature.

https://github.com/llvm/llvm-project/pull/114070
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [X86][AMX] Support AMX-AVX512 (PR #114070)

2024-11-06 Thread Phoebe Wang via cfe-commits


@@ -133,6 +133,12 @@ TARGET_BUILTIN(__builtin_ia32_t2rpntlvwz0t1_internal, 
"vUsUsUsV256i*V256i*vC*z",
 TARGET_BUILTIN(__builtin_ia32_t2rpntlvwz1_internal, "vUsUsUsV256i*V256i*vC*z", 
"n", "amx-transpose")
 TARGET_BUILTIN(__builtin_ia32_t2rpntlvwz1t1_internal, 
"vUsUsUsV256i*V256i*vC*z", "n", "amx-transpose")
 TARGET_BUILTIN(__builtin_ia32_ttransposed_internal, "V256iUsUsV256i", "n", 
"amx-transpose")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowd2ps_internal, "V16fUsUsV256iUi", "n", 
"amx-avx512")

phoebewang wrote:

Good catch! Done.

https://github.com/llvm/llvm-project/pull/114070
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [X86][AMX] Support AMX-AVX512 (PR #114070)

2024-11-06 Thread Phoebe Wang via cfe-commits

https://github.com/phoebewang updated 
https://github.com/llvm/llvm-project/pull/114070

>From 587d0105e7724db0f35fc5c8179519fa6319e5c8 Mon Sep 17 00:00:00 2001
From: "Wang, Phoebe" 
Date: Tue, 29 Oct 2024 22:29:25 +0800
Subject: [PATCH 1/3] [X86][AMX] Support AMX-AVX512

---
 clang/docs/ReleaseNotes.rst   |   2 +
 clang/include/clang/Basic/BuiltinsX86_64.def  |  13 +
 clang/include/clang/Driver/Options.td |   2 +
 clang/lib/Basic/Targets/X86.cpp   |   6 +
 clang/lib/Basic/Targets/X86.h |   1 +
 clang/lib/Headers/CMakeLists.txt  |   1 +
 clang/lib/Headers/amxavx512intrin.h   | 381 ++
 clang/lib/Headers/immintrin.h |   4 +
 clang/lib/Sema/SemaX86.cpp|   6 +
 clang/test/CodeGen/X86/amx_avx512_api.c   |  52 +++
 clang/test/CodeGen/X86/amxavx512-builtins.c   |  41 ++
 clang/test/CodeGen/attr-target-x86.c  |   8 +-
 clang/test/Driver/x86-target-features.c   |   7 +
 clang/test/Preprocessor/x86_target_features.c |   7 +
 llvm/include/llvm/IR/IntrinsicsX86.td |  50 +++
 .../llvm/TargetParser/X86TargetParser.def |   1 +
 llvm/lib/Target/X86/X86.td|   4 +
 llvm/lib/Target/X86/X86ExpandPseudo.cpp   |  64 ++-
 llvm/lib/Target/X86/X86ISelLowering.cpp   |  76 
 llvm/lib/Target/X86/X86InstrAMX.td| 147 +++
 llvm/lib/Target/X86/X86InstrPredicates.td |   1 +
 llvm/lib/Target/X86/X86LowerAMXType.cpp   |  11 +
 llvm/lib/Target/X86/X86PreTileConfig.cpp  |  19 +-
 llvm/lib/TargetParser/Host.cpp|   4 +
 llvm/lib/TargetParser/X86TargetParser.cpp |   2 +
 .../CodeGen/X86/amx-across-func-tilemovrow.ll | 171 
 .../test/CodeGen/X86/amx-avx512-intrinsics.ll | 116 ++
 .../CodeGen/X86/amx-tile-avx512-internals.ll  |  61 +++
 llvm/test/MC/Disassembler/X86/amx-avx512.txt  | 106 +
 llvm/test/MC/X86/amx-avx512-att.s | 105 +
 llvm/test/MC/X86/amx-avx512-intel.s   | 105 +
 31 files changed, 1564 insertions(+), 10 deletions(-)
 create mode 100644 clang/lib/Headers/amxavx512intrin.h
 create mode 100644 clang/test/CodeGen/X86/amx_avx512_api.c
 create mode 100644 clang/test/CodeGen/X86/amxavx512-builtins.c
 create mode 100644 llvm/test/CodeGen/X86/amx-across-func-tilemovrow.ll
 create mode 100644 llvm/test/CodeGen/X86/amx-avx512-intrinsics.ll
 create mode 100644 llvm/test/CodeGen/X86/amx-tile-avx512-internals.ll
 create mode 100644 llvm/test/MC/Disassembler/X86/amx-avx512.txt
 create mode 100644 llvm/test/MC/X86/amx-avx512-att.s
 create mode 100644 llvm/test/MC/X86/amx-avx512-intel.s

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index ce046a305c89b6..d45bd1240dd173 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -611,6 +611,8 @@ X86 Support
   * Supported MINMAX intrinsics of ``*_(mask(z)))_minmax(ne)_p[s|d|h|bh]`` and
   ``*_(mask(z)))_minmax_s[s|d|h]``.
 
+- Support ISA of ``AMX-AVX512``.
+
 - All intrinsics in adcintrin.h can now be used in constant expressions.
 
 - All intrinsics in adxintrin.h can now be used in constant expressions.
diff --git a/clang/include/clang/Basic/BuiltinsX86_64.def 
b/clang/include/clang/Basic/BuiltinsX86_64.def
index 2c591edb2835cd..70644f3f6b6054 100644
--- a/clang/include/clang/Basic/BuiltinsX86_64.def
+++ b/clang/include/clang/Basic/BuiltinsX86_64.def
@@ -128,6 +128,12 @@ TARGET_BUILTIN(__builtin_ia32_tdpbf16ps_internal, 
"V256iUsUsUsV256iV256iV256i",
 TARGET_BUILTIN(__builtin_ia32_tdpfp16ps_internal, 
"V256iUsUsUsV256iV256iV256i", "n", "amx-fp16")
 TARGET_BUILTIN(__builtin_ia32_tcmmimfp16ps_internal, 
"V256iUsUsUsV256iV256iV256i", "n", "amx-complex")
 TARGET_BUILTIN(__builtin_ia32_tcmmrlfp16ps_internal, 
"V256iUsUsUsV256iV256iV256i", "n", "amx-complex")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowd2ps_internal, "V16fUsUsV256iUi", "n", 
"amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowps2pbf16h_internal, "V32yUsUsV256iUi", 
"n", "amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowps2pbf16l_internal, "V32yUsUsV256iUi", 
"n", "amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowps2phh_internal, "V32xUsUsV256iUi", "n", 
"amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowps2phl_internal, "V32xUsUsV256iUi", "n", 
"amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tilemovrow_internal, "V16iUsUsV256iUi", "n", 
"amx-avx512")
 // AMX
 TARGET_BUILTIN(__builtin_ia32_tile_loadconfig, "vvC*", "n", "amx-tile")
 TARGET_BUILTIN(__builtin_ia32_tile_storeconfig, "vvC*", "n", "amx-tile")
@@ -148,6 +154,13 @@ TARGET_BUILTIN(__builtin_ia32_ptwrite64, "vUOi", "n", 
"ptwrite")
 TARGET_BUILTIN(__builtin_ia32_tcmmimfp16ps, "vIUcIUcIUc", "n", "amx-complex")
 TARGET_BUILTIN(__builtin_ia32_tcmmrlfp16ps, "vIUcIUcIUc", "n", "amx-complex")
 
+TARGET_BUILTIN(__builtin_ia32_tcvtrowd2ps, "V16fIUcUi", "n", "amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowps2pbf16h, "V32yIUcUi", "n", "amx-avx512")
+TARGET_BUILTIN(__builtin_i

[clang] [llvm] [X86][AMX] Support AMX-AVX512 (PR #114070)

2024-11-06 Thread Feng Zou via cfe-commits


@@ -133,6 +133,12 @@ TARGET_BUILTIN(__builtin_ia32_t2rpntlvwz0t1_internal, 
"vUsUsUsV256i*V256i*vC*z",
 TARGET_BUILTIN(__builtin_ia32_t2rpntlvwz1_internal, "vUsUsUsV256i*V256i*vC*z", 
"n", "amx-transpose")
 TARGET_BUILTIN(__builtin_ia32_t2rpntlvwz1t1_internal, 
"vUsUsUsV256i*V256i*vC*z", "n", "amx-transpose")
 TARGET_BUILTIN(__builtin_ia32_ttransposed_internal, "V256iUsUsV256i", "n", 
"amx-transpose")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowd2ps_internal, "V16fUsUsV256iUi", "n", 
"amx-avx512")

fzou1 wrote:

Is "avx10.2-512" feature needed to be added for the intrinsics here and there?

https://github.com/llvm/llvm-project/pull/114070
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [X86][AMX] Support AMX-AVX512 (PR #114070)

2024-11-06 Thread via cfe-commits

github-actions[bot] wrote:




:warning: C/C++ code formatter, clang-format found issues in your code. 
:warning:



You can test this locally with the following command:


``bash
git-clang-format --diff 37ce18951fded6be1de319b05b968918cb45c00b 
c38da4e614434b02158444f31f50aee61f9879f6 --extensions cpp,c,h -- 
clang/lib/Headers/amxavx512intrin.h clang/test/CodeGen/X86/amx_avx512_api.c 
clang/test/CodeGen/X86/amxavx512-builtins.c clang/lib/Basic/Targets/X86.cpp 
clang/lib/Basic/Targets/X86.h clang/lib/Headers/immintrin.h 
clang/lib/Sema/SemaX86.cpp clang/test/CodeGen/attr-target-x86.c 
clang/test/Driver/x86-target-features.c 
clang/test/Preprocessor/x86_target_features.c 
llvm/lib/Target/X86/X86ExpandPseudo.cpp llvm/lib/Target/X86/X86ISelLowering.cpp 
llvm/lib/Target/X86/X86LowerAMXType.cpp 
llvm/lib/Target/X86/X86PreTileConfig.cpp llvm/lib/TargetParser/Host.cpp 
llvm/lib/TargetParser/X86TargetParser.cpp
``





View the diff from clang-format here.


``diff
diff --git a/clang/lib/Headers/amxavx512intrin.h 
b/clang/lib/Headers/amxavx512intrin.h
index 9bfa868cf4..1e6ee35dea 100644
--- a/clang/lib/Headers/amxavx512intrin.h
+++ b/clang/lib/Headers/amxavx512intrin.h
@@ -36,7 +36,8 @@
 /// IF i + row_chunk / 4 >= tsrc.colsb / 4
 /// dst.dword[i] := 0
 /// ELSE
-/// dst.f32[i] := 
CONVERT_INT32_TO_FP32(tsrc.row[row_index].dword[row_chunk/4+i], RNE)
+/// dst.f32[i] :=
+/// CONVERT_INT32_TO_FP32(tsrc.row[row_index].dword[row_chunk/4+i], 
RNE)
 /// FI
 /// ENDFOR
 /// dst[MAX_VL-1:VL] := 0
@@ -72,7 +73,8 @@
 /// dst.dword[i] := 0
 /// ELSE
 /// dst.word[2*i+0] := 0
-/// dst.bf16[2*i+1] := 
CONVERT_FP32_TO_BF16(tsrc.row[row_index].fp32[row_chunk/4+i], RNE)
+/// dst.bf16[2*i+1] :=
+/// CONVERT_FP32_TO_BF16(tsrc.row[row_index].fp32[row_chunk/4+i], RNE)
 /// FI
 /// ENDFOR
 /// dst[MAX_VL-1:VL] := 0
@@ -109,7 +111,8 @@
 /// dst.dword[i] := 0
 /// ELSE
 /// dst.word[2*i+1] := 0
-/// dst.bf16[2*i+0] := 
CONVERT_FP32_TO_BF16(tsrc.row[row_index].fp32[row_chunk/4+i], RNE)
+/// dst.bf16[2*i+0] :=
+/// CONVERT_FP32_TO_BF16(tsrc.row[row_index].fp32[row_chunk/4+i], RNE)
 /// FI
 /// ENDFOR
 /// dst[MAX_VL-1:VL] := 0
@@ -146,7 +149,8 @@
 /// dst.dword[i] := 0
 /// ELSE
 /// dst.word[2*i+0] := 0
-/// dst.fp16[2*i+1] := 
CONVERT_FP32_TO_FP16(tsrc.row[row_index].fp32[row_chunk/4+i], RNE)
+/// dst.fp16[2*i+1] :=
+/// CONVERT_FP32_TO_FP16(tsrc.row[row_index].fp32[row_chunk/4+i], RNE)
 /// FI
 /// ENDFOR
 /// dst[MAX_VL-1:VL] := 0
@@ -182,7 +186,8 @@
 /// dst.dword[i] := 0
 /// ELSE
 /// dst.word[2*i+1] := 0
-/// dst.fp16[2*i+0] := 
CONVERT_FP32_TO_FP16(tsrc.row[row_index].fp32[row_chunk/4+i], RNE)
+/// dst.fp16[2*i+0] :=
+/// CONVERT_FP32_TO_FP16(tsrc.row[row_index].fp32[row_chunk/4+i], RNE)
 /// FI
 /// ENDFOR
 /// dst[MAX_VL-1:VL] := 0
diff --git a/llvm/lib/Target/X86/X86ExpandPseudo.cpp 
b/llvm/lib/Target/X86/X86ExpandPseudo.cpp
index 52519f49e7..9511a82f0e 100644
--- a/llvm/lib/Target/X86/X86ExpandPseudo.cpp
+++ b/llvm/lib/Target/X86/X86ExpandPseudo.cpp
@@ -770,7 +770,8 @@ bool X86ExpandPseudo::expandMI(MachineBasicBlock &MBB,
 case X86::PTDPBUUDV:   Opc = X86::TDPBUUD; break;
 case X86::PTDPBF16PSV: Opc = X86::TDPBF16PS; break;
 case X86::PTDPFP16PSV: Opc = X86::TDPFP16PS; break;
-default: llvm_unreachable("Unexpected Opcode");
+default:
+  llvm_unreachable("Unexpected Opcode");
 }
 MI.setDesc(TII->get(Opc));
 MI.tieOperands(0, 1);

``




https://github.com/llvm/llvm-project/pull/114070
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [X86][AMX] Support AMX-AVX512 (PR #114070)

2024-11-06 Thread Phoebe Wang via cfe-commits


@@ -559,12 +559,68 @@ bool X86ExpandPseudo::expandMI(MachineBasicBlock &MBB,
 return true;
   }
   case X86::PTILELOADDV:
-  case X86::PTILELOADDT1V: {
+  case X86::PTILELOADDT1V:
+  case X86::PTCVTROWD2PSrreV:
+  case X86::PTCVTROWD2PSrriV:
+  case X86::PTCVTROWPS2PBF16HrreV:
+  case X86::PTCVTROWPS2PBF16HrriV:
+  case X86::PTCVTROWPS2PBF16LrreV:
+  case X86::PTCVTROWPS2PBF16LrriV:
+  case X86::PTCVTROWPS2PHHrreV:
+  case X86::PTCVTROWPS2PHHrriV:
+  case X86::PTCVTROWPS2PHLrreV:
+  case X86::PTCVTROWPS2PHLrriV:
+  case X86::PTILEMOVROWrreV:
+  case X86::PTILEMOVROWrriV: {
 for (unsigned i = 2; i > 0; --i)
   MI.removeOperand(i);
-unsigned Opc = Opcode == X86::PTILELOADDV
-   ? GET_EGPR_IF_ENABLED(X86::TILELOADD)
-   : GET_EGPR_IF_ENABLED(X86::TILELOADDT1);
+unsigned Opc;
+switch (Opcode) {
+case X86::PTILELOADDV:
+  Opc = GET_EGPR_IF_ENABLED(X86::TILELOADD);
+  break;
+case X86::PTILELOADDT1V:
+  Opc = GET_EGPR_IF_ENABLED(X86::TILELOADDT1);
+  break;
+case X86::PTCVTROWD2PSrreV:
+  Opc = X86::TCVTROWD2PSrre;
+  break;
+case X86::PTCVTROWD2PSrriV:
+  Opc = X86::TCVTROWD2PSrri;
+  break;
+case X86::PTCVTROWPS2PBF16HrreV:
+  Opc = X86::TCVTROWPS2PBF16Hrre;
+  break;
+case X86::PTCVTROWPS2PBF16HrriV:
+  Opc = X86::TCVTROWPS2PBF16Hrri;
+  break;
+case X86::PTCVTROWPS2PBF16LrreV:
+  Opc = X86::TCVTROWPS2PBF16Lrre;
+  break;
+case X86::PTCVTROWPS2PBF16LrriV:
+  Opc = X86::TCVTROWPS2PBF16Lrri;
+  break;
+case X86::PTCVTROWPS2PHHrreV:
+  Opc = X86::TCVTROWPS2PHHrre;
+  break;
+case X86::PTCVTROWPS2PHHrriV:
+  Opc = X86::TCVTROWPS2PHHrri;
+  break;
+case X86::PTCVTROWPS2PHLrreV:
+  Opc = X86::TCVTROWPS2PHLrre;
+  break;
+case X86::PTCVTROWPS2PHLrriV:
+  Opc = X86::TCVTROWPS2PHLrri;
+  break;
+case X86::PTILEMOVROWrreV:
+  Opc = X86::TILEMOVROWrre;
+  break;
+case X86::PTILEMOVROWrriV:
+  Opc = X86::TILEMOVROWrri;
+  break;
+default:
+  llvm_unreachable("Impossible Opcode!");

phoebewang wrote:

Done.

https://github.com/llvm/llvm-project/pull/114070
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [X86][AMX] Support AMX-AVX512 (PR #114070)

2024-11-06 Thread Phoebe Wang via cfe-commits


@@ -0,0 +1,381 @@
+/*===- amxavx512intrin.h - AMXAVX512 
===
+ *
+ * Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+ * See https://llvm.org/LICENSE.txt for license information.
+ * SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+ *
+ 
*======
+ */
+#ifndef __IMMINTRIN_H
+#error "Never use  directly; include  instead."
+#endif // __IMMINTRIN_H
+
+#ifndef __AMX_AVX512INTRIN_H
+#define __AMX_AVX512INTRIN_H
+#ifdef __x86_64__
+
+#define __DEFAULT_FN_ATTRS_AVX512  
\
+  __attribute__((__always_inline__, __nodebug__, __target__("amx-avx512")))
+
+/// Moves a row from a tile register to a zmm destination register, converting
+///the int32 source elements to fp32. The row of the tile is selected by an
+///32b GPR.
+///
+/// \headerfile 
+///
+/// \code
+/// __m512i _tile_cvtrowd2ps(__tile tsrc, unsigned int row);
+/// \endcode
+///
+/// \code{.operation}
+/// VL := 512
+/// VL_bytes := VL >> 3
+/// row_index := row & 0x
+/// row_chunk := ((row >> 16) & 0x) * VL_bytes
+/// FOR i := 0 TO (VL_bytes / 4) - 1
+/// IF i + row_chunk / 4 >= tsrc.colsb / 4
+/// dst.dword[i] := 0
+/// ELSE
+/// dst.f32[i] := 
CONVERT_INT32_TO_FP32(tsrc.row[row_index].dword[row_chunk/4+i], RNE)
+/// FI
+/// ENDFOR
+/// dst[MAX_VL-1:VL] := 0
+/// zero_tileconfig_start()
+/// \endcode
+///
+/// This intrinsic corresponds to the \c TCVTROWD2PS instruction.
+///
+/// \param tsrc
+///The 1st source tile. Max size is 1024 Bytes.
+/// \param row
+///The row of the source tile
+#define _tile_cvtrowd2ps(tsrc, row) __builtin_ia32_tcvtrowd2ps(tsrc, row)
+
+/// Moves a row from a tile register to a zmm destination register, converting
+///the fp32 source elements to bf16. It places the resulting bf16 elements
+///in the high 16 bits within each dword. The row of the tile is selected
+///by an 32b GPR.

phoebewang wrote:

Done.

https://github.com/llvm/llvm-project/pull/114070
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [X86][AMX] Support AMX-AVX512 (PR #114070)

2024-11-06 Thread Phoebe Wang via cfe-commits


@@ -0,0 +1,381 @@
+/*===- amxavx512intrin.h - AMXAVX512 
===
+ *
+ * Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+ * See https://llvm.org/LICENSE.txt for license information.
+ * SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+ *
+ 
*======
+ */
+#ifndef __IMMINTRIN_H
+#error "Never use  directly; include  instead."
+#endif // __IMMINTRIN_H
+
+#ifndef __AMX_AVX512INTRIN_H
+#define __AMX_AVX512INTRIN_H
+#ifdef __x86_64__
+
+#define __DEFAULT_FN_ATTRS_AVX512  
\
+  __attribute__((__always_inline__, __nodebug__, __target__("amx-avx512")))
+
+/// Moves a row from a tile register to a zmm destination register, converting
+///the int32 source elements to fp32. The row of the tile is selected by an
+///32b GPR.
+///
+/// \headerfile 
+///
+/// \code
+/// __m512i _tile_cvtrowd2ps(__tile tsrc, unsigned int row);
+/// \endcode
+///
+/// \code{.operation}
+/// VL := 512
+/// VL_bytes := VL >> 3
+/// row_index := row & 0x
+/// row_chunk := ((row >> 16) & 0x) * VL_bytes
+/// FOR i := 0 TO (VL_bytes / 4) - 1
+/// IF i + row_chunk / 4 >= tsrc.colsb / 4
+/// dst.dword[i] := 0
+/// ELSE
+/// dst.f32[i] := 
CONVERT_INT32_TO_FP32(tsrc.row[row_index].dword[row_chunk/4+i], RNE)
+/// FI
+/// ENDFOR
+/// dst[MAX_VL-1:VL] := 0
+/// zero_tileconfig_start()
+/// \endcode
+///
+/// This intrinsic corresponds to the \c TCVTROWD2PS instruction.
+///
+/// \param tsrc
+///The 1st source tile. Max size is 1024 Bytes.

phoebewang wrote:

Done.

https://github.com/llvm/llvm-project/pull/114070
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [X86][AMX] Support AMX-AVX512 (PR #114070)

2024-11-06 Thread Phoebe Wang via cfe-commits


@@ -0,0 +1,381 @@
+/*===- amxavx512intrin.h - AMXAVX512 
===
+ *
+ * Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+ * See https://llvm.org/LICENSE.txt for license information.
+ * SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+ *
+ 
*======
+ */
+#ifndef __IMMINTRIN_H
+#error "Never use  directly; include  instead."
+#endif // __IMMINTRIN_H
+
+#ifndef __AMX_AVX512INTRIN_H
+#define __AMX_AVX512INTRIN_H
+#ifdef __x86_64__
+
+#define __DEFAULT_FN_ATTRS_AVX512  
\
+  __attribute__((__always_inline__, __nodebug__, __target__("amx-avx512")))
+
+/// Moves a row from a tile register to a zmm destination register, converting
+///the int32 source elements to fp32. The row of the tile is selected by an

phoebewang wrote:

Done.

https://github.com/llvm/llvm-project/pull/114070
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [X86][AMX] Support AMX-AVX512 (PR #114070)

2024-11-06 Thread Phoebe Wang via cfe-commits

https://github.com/phoebewang updated 
https://github.com/llvm/llvm-project/pull/114070

>From 587d0105e7724db0f35fc5c8179519fa6319e5c8 Mon Sep 17 00:00:00 2001
From: "Wang, Phoebe" 
Date: Tue, 29 Oct 2024 22:29:25 +0800
Subject: [PATCH 1/2] [X86][AMX] Support AMX-AVX512

---
 clang/docs/ReleaseNotes.rst   |   2 +
 clang/include/clang/Basic/BuiltinsX86_64.def  |  13 +
 clang/include/clang/Driver/Options.td |   2 +
 clang/lib/Basic/Targets/X86.cpp   |   6 +
 clang/lib/Basic/Targets/X86.h |   1 +
 clang/lib/Headers/CMakeLists.txt  |   1 +
 clang/lib/Headers/amxavx512intrin.h   | 381 ++
 clang/lib/Headers/immintrin.h |   4 +
 clang/lib/Sema/SemaX86.cpp|   6 +
 clang/test/CodeGen/X86/amx_avx512_api.c   |  52 +++
 clang/test/CodeGen/X86/amxavx512-builtins.c   |  41 ++
 clang/test/CodeGen/attr-target-x86.c  |   8 +-
 clang/test/Driver/x86-target-features.c   |   7 +
 clang/test/Preprocessor/x86_target_features.c |   7 +
 llvm/include/llvm/IR/IntrinsicsX86.td |  50 +++
 .../llvm/TargetParser/X86TargetParser.def |   1 +
 llvm/lib/Target/X86/X86.td|   4 +
 llvm/lib/Target/X86/X86ExpandPseudo.cpp   |  64 ++-
 llvm/lib/Target/X86/X86ISelLowering.cpp   |  76 
 llvm/lib/Target/X86/X86InstrAMX.td| 147 +++
 llvm/lib/Target/X86/X86InstrPredicates.td |   1 +
 llvm/lib/Target/X86/X86LowerAMXType.cpp   |  11 +
 llvm/lib/Target/X86/X86PreTileConfig.cpp  |  19 +-
 llvm/lib/TargetParser/Host.cpp|   4 +
 llvm/lib/TargetParser/X86TargetParser.cpp |   2 +
 .../CodeGen/X86/amx-across-func-tilemovrow.ll | 171 
 .../test/CodeGen/X86/amx-avx512-intrinsics.ll | 116 ++
 .../CodeGen/X86/amx-tile-avx512-internals.ll  |  61 +++
 llvm/test/MC/Disassembler/X86/amx-avx512.txt  | 106 +
 llvm/test/MC/X86/amx-avx512-att.s | 105 +
 llvm/test/MC/X86/amx-avx512-intel.s   | 105 +
 31 files changed, 1564 insertions(+), 10 deletions(-)
 create mode 100644 clang/lib/Headers/amxavx512intrin.h
 create mode 100644 clang/test/CodeGen/X86/amx_avx512_api.c
 create mode 100644 clang/test/CodeGen/X86/amxavx512-builtins.c
 create mode 100644 llvm/test/CodeGen/X86/amx-across-func-tilemovrow.ll
 create mode 100644 llvm/test/CodeGen/X86/amx-avx512-intrinsics.ll
 create mode 100644 llvm/test/CodeGen/X86/amx-tile-avx512-internals.ll
 create mode 100644 llvm/test/MC/Disassembler/X86/amx-avx512.txt
 create mode 100644 llvm/test/MC/X86/amx-avx512-att.s
 create mode 100644 llvm/test/MC/X86/amx-avx512-intel.s

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index ce046a305c89b6..d45bd1240dd173 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -611,6 +611,8 @@ X86 Support
   * Supported MINMAX intrinsics of ``*_(mask(z)))_minmax(ne)_p[s|d|h|bh]`` and
   ``*_(mask(z)))_minmax_s[s|d|h]``.
 
+- Support ISA of ``AMX-AVX512``.
+
 - All intrinsics in adcintrin.h can now be used in constant expressions.
 
 - All intrinsics in adxintrin.h can now be used in constant expressions.
diff --git a/clang/include/clang/Basic/BuiltinsX86_64.def 
b/clang/include/clang/Basic/BuiltinsX86_64.def
index 2c591edb2835cd..70644f3f6b6054 100644
--- a/clang/include/clang/Basic/BuiltinsX86_64.def
+++ b/clang/include/clang/Basic/BuiltinsX86_64.def
@@ -128,6 +128,12 @@ TARGET_BUILTIN(__builtin_ia32_tdpbf16ps_internal, 
"V256iUsUsUsV256iV256iV256i",
 TARGET_BUILTIN(__builtin_ia32_tdpfp16ps_internal, 
"V256iUsUsUsV256iV256iV256i", "n", "amx-fp16")
 TARGET_BUILTIN(__builtin_ia32_tcmmimfp16ps_internal, 
"V256iUsUsUsV256iV256iV256i", "n", "amx-complex")
 TARGET_BUILTIN(__builtin_ia32_tcmmrlfp16ps_internal, 
"V256iUsUsUsV256iV256iV256i", "n", "amx-complex")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowd2ps_internal, "V16fUsUsV256iUi", "n", 
"amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowps2pbf16h_internal, "V32yUsUsV256iUi", 
"n", "amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowps2pbf16l_internal, "V32yUsUsV256iUi", 
"n", "amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowps2phh_internal, "V32xUsUsV256iUi", "n", 
"amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowps2phl_internal, "V32xUsUsV256iUi", "n", 
"amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tilemovrow_internal, "V16iUsUsV256iUi", "n", 
"amx-avx512")
 // AMX
 TARGET_BUILTIN(__builtin_ia32_tile_loadconfig, "vvC*", "n", "amx-tile")
 TARGET_BUILTIN(__builtin_ia32_tile_storeconfig, "vvC*", "n", "amx-tile")
@@ -148,6 +154,13 @@ TARGET_BUILTIN(__builtin_ia32_ptwrite64, "vUOi", "n", 
"ptwrite")
 TARGET_BUILTIN(__builtin_ia32_tcmmimfp16ps, "vIUcIUcIUc", "n", "amx-complex")
 TARGET_BUILTIN(__builtin_ia32_tcmmrlfp16ps, "vIUcIUcIUc", "n", "amx-complex")
 
+TARGET_BUILTIN(__builtin_ia32_tcvtrowd2ps, "V16fIUcUi", "n", "amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowps2pbf16h, "V32yIUcUi", "n", "amx-avx512")
+TARGET_BUILTIN(__builtin_i

[clang] [llvm] [X86][AMX] Support AMX-AVX512 (PR #114070)

2024-11-06 Thread Phoebe Wang via cfe-commits

https://github.com/phoebewang updated 
https://github.com/llvm/llvm-project/pull/114070

>From 587d0105e7724db0f35fc5c8179519fa6319e5c8 Mon Sep 17 00:00:00 2001
From: "Wang, Phoebe" 
Date: Tue, 29 Oct 2024 22:29:25 +0800
Subject: [PATCH] [X86][AMX] Support AMX-AVX512

---
 clang/docs/ReleaseNotes.rst   |   2 +
 clang/include/clang/Basic/BuiltinsX86_64.def  |  13 +
 clang/include/clang/Driver/Options.td |   2 +
 clang/lib/Basic/Targets/X86.cpp   |   6 +
 clang/lib/Basic/Targets/X86.h |   1 +
 clang/lib/Headers/CMakeLists.txt  |   1 +
 clang/lib/Headers/amxavx512intrin.h   | 381 ++
 clang/lib/Headers/immintrin.h |   4 +
 clang/lib/Sema/SemaX86.cpp|   6 +
 clang/test/CodeGen/X86/amx_avx512_api.c   |  52 +++
 clang/test/CodeGen/X86/amxavx512-builtins.c   |  41 ++
 clang/test/CodeGen/attr-target-x86.c  |   8 +-
 clang/test/Driver/x86-target-features.c   |   7 +
 clang/test/Preprocessor/x86_target_features.c |   7 +
 llvm/include/llvm/IR/IntrinsicsX86.td |  50 +++
 .../llvm/TargetParser/X86TargetParser.def |   1 +
 llvm/lib/Target/X86/X86.td|   4 +
 llvm/lib/Target/X86/X86ExpandPseudo.cpp   |  64 ++-
 llvm/lib/Target/X86/X86ISelLowering.cpp   |  76 
 llvm/lib/Target/X86/X86InstrAMX.td| 147 +++
 llvm/lib/Target/X86/X86InstrPredicates.td |   1 +
 llvm/lib/Target/X86/X86LowerAMXType.cpp   |  11 +
 llvm/lib/Target/X86/X86PreTileConfig.cpp  |  19 +-
 llvm/lib/TargetParser/Host.cpp|   4 +
 llvm/lib/TargetParser/X86TargetParser.cpp |   2 +
 .../CodeGen/X86/amx-across-func-tilemovrow.ll | 171 
 .../test/CodeGen/X86/amx-avx512-intrinsics.ll | 116 ++
 .../CodeGen/X86/amx-tile-avx512-internals.ll  |  61 +++
 llvm/test/MC/Disassembler/X86/amx-avx512.txt  | 106 +
 llvm/test/MC/X86/amx-avx512-att.s | 105 +
 llvm/test/MC/X86/amx-avx512-intel.s   | 105 +
 31 files changed, 1564 insertions(+), 10 deletions(-)
 create mode 100644 clang/lib/Headers/amxavx512intrin.h
 create mode 100644 clang/test/CodeGen/X86/amx_avx512_api.c
 create mode 100644 clang/test/CodeGen/X86/amxavx512-builtins.c
 create mode 100644 llvm/test/CodeGen/X86/amx-across-func-tilemovrow.ll
 create mode 100644 llvm/test/CodeGen/X86/amx-avx512-intrinsics.ll
 create mode 100644 llvm/test/CodeGen/X86/amx-tile-avx512-internals.ll
 create mode 100644 llvm/test/MC/Disassembler/X86/amx-avx512.txt
 create mode 100644 llvm/test/MC/X86/amx-avx512-att.s
 create mode 100644 llvm/test/MC/X86/amx-avx512-intel.s

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index ce046a305c89b6..d45bd1240dd173 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -611,6 +611,8 @@ X86 Support
   * Supported MINMAX intrinsics of ``*_(mask(z)))_minmax(ne)_p[s|d|h|bh]`` and
   ``*_(mask(z)))_minmax_s[s|d|h]``.
 
+- Support ISA of ``AMX-AVX512``.
+
 - All intrinsics in adcintrin.h can now be used in constant expressions.
 
 - All intrinsics in adxintrin.h can now be used in constant expressions.
diff --git a/clang/include/clang/Basic/BuiltinsX86_64.def 
b/clang/include/clang/Basic/BuiltinsX86_64.def
index 2c591edb2835cd..70644f3f6b6054 100644
--- a/clang/include/clang/Basic/BuiltinsX86_64.def
+++ b/clang/include/clang/Basic/BuiltinsX86_64.def
@@ -128,6 +128,12 @@ TARGET_BUILTIN(__builtin_ia32_tdpbf16ps_internal, 
"V256iUsUsUsV256iV256iV256i",
 TARGET_BUILTIN(__builtin_ia32_tdpfp16ps_internal, 
"V256iUsUsUsV256iV256iV256i", "n", "amx-fp16")
 TARGET_BUILTIN(__builtin_ia32_tcmmimfp16ps_internal, 
"V256iUsUsUsV256iV256iV256i", "n", "amx-complex")
 TARGET_BUILTIN(__builtin_ia32_tcmmrlfp16ps_internal, 
"V256iUsUsUsV256iV256iV256i", "n", "amx-complex")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowd2ps_internal, "V16fUsUsV256iUi", "n", 
"amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowps2pbf16h_internal, "V32yUsUsV256iUi", 
"n", "amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowps2pbf16l_internal, "V32yUsUsV256iUi", 
"n", "amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowps2phh_internal, "V32xUsUsV256iUi", "n", 
"amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowps2phl_internal, "V32xUsUsV256iUi", "n", 
"amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tilemovrow_internal, "V16iUsUsV256iUi", "n", 
"amx-avx512")
 // AMX
 TARGET_BUILTIN(__builtin_ia32_tile_loadconfig, "vvC*", "n", "amx-tile")
 TARGET_BUILTIN(__builtin_ia32_tile_storeconfig, "vvC*", "n", "amx-tile")
@@ -148,6 +154,13 @@ TARGET_BUILTIN(__builtin_ia32_ptwrite64, "vUOi", "n", 
"ptwrite")
 TARGET_BUILTIN(__builtin_ia32_tcmmimfp16ps, "vIUcIUcIUc", "n", "amx-complex")
 TARGET_BUILTIN(__builtin_ia32_tcmmrlfp16ps, "vIUcIUcIUc", "n", "amx-complex")
 
+TARGET_BUILTIN(__builtin_ia32_tcvtrowd2ps, "V16fIUcUi", "n", "amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowps2pbf16h, "V32yIUcUi", "n", "amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_

[clang] [llvm] [X86][AMX] Support AMX-AVX512 (PR #114070)

2024-11-06 Thread Feng Zou via cfe-commits


@@ -0,0 +1,381 @@
+/*===- amxavx512intrin.h - AMXAVX512 
===
+ *
+ * Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+ * See https://llvm.org/LICENSE.txt for license information.
+ * SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+ *
+ 
*======
+ */
+#ifndef __IMMINTRIN_H
+#error "Never use  directly; include  instead."
+#endif // __IMMINTRIN_H
+
+#ifndef __AMX_AVX512INTRIN_H
+#define __AMX_AVX512INTRIN_H
+#ifdef __x86_64__
+
+#define __DEFAULT_FN_ATTRS_AVX512  
\
+  __attribute__((__always_inline__, __nodebug__, __target__("amx-avx512")))
+
+/// Moves a row from a tile register to a zmm destination register, converting
+///the int32 source elements to fp32. The row of the tile is selected by an

fzou1 wrote:

an -> a.

https://github.com/llvm/llvm-project/pull/114070
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [X86][AMX] Support AMX-AVX512 (PR #114070)

2024-11-06 Thread Feng Zou via cfe-commits


@@ -0,0 +1,381 @@
+/*===- amxavx512intrin.h - AMXAVX512 
===
+ *
+ * Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+ * See https://llvm.org/LICENSE.txt for license information.
+ * SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+ *
+ 
*======
+ */
+#ifndef __IMMINTRIN_H
+#error "Never use  directly; include  instead."
+#endif // __IMMINTRIN_H
+
+#ifndef __AMX_AVX512INTRIN_H
+#define __AMX_AVX512INTRIN_H
+#ifdef __x86_64__
+
+#define __DEFAULT_FN_ATTRS_AVX512  
\
+  __attribute__((__always_inline__, __nodebug__, __target__("amx-avx512")))
+
+/// Moves a row from a tile register to a zmm destination register, converting
+///the int32 source elements to fp32. The row of the tile is selected by an
+///32b GPR.
+///
+/// \headerfile 
+///
+/// \code
+/// __m512i _tile_cvtrowd2ps(__tile tsrc, unsigned int row);
+/// \endcode
+///
+/// \code{.operation}
+/// VL := 512
+/// VL_bytes := VL >> 3
+/// row_index := row & 0x
+/// row_chunk := ((row >> 16) & 0x) * VL_bytes
+/// FOR i := 0 TO (VL_bytes / 4) - 1
+/// IF i + row_chunk / 4 >= tsrc.colsb / 4
+/// dst.dword[i] := 0
+/// ELSE
+/// dst.f32[i] := 
CONVERT_INT32_TO_FP32(tsrc.row[row_index].dword[row_chunk/4+i], RNE)
+/// FI
+/// ENDFOR
+/// dst[MAX_VL-1:VL] := 0
+/// zero_tileconfig_start()
+/// \endcode
+///
+/// This intrinsic corresponds to the \c TCVTROWD2PS instruction.
+///
+/// \param tsrc
+///The 1st source tile. Max size is 1024 Bytes.

fzou1 wrote:

Remove "1st" since there is only one source tile. The following should be 
updated as this.

https://github.com/llvm/llvm-project/pull/114070
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [X86][AMX] Support AMX-AVX512 (PR #114070)

2024-11-06 Thread Feng Zou via cfe-commits


@@ -0,0 +1,381 @@
+/*===- amxavx512intrin.h - AMXAVX512 
===
+ *
+ * Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+ * See https://llvm.org/LICENSE.txt for license information.
+ * SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+ *
+ 
*======
+ */
+#ifndef __IMMINTRIN_H
+#error "Never use  directly; include  instead."
+#endif // __IMMINTRIN_H
+
+#ifndef __AMX_AVX512INTRIN_H
+#define __AMX_AVX512INTRIN_H
+#ifdef __x86_64__
+
+#define __DEFAULT_FN_ATTRS_AVX512  
\
+  __attribute__((__always_inline__, __nodebug__, __target__("amx-avx512")))
+
+/// Moves a row from a tile register to a zmm destination register, converting
+///the int32 source elements to fp32. The row of the tile is selected by an
+///32b GPR.
+///
+/// \headerfile 
+///
+/// \code
+/// __m512i _tile_cvtrowd2ps(__tile tsrc, unsigned int row);
+/// \endcode
+///
+/// \code{.operation}
+/// VL := 512
+/// VL_bytes := VL >> 3
+/// row_index := row & 0x
+/// row_chunk := ((row >> 16) & 0x) * VL_bytes
+/// FOR i := 0 TO (VL_bytes / 4) - 1
+/// IF i + row_chunk / 4 >= tsrc.colsb / 4
+/// dst.dword[i] := 0
+/// ELSE
+/// dst.f32[i] := 
CONVERT_INT32_TO_FP32(tsrc.row[row_index].dword[row_chunk/4+i], RNE)
+/// FI
+/// ENDFOR
+/// dst[MAX_VL-1:VL] := 0
+/// zero_tileconfig_start()
+/// \endcode
+///
+/// This intrinsic corresponds to the \c TCVTROWD2PS instruction.
+///
+/// \param tsrc
+///The 1st source tile. Max size is 1024 Bytes.
+/// \param row
+///The row of the source tile
+#define _tile_cvtrowd2ps(tsrc, row) __builtin_ia32_tcvtrowd2ps(tsrc, row)
+
+/// Moves a row from a tile register to a zmm destination register, converting
+///the fp32 source elements to bf16. It places the resulting bf16 elements
+///in the high 16 bits within each dword. The row of the tile is selected
+///by an 32b GPR.

fzou1 wrote:

an -> a. The following "an 32b" should be updated to "a 32b" too.

https://github.com/llvm/llvm-project/pull/114070
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [X86][AMX] Support AMX-AVX512 (PR #114070)

2024-11-06 Thread Feng Zou via cfe-commits


@@ -559,12 +559,68 @@ bool X86ExpandPseudo::expandMI(MachineBasicBlock &MBB,
 return true;
   }
   case X86::PTILELOADDV:
-  case X86::PTILELOADDT1V: {
+  case X86::PTILELOADDT1V:
+  case X86::PTCVTROWD2PSrreV:
+  case X86::PTCVTROWD2PSrriV:
+  case X86::PTCVTROWPS2PBF16HrreV:
+  case X86::PTCVTROWPS2PBF16HrriV:
+  case X86::PTCVTROWPS2PBF16LrreV:
+  case X86::PTCVTROWPS2PBF16LrriV:
+  case X86::PTCVTROWPS2PHHrreV:
+  case X86::PTCVTROWPS2PHHrriV:
+  case X86::PTCVTROWPS2PHLrreV:
+  case X86::PTCVTROWPS2PHLrriV:
+  case X86::PTILEMOVROWrreV:
+  case X86::PTILEMOVROWrriV: {
 for (unsigned i = 2; i > 0; --i)
   MI.removeOperand(i);
-unsigned Opc = Opcode == X86::PTILELOADDV
-   ? GET_EGPR_IF_ENABLED(X86::TILELOADD)
-   : GET_EGPR_IF_ENABLED(X86::TILELOADDT1);
+unsigned Opc;
+switch (Opcode) {
+case X86::PTILELOADDV:
+  Opc = GET_EGPR_IF_ENABLED(X86::TILELOADD);
+  break;
+case X86::PTILELOADDT1V:
+  Opc = GET_EGPR_IF_ENABLED(X86::TILELOADDT1);
+  break;
+case X86::PTCVTROWD2PSrreV:
+  Opc = X86::TCVTROWD2PSrre;
+  break;
+case X86::PTCVTROWD2PSrriV:
+  Opc = X86::TCVTROWD2PSrri;
+  break;
+case X86::PTCVTROWPS2PBF16HrreV:
+  Opc = X86::TCVTROWPS2PBF16Hrre;
+  break;
+case X86::PTCVTROWPS2PBF16HrriV:
+  Opc = X86::TCVTROWPS2PBF16Hrri;
+  break;
+case X86::PTCVTROWPS2PBF16LrreV:
+  Opc = X86::TCVTROWPS2PBF16Lrre;
+  break;
+case X86::PTCVTROWPS2PBF16LrriV:
+  Opc = X86::TCVTROWPS2PBF16Lrri;
+  break;
+case X86::PTCVTROWPS2PHHrreV:
+  Opc = X86::TCVTROWPS2PHHrre;
+  break;
+case X86::PTCVTROWPS2PHHrriV:
+  Opc = X86::TCVTROWPS2PHHrri;
+  break;
+case X86::PTCVTROWPS2PHLrreV:
+  Opc = X86::TCVTROWPS2PHLrre;
+  break;
+case X86::PTCVTROWPS2PHLrriV:
+  Opc = X86::TCVTROWPS2PHLrri;
+  break;
+case X86::PTILEMOVROWrreV:
+  Opc = X86::TILEMOVROWrre;
+  break;
+case X86::PTILEMOVROWrriV:
+  Opc = X86::TILEMOVROWrri;
+  break;
+default:
+  llvm_unreachable("Impossible Opcode!");

fzou1 wrote:

Better to change "Impossible" to "Invalid".

https://github.com/llvm/llvm-project/pull/114070
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [X86][AMX] Support AMX-AVX512 (PR #114070)

2024-10-29 Thread via cfe-commits

llvmbot wrote:




@llvm/pr-subscribers-clang-driver

Author: Phoebe Wang (phoebewang)


Changes



---

Patch is 81.89 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/114070.diff


31 Files Affected:

- (modified) clang/docs/ReleaseNotes.rst (+2) 
- (modified) clang/include/clang/Basic/BuiltinsX86_64.def (+13) 
- (modified) clang/include/clang/Driver/Options.td (+2) 
- (modified) clang/lib/Basic/Targets/X86.cpp (+6) 
- (modified) clang/lib/Basic/Targets/X86.h (+1) 
- (modified) clang/lib/Headers/CMakeLists.txt (+1) 
- (added) clang/lib/Headers/amxavx512intrin.h (+381) 
- (modified) clang/lib/Headers/immintrin.h (+4) 
- (modified) clang/lib/Sema/SemaX86.cpp (+6) 
- (added) clang/test/CodeGen/X86/amx_avx512_api.c (+52) 
- (added) clang/test/CodeGen/X86/amxavx512-builtins.c (+41) 
- (modified) clang/test/CodeGen/attr-target-x86.c (+4-4) 
- (modified) clang/test/Driver/x86-target-features.c (+7) 
- (modified) clang/test/Preprocessor/x86_target_features.c (+7) 
- (modified) llvm/include/llvm/IR/IntrinsicsX86.td (+50) 
- (modified) llvm/include/llvm/TargetParser/X86TargetParser.def (+1) 
- (modified) llvm/lib/Target/X86/X86.td (+4) 
- (modified) llvm/lib/Target/X86/X86ExpandPseudo.cpp (+60-4) 
- (modified) llvm/lib/Target/X86/X86ISelLowering.cpp (+76) 
- (modified) llvm/lib/Target/X86/X86InstrAMX.td (+147) 
- (modified) llvm/lib/Target/X86/X86InstrPredicates.td (+1) 
- (modified) llvm/lib/Target/X86/X86LowerAMXType.cpp (+11) 
- (modified) llvm/lib/Target/X86/X86PreTileConfig.cpp (+17-2) 
- (modified) llvm/lib/TargetParser/Host.cpp (+4) 
- (modified) llvm/lib/TargetParser/X86TargetParser.cpp (+2) 
- (added) llvm/test/CodeGen/X86/amx-across-func-tilemovrow.ll (+171) 
- (added) llvm/test/CodeGen/X86/amx-avx512-intrinsics.ll (+116) 
- (added) llvm/test/CodeGen/X86/amx-tile-avx512-internals.ll (+61) 
- (added) llvm/test/MC/Disassembler/X86/amx-avx512.txt (+106) 
- (added) llvm/test/MC/X86/amx-avx512-att.s (+105) 
- (added) llvm/test/MC/X86/amx-avx512-intel.s (+105) 


``diff
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index ce046a305c89b6..d45bd1240dd173 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -611,6 +611,8 @@ X86 Support
   * Supported MINMAX intrinsics of ``*_(mask(z)))_minmax(ne)_p[s|d|h|bh]`` and
   ``*_(mask(z)))_minmax_s[s|d|h]``.
 
+- Support ISA of ``AMX-AVX512``.
+
 - All intrinsics in adcintrin.h can now be used in constant expressions.
 
 - All intrinsics in adxintrin.h can now be used in constant expressions.
diff --git a/clang/include/clang/Basic/BuiltinsX86_64.def 
b/clang/include/clang/Basic/BuiltinsX86_64.def
index 2c591edb2835cd..70644f3f6b6054 100644
--- a/clang/include/clang/Basic/BuiltinsX86_64.def
+++ b/clang/include/clang/Basic/BuiltinsX86_64.def
@@ -128,6 +128,12 @@ TARGET_BUILTIN(__builtin_ia32_tdpbf16ps_internal, 
"V256iUsUsUsV256iV256iV256i",
 TARGET_BUILTIN(__builtin_ia32_tdpfp16ps_internal, 
"V256iUsUsUsV256iV256iV256i", "n", "amx-fp16")
 TARGET_BUILTIN(__builtin_ia32_tcmmimfp16ps_internal, 
"V256iUsUsUsV256iV256iV256i", "n", "amx-complex")
 TARGET_BUILTIN(__builtin_ia32_tcmmrlfp16ps_internal, 
"V256iUsUsUsV256iV256iV256i", "n", "amx-complex")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowd2ps_internal, "V16fUsUsV256iUi", "n", 
"amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowps2pbf16h_internal, "V32yUsUsV256iUi", 
"n", "amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowps2pbf16l_internal, "V32yUsUsV256iUi", 
"n", "amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowps2phh_internal, "V32xUsUsV256iUi", "n", 
"amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowps2phl_internal, "V32xUsUsV256iUi", "n", 
"amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tilemovrow_internal, "V16iUsUsV256iUi", "n", 
"amx-avx512")
 // AMX
 TARGET_BUILTIN(__builtin_ia32_tile_loadconfig, "vvC*", "n", "amx-tile")
 TARGET_BUILTIN(__builtin_ia32_tile_storeconfig, "vvC*", "n", "amx-tile")
@@ -148,6 +154,13 @@ TARGET_BUILTIN(__builtin_ia32_ptwrite64, "vUOi", "n", 
"ptwrite")
 TARGET_BUILTIN(__builtin_ia32_tcmmimfp16ps, "vIUcIUcIUc", "n", "amx-complex")
 TARGET_BUILTIN(__builtin_ia32_tcmmrlfp16ps, "vIUcIUcIUc", "n", "amx-complex")
 
+TARGET_BUILTIN(__builtin_ia32_tcvtrowd2ps, "V16fIUcUi", "n", "amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowps2pbf16h, "V32yIUcUi", "n", "amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowps2pbf16l, "V32yIUcUi", "n", "amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowps2phh, "V32xIUcUi", "n", "amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowps2phl, "V32xIUcUi", "n", "amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tilemovrow, "V16iIUcUi", "n", "amx-avx512")
+
 TARGET_BUILTIN(__builtin_ia32_prefetchi, "vvC*Ui", "nc", "prefetchi")
 TARGET_BUILTIN(__builtin_ia32_cmpccxadd32, "Siv*SiSiIi", "n", "cmpccxadd")
 TARGET_BUILTIN(__builtin_ia32_cmpccxadd64, "SLLiv*SLLiSLLiIi", "n", 
"cmpccxadd")
diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
ind

[clang] [llvm] [X86][AMX] Support AMX-AVX512 (PR #114070)

2024-10-29 Thread via cfe-commits

llvmbot wrote:




@llvm/pr-subscribers-backend-x86

Author: Phoebe Wang (phoebewang)


Changes



---

Patch is 81.89 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/114070.diff


31 Files Affected:

- (modified) clang/docs/ReleaseNotes.rst (+2) 
- (modified) clang/include/clang/Basic/BuiltinsX86_64.def (+13) 
- (modified) clang/include/clang/Driver/Options.td (+2) 
- (modified) clang/lib/Basic/Targets/X86.cpp (+6) 
- (modified) clang/lib/Basic/Targets/X86.h (+1) 
- (modified) clang/lib/Headers/CMakeLists.txt (+1) 
- (added) clang/lib/Headers/amxavx512intrin.h (+381) 
- (modified) clang/lib/Headers/immintrin.h (+4) 
- (modified) clang/lib/Sema/SemaX86.cpp (+6) 
- (added) clang/test/CodeGen/X86/amx_avx512_api.c (+52) 
- (added) clang/test/CodeGen/X86/amxavx512-builtins.c (+41) 
- (modified) clang/test/CodeGen/attr-target-x86.c (+4-4) 
- (modified) clang/test/Driver/x86-target-features.c (+7) 
- (modified) clang/test/Preprocessor/x86_target_features.c (+7) 
- (modified) llvm/include/llvm/IR/IntrinsicsX86.td (+50) 
- (modified) llvm/include/llvm/TargetParser/X86TargetParser.def (+1) 
- (modified) llvm/lib/Target/X86/X86.td (+4) 
- (modified) llvm/lib/Target/X86/X86ExpandPseudo.cpp (+60-4) 
- (modified) llvm/lib/Target/X86/X86ISelLowering.cpp (+76) 
- (modified) llvm/lib/Target/X86/X86InstrAMX.td (+147) 
- (modified) llvm/lib/Target/X86/X86InstrPredicates.td (+1) 
- (modified) llvm/lib/Target/X86/X86LowerAMXType.cpp (+11) 
- (modified) llvm/lib/Target/X86/X86PreTileConfig.cpp (+17-2) 
- (modified) llvm/lib/TargetParser/Host.cpp (+4) 
- (modified) llvm/lib/TargetParser/X86TargetParser.cpp (+2) 
- (added) llvm/test/CodeGen/X86/amx-across-func-tilemovrow.ll (+171) 
- (added) llvm/test/CodeGen/X86/amx-avx512-intrinsics.ll (+116) 
- (added) llvm/test/CodeGen/X86/amx-tile-avx512-internals.ll (+61) 
- (added) llvm/test/MC/Disassembler/X86/amx-avx512.txt (+106) 
- (added) llvm/test/MC/X86/amx-avx512-att.s (+105) 
- (added) llvm/test/MC/X86/amx-avx512-intel.s (+105) 


``diff
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index ce046a305c89b6..d45bd1240dd173 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -611,6 +611,8 @@ X86 Support
   * Supported MINMAX intrinsics of ``*_(mask(z)))_minmax(ne)_p[s|d|h|bh]`` and
   ``*_(mask(z)))_minmax_s[s|d|h]``.
 
+- Support ISA of ``AMX-AVX512``.
+
 - All intrinsics in adcintrin.h can now be used in constant expressions.
 
 - All intrinsics in adxintrin.h can now be used in constant expressions.
diff --git a/clang/include/clang/Basic/BuiltinsX86_64.def 
b/clang/include/clang/Basic/BuiltinsX86_64.def
index 2c591edb2835cd..70644f3f6b6054 100644
--- a/clang/include/clang/Basic/BuiltinsX86_64.def
+++ b/clang/include/clang/Basic/BuiltinsX86_64.def
@@ -128,6 +128,12 @@ TARGET_BUILTIN(__builtin_ia32_tdpbf16ps_internal, 
"V256iUsUsUsV256iV256iV256i",
 TARGET_BUILTIN(__builtin_ia32_tdpfp16ps_internal, 
"V256iUsUsUsV256iV256iV256i", "n", "amx-fp16")
 TARGET_BUILTIN(__builtin_ia32_tcmmimfp16ps_internal, 
"V256iUsUsUsV256iV256iV256i", "n", "amx-complex")
 TARGET_BUILTIN(__builtin_ia32_tcmmrlfp16ps_internal, 
"V256iUsUsUsV256iV256iV256i", "n", "amx-complex")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowd2ps_internal, "V16fUsUsV256iUi", "n", 
"amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowps2pbf16h_internal, "V32yUsUsV256iUi", 
"n", "amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowps2pbf16l_internal, "V32yUsUsV256iUi", 
"n", "amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowps2phh_internal, "V32xUsUsV256iUi", "n", 
"amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowps2phl_internal, "V32xUsUsV256iUi", "n", 
"amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tilemovrow_internal, "V16iUsUsV256iUi", "n", 
"amx-avx512")
 // AMX
 TARGET_BUILTIN(__builtin_ia32_tile_loadconfig, "vvC*", "n", "amx-tile")
 TARGET_BUILTIN(__builtin_ia32_tile_storeconfig, "vvC*", "n", "amx-tile")
@@ -148,6 +154,13 @@ TARGET_BUILTIN(__builtin_ia32_ptwrite64, "vUOi", "n", 
"ptwrite")
 TARGET_BUILTIN(__builtin_ia32_tcmmimfp16ps, "vIUcIUcIUc", "n", "amx-complex")
 TARGET_BUILTIN(__builtin_ia32_tcmmrlfp16ps, "vIUcIUcIUc", "n", "amx-complex")
 
+TARGET_BUILTIN(__builtin_ia32_tcvtrowd2ps, "V16fIUcUi", "n", "amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowps2pbf16h, "V32yIUcUi", "n", "amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowps2pbf16l, "V32yIUcUi", "n", "amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowps2phh, "V32xIUcUi", "n", "amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowps2phl, "V32xIUcUi", "n", "amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tilemovrow, "V16iIUcUi", "n", "amx-avx512")
+
 TARGET_BUILTIN(__builtin_ia32_prefetchi, "vvC*Ui", "nc", "prefetchi")
 TARGET_BUILTIN(__builtin_ia32_cmpccxadd32, "Siv*SiSiIi", "n", "cmpccxadd")
 TARGET_BUILTIN(__builtin_ia32_cmpccxadd64, "SLLiv*SLLiSLLiIi", "n", 
"cmpccxadd")
diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
inde

[clang] [llvm] [X86][AMX] Support AMX-AVX512 (PR #114070)

2024-10-29 Thread Phoebe Wang via cfe-commits

https://github.com/phoebewang created 
https://github.com/llvm/llvm-project/pull/114070

None

>From 587d0105e7724db0f35fc5c8179519fa6319e5c8 Mon Sep 17 00:00:00 2001
From: "Wang, Phoebe" 
Date: Tue, 29 Oct 2024 22:29:25 +0800
Subject: [PATCH] [X86][AMX] Support AMX-AVX512

---
 clang/docs/ReleaseNotes.rst   |   2 +
 clang/include/clang/Basic/BuiltinsX86_64.def  |  13 +
 clang/include/clang/Driver/Options.td |   2 +
 clang/lib/Basic/Targets/X86.cpp   |   6 +
 clang/lib/Basic/Targets/X86.h |   1 +
 clang/lib/Headers/CMakeLists.txt  |   1 +
 clang/lib/Headers/amxavx512intrin.h   | 381 ++
 clang/lib/Headers/immintrin.h |   4 +
 clang/lib/Sema/SemaX86.cpp|   6 +
 clang/test/CodeGen/X86/amx_avx512_api.c   |  52 +++
 clang/test/CodeGen/X86/amxavx512-builtins.c   |  41 ++
 clang/test/CodeGen/attr-target-x86.c  |   8 +-
 clang/test/Driver/x86-target-features.c   |   7 +
 clang/test/Preprocessor/x86_target_features.c |   7 +
 llvm/include/llvm/IR/IntrinsicsX86.td |  50 +++
 .../llvm/TargetParser/X86TargetParser.def |   1 +
 llvm/lib/Target/X86/X86.td|   4 +
 llvm/lib/Target/X86/X86ExpandPseudo.cpp   |  64 ++-
 llvm/lib/Target/X86/X86ISelLowering.cpp   |  76 
 llvm/lib/Target/X86/X86InstrAMX.td| 147 +++
 llvm/lib/Target/X86/X86InstrPredicates.td |   1 +
 llvm/lib/Target/X86/X86LowerAMXType.cpp   |  11 +
 llvm/lib/Target/X86/X86PreTileConfig.cpp  |  19 +-
 llvm/lib/TargetParser/Host.cpp|   4 +
 llvm/lib/TargetParser/X86TargetParser.cpp |   2 +
 .../CodeGen/X86/amx-across-func-tilemovrow.ll | 171 
 .../test/CodeGen/X86/amx-avx512-intrinsics.ll | 116 ++
 .../CodeGen/X86/amx-tile-avx512-internals.ll  |  61 +++
 llvm/test/MC/Disassembler/X86/amx-avx512.txt  | 106 +
 llvm/test/MC/X86/amx-avx512-att.s | 105 +
 llvm/test/MC/X86/amx-avx512-intel.s   | 105 +
 31 files changed, 1564 insertions(+), 10 deletions(-)
 create mode 100644 clang/lib/Headers/amxavx512intrin.h
 create mode 100644 clang/test/CodeGen/X86/amx_avx512_api.c
 create mode 100644 clang/test/CodeGen/X86/amxavx512-builtins.c
 create mode 100644 llvm/test/CodeGen/X86/amx-across-func-tilemovrow.ll
 create mode 100644 llvm/test/CodeGen/X86/amx-avx512-intrinsics.ll
 create mode 100644 llvm/test/CodeGen/X86/amx-tile-avx512-internals.ll
 create mode 100644 llvm/test/MC/Disassembler/X86/amx-avx512.txt
 create mode 100644 llvm/test/MC/X86/amx-avx512-att.s
 create mode 100644 llvm/test/MC/X86/amx-avx512-intel.s

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index ce046a305c89b6..d45bd1240dd173 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -611,6 +611,8 @@ X86 Support
   * Supported MINMAX intrinsics of ``*_(mask(z)))_minmax(ne)_p[s|d|h|bh]`` and
   ``*_(mask(z)))_minmax_s[s|d|h]``.
 
+- Support ISA of ``AMX-AVX512``.
+
 - All intrinsics in adcintrin.h can now be used in constant expressions.
 
 - All intrinsics in adxintrin.h can now be used in constant expressions.
diff --git a/clang/include/clang/Basic/BuiltinsX86_64.def 
b/clang/include/clang/Basic/BuiltinsX86_64.def
index 2c591edb2835cd..70644f3f6b6054 100644
--- a/clang/include/clang/Basic/BuiltinsX86_64.def
+++ b/clang/include/clang/Basic/BuiltinsX86_64.def
@@ -128,6 +128,12 @@ TARGET_BUILTIN(__builtin_ia32_tdpbf16ps_internal, 
"V256iUsUsUsV256iV256iV256i",
 TARGET_BUILTIN(__builtin_ia32_tdpfp16ps_internal, 
"V256iUsUsUsV256iV256iV256i", "n", "amx-fp16")
 TARGET_BUILTIN(__builtin_ia32_tcmmimfp16ps_internal, 
"V256iUsUsUsV256iV256iV256i", "n", "amx-complex")
 TARGET_BUILTIN(__builtin_ia32_tcmmrlfp16ps_internal, 
"V256iUsUsUsV256iV256iV256i", "n", "amx-complex")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowd2ps_internal, "V16fUsUsV256iUi", "n", 
"amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowps2pbf16h_internal, "V32yUsUsV256iUi", 
"n", "amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowps2pbf16l_internal, "V32yUsUsV256iUi", 
"n", "amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowps2phh_internal, "V32xUsUsV256iUi", "n", 
"amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowps2phl_internal, "V32xUsUsV256iUi", "n", 
"amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tilemovrow_internal, "V16iUsUsV256iUi", "n", 
"amx-avx512")
 // AMX
 TARGET_BUILTIN(__builtin_ia32_tile_loadconfig, "vvC*", "n", "amx-tile")
 TARGET_BUILTIN(__builtin_ia32_tile_storeconfig, "vvC*", "n", "amx-tile")
@@ -148,6 +154,13 @@ TARGET_BUILTIN(__builtin_ia32_ptwrite64, "vUOi", "n", 
"ptwrite")
 TARGET_BUILTIN(__builtin_ia32_tcmmimfp16ps, "vIUcIUcIUc", "n", "amx-complex")
 TARGET_BUILTIN(__builtin_ia32_tcmmrlfp16ps, "vIUcIUcIUc", "n", "amx-complex")
 
+TARGET_BUILTIN(__builtin_ia32_tcvtrowd2ps, "V16fIUcUi", "n", "amx-avx512")
+TARGET_BUILTIN(__builtin_ia32_tcvtrowps2pbf16h, "V32yIUcUi", "n", "amx-avx512")
+TARGET_BUILTIN(__builtin