[PATCH] D77871: [AArch64] Armv8.6-a Matrix Mult Assembly + Intrinsics
This revision was automatically updated to reflect the committed changes. Closed by commit rG832cd749131b: [AArch64] Armv8.6-a Matrix Mult Assembly + Intrinsics (authored by LukeGeeson). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D77871/new/ https://reviews.llvm.org/D77871 Files: clang/include/clang/Basic/arm_neon.td clang/lib/Basic/Targets/AArch64.cpp clang/lib/Basic/Targets/AArch64.h clang/lib/CodeGen/CGBuiltin.cpp clang/test/CodeGen/aarch64-matmul.cpp clang/test/CodeGen/aarch64-v8.6a-neon-intrinsics.c llvm/include/llvm/IR/IntrinsicsAArch64.td llvm/lib/Target/AArch64/AArch64.td llvm/lib/Target/AArch64/AArch64InstrFormats.td llvm/lib/Target/AArch64/AArch64InstrInfo.td llvm/lib/Target/AArch64/AArch64Subtarget.h llvm/test/CodeGen/AArch64/aarch64-matmul.ll llvm/test/MC/AArch64/armv8.6a-simd-matmul-error.s llvm/test/MC/AArch64/armv8.6a-simd-matmul.s llvm/test/MC/Disassembler/AArch64/armv8.6a-simd-matmul.txt Index: llvm/test/MC/Disassembler/AArch64/armv8.6a-simd-matmul.txt === --- /dev/null +++ llvm/test/MC/Disassembler/AArch64/armv8.6a-simd-matmul.txt @@ -0,0 +1,34 @@ +# RUN: llvm-mc -triple=aarch64 -mattr=+i8mm -disassemble < %s | FileCheck %s +# RUN: llvm-mc -triple=aarch64 -mattr=+v8.6a -disassemble < %s | FileCheck %s +# RUN: not llvm-mc -triple=aarch64 -mattr=+v8.5a -disassemble < %s 2>&1 | FileCheck %s --check-prefix=NOI8MM + +[0x01,0xa6,0x9f,0x4e] +[0x01,0xa6,0x9f,0x6e] +[0x01,0xae,0x9f,0x4e] +# CHECK: smmla v1.4s, v16.16b, v31.16b +# CHECK: ummla v1.4s, v16.16b, v31.16b +# CHECK: usmmla v1.4s, v16.16b, v31.16b +# NOI8MM: [[@LINE-6]]:{{[0-9]+}}: warning: invalid instruction encoding +# NOI8MM: [[@LINE-6]]:{{[0-9]+}}: warning: invalid instruction encoding +# NOI8MM: [[@LINE-6]]:{{[0-9]+}}: warning: invalid instruction encoding + +[0xe3,0x9d,0x9e,0x0e] +[0xe3,0x9d,0x9e,0x4e] +# CHECK: usdot v3.2s, v15.8b, v30.8b +# CHECK: usdot v3.4s, v15.16b, v30.16b +# NOI8MM: [[@LINE-4]]:{{[0-9]+}}: warning: invalid instruction encoding +# NOI8MM: [[@LINE-4]]:{{[0-9]+}}: warning: invalid instruction encoding + +[0x3f,0xf8,0xa2,0x0f] +[0x3f,0xf8,0xa2,0x4f] +# CHECK: usdot v31.2s, v1.8b, v2.4b[3] +# CHECK: usdot v31.4s, v1.16b, v2.4b[3] +# NOI8MM: [[@LINE-4]]:{{[0-9]+}}: warning: invalid instruction encoding +# NOI8MM: [[@LINE-4]]:{{[0-9]+}}: warning: invalid instruction encoding + +[0x3f,0xf8,0x22,0x0f] +[0x3f,0xf8,0x22,0x4f] +# CHECK: sudot v31.2s, v1.8b, v2.4b[3] +# CHECK: sudot v31.4s, v1.16b, v2.4b[3] +# NOI8MM: [[@LINE-4]]:{{[0-9]+}}: warning: invalid instruction encoding +# NOI8MM: [[@LINE-4]]:{{[0-9]+}}: warning: invalid instruction encoding Index: llvm/test/MC/AArch64/armv8.6a-simd-matmul.s === --- /dev/null +++ llvm/test/MC/AArch64/armv8.6a-simd-matmul.s @@ -0,0 +1,43 @@ +// RUN: llvm-mc -triple aarch64 -show-encoding -mattr=+i8mm < %s | FileCheck %s +// RUN: llvm-mc -triple aarch64 -show-encoding -mattr=+v8.6a < %s | FileCheck %s +// RUN: not llvm-mc -triple aarch64 -show-encoding -mattr=+v8.6a-i8mm < %s 2>&1 | FileCheck %s --check-prefix=NOMATMUL + +smmla v1.4s, v16.16b, v31.16b +ummla v1.4s, v16.16b, v31.16b +usmmla v1.4s, v16.16b, v31.16b +// CHECK: smmla v1.4s, v16.16b, v31.16b // encoding: [0x01,0xa6,0x9f,0x4e] +// CHECK: ummla v1.4s, v16.16b, v31.16b // encoding: [0x01,0xa6,0x9f,0x6e] +// CHECK: usmmla v1.4s, v16.16b, v31.16b // encoding: [0x01,0xae,0x9f,0x4e] +// NOMATMUL: instruction requires: i8mm +// NOMATMUL-NEXT: smmla v1.4s, v16.16b, v31.16b +// NOMATMUL: instruction requires: i8mm +// NOMATMUL-NEXT: ummla v1.4s, v16.16b, v31.16b +// NOMATMUL: instruction requires: i8mm +// NOMATMUL-NEXT: usmmla v1.4s, v16.16b, v31.16b + +usdot v3.2s, v15.8b, v30.8b +usdot v3.4s, v15.16b, v30.16b +// CHECK: usdot v3.2s, v15.8b, v30.8b // encoding: [0xe3,0x9d,0x9e,0x0e] +// CHECK: usdot v3.4s, v15.16b, v30.16b // encoding: [0xe3,0x9d,0x9e,0x4e] +// NOMATMUL: instruction requires: i8mm +// NOMATMUL-NEXT: usdot v3.2s, v15.8b, v30.8b +// NOMATMUL: instruction requires: i8mm +// NOMATMUL-NEXT: usdot v3.4s, v15.16b, v30.16b + +usdot v31.2s, v1.8b, v2.4b[3] +usdot v31.4s, v1.16b, v2.4b[3] +// CHECK: usdot v31.2s, v1.8b, v2.4b[3] // encoding: [0x3f,0xf8,0xa2,0x0f] +// CHECK: usdot v31.4s, v1.16b, v2.4b[3] // encoding: [0x3f,0xf8,0xa2,0x4f] +// NOMATMUL: instruction requires: i8mm +// NOMATMUL-NEXT: usdot v31.2s, v1.8b, v2.4b[3] +// NOMATMUL: instruction requires: i8mm +// NOMATMUL-NEXT: usdot v31.4s, v1.16b, v2.4b[3] + +sudot v31.2s, v1.8b, v2.4b[3] +sudot v31.4s, v1.16b, v2.4b[3] +// CHECK: sudot v31.2s, v1.8b, v2.4b[3] // encoding: [0x3f,0xf8,0x22,0x0f] +// CHECK: sudot v31.4s, v1.16b, v2.4b[3] // encoding: [0x3f,0xf8,0x22,0x4f] +// NOMATMUL: instruction requires: i8mm +// NOMATMUL-NEXT: sudot v31.2s, v1.8b,
[PATCH] D77871: [AArch64] Armv8.6-a Matrix Mult Assembly + Intrinsics
kmclaughlin accepted this revision. kmclaughlin added a comment. This revision is now accepted and ready to land. Thanks for the updates, @LukeGeeson, LGTM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D77871/new/ https://reviews.llvm.org/D77871 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D77871: [AArch64] Armv8.6-a Matrix Mult Assembly + Intrinsics
LukeGeeson updated this revision to Diff 259327. LukeGeeson marked 5 inline comments as done. LukeGeeson added a comment. - fixed typos - added sroa as mem2reg arg to reduce redundant mem accesses in tests, refactored test - addressed other comments CHANGES SINCE LAST ACTION https://reviews.llvm.org/D77871/new/ https://reviews.llvm.org/D77871 Files: clang/include/clang/Basic/arm_neon.td clang/lib/Basic/Targets/AArch64.cpp clang/lib/Basic/Targets/AArch64.h clang/lib/CodeGen/CGBuiltin.cpp clang/test/CodeGen/aarch64-matmul.cpp clang/test/CodeGen/aarch64-v8.6a-neon-intrinsics.c llvm/include/llvm/IR/IntrinsicsAArch64.td llvm/lib/Target/AArch64/AArch64.td llvm/lib/Target/AArch64/AArch64InstrFormats.td llvm/lib/Target/AArch64/AArch64InstrInfo.td llvm/lib/Target/AArch64/AArch64Subtarget.h llvm/test/CodeGen/AArch64/aarch64-matmul.ll llvm/test/MC/AArch64/armv8.6a-simd-matmul-error.s llvm/test/MC/AArch64/armv8.6a-simd-matmul.s llvm/test/MC/Disassembler/AArch64/armv8.6a-simd-matmul.txt Index: llvm/test/MC/Disassembler/AArch64/armv8.6a-simd-matmul.txt === --- /dev/null +++ llvm/test/MC/Disassembler/AArch64/armv8.6a-simd-matmul.txt @@ -0,0 +1,34 @@ +# RUN: llvm-mc -triple=aarch64 -mattr=+i8mm -disassemble < %s | FileCheck %s +# RUN: llvm-mc -triple=aarch64 -mattr=+v8.6a -disassemble < %s | FileCheck %s +# RUN: not llvm-mc -triple=aarch64 -mattr=+v8.5a -disassemble < %s 2>&1 | FileCheck %s --check-prefix=NOI8MM + +[0x01,0xa6,0x9f,0x4e] +[0x01,0xa6,0x9f,0x6e] +[0x01,0xae,0x9f,0x4e] +# CHECK: smmla v1.4s, v16.16b, v31.16b +# CHECK: ummla v1.4s, v16.16b, v31.16b +# CHECK: usmmla v1.4s, v16.16b, v31.16b +# NOI8MM: [[@LINE-6]]:{{[0-9]+}}: warning: invalid instruction encoding +# NOI8MM: [[@LINE-6]]:{{[0-9]+}}: warning: invalid instruction encoding +# NOI8MM: [[@LINE-6]]:{{[0-9]+}}: warning: invalid instruction encoding + +[0xe3,0x9d,0x9e,0x0e] +[0xe3,0x9d,0x9e,0x4e] +# CHECK: usdot v3.2s, v15.8b, v30.8b +# CHECK: usdot v3.4s, v15.16b, v30.16b +# NOI8MM: [[@LINE-4]]:{{[0-9]+}}: warning: invalid instruction encoding +# NOI8MM: [[@LINE-4]]:{{[0-9]+}}: warning: invalid instruction encoding + +[0x3f,0xf8,0xa2,0x0f] +[0x3f,0xf8,0xa2,0x4f] +# CHECK: usdot v31.2s, v1.8b, v2.4b[3] +# CHECK: usdot v31.4s, v1.16b, v2.4b[3] +# NOI8MM: [[@LINE-4]]:{{[0-9]+}}: warning: invalid instruction encoding +# NOI8MM: [[@LINE-4]]:{{[0-9]+}}: warning: invalid instruction encoding + +[0x3f,0xf8,0x22,0x0f] +[0x3f,0xf8,0x22,0x4f] +# CHECK: sudot v31.2s, v1.8b, v2.4b[3] +# CHECK: sudot v31.4s, v1.16b, v2.4b[3] +# NOI8MM: [[@LINE-4]]:{{[0-9]+}}: warning: invalid instruction encoding +# NOI8MM: [[@LINE-4]]:{{[0-9]+}}: warning: invalid instruction encoding Index: llvm/test/MC/AArch64/armv8.6a-simd-matmul.s === --- /dev/null +++ llvm/test/MC/AArch64/armv8.6a-simd-matmul.s @@ -0,0 +1,43 @@ +// RUN: llvm-mc -triple aarch64 -show-encoding -mattr=+i8mm < %s | FileCheck %s +// RUN: llvm-mc -triple aarch64 -show-encoding -mattr=+v8.6a < %s | FileCheck %s +// RUN: not llvm-mc -triple aarch64 -show-encoding -mattr=+v8.6a-i8mm < %s 2>&1 | FileCheck %s --check-prefix=NOMATMUL + +smmla v1.4s, v16.16b, v31.16b +ummla v1.4s, v16.16b, v31.16b +usmmla v1.4s, v16.16b, v31.16b +// CHECK: smmla v1.4s, v16.16b, v31.16b // encoding: [0x01,0xa6,0x9f,0x4e] +// CHECK: ummla v1.4s, v16.16b, v31.16b // encoding: [0x01,0xa6,0x9f,0x6e] +// CHECK: usmmla v1.4s, v16.16b, v31.16b // encoding: [0x01,0xae,0x9f,0x4e] +// NOMATMUL: instruction requires: i8mm +// NOMATMUL-NEXT: smmla v1.4s, v16.16b, v31.16b +// NOMATMUL: instruction requires: i8mm +// NOMATMUL-NEXT: ummla v1.4s, v16.16b, v31.16b +// NOMATMUL: instruction requires: i8mm +// NOMATMUL-NEXT: usmmla v1.4s, v16.16b, v31.16b + +usdot v3.2s, v15.8b, v30.8b +usdot v3.4s, v15.16b, v30.16b +// CHECK: usdot v3.2s, v15.8b, v30.8b // encoding: [0xe3,0x9d,0x9e,0x0e] +// CHECK: usdot v3.4s, v15.16b, v30.16b // encoding: [0xe3,0x9d,0x9e,0x4e] +// NOMATMUL: instruction requires: i8mm +// NOMATMUL-NEXT: usdot v3.2s, v15.8b, v30.8b +// NOMATMUL: instruction requires: i8mm +// NOMATMUL-NEXT: usdot v3.4s, v15.16b, v30.16b + +usdot v31.2s, v1.8b, v2.4b[3] +usdot v31.4s, v1.16b, v2.4b[3] +// CHECK: usdot v31.2s, v1.8b, v2.4b[3] // encoding: [0x3f,0xf8,0xa2,0x0f] +// CHECK: usdot v31.4s, v1.16b, v2.4b[3] // encoding: [0x3f,0xf8,0xa2,0x4f] +// NOMATMUL: instruction requires: i8mm +// NOMATMUL-NEXT: usdot v31.2s, v1.8b, v2.4b[3] +// NOMATMUL: instruction requires: i8mm +// NOMATMUL-NEXT: usdot v31.4s, v1.16b, v2.4b[3] + +sudot v31.2s, v1.8b, v2.4b[3] +sudot v31.4s, v1.16b, v2.4b[3] +// CHECK: sudot v31.2s, v1.8b, v2.4b[3] // encoding: [0x3f,0xf8,0x22,0x0f] +// CHECK: sudot v31.4s, v1.16b, v2.4b[3] // encoding: [0x3f,0xf8,0x22,0x4f] +// NOMATMUL: instruction requires: i8mm +// NOMATMUL-NEXT:
[PATCH] D77871: [AArch64] Armv8.6-a Matrix Mult Assembly + Intrinsics
kmclaughlin added inline comments. Comment at: clang/test/CodeGen/aarch64-v8.6a-neon-intrinsics.c:3 +// RUN: -fallow-half-arguments-and-returns -S -disable-O0-optnone -emit-llvm -o - %s \ +// RUN: | opt -S -mem2reg \ +// RUN: | FileCheck %s Is it possible to use -sroa here as you did for the tests added in D77872? If so, I think this might make some of the `_lane` tests below a bit easier to follow. Comment at: llvm/test/MC/AArch64/armv8.6a-simd-matmul-error.s:17 +// For USDOT and SUDOT (indexed), the index is in range [0,3] (regardless of data types) +usdot v31.2s, v1.8b, v2.4b[4] +// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: vector lane must be an integer in range [0, 3]. The arrangement specifiers of the first two operands don't match for these tests, which is what the next set of tests below is checking for. It might be worth keeping these tests specific to just the index being out of range. Comment at: llvm/test/MC/AArch64/armv8.6a-simd-matmul-error.s:26 + +// The arrangement specifiers of the first two operands muct match. +usdot v31.4s, v1.8b, v2.4b[0] muct -> must :) Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D77871/new/ https://reviews.llvm.org/D77871 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D77871: [AArch64] Armv8.6-a Matrix Mult Assembly + Intrinsics
LukeGeeson added a comment. Removed reliance on parent revision, harbormaster now builds with unit tests passing Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D77871/new/ https://reviews.llvm.org/D77871 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D77871: [AArch64] Armv8.6-a Matrix Mult Assembly + Intrinsics
LukeGeeson created this revision. LukeGeeson added reviewers: ostannard, t.p.northover. Herald added subscribers: cfe-commits, danielkiss, hiraditya, kristof.beyls. Herald added a reviewer: rengolin. Herald added a project: clang. LukeGeeson added a parent revision: D77540: [PATCH] [ARM]: Armv8.6-a Matrix Mul Asm and Intrinsics Support. LukeGeeson added a child revision: D77872: [AArch32] Armv8.6-a Matrix Mult Assembly + Intrinsics. This patch upstreams support for the Armv8.6-a Matrix Multiplication Extension. A summary of the features can be found here: https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a This patch includes: - Assembly support for AArch64 only (no SVE or Neon) - Intrinsics Support for AArch64 Armv8.6a Matrix Multiplication Instructions (No bfloat16 matrix multiplication) No IR types or C Types are needed for this extension. This is part of a patch series, starting with BFloat16 support and the other components in the armv8.6a extension (in previous patches linked in phabricator) Based on work by: - Luke Geeson - Oliver Stannard - Luke Cheeseman Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D77871 Files: clang/include/clang/Basic/arm_neon.td clang/lib/Basic/Targets/AArch64.cpp clang/lib/Basic/Targets/AArch64.h clang/lib/CodeGen/CGBuiltin.cpp clang/test/CodeGen/aarch64-matmul.cpp clang/test/CodeGen/aarch64-v8.6a-neon-intrinsics.c llvm/include/llvm/IR/IntrinsicsAArch64.td llvm/lib/Target/AArch64/AArch64.td llvm/lib/Target/AArch64/AArch64InstrFormats.td llvm/lib/Target/AArch64/AArch64InstrInfo.td llvm/lib/Target/AArch64/AArch64Subtarget.h llvm/test/CodeGen/AArch64/aarch64-matmul.ll llvm/test/MC/AArch64/armv8.6a-simd-matmul-error.s llvm/test/MC/AArch64/armv8.6a-simd-matmul.s llvm/test/MC/Disassembler/AArch64/armv8.6a-simd-matmul.txt Index: llvm/test/MC/Disassembler/AArch64/armv8.6a-simd-matmul.txt === --- /dev/null +++ llvm/test/MC/Disassembler/AArch64/armv8.6a-simd-matmul.txt @@ -0,0 +1,34 @@ +# RUN: llvm-mc -triple=aarch64 -mattr=+i8mm -disassemble < %s | FileCheck %s +# RUN: llvm-mc -triple=aarch64 -mattr=+v8.6a -disassemble < %s | FileCheck %s +# RUN: not llvm-mc -triple=aarch64 -mattr=+v8.5a -disassemble < %s 2>&1 | FileCheck %s --check-prefix=NOI8MM + +[0x01,0xa6,0x9f,0x4e] +[0x01,0xa6,0x9f,0x6e] +[0x01,0xae,0x9f,0x4e] +# CHECK: smmla v1.4s, v16.16b, v31.16b +# CHECK: ummla v1.4s, v16.16b, v31.16b +# CHECK: usmmla v1.4s, v16.16b, v31.16b +# NOI8MM: [[@LINE-6]]:{{[0-9]+}}: warning: invalid instruction encoding +# NOI8MM: [[@LINE-6]]:{{[0-9]+}}: warning: invalid instruction encoding +# NOI8MM: [[@LINE-6]]:{{[0-9]+}}: warning: invalid instruction encoding + +[0xe3,0x9d,0x9e,0x0e] +[0xe3,0x9d,0x9e,0x4e] +# CHECK: usdot v3.2s, v15.8b, v30.8b +# CHECK: usdot v3.4s, v15.16b, v30.16b +# NOI8MM: [[@LINE-4]]:{{[0-9]+}}: warning: invalid instruction encoding +# NOI8MM: [[@LINE-4]]:{{[0-9]+}}: warning: invalid instruction encoding + +[0x3f,0xf8,0xa2,0x0f] +[0x3f,0xf8,0xa2,0x4f] +# CHECK: usdot v31.2s, v1.8b, v2.4b[3] +# CHECK: usdot v31.4s, v1.16b, v2.4b[3] +# NOI8MM: [[@LINE-4]]:{{[0-9]+}}: warning: invalid instruction encoding +# NOI8MM: [[@LINE-4]]:{{[0-9]+}}: warning: invalid instruction encoding + +[0x3f,0xf8,0x22,0x0f] +[0x3f,0xf8,0x22,0x4f] +# CHECK: sudot v31.2s, v1.8b, v2.4b[3] +# CHECK: sudot v31.4s, v1.16b, v2.4b[3] +# NOI8MM: [[@LINE-4]]:{{[0-9]+}}: warning: invalid instruction encoding +# NOI8MM: [[@LINE-4]]:{{[0-9]+}}: warning: invalid instruction encoding Index: llvm/test/MC/AArch64/armv8.6a-simd-matmul.s === --- /dev/null +++ llvm/test/MC/AArch64/armv8.6a-simd-matmul.s @@ -0,0 +1,43 @@ +// RUN: llvm-mc -triple aarch64 -show-encoding -mattr=+i8mm < %s | FileCheck %s +// RUN: llvm-mc -triple aarch64 -show-encoding -mattr=+v8.6a < %s | FileCheck %s +// RUN: not llvm-mc -triple aarch64 -show-encoding -mattr=+v8.6a-i8mm < %s 2>&1 | FileCheck %s --check-prefix=NOMATMUL + +smmla v1.4s, v16.16b, v31.16b +ummla v1.4s, v16.16b, v31.16b +usmmla v1.4s, v16.16b, v31.16b +// CHECK: smmla v1.4s, v16.16b, v31.16b // encoding: [0x01,0xa6,0x9f,0x4e] +// CHECK: ummla v1.4s, v16.16b, v31.16b // encoding: [0x01,0xa6,0x9f,0x6e] +// CHECK: usmmla v1.4s, v16.16b, v31.16b // encoding: [0x01,0xae,0x9f,0x4e] +// NOMATMUL: instruction requires: i8mm +// NOMATMUL-NEXT: smmla v1.4s, v16.16b, v31.16b +// NOMATMUL: instruction requires: i8mm +// NOMATMUL-NEXT: ummla v1.4s, v16.16b, v31.16b +// NOMATMUL: instruction requires: i8mm +// NOMATMUL-NEXT: usmmla v1.4s, v16.16b, v31.16b + +usdot v3.2s, v15.8b, v30.8b +usdot v3.4s, v15.16b, v30.16b +// CHECK: usdot v3.2s, v15.8b, v30.8b // encoding: [0xe3,0x9d,0x9e,0x0e] +// CHECK: usdot v3.4s, v15.16b, v30.16b // encoding: