[PATCH] D77871: [AArch64] Armv8.6-a Matrix Mult Assembly + Intrinsics

2020-04-24 Thread Luke Geeson via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG832cd749131b: [AArch64] Armv8.6-a Matrix Mult Assembly + 
Intrinsics (authored by LukeGeeson).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77871/new/

https://reviews.llvm.org/D77871

Files:
  clang/include/clang/Basic/arm_neon.td
  clang/lib/Basic/Targets/AArch64.cpp
  clang/lib/Basic/Targets/AArch64.h
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGen/aarch64-matmul.cpp
  clang/test/CodeGen/aarch64-v8.6a-neon-intrinsics.c
  llvm/include/llvm/IR/IntrinsicsAArch64.td
  llvm/lib/Target/AArch64/AArch64.td
  llvm/lib/Target/AArch64/AArch64InstrFormats.td
  llvm/lib/Target/AArch64/AArch64InstrInfo.td
  llvm/lib/Target/AArch64/AArch64Subtarget.h
  llvm/test/CodeGen/AArch64/aarch64-matmul.ll
  llvm/test/MC/AArch64/armv8.6a-simd-matmul-error.s
  llvm/test/MC/AArch64/armv8.6a-simd-matmul.s
  llvm/test/MC/Disassembler/AArch64/armv8.6a-simd-matmul.txt

Index: llvm/test/MC/Disassembler/AArch64/armv8.6a-simd-matmul.txt
===
--- /dev/null
+++ llvm/test/MC/Disassembler/AArch64/armv8.6a-simd-matmul.txt
@@ -0,0 +1,34 @@
+# RUN: llvm-mc -triple=aarch64  -mattr=+i8mm -disassemble < %s   | FileCheck %s
+# RUN: llvm-mc -triple=aarch64  -mattr=+v8.6a -disassemble < %s  | FileCheck %s
+# RUN: not llvm-mc -triple=aarch64  -mattr=+v8.5a -disassemble < %s 2>&1 | FileCheck %s --check-prefix=NOI8MM
+
+[0x01,0xa6,0x9f,0x4e]
+[0x01,0xa6,0x9f,0x6e]
+[0x01,0xae,0x9f,0x4e]
+# CHECK: smmla   v1.4s, v16.16b, v31.16b
+# CHECK: ummla   v1.4s, v16.16b, v31.16b
+# CHECK: usmmla  v1.4s, v16.16b, v31.16b
+# NOI8MM: [[@LINE-6]]:{{[0-9]+}}: warning: invalid instruction encoding
+# NOI8MM: [[@LINE-6]]:{{[0-9]+}}: warning: invalid instruction encoding
+# NOI8MM: [[@LINE-6]]:{{[0-9]+}}: warning: invalid instruction encoding
+
+[0xe3,0x9d,0x9e,0x0e]
+[0xe3,0x9d,0x9e,0x4e]
+# CHECK: usdot   v3.2s, v15.8b, v30.8b
+# CHECK: usdot   v3.4s, v15.16b, v30.16b
+# NOI8MM: [[@LINE-4]]:{{[0-9]+}}: warning: invalid instruction encoding
+# NOI8MM: [[@LINE-4]]:{{[0-9]+}}: warning: invalid instruction encoding
+
+[0x3f,0xf8,0xa2,0x0f]
+[0x3f,0xf8,0xa2,0x4f]
+# CHECK: usdot   v31.2s, v1.8b, v2.4b[3]
+# CHECK: usdot   v31.4s, v1.16b, v2.4b[3]
+# NOI8MM: [[@LINE-4]]:{{[0-9]+}}: warning: invalid instruction encoding
+# NOI8MM: [[@LINE-4]]:{{[0-9]+}}: warning: invalid instruction encoding
+
+[0x3f,0xf8,0x22,0x0f]
+[0x3f,0xf8,0x22,0x4f]
+# CHECK: sudot   v31.2s, v1.8b, v2.4b[3]
+# CHECK: sudot   v31.4s, v1.16b, v2.4b[3]
+# NOI8MM: [[@LINE-4]]:{{[0-9]+}}: warning: invalid instruction encoding
+# NOI8MM: [[@LINE-4]]:{{[0-9]+}}: warning: invalid instruction encoding
Index: llvm/test/MC/AArch64/armv8.6a-simd-matmul.s
===
--- /dev/null
+++ llvm/test/MC/AArch64/armv8.6a-simd-matmul.s
@@ -0,0 +1,43 @@
+// RUN: llvm-mc -triple aarch64 -show-encoding -mattr=+i8mm   < %s  | FileCheck %s
+// RUN: llvm-mc -triple aarch64 -show-encoding -mattr=+v8.6a  < %s  | FileCheck %s
+// RUN: not llvm-mc -triple aarch64 -show-encoding -mattr=+v8.6a-i8mm < %s 2>&1 | FileCheck %s --check-prefix=NOMATMUL
+
+smmla  v1.4s, v16.16b, v31.16b
+ummla  v1.4s, v16.16b, v31.16b
+usmmla v1.4s, v16.16b, v31.16b
+// CHECK: smmla   v1.4s, v16.16b, v31.16b // encoding: [0x01,0xa6,0x9f,0x4e]
+// CHECK: ummla   v1.4s, v16.16b, v31.16b // encoding: [0x01,0xa6,0x9f,0x6e]
+// CHECK: usmmla  v1.4s, v16.16b, v31.16b // encoding: [0x01,0xae,0x9f,0x4e]
+// NOMATMUL: instruction requires: i8mm
+// NOMATMUL-NEXT: smmla  v1.4s, v16.16b, v31.16b
+// NOMATMUL: instruction requires: i8mm
+// NOMATMUL-NEXT: ummla  v1.4s, v16.16b, v31.16b
+// NOMATMUL: instruction requires: i8mm
+// NOMATMUL-NEXT: usmmla  v1.4s, v16.16b, v31.16b
+
+usdot v3.2s, v15.8b, v30.8b
+usdot v3.4s, v15.16b, v30.16b
+// CHECK: usdot   v3.2s, v15.8b, v30.8b   // encoding: [0xe3,0x9d,0x9e,0x0e]
+// CHECK: usdot   v3.4s, v15.16b, v30.16b // encoding: [0xe3,0x9d,0x9e,0x4e]
+// NOMATMUL: instruction requires: i8mm
+// NOMATMUL-NEXT: usdot v3.2s, v15.8b, v30.8b
+// NOMATMUL: instruction requires: i8mm
+// NOMATMUL-NEXT: usdot v3.4s, v15.16b, v30.16b
+
+usdot v31.2s, v1.8b,  v2.4b[3]
+usdot v31.4s, v1.16b, v2.4b[3]
+// CHECK: usdot   v31.2s, v1.8b, v2.4b[3] // encoding: [0x3f,0xf8,0xa2,0x0f]
+// CHECK: usdot   v31.4s, v1.16b, v2.4b[3] // encoding: [0x3f,0xf8,0xa2,0x4f]
+// NOMATMUL: instruction requires: i8mm
+// NOMATMUL-NEXT: usdot   v31.2s, v1.8b, v2.4b[3]
+// NOMATMUL: instruction requires: i8mm
+// NOMATMUL-NEXT: usdot   v31.4s, v1.16b, v2.4b[3]
+
+sudot v31.2s, v1.8b,  v2.4b[3]
+sudot v31.4s, v1.16b, v2.4b[3]
+// CHECK: sudot   v31.2s, v1.8b, v2.4b[3] // encoding: [0x3f,0xf8,0x22,0x0f]
+// CHECK: sudot   v31.4s, v1.16b, v2.4b[3] // encoding: [0x3f,0xf8,0x22,0x4f]
+// NOMATMUL: instruction requires: i8mm
+// NOMATMUL-NEXT: sudot   v31.2s, v1.8b, 

[PATCH] D77871: [AArch64] Armv8.6-a Matrix Mult Assembly + Intrinsics

2020-04-23 Thread Kerry McLaughlin via Phabricator via cfe-commits
kmclaughlin accepted this revision.
kmclaughlin added a comment.
This revision is now accepted and ready to land.

Thanks for the updates, @LukeGeeson, LGTM


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77871/new/

https://reviews.llvm.org/D77871



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77871: [AArch64] Armv8.6-a Matrix Mult Assembly + Intrinsics

2020-04-22 Thread Luke Geeson via Phabricator via cfe-commits
LukeGeeson updated this revision to Diff 259327.
LukeGeeson marked 5 inline comments as done.
LukeGeeson added a comment.

- fixed typos
- added sroa as mem2reg arg to reduce redundant mem accesses in tests, 
refactored test
- addressed other comments


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77871/new/

https://reviews.llvm.org/D77871

Files:
  clang/include/clang/Basic/arm_neon.td
  clang/lib/Basic/Targets/AArch64.cpp
  clang/lib/Basic/Targets/AArch64.h
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGen/aarch64-matmul.cpp
  clang/test/CodeGen/aarch64-v8.6a-neon-intrinsics.c
  llvm/include/llvm/IR/IntrinsicsAArch64.td
  llvm/lib/Target/AArch64/AArch64.td
  llvm/lib/Target/AArch64/AArch64InstrFormats.td
  llvm/lib/Target/AArch64/AArch64InstrInfo.td
  llvm/lib/Target/AArch64/AArch64Subtarget.h
  llvm/test/CodeGen/AArch64/aarch64-matmul.ll
  llvm/test/MC/AArch64/armv8.6a-simd-matmul-error.s
  llvm/test/MC/AArch64/armv8.6a-simd-matmul.s
  llvm/test/MC/Disassembler/AArch64/armv8.6a-simd-matmul.txt

Index: llvm/test/MC/Disassembler/AArch64/armv8.6a-simd-matmul.txt
===
--- /dev/null
+++ llvm/test/MC/Disassembler/AArch64/armv8.6a-simd-matmul.txt
@@ -0,0 +1,34 @@
+# RUN: llvm-mc -triple=aarch64  -mattr=+i8mm -disassemble < %s   | FileCheck %s
+# RUN: llvm-mc -triple=aarch64  -mattr=+v8.6a -disassemble < %s  | FileCheck %s
+# RUN: not llvm-mc -triple=aarch64  -mattr=+v8.5a -disassemble < %s 2>&1 | FileCheck %s --check-prefix=NOI8MM
+
+[0x01,0xa6,0x9f,0x4e]
+[0x01,0xa6,0x9f,0x6e]
+[0x01,0xae,0x9f,0x4e]
+# CHECK: smmla   v1.4s, v16.16b, v31.16b
+# CHECK: ummla   v1.4s, v16.16b, v31.16b
+# CHECK: usmmla  v1.4s, v16.16b, v31.16b
+# NOI8MM: [[@LINE-6]]:{{[0-9]+}}: warning: invalid instruction encoding
+# NOI8MM: [[@LINE-6]]:{{[0-9]+}}: warning: invalid instruction encoding
+# NOI8MM: [[@LINE-6]]:{{[0-9]+}}: warning: invalid instruction encoding
+
+[0xe3,0x9d,0x9e,0x0e]
+[0xe3,0x9d,0x9e,0x4e]
+# CHECK: usdot   v3.2s, v15.8b, v30.8b
+# CHECK: usdot   v3.4s, v15.16b, v30.16b
+# NOI8MM: [[@LINE-4]]:{{[0-9]+}}: warning: invalid instruction encoding
+# NOI8MM: [[@LINE-4]]:{{[0-9]+}}: warning: invalid instruction encoding
+
+[0x3f,0xf8,0xa2,0x0f]
+[0x3f,0xf8,0xa2,0x4f]
+# CHECK: usdot   v31.2s, v1.8b, v2.4b[3]
+# CHECK: usdot   v31.4s, v1.16b, v2.4b[3]
+# NOI8MM: [[@LINE-4]]:{{[0-9]+}}: warning: invalid instruction encoding
+# NOI8MM: [[@LINE-4]]:{{[0-9]+}}: warning: invalid instruction encoding
+
+[0x3f,0xf8,0x22,0x0f]
+[0x3f,0xf8,0x22,0x4f]
+# CHECK: sudot   v31.2s, v1.8b, v2.4b[3]
+# CHECK: sudot   v31.4s, v1.16b, v2.4b[3]
+# NOI8MM: [[@LINE-4]]:{{[0-9]+}}: warning: invalid instruction encoding
+# NOI8MM: [[@LINE-4]]:{{[0-9]+}}: warning: invalid instruction encoding
Index: llvm/test/MC/AArch64/armv8.6a-simd-matmul.s
===
--- /dev/null
+++ llvm/test/MC/AArch64/armv8.6a-simd-matmul.s
@@ -0,0 +1,43 @@
+// RUN: llvm-mc -triple aarch64 -show-encoding -mattr=+i8mm   < %s  | FileCheck %s
+// RUN: llvm-mc -triple aarch64 -show-encoding -mattr=+v8.6a  < %s  | FileCheck %s
+// RUN: not llvm-mc -triple aarch64 -show-encoding -mattr=+v8.6a-i8mm < %s 2>&1 | FileCheck %s --check-prefix=NOMATMUL
+
+smmla  v1.4s, v16.16b, v31.16b
+ummla  v1.4s, v16.16b, v31.16b
+usmmla v1.4s, v16.16b, v31.16b
+// CHECK: smmla   v1.4s, v16.16b, v31.16b // encoding: [0x01,0xa6,0x9f,0x4e]
+// CHECK: ummla   v1.4s, v16.16b, v31.16b // encoding: [0x01,0xa6,0x9f,0x6e]
+// CHECK: usmmla  v1.4s, v16.16b, v31.16b // encoding: [0x01,0xae,0x9f,0x4e]
+// NOMATMUL: instruction requires: i8mm
+// NOMATMUL-NEXT: smmla  v1.4s, v16.16b, v31.16b
+// NOMATMUL: instruction requires: i8mm
+// NOMATMUL-NEXT: ummla  v1.4s, v16.16b, v31.16b
+// NOMATMUL: instruction requires: i8mm
+// NOMATMUL-NEXT: usmmla  v1.4s, v16.16b, v31.16b
+
+usdot v3.2s, v15.8b, v30.8b
+usdot v3.4s, v15.16b, v30.16b
+// CHECK: usdot   v3.2s, v15.8b, v30.8b   // encoding: [0xe3,0x9d,0x9e,0x0e]
+// CHECK: usdot   v3.4s, v15.16b, v30.16b // encoding: [0xe3,0x9d,0x9e,0x4e]
+// NOMATMUL: instruction requires: i8mm
+// NOMATMUL-NEXT: usdot v3.2s, v15.8b, v30.8b
+// NOMATMUL: instruction requires: i8mm
+// NOMATMUL-NEXT: usdot v3.4s, v15.16b, v30.16b
+
+usdot v31.2s, v1.8b,  v2.4b[3]
+usdot v31.4s, v1.16b, v2.4b[3]
+// CHECK: usdot   v31.2s, v1.8b, v2.4b[3] // encoding: [0x3f,0xf8,0xa2,0x0f]
+// CHECK: usdot   v31.4s, v1.16b, v2.4b[3] // encoding: [0x3f,0xf8,0xa2,0x4f]
+// NOMATMUL: instruction requires: i8mm
+// NOMATMUL-NEXT: usdot   v31.2s, v1.8b, v2.4b[3]
+// NOMATMUL: instruction requires: i8mm
+// NOMATMUL-NEXT: usdot   v31.4s, v1.16b, v2.4b[3]
+
+sudot v31.2s, v1.8b,  v2.4b[3]
+sudot v31.4s, v1.16b, v2.4b[3]
+// CHECK: sudot   v31.2s, v1.8b, v2.4b[3] // encoding: [0x3f,0xf8,0x22,0x0f]
+// CHECK: sudot   v31.4s, v1.16b, v2.4b[3] // encoding: [0x3f,0xf8,0x22,0x4f]
+// NOMATMUL: instruction requires: i8mm
+// NOMATMUL-NEXT: 

[PATCH] D77871: [AArch64] Armv8.6-a Matrix Mult Assembly + Intrinsics

2020-04-22 Thread Kerry McLaughlin via Phabricator via cfe-commits
kmclaughlin added inline comments.



Comment at: clang/test/CodeGen/aarch64-v8.6a-neon-intrinsics.c:3
+// RUN: -fallow-half-arguments-and-returns -S -disable-O0-optnone -emit-llvm 
-o - %s \
+// RUN: | opt -S -mem2reg \
+// RUN: | FileCheck %s

Is it possible to use -sroa here as you did for the tests added in D77872? If 
so, I think this might make some of the `_lane` tests below a bit easier to 
follow.



Comment at: llvm/test/MC/AArch64/armv8.6a-simd-matmul-error.s:17
+// For USDOT and SUDOT (indexed), the index is in range [0,3] (regardless of 
data types)
+usdot v31.2s, v1.8b,  v2.4b[4]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: vector lane must be an integer in 
range [0, 3].

The arrangement specifiers of the first two operands don't match for these 
tests, which is what the next set of tests below is checking for. It might be 
worth keeping these tests specific to just the index being out of range.



Comment at: llvm/test/MC/AArch64/armv8.6a-simd-matmul-error.s:26
+
+// The arrangement specifiers of the first two operands muct match.
+usdot v31.4s, v1.8b,  v2.4b[0]

muct -> must :)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77871/new/

https://reviews.llvm.org/D77871



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77871: [AArch64] Armv8.6-a Matrix Mult Assembly + Intrinsics

2020-04-10 Thread Luke Geeson via Phabricator via cfe-commits
LukeGeeson added a comment.

Removed reliance on parent revision, harbormaster now builds with unit tests 
passing


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77871/new/

https://reviews.llvm.org/D77871



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77871: [AArch64] Armv8.6-a Matrix Mult Assembly + Intrinsics

2020-04-10 Thread Luke Geeson via Phabricator via cfe-commits
LukeGeeson created this revision.
LukeGeeson added reviewers: ostannard, t.p.northover.
Herald added subscribers: cfe-commits, danielkiss, hiraditya, kristof.beyls.
Herald added a reviewer: rengolin.
Herald added a project: clang.
LukeGeeson added a parent revision: D77540: [PATCH] [ARM]: Armv8.6-a Matrix Mul 
Asm and Intrinsics Support.
LukeGeeson added a child revision: D77872: [AArch32] Armv8.6-a Matrix Mult 
Assembly + Intrinsics.

This patch upstreams support for the Armv8.6-a Matrix Multiplication
Extension. A summary of the features can be found here:

https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a

This patch includes:

- Assembly support for AArch64 only (no SVE or Neon)
- Intrinsics Support for AArch64 Armv8.6a Matrix Multiplication Instructions 
(No bfloat16 matrix multiplication)

No IR types or C Types are needed for this extension.

This is part of a patch series, starting with BFloat16 support and
the other components in the armv8.6a extension (in previous patches
linked in phabricator)

Based on work by:

- Luke Geeson
- Oliver Stannard
- Luke Cheeseman


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D77871

Files:
  clang/include/clang/Basic/arm_neon.td
  clang/lib/Basic/Targets/AArch64.cpp
  clang/lib/Basic/Targets/AArch64.h
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGen/aarch64-matmul.cpp
  clang/test/CodeGen/aarch64-v8.6a-neon-intrinsics.c
  llvm/include/llvm/IR/IntrinsicsAArch64.td
  llvm/lib/Target/AArch64/AArch64.td
  llvm/lib/Target/AArch64/AArch64InstrFormats.td
  llvm/lib/Target/AArch64/AArch64InstrInfo.td
  llvm/lib/Target/AArch64/AArch64Subtarget.h
  llvm/test/CodeGen/AArch64/aarch64-matmul.ll
  llvm/test/MC/AArch64/armv8.6a-simd-matmul-error.s
  llvm/test/MC/AArch64/armv8.6a-simd-matmul.s
  llvm/test/MC/Disassembler/AArch64/armv8.6a-simd-matmul.txt

Index: llvm/test/MC/Disassembler/AArch64/armv8.6a-simd-matmul.txt
===
--- /dev/null
+++ llvm/test/MC/Disassembler/AArch64/armv8.6a-simd-matmul.txt
@@ -0,0 +1,34 @@
+# RUN: llvm-mc -triple=aarch64  -mattr=+i8mm -disassemble < %s   | FileCheck %s
+# RUN: llvm-mc -triple=aarch64  -mattr=+v8.6a -disassemble < %s  | FileCheck %s
+# RUN: not llvm-mc -triple=aarch64  -mattr=+v8.5a -disassemble < %s 2>&1 | FileCheck %s --check-prefix=NOI8MM
+
+[0x01,0xa6,0x9f,0x4e]
+[0x01,0xa6,0x9f,0x6e]
+[0x01,0xae,0x9f,0x4e]
+# CHECK: smmla   v1.4s, v16.16b, v31.16b
+# CHECK: ummla   v1.4s, v16.16b, v31.16b
+# CHECK: usmmla  v1.4s, v16.16b, v31.16b
+# NOI8MM: [[@LINE-6]]:{{[0-9]+}}: warning: invalid instruction encoding
+# NOI8MM: [[@LINE-6]]:{{[0-9]+}}: warning: invalid instruction encoding
+# NOI8MM: [[@LINE-6]]:{{[0-9]+}}: warning: invalid instruction encoding
+
+[0xe3,0x9d,0x9e,0x0e]
+[0xe3,0x9d,0x9e,0x4e]
+# CHECK: usdot   v3.2s, v15.8b, v30.8b
+# CHECK: usdot   v3.4s, v15.16b, v30.16b
+# NOI8MM: [[@LINE-4]]:{{[0-9]+}}: warning: invalid instruction encoding
+# NOI8MM: [[@LINE-4]]:{{[0-9]+}}: warning: invalid instruction encoding
+
+[0x3f,0xf8,0xa2,0x0f]
+[0x3f,0xf8,0xa2,0x4f]
+# CHECK: usdot   v31.2s, v1.8b, v2.4b[3]
+# CHECK: usdot   v31.4s, v1.16b, v2.4b[3]
+# NOI8MM: [[@LINE-4]]:{{[0-9]+}}: warning: invalid instruction encoding
+# NOI8MM: [[@LINE-4]]:{{[0-9]+}}: warning: invalid instruction encoding
+
+[0x3f,0xf8,0x22,0x0f]
+[0x3f,0xf8,0x22,0x4f]
+# CHECK: sudot   v31.2s, v1.8b, v2.4b[3]
+# CHECK: sudot   v31.4s, v1.16b, v2.4b[3]
+# NOI8MM: [[@LINE-4]]:{{[0-9]+}}: warning: invalid instruction encoding
+# NOI8MM: [[@LINE-4]]:{{[0-9]+}}: warning: invalid instruction encoding
Index: llvm/test/MC/AArch64/armv8.6a-simd-matmul.s
===
--- /dev/null
+++ llvm/test/MC/AArch64/armv8.6a-simd-matmul.s
@@ -0,0 +1,43 @@
+// RUN: llvm-mc -triple aarch64 -show-encoding -mattr=+i8mm   < %s  | FileCheck %s
+// RUN: llvm-mc -triple aarch64 -show-encoding -mattr=+v8.6a  < %s  | FileCheck %s
+// RUN: not llvm-mc -triple aarch64 -show-encoding -mattr=+v8.6a-i8mm < %s 2>&1 | FileCheck %s --check-prefix=NOMATMUL
+
+smmla  v1.4s, v16.16b, v31.16b
+ummla  v1.4s, v16.16b, v31.16b
+usmmla v1.4s, v16.16b, v31.16b
+// CHECK: smmla   v1.4s, v16.16b, v31.16b // encoding: [0x01,0xa6,0x9f,0x4e]
+// CHECK: ummla   v1.4s, v16.16b, v31.16b // encoding: [0x01,0xa6,0x9f,0x6e]
+// CHECK: usmmla  v1.4s, v16.16b, v31.16b // encoding: [0x01,0xae,0x9f,0x4e]
+// NOMATMUL: instruction requires: i8mm
+// NOMATMUL-NEXT: smmla  v1.4s, v16.16b, v31.16b
+// NOMATMUL: instruction requires: i8mm
+// NOMATMUL-NEXT: ummla  v1.4s, v16.16b, v31.16b
+// NOMATMUL: instruction requires: i8mm
+// NOMATMUL-NEXT: usmmla  v1.4s, v16.16b, v31.16b
+
+usdot v3.2s, v15.8b, v30.8b
+usdot v3.4s, v15.16b, v30.16b
+// CHECK: usdot   v3.2s, v15.8b, v30.8b   // encoding: [0xe3,0x9d,0x9e,0x0e]
+// CHECK: usdot   v3.4s, v15.16b, v30.16b // encoding: