[PATCH] D66302: [SVE][Inline-Asm] Support for SVE asm operands

2019-09-03 Thread Kerry McLaughlin via Phabricator via cfe-commits
kmclaughlin marked an inline comment as done.
kmclaughlin added a subscriber: ruiu.
kmclaughlin added a comment.

Thank you to @gribozavr & @ruiu for spotting the warning caused by this patch, 
and the suggestions to use -Wimplicit-fallthrough!




Comment at: llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp:5836-5837
   if (VT.getSizeInBits() == 128)
 return std::make_pair(0U, ::FPR128_loRegClass);
+case 'y':
+  if (!Subtarget->hasFPARMv8())

gribozavr wrote:
> ```
> AArch64ISelLowering.cpp:5837:5: warning: unannotated fall-through between 
> switch labels [-Wimplicit-fallthrough]
> AArch64ISelLowering.cpp:5837:5: note: insert 'LLVM_FALLTHROUGH;' to silence 
> this warning
> AArch64ISelLowering.cpp:5837:5: note: insert 'break;' to avoid fall-through
> ```
> 
> Is the fallthrough intentional?
The fallthrough was not intentional; this should now be resolved by D67095


Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66302/new/

https://reviews.llvm.org/D66302



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D66302: [SVE][Inline-Asm] Support for SVE asm operands

2019-09-03 Thread Dmitri Gribenko via Phabricator via cfe-commits
gribozavr added inline comments.



Comment at: llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp:5836-5837
   if (VT.getSizeInBits() == 128)
 return std::make_pair(0U, ::FPR128_loRegClass);
+case 'y':
+  if (!Subtarget->hasFPARMv8())

```
AArch64ISelLowering.cpp:5837:5: warning: unannotated fall-through between 
switch labels [-Wimplicit-fallthrough]
AArch64ISelLowering.cpp:5837:5: note: insert 'LLVM_FALLTHROUGH;' to silence 
this warning
AArch64ISelLowering.cpp:5837:5: note: insert 'break;' to avoid fall-through
```

Is the fallthrough intentional?


Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66302/new/

https://reviews.llvm.org/D66302



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D66302: [SVE][Inline-Asm] Support for SVE asm operands

2019-09-02 Thread Kerry McLaughlin via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rL370673: [SVE][Inline-Asm] Support for SVE asm operands 
(authored by kmclaughlin, committed by ).

Changed prior to commit:
  https://reviews.llvm.org/D66302?vs=216655=218376#toc

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66302/new/

https://reviews.llvm.org/D66302

Files:
  llvm/trunk/docs/LangRef.rst
  llvm/trunk/lib/Target/AArch64/AArch64AsmPrinter.cpp
  llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp
  llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.cpp
  llvm/trunk/lib/Target/AArch64/AArch64SVEInstrInfo.td
  llvm/trunk/test/CodeGen/AArch64/aarch64-sve-asm-negative.ll
  llvm/trunk/test/CodeGen/AArch64/aarch64-sve-asm.ll
  llvm/trunk/test/CodeGen/AArch64/arm64-inline-asm.ll

Index: llvm/trunk/test/CodeGen/AArch64/arm64-inline-asm.ll
===
--- llvm/trunk/test/CodeGen/AArch64/arm64-inline-asm.ll
+++ llvm/trunk/test/CodeGen/AArch64/arm64-inline-asm.ll
@@ -138,6 +138,8 @@
   %a = alloca [2 x float], align 4
   %arraydecay = getelementptr inbounds [2 x float], [2 x float]* %a, i32 0, i32 0
   %0 = load <2 x float>, <2 x float>* %data, align 8
+  call void asm sideeffect "ldr ${1:z}, [$0]\0A", "r,w"(float* %arraydecay, <2 x float> %0) nounwind
+  ; CHECK: ldr {{z[0-9]+}}, [{{x[0-9]+}}]
   call void asm sideeffect "ldr ${1:q}, [$0]\0A", "r,w"(float* %arraydecay, <2 x float> %0) nounwind
   ; CHECK: ldr {{q[0-9]+}}, [{{x[0-9]+}}]
   call void asm sideeffect "ldr ${1:d}, [$0]\0A", "r,w"(float* %arraydecay, <2 x float> %0) nounwind
Index: llvm/trunk/test/CodeGen/AArch64/aarch64-sve-asm-negative.ll
===
--- llvm/trunk/test/CodeGen/AArch64/aarch64-sve-asm-negative.ll
+++ llvm/trunk/test/CodeGen/AArch64/aarch64-sve-asm-negative.ll
@@ -0,0 +1,12 @@
+; RUN: not llc -mtriple aarch64-none-linux-gnu -mattr=+neon -o %t.s -filetype=asm %s 2>&1 | FileCheck %s
+
+; The 'y' constraint only applies to SVE vector registers (Z0-Z7)
+; The test below ensures that we get an appropriate error should the
+; constraint be used with a Neon register.
+
+; Function Attrs: nounwind readnone
+; CHECK: error: couldn't allocate input reg for constraint 'y'
+define <4 x i32> @test_neon(<4 x i32> %in1, <4 x i32> %in2) {
+  %1 = tail call <4 x i32> asm "add $0.4s, $1.4s, $2.4s", "=w,w,y"(<4 x i32> %in1, <4 x i32> %in2)
+  ret <4 x i32> %1
+}
Index: llvm/trunk/test/CodeGen/AArch64/aarch64-sve-asm.ll
===
--- llvm/trunk/test/CodeGen/AArch64/aarch64-sve-asm.ll
+++ llvm/trunk/test/CodeGen/AArch64/aarch64-sve-asm.ll
@@ -0,0 +1,44 @@
+; RUN: llc < %s -mtriple aarch64-none-linux-gnu -mattr=+sve -stop-after=finalize-isel | FileCheck %s --check-prefix=CHECK
+
+target datalayout = "e-m:e-i64:64-i128:128-n32:64-S128"
+target triple = "aarch64-none-linux-gnu"
+
+; Function Attrs: nounwind readnone
+; CHECK: [[ARG1:%[0-9]+]]:zpr = COPY $z1
+; CHECK: [[ARG2:%[0-9]+]]:zpr = COPY $z0
+; CHECK: [[ARG3:%[0-9]+]]:zpr = COPY [[ARG2]]
+; CHECK: [[ARG4:%[0-9]+]]:zpr_3b = COPY [[ARG1]]
+define  @test_svadd_i8( %Zn,  %Zm) {
+  %1 = tail call  asm "add $0.b, $1.b, $2.b", "=w,w,y"( %Zn,  %Zm)
+  ret  %1
+}
+
+; Function Attrs: nounwind readnone
+; CHECK: [[ARG1:%[0-9]+]]:zpr = COPY $z1
+; CHECK: [[ARG2:%[0-9]+]]:zpr = COPY $z0
+; CHECK: [[ARG3:%[0-9]+]]:zpr = COPY [[ARG2]]
+; CHECK: [[ARG4:%[0-9]+]]:zpr_4b = COPY [[ARG1]]
+define  @test_svsub_i64( %Zn,  %Zm) {
+  %1 = tail call  asm "sub $0.d, $1.d, $2.d", "=w,w,x"( %Zn,  %Zm)
+  ret  %1
+}
+
+; Function Attrs: nounwind readnone
+; CHECK: [[ARG1:%[0-9]+]]:zpr = COPY $z1
+; CHECK: [[ARG2:%[0-9]+]]:zpr = COPY $z0
+; CHECK: [[ARG3:%[0-9]+]]:zpr = COPY [[ARG2]]
+; CHECK: [[ARG4:%[0-9]+]]:zpr_3b = COPY [[ARG1]]
+define  @test_svfmul_f16( %Zn,  %Zm) {
+  %1 = tail call  asm "fmul $0.h, $1.h, $2.h", "=w,w,y"( %Zn,  %Zm)
+  ret  %1
+}
+
+; Function Attrs: nounwind readnone
+; CHECK: [[ARG1:%[0-9]+]]:zpr = COPY $z1
+; CHECK: [[ARG2:%[0-9]+]]:zpr = COPY $z0
+; CHECK: [[ARG3:%[0-9]+]]:zpr = COPY [[ARG2]]
+; CHECK: [[ARG4:%[0-9]+]]:zpr_4b = COPY [[ARG1]]
+define  @test_svfmul_f( %Zn,  %Zm) {
+  %1 = tail call  asm "fmul $0.s, $1.s, $2.s", "=w,w,x"( %Zn,  %Zm)
+  ret  %1
+}
Index: llvm/trunk/lib/Target/AArch64/AArch64AsmPrinter.cpp
===
--- llvm/trunk/lib/Target/AArch64/AArch64AsmPrinter.cpp
+++ llvm/trunk/lib/Target/AArch64/AArch64AsmPrinter.cpp
@@ -150,7 +150,7 @@
   void printOperand(const MachineInstr *MI, unsigned OpNum, raw_ostream );
   bool printAsmMRegister(const MachineOperand , char Mode, raw_ostream );
   bool printAsmRegInClass(const MachineOperand ,
-  const TargetRegisterClass *RC, bool isVector,
+  const TargetRegisterClass *RC, unsigned AltName,
   raw_ostream );
 
  

[PATCH] D66302: [SVE][Inline-Asm] Support for SVE asm operands

2019-08-22 Thread Sander de Smalen via Phabricator via cfe-commits
sdesmalen accepted this revision.
sdesmalen added a comment.
This revision is now accepted and ready to land.

Thanks for making these changes @kmclaughlin, LGTM!


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66302/new/

https://reviews.llvm.org/D66302



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D66302: [SVE][Inline-Asm] Support for SVE asm operands

2019-08-22 Thread Kerry McLaughlin via Phabricator via cfe-commits
kmclaughlin updated this revision to Diff 216655.
kmclaughlin added a comment.

- Removed a confusing comment from AArch64AsmPrinter.cpp


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66302/new/

https://reviews.llvm.org/D66302

Files:
  docs/LangRef.rst
  lib/Target/AArch64/AArch64AsmPrinter.cpp
  lib/Target/AArch64/AArch64ISelLowering.cpp
  lib/Target/AArch64/AArch64InstrInfo.cpp
  lib/Target/AArch64/AArch64SVEInstrInfo.td
  test/CodeGen/AArch64/aarch64-sve-asm-negative.ll
  test/CodeGen/AArch64/aarch64-sve-asm.ll
  test/CodeGen/AArch64/arm64-inline-asm.ll

Index: test/CodeGen/AArch64/arm64-inline-asm.ll
===
--- test/CodeGen/AArch64/arm64-inline-asm.ll
+++ test/CodeGen/AArch64/arm64-inline-asm.ll
@@ -138,6 +138,8 @@
   %a = alloca [2 x float], align 4
   %arraydecay = getelementptr inbounds [2 x float], [2 x float]* %a, i32 0, i32 0
   %0 = load <2 x float>, <2 x float>* %data, align 8
+  call void asm sideeffect "ldr ${1:z}, [$0]\0A", "r,w"(float* %arraydecay, <2 x float> %0) nounwind
+  ; CHECK: ldr {{z[0-9]+}}, [{{x[0-9]+}}]
   call void asm sideeffect "ldr ${1:q}, [$0]\0A", "r,w"(float* %arraydecay, <2 x float> %0) nounwind
   ; CHECK: ldr {{q[0-9]+}}, [{{x[0-9]+}}]
   call void asm sideeffect "ldr ${1:d}, [$0]\0A", "r,w"(float* %arraydecay, <2 x float> %0) nounwind
Index: test/CodeGen/AArch64/aarch64-sve-asm.ll
===
--- /dev/null
+++ test/CodeGen/AArch64/aarch64-sve-asm.ll
@@ -0,0 +1,44 @@
+; RUN: llc < %s -mtriple aarch64-none-linux-gnu -mattr=+sve -stop-after=finalize-isel | FileCheck %s --check-prefix=CHECK
+
+target datalayout = "e-m:e-i64:64-i128:128-n32:64-S128"
+target triple = "aarch64-none-linux-gnu"
+
+; Function Attrs: nounwind readnone
+; CHECK: [[ARG1:%[0-9]+]]:zpr = COPY $z1
+; CHECK: [[ARG2:%[0-9]+]]:zpr = COPY $z0
+; CHECK: [[ARG3:%[0-9]+]]:zpr = COPY [[ARG2]]
+; CHECK: [[ARG4:%[0-9]+]]:zpr_3b = COPY [[ARG1]]
+define  @test_svadd_i8( %Zn,  %Zm) {
+  %1 = tail call  asm "add $0.b, $1.b, $2.b", "=w,w,y"( %Zn,  %Zm)
+  ret  %1
+}
+
+; Function Attrs: nounwind readnone
+; CHECK: [[ARG1:%[0-9]+]]:zpr = COPY $z1
+; CHECK: [[ARG2:%[0-9]+]]:zpr = COPY $z0
+; CHECK: [[ARG3:%[0-9]+]]:zpr = COPY [[ARG2]]
+; CHECK: [[ARG4:%[0-9]+]]:zpr_4b = COPY [[ARG1]]
+define  @test_svsub_i64( %Zn,  %Zm) {
+  %1 = tail call  asm "sub $0.d, $1.d, $2.d", "=w,w,x"( %Zn,  %Zm)
+  ret  %1
+}
+
+; Function Attrs: nounwind readnone
+; CHECK: [[ARG1:%[0-9]+]]:zpr = COPY $z1
+; CHECK: [[ARG2:%[0-9]+]]:zpr = COPY $z0
+; CHECK: [[ARG3:%[0-9]+]]:zpr = COPY [[ARG2]]
+; CHECK: [[ARG4:%[0-9]+]]:zpr_3b = COPY [[ARG1]]
+define  @test_svfmul_f16( %Zn,  %Zm) {
+  %1 = tail call  asm "fmul $0.h, $1.h, $2.h", "=w,w,y"( %Zn,  %Zm)
+  ret  %1
+}
+
+; Function Attrs: nounwind readnone
+; CHECK: [[ARG1:%[0-9]+]]:zpr = COPY $z1
+; CHECK: [[ARG2:%[0-9]+]]:zpr = COPY $z0
+; CHECK: [[ARG3:%[0-9]+]]:zpr = COPY [[ARG2]]
+; CHECK: [[ARG4:%[0-9]+]]:zpr_4b = COPY [[ARG1]]
+define  @test_svfmul_f( %Zn,  %Zm) {
+  %1 = tail call  asm "fmul $0.s, $1.s, $2.s", "=w,w,x"( %Zn,  %Zm)
+  ret  %1
+}
Index: test/CodeGen/AArch64/aarch64-sve-asm-negative.ll
===
--- /dev/null
+++ test/CodeGen/AArch64/aarch64-sve-asm-negative.ll
@@ -0,0 +1,12 @@
+; RUN: not llc -mtriple aarch64-none-linux-gnu -mattr=+neon -o %t.s -filetype=asm %s 2>&1 | FileCheck %s
+
+; The 'y' constraint only applies to SVE vector registers (Z0-Z7)
+; The test below ensures that we get an appropriate error should the
+; constraint be used with a Neon register.
+
+; Function Attrs: nounwind readnone
+; CHECK: error: couldn't allocate input reg for constraint 'y'
+define <4 x i32> @test_neon(<4 x i32> %in1, <4 x i32> %in2) {
+  %1 = tail call <4 x i32> asm "add $0.4s, $1.4s, $2.4s", "=w,w,y"(<4 x i32> %in1, <4 x i32> %in2)
+  ret <4 x i32> %1
+}
Index: lib/Target/AArch64/AArch64SVEInstrInfo.td
===
--- lib/Target/AArch64/AArch64SVEInstrInfo.td
+++ lib/Target/AArch64/AArch64SVEInstrInfo.td
@@ -1020,6 +1020,56 @@
   (FCMGT_PPzZZ_S PPR32:$Zd, PPR3bAny:$Pg, ZPR32:$Zn, ZPR32:$Zm), 0>;
   def : InstAlias<"fcmlt $Zd, $Pg/z, $Zm, $Zn",
   (FCMGT_PPzZZ_D PPR64:$Zd, PPR3bAny:$Pg, ZPR64:$Zn, ZPR64:$Zm), 0>;
+
+  def : Pat<(nxv16i8 (bitconvert (nxv8i16 ZPR:$src))), (nxv16i8 ZPR:$src)>;
+  def : Pat<(nxv16i8 (bitconvert (nxv4i32 ZPR:$src))), (nxv16i8 ZPR:$src)>;
+  def : Pat<(nxv16i8 (bitconvert (nxv2i64 ZPR:$src))), (nxv16i8 ZPR:$src)>;
+  def : Pat<(nxv16i8 (bitconvert (nxv8f16 ZPR:$src))), (nxv16i8 ZPR:$src)>;
+  def : Pat<(nxv16i8 (bitconvert (nxv4f32 ZPR:$src))), (nxv16i8 ZPR:$src)>;
+  def : Pat<(nxv16i8 (bitconvert (nxv2f64 ZPR:$src))), (nxv16i8 ZPR:$src)>;
+
+  def : Pat<(nxv8i16 (bitconvert (nxv16i8 ZPR:$src))), (nxv8i16 ZPR:$src)>;
+  def : Pat<(nxv8i16 (bitconvert (nxv4i32 ZPR:$src))), (nxv8i16 

[PATCH] D66302: [SVE][Inline-Asm] Support for SVE asm operands

2019-08-22 Thread Kerry McLaughlin via Phabricator via cfe-commits
kmclaughlin updated this revision to Diff 216574.
kmclaughlin added a comment.

- Changed printAsmRegInClass in AArch64AsmPrinter.cpp to accept //unsigned 
AltName// instead of //bool isVector//
- Added a comment to explain the test in aarch64-sve-asm-negative.ll


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66302/new/

https://reviews.llvm.org/D66302

Files:
  docs/LangRef.rst
  lib/Target/AArch64/AArch64AsmPrinter.cpp
  lib/Target/AArch64/AArch64ISelLowering.cpp
  lib/Target/AArch64/AArch64InstrInfo.cpp
  lib/Target/AArch64/AArch64SVEInstrInfo.td
  test/CodeGen/AArch64/aarch64-sve-asm-negative.ll
  test/CodeGen/AArch64/aarch64-sve-asm.ll
  test/CodeGen/AArch64/arm64-inline-asm.ll

Index: test/CodeGen/AArch64/arm64-inline-asm.ll
===
--- test/CodeGen/AArch64/arm64-inline-asm.ll
+++ test/CodeGen/AArch64/arm64-inline-asm.ll
@@ -138,6 +138,8 @@
   %a = alloca [2 x float], align 4
   %arraydecay = getelementptr inbounds [2 x float], [2 x float]* %a, i32 0, i32 0
   %0 = load <2 x float>, <2 x float>* %data, align 8
+  call void asm sideeffect "ldr ${1:z}, [$0]\0A", "r,w"(float* %arraydecay, <2 x float> %0) nounwind
+  ; CHECK: ldr {{z[0-9]+}}, [{{x[0-9]+}}]
   call void asm sideeffect "ldr ${1:q}, [$0]\0A", "r,w"(float* %arraydecay, <2 x float> %0) nounwind
   ; CHECK: ldr {{q[0-9]+}}, [{{x[0-9]+}}]
   call void asm sideeffect "ldr ${1:d}, [$0]\0A", "r,w"(float* %arraydecay, <2 x float> %0) nounwind
Index: test/CodeGen/AArch64/aarch64-sve-asm.ll
===
--- /dev/null
+++ test/CodeGen/AArch64/aarch64-sve-asm.ll
@@ -0,0 +1,44 @@
+; RUN: llc < %s -mtriple aarch64-none-linux-gnu -mattr=+sve -stop-after=finalize-isel | FileCheck %s --check-prefix=CHECK
+
+target datalayout = "e-m:e-i64:64-i128:128-n32:64-S128"
+target triple = "aarch64-none-linux-gnu"
+
+; Function Attrs: nounwind readnone
+; CHECK: [[ARG1:%[0-9]+]]:zpr = COPY $z1
+; CHECK: [[ARG2:%[0-9]+]]:zpr = COPY $z0
+; CHECK: [[ARG3:%[0-9]+]]:zpr = COPY [[ARG2]]
+; CHECK: [[ARG4:%[0-9]+]]:zpr_3b = COPY [[ARG1]]
+define  @test_svadd_i8( %Zn,  %Zm) {
+  %1 = tail call  asm "add $0.b, $1.b, $2.b", "=w,w,y"( %Zn,  %Zm)
+  ret  %1
+}
+
+; Function Attrs: nounwind readnone
+; CHECK: [[ARG1:%[0-9]+]]:zpr = COPY $z1
+; CHECK: [[ARG2:%[0-9]+]]:zpr = COPY $z0
+; CHECK: [[ARG3:%[0-9]+]]:zpr = COPY [[ARG2]]
+; CHECK: [[ARG4:%[0-9]+]]:zpr_4b = COPY [[ARG1]]
+define  @test_svsub_i64( %Zn,  %Zm) {
+  %1 = tail call  asm "sub $0.d, $1.d, $2.d", "=w,w,x"( %Zn,  %Zm)
+  ret  %1
+}
+
+; Function Attrs: nounwind readnone
+; CHECK: [[ARG1:%[0-9]+]]:zpr = COPY $z1
+; CHECK: [[ARG2:%[0-9]+]]:zpr = COPY $z0
+; CHECK: [[ARG3:%[0-9]+]]:zpr = COPY [[ARG2]]
+; CHECK: [[ARG4:%[0-9]+]]:zpr_3b = COPY [[ARG1]]
+define  @test_svfmul_f16( %Zn,  %Zm) {
+  %1 = tail call  asm "fmul $0.h, $1.h, $2.h", "=w,w,y"( %Zn,  %Zm)
+  ret  %1
+}
+
+; Function Attrs: nounwind readnone
+; CHECK: [[ARG1:%[0-9]+]]:zpr = COPY $z1
+; CHECK: [[ARG2:%[0-9]+]]:zpr = COPY $z0
+; CHECK: [[ARG3:%[0-9]+]]:zpr = COPY [[ARG2]]
+; CHECK: [[ARG4:%[0-9]+]]:zpr_4b = COPY [[ARG1]]
+define  @test_svfmul_f( %Zn,  %Zm) {
+  %1 = tail call  asm "fmul $0.s, $1.s, $2.s", "=w,w,x"( %Zn,  %Zm)
+  ret  %1
+}
Index: test/CodeGen/AArch64/aarch64-sve-asm-negative.ll
===
--- /dev/null
+++ test/CodeGen/AArch64/aarch64-sve-asm-negative.ll
@@ -0,0 +1,12 @@
+; RUN: not llc -mtriple aarch64-none-linux-gnu -mattr=+neon -o %t.s -filetype=asm %s 2>&1 | FileCheck %s
+
+; The 'y' constraint only applies to SVE vector registers (Z0-Z7)
+; The test below ensures that we get an appropriate error should the
+; constraint be used with a Neon register.
+
+; Function Attrs: nounwind readnone
+; CHECK: error: couldn't allocate input reg for constraint 'y'
+define <4 x i32> @test_neon(<4 x i32> %in1, <4 x i32> %in2) {
+  %1 = tail call <4 x i32> asm "add $0.4s, $1.4s, $2.4s", "=w,w,y"(<4 x i32> %in1, <4 x i32> %in2)
+  ret <4 x i32> %1
+}
Index: lib/Target/AArch64/AArch64SVEInstrInfo.td
===
--- lib/Target/AArch64/AArch64SVEInstrInfo.td
+++ lib/Target/AArch64/AArch64SVEInstrInfo.td
@@ -1020,6 +1020,56 @@
   (FCMGT_PPzZZ_S PPR32:$Zd, PPR3bAny:$Pg, ZPR32:$Zn, ZPR32:$Zm), 0>;
   def : InstAlias<"fcmlt $Zd, $Pg/z, $Zm, $Zn",
   (FCMGT_PPzZZ_D PPR64:$Zd, PPR3bAny:$Pg, ZPR64:$Zn, ZPR64:$Zm), 0>;
+
+  def : Pat<(nxv16i8 (bitconvert (nxv8i16 ZPR:$src))), (nxv16i8 ZPR:$src)>;
+  def : Pat<(nxv16i8 (bitconvert (nxv4i32 ZPR:$src))), (nxv16i8 ZPR:$src)>;
+  def : Pat<(nxv16i8 (bitconvert (nxv2i64 ZPR:$src))), (nxv16i8 ZPR:$src)>;
+  def : Pat<(nxv16i8 (bitconvert (nxv8f16 ZPR:$src))), (nxv16i8 ZPR:$src)>;
+  def : Pat<(nxv16i8 (bitconvert (nxv4f32 ZPR:$src))), (nxv16i8 ZPR:$src)>;
+  def : Pat<(nxv16i8 (bitconvert (nxv2f64 ZPR:$src))), (nxv16i8 ZPR:$src)>;
+
+  def : Pat<(nxv8i16 

[PATCH] D66302: [SVE][Inline-Asm] Support for SVE asm operands

2019-08-21 Thread Sander de Smalen via Phabricator via cfe-commits
sdesmalen added inline comments.



Comment at: lib/Target/AArch64/AArch64AsmPrinter.cpp:618
 
+bool hasAltName;
+const TargetRegisterClass *RegClass;

The use of `hasAltName` is confusing me here. When I look at the declaration 
and definition of `printAsmRegInClass`, the parameter is called `bool 
isVector`, and a SVE vector is still a vector, so passing `false` is odd. I 
think it makes more sense to change `printAsmRegInClass` to accept an `unsigned 
AltName`, and just pass `AArch64::vreg` for Neon Vectors directly (or 
`AArch64::NoRegAltName` otherwise).



Comment at: test/CodeGen/AArch64/aarch64-sve-asm-negative.ll:2
+; RUN: not llc -mtriple aarch64-none-linux-gnu -mattr=+neon -o %t.s 
-filetype=asm %s 2>&1 | FileCheck %s
+
+; Function Attrs: nounwind readnone

Can you add a comment explaining what this is testing (and why the inline asm 
below is not valid)?



Comment at: test/CodeGen/AArch64/aarch64-sve-asm.ll:46
+
+!0 = !{i32 188, i32 210}

nit: what is this line doing here?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66302/new/

https://reviews.llvm.org/D66302



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D66302: [SVE][Inline-Asm] Support for SVE asm operands

2019-08-20 Thread Kerry McLaughlin via Phabricator via cfe-commits
kmclaughlin updated this revision to Diff 216178.
kmclaughlin added a comment.

- Added a new test file, aarch64-sve-asm-negative.ll
- Updated description of the 'y' constraint in LangRef.rst




CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66302/new/

https://reviews.llvm.org/D66302

Files:
  docs/LangRef.rst
  lib/Target/AArch64/AArch64AsmPrinter.cpp
  lib/Target/AArch64/AArch64ISelLowering.cpp
  lib/Target/AArch64/AArch64InstrInfo.cpp
  lib/Target/AArch64/AArch64SVEInstrInfo.td
  test/CodeGen/AArch64/aarch64-sve-asm-negative.ll
  test/CodeGen/AArch64/aarch64-sve-asm.ll
  test/CodeGen/AArch64/arm64-inline-asm.ll

Index: test/CodeGen/AArch64/arm64-inline-asm.ll
===
--- test/CodeGen/AArch64/arm64-inline-asm.ll
+++ test/CodeGen/AArch64/arm64-inline-asm.ll
@@ -138,6 +138,8 @@
   %a = alloca [2 x float], align 4
   %arraydecay = getelementptr inbounds [2 x float], [2 x float]* %a, i32 0, i32 0
   %0 = load <2 x float>, <2 x float>* %data, align 8
+  call void asm sideeffect "ldr ${1:z}, [$0]\0A", "r,w"(float* %arraydecay, <2 x float> %0) nounwind
+  ; CHECK: ldr {{z[0-9]+}}, [{{x[0-9]+}}]
   call void asm sideeffect "ldr ${1:q}, [$0]\0A", "r,w"(float* %arraydecay, <2 x float> %0) nounwind
   ; CHECK: ldr {{q[0-9]+}}, [{{x[0-9]+}}]
   call void asm sideeffect "ldr ${1:d}, [$0]\0A", "r,w"(float* %arraydecay, <2 x float> %0) nounwind
Index: test/CodeGen/AArch64/aarch64-sve-asm.ll
===
--- /dev/null
+++ test/CodeGen/AArch64/aarch64-sve-asm.ll
@@ -0,0 +1,46 @@
+; RUN: llc < %s -mtriple aarch64-none-linux-gnu -mattr=+sve -stop-after=finalize-isel | FileCheck %s --check-prefix=CHECK
+
+target datalayout = "e-m:e-i64:64-i128:128-n32:64-S128"
+target triple = "aarch64-none-linux-gnu"
+
+; Function Attrs: nounwind readnone
+; CHECK: [[ARG1:%[0-9]+]]:zpr = COPY $z1
+; CHECK: [[ARG2:%[0-9]+]]:zpr = COPY $z0
+; CHECK: [[ARG3:%[0-9]+]]:zpr = COPY [[ARG2]]
+; CHECK: [[ARG4:%[0-9]+]]:zpr_3b = COPY [[ARG1]]
+define  @test_svadd_i8( %Zn,  %Zm) {
+  %1 = tail call  asm "add $0.b, $1.b, $2.b", "=w,w,y"( %Zn,  %Zm)
+  ret  %1
+}
+
+; Function Attrs: nounwind readnone
+; CHECK: [[ARG1:%[0-9]+]]:zpr = COPY $z1
+; CHECK: [[ARG2:%[0-9]+]]:zpr = COPY $z0
+; CHECK: [[ARG3:%[0-9]+]]:zpr = COPY [[ARG2]]
+; CHECK: [[ARG4:%[0-9]+]]:zpr_4b = COPY [[ARG1]]
+define  @test_svsub_i64( %Zn,  %Zm) {
+  %1 = tail call  asm "sub $0.d, $1.d, $2.d", "=w,w,x"( %Zn,  %Zm)
+  ret  %1
+}
+
+; Function Attrs: nounwind readnone
+; CHECK: [[ARG1:%[0-9]+]]:zpr = COPY $z1
+; CHECK: [[ARG2:%[0-9]+]]:zpr = COPY $z0
+; CHECK: [[ARG3:%[0-9]+]]:zpr = COPY [[ARG2]]
+; CHECK: [[ARG4:%[0-9]+]]:zpr_3b = COPY [[ARG1]]
+define  @test_svfmul_f16( %Zn,  %Zm) {
+  %1 = tail call  asm "fmul $0.h, $1.h, $2.h", "=w,w,y"( %Zn,  %Zm)
+  ret  %1
+}
+
+; Function Attrs: nounwind readnone
+; CHECK: [[ARG1:%[0-9]+]]:zpr = COPY $z1
+; CHECK: [[ARG2:%[0-9]+]]:zpr = COPY $z0
+; CHECK: [[ARG3:%[0-9]+]]:zpr = COPY [[ARG2]]
+; CHECK: [[ARG4:%[0-9]+]]:zpr_4b = COPY [[ARG1]]
+define  @test_svfmul_f( %Zn,  %Zm) {
+  %1 = tail call  asm "fmul $0.s, $1.s, $2.s", "=w,w,x"( %Zn,  %Zm)
+  ret  %1
+}
+
+!0 = !{i32 188, i32 210}
Index: test/CodeGen/AArch64/aarch64-sve-asm-negative.ll
===
--- /dev/null
+++ test/CodeGen/AArch64/aarch64-sve-asm-negative.ll
@@ -0,0 +1,8 @@
+; RUN: not llc -mtriple aarch64-none-linux-gnu -mattr=+neon -o %t.s -filetype=asm %s 2>&1 | FileCheck %s
+
+; Function Attrs: nounwind readnone
+; CHECK: error: couldn't allocate input reg for constraint 'y'
+define <4 x i32> @test_neon(<4 x i32> %in1, <4 x i32> %in2) {
+  %1 = tail call <4 x i32> asm "add $0.4s, $1.4s, $2.4s", "=w,w,y"(<4 x i32> %in1, <4 x i32> %in2)
+  ret <4 x i32> %1
+}
Index: lib/Target/AArch64/AArch64SVEInstrInfo.td
===
--- lib/Target/AArch64/AArch64SVEInstrInfo.td
+++ lib/Target/AArch64/AArch64SVEInstrInfo.td
@@ -1020,6 +1020,56 @@
   (FCMGT_PPzZZ_S PPR32:$Zd, PPR3bAny:$Pg, ZPR32:$Zn, ZPR32:$Zm), 0>;
   def : InstAlias<"fcmlt $Zd, $Pg/z, $Zm, $Zn",
   (FCMGT_PPzZZ_D PPR64:$Zd, PPR3bAny:$Pg, ZPR64:$Zn, ZPR64:$Zm), 0>;
+
+  def : Pat<(nxv16i8 (bitconvert (nxv8i16 ZPR:$src))), (nxv16i8 ZPR:$src)>;
+  def : Pat<(nxv16i8 (bitconvert (nxv4i32 ZPR:$src))), (nxv16i8 ZPR:$src)>;
+  def : Pat<(nxv16i8 (bitconvert (nxv2i64 ZPR:$src))), (nxv16i8 ZPR:$src)>;
+  def : Pat<(nxv16i8 (bitconvert (nxv8f16 ZPR:$src))), (nxv16i8 ZPR:$src)>;
+  def : Pat<(nxv16i8 (bitconvert (nxv4f32 ZPR:$src))), (nxv16i8 ZPR:$src)>;
+  def : Pat<(nxv16i8 (bitconvert (nxv2f64 ZPR:$src))), (nxv16i8 ZPR:$src)>;
+
+  def : Pat<(nxv8i16 (bitconvert (nxv16i8 ZPR:$src))), (nxv8i16 ZPR:$src)>;
+  def : Pat<(nxv8i16 (bitconvert (nxv4i32 ZPR:$src))), (nxv8i16 ZPR:$src)>;
+  def : Pat<(nxv8i16 (bitconvert (nxv2i64 ZPR:$src))), (nxv8i16 ZPR:$src)>;
+  def : 

[PATCH] D66302: [SVE][Inline-Asm] Support for SVE asm operands

2019-08-16 Thread Sander de Smalen via Phabricator via cfe-commits
sdesmalen added a comment.

Thanks for this change @kmclaughlin.




Comment at: docs/LangRef.rst:3816
+- ``x``: Like w, but restricted to registers 0 to 15 inclusive.
+- ``y``: Like w, but restricted to registers 0 to 7 inclusive.
 

I noticed this comment does not match the code below, since `y` only seems to 
work for scalable vectors, which probably shows this case is missing a test.


Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66302/new/

https://reviews.llvm.org/D66302



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D66302: [SVE][Inline-Asm] Support for SVE asm operands

2019-08-15 Thread Kerry McLaughlin via Phabricator via cfe-commits
kmclaughlin created this revision.
kmclaughlin added reviewers: t.p.northover, sdesmalen, rovka, momchil.velikov.
Herald added subscribers: psnobl, rkruppe, tschuett, javed.absar.
Herald added a reviewer: rengolin.
Herald added a project: LLVM.

Adds the following inline asm constraints for SVE:

- w: SVE vector register with full range, Z0 to Z31
- x: Restricted to registers Z0 to Z15 inclusive.
- y: Restricted to registers Z0 to Z7 inclusive.

This change also adds the "z" modifier to interpret a register as an SVE 
register.

Not all of the bitconvert patterns added by this patch are used, but they have 
been included here for completeness.


Repository:
  rL LLVM

https://reviews.llvm.org/D66302

Files:
  docs/LangRef.rst
  lib/Target/AArch64/AArch64AsmPrinter.cpp
  lib/Target/AArch64/AArch64ISelLowering.cpp
  lib/Target/AArch64/AArch64InstrInfo.cpp
  lib/Target/AArch64/AArch64SVEInstrInfo.td
  test/CodeGen/AArch64/aarch64-sve-asm.ll
  test/CodeGen/AArch64/arm64-inline-asm.ll

Index: test/CodeGen/AArch64/arm64-inline-asm.ll
===
--- test/CodeGen/AArch64/arm64-inline-asm.ll
+++ test/CodeGen/AArch64/arm64-inline-asm.ll
@@ -138,6 +138,8 @@
   %a = alloca [2 x float], align 4
   %arraydecay = getelementptr inbounds [2 x float], [2 x float]* %a, i32 0, i32 0
   %0 = load <2 x float>, <2 x float>* %data, align 8
+  call void asm sideeffect "ldr ${1:z}, [$0]\0A", "r,w"(float* %arraydecay, <2 x float> %0) nounwind
+  ; CHECK: ldr {{z[0-9]+}}, [{{x[0-9]+}}]
   call void asm sideeffect "ldr ${1:q}, [$0]\0A", "r,w"(float* %arraydecay, <2 x float> %0) nounwind
   ; CHECK: ldr {{q[0-9]+}}, [{{x[0-9]+}}]
   call void asm sideeffect "ldr ${1:d}, [$0]\0A", "r,w"(float* %arraydecay, <2 x float> %0) nounwind
Index: test/CodeGen/AArch64/aarch64-sve-asm.ll
===
--- /dev/null
+++ test/CodeGen/AArch64/aarch64-sve-asm.ll
@@ -0,0 +1,46 @@
+; RUN: llc < %s -mtriple aarch64-none-linux-gnu -mattr=+sve -stop-after=finalize-isel | FileCheck %s --check-prefix=CHECK
+
+target datalayout = "e-m:e-i64:64-i128:128-n32:64-S128"
+target triple = "aarch64-none-linux-gnu"
+
+; Function Attrs: nounwind readnone
+; CHECK: [[ARG1:%[0-9]+]]:zpr = COPY $z1
+; CHECK: [[ARG2:%[0-9]+]]:zpr = COPY $z0
+; CHECK: [[ARG3:%[0-9]+]]:zpr = COPY [[ARG2]]
+; CHECK: [[ARG4:%[0-9]+]]:zpr_3b = COPY [[ARG1]]
+define  @test_svadd_i8( %Zn,  %Zm) {
+  %1 = tail call  asm "add $0.b, $1.b, $2.b", "=w,w,y"( %Zn,  %Zm)
+  ret  %1
+}
+
+; Function Attrs: nounwind readnone
+; CHECK: [[ARG1:%[0-9]+]]:zpr = COPY $z1
+; CHECK: [[ARG2:%[0-9]+]]:zpr = COPY $z0
+; CHECK: [[ARG3:%[0-9]+]]:zpr = COPY [[ARG2]]
+; CHECK: [[ARG4:%[0-9]+]]:zpr_4b = COPY [[ARG1]]
+define  @test_svsub_i64( %Zn,  %Zm) {
+  %1 = tail call  asm "sub $0.d, $1.d, $2.d", "=w,w,x"( %Zn,  %Zm)
+  ret  %1
+}
+
+; Function Attrs: nounwind readnone
+; CHECK: [[ARG1:%[0-9]+]]:zpr = COPY $z1
+; CHECK: [[ARG2:%[0-9]+]]:zpr = COPY $z0
+; CHECK: [[ARG3:%[0-9]+]]:zpr = COPY [[ARG2]]
+; CHECK: [[ARG4:%[0-9]+]]:zpr_3b = COPY [[ARG1]]
+define  @test_svfmul_f16( %Zn,  %Zm) {
+  %1 = tail call  asm "fmul $0.h, $1.h, $2.h", "=w,w,y"( %Zn,  %Zm)
+  ret  %1
+}
+
+; Function Attrs: nounwind readnone
+; CHECK: [[ARG1:%[0-9]+]]:zpr = COPY $z1
+; CHECK: [[ARG2:%[0-9]+]]:zpr = COPY $z0
+; CHECK: [[ARG3:%[0-9]+]]:zpr = COPY [[ARG2]]
+; CHECK: [[ARG4:%[0-9]+]]:zpr_4b = COPY [[ARG1]]
+define  @test_svfmul_f( %Zn,  %Zm) {
+  %1 = tail call  asm "fmul $0.s, $1.s, $2.s", "=w,w,x"( %Zn,  %Zm)
+  ret  %1
+}
+
+!0 = !{i32 188, i32 210}
Index: lib/Target/AArch64/AArch64SVEInstrInfo.td
===
--- lib/Target/AArch64/AArch64SVEInstrInfo.td
+++ lib/Target/AArch64/AArch64SVEInstrInfo.td
@@ -1020,6 +1020,56 @@
   (FCMGT_PPzZZ_S PPR32:$Zd, PPR3bAny:$Pg, ZPR32:$Zn, ZPR32:$Zm), 0>;
   def : InstAlias<"fcmlt $Zd, $Pg/z, $Zm, $Zn",
   (FCMGT_PPzZZ_D PPR64:$Zd, PPR3bAny:$Pg, ZPR64:$Zn, ZPR64:$Zm), 0>;
+
+  def : Pat<(nxv16i8 (bitconvert (nxv8i16 ZPR:$src))), (nxv16i8 ZPR:$src)>;
+  def : Pat<(nxv16i8 (bitconvert (nxv4i32 ZPR:$src))), (nxv16i8 ZPR:$src)>;
+  def : Pat<(nxv16i8 (bitconvert (nxv2i64 ZPR:$src))), (nxv16i8 ZPR:$src)>;
+  def : Pat<(nxv16i8 (bitconvert (nxv8f16 ZPR:$src))), (nxv16i8 ZPR:$src)>;
+  def : Pat<(nxv16i8 (bitconvert (nxv4f32 ZPR:$src))), (nxv16i8 ZPR:$src)>;
+  def : Pat<(nxv16i8 (bitconvert (nxv2f64 ZPR:$src))), (nxv16i8 ZPR:$src)>;
+
+  def : Pat<(nxv8i16 (bitconvert (nxv16i8 ZPR:$src))), (nxv8i16 ZPR:$src)>;
+  def : Pat<(nxv8i16 (bitconvert (nxv4i32 ZPR:$src))), (nxv8i16 ZPR:$src)>;
+  def : Pat<(nxv8i16 (bitconvert (nxv2i64 ZPR:$src))), (nxv8i16 ZPR:$src)>;
+  def : Pat<(nxv8i16 (bitconvert (nxv8f16 ZPR:$src))), (nxv8i16 ZPR:$src)>;
+  def : Pat<(nxv8i16 (bitconvert (nxv4f32 ZPR:$src))), (nxv8i16 ZPR:$src)>;
+  def : Pat<(nxv8i16 (bitconvert (nxv2f64 ZPR:$src))), (nxv8i16 ZPR:$src)>;
+
+  def : Pat<(nxv4i32