[PATCH] D66302: [SVE][Inline-Asm] Support for SVE asm operands
kmclaughlin marked an inline comment as done. kmclaughlin added a subscriber: ruiu. kmclaughlin added a comment. Thank you to @gribozavr & @ruiu for spotting the warning caused by this patch, and the suggestions to use -Wimplicit-fallthrough! Comment at: llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp:5836-5837 if (VT.getSizeInBits() == 128) return std::make_pair(0U, ::FPR128_loRegClass); +case 'y': + if (!Subtarget->hasFPARMv8()) gribozavr wrote: > ``` > AArch64ISelLowering.cpp:5837:5: warning: unannotated fall-through between > switch labels [-Wimplicit-fallthrough] > AArch64ISelLowering.cpp:5837:5: note: insert 'LLVM_FALLTHROUGH;' to silence > this warning > AArch64ISelLowering.cpp:5837:5: note: insert 'break;' to avoid fall-through > ``` > > Is the fallthrough intentional? The fallthrough was not intentional; this should now be resolved by D67095 Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66302/new/ https://reviews.llvm.org/D66302 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D66302: [SVE][Inline-Asm] Support for SVE asm operands
gribozavr added inline comments. Comment at: llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp:5836-5837 if (VT.getSizeInBits() == 128) return std::make_pair(0U, ::FPR128_loRegClass); +case 'y': + if (!Subtarget->hasFPARMv8()) ``` AArch64ISelLowering.cpp:5837:5: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough] AArch64ISelLowering.cpp:5837:5: note: insert 'LLVM_FALLTHROUGH;' to silence this warning AArch64ISelLowering.cpp:5837:5: note: insert 'break;' to avoid fall-through ``` Is the fallthrough intentional? Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66302/new/ https://reviews.llvm.org/D66302 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D66302: [SVE][Inline-Asm] Support for SVE asm operands
This revision was automatically updated to reflect the committed changes. Closed by commit rL370673: [SVE][Inline-Asm] Support for SVE asm operands (authored by kmclaughlin, committed by ). Changed prior to commit: https://reviews.llvm.org/D66302?vs=216655=218376#toc Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66302/new/ https://reviews.llvm.org/D66302 Files: llvm/trunk/docs/LangRef.rst llvm/trunk/lib/Target/AArch64/AArch64AsmPrinter.cpp llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.cpp llvm/trunk/lib/Target/AArch64/AArch64SVEInstrInfo.td llvm/trunk/test/CodeGen/AArch64/aarch64-sve-asm-negative.ll llvm/trunk/test/CodeGen/AArch64/aarch64-sve-asm.ll llvm/trunk/test/CodeGen/AArch64/arm64-inline-asm.ll Index: llvm/trunk/test/CodeGen/AArch64/arm64-inline-asm.ll === --- llvm/trunk/test/CodeGen/AArch64/arm64-inline-asm.ll +++ llvm/trunk/test/CodeGen/AArch64/arm64-inline-asm.ll @@ -138,6 +138,8 @@ %a = alloca [2 x float], align 4 %arraydecay = getelementptr inbounds [2 x float], [2 x float]* %a, i32 0, i32 0 %0 = load <2 x float>, <2 x float>* %data, align 8 + call void asm sideeffect "ldr ${1:z}, [$0]\0A", "r,w"(float* %arraydecay, <2 x float> %0) nounwind + ; CHECK: ldr {{z[0-9]+}}, [{{x[0-9]+}}] call void asm sideeffect "ldr ${1:q}, [$0]\0A", "r,w"(float* %arraydecay, <2 x float> %0) nounwind ; CHECK: ldr {{q[0-9]+}}, [{{x[0-9]+}}] call void asm sideeffect "ldr ${1:d}, [$0]\0A", "r,w"(float* %arraydecay, <2 x float> %0) nounwind Index: llvm/trunk/test/CodeGen/AArch64/aarch64-sve-asm-negative.ll === --- llvm/trunk/test/CodeGen/AArch64/aarch64-sve-asm-negative.ll +++ llvm/trunk/test/CodeGen/AArch64/aarch64-sve-asm-negative.ll @@ -0,0 +1,12 @@ +; RUN: not llc -mtriple aarch64-none-linux-gnu -mattr=+neon -o %t.s -filetype=asm %s 2>&1 | FileCheck %s + +; The 'y' constraint only applies to SVE vector registers (Z0-Z7) +; The test below ensures that we get an appropriate error should the +; constraint be used with a Neon register. + +; Function Attrs: nounwind readnone +; CHECK: error: couldn't allocate input reg for constraint 'y' +define <4 x i32> @test_neon(<4 x i32> %in1, <4 x i32> %in2) { + %1 = tail call <4 x i32> asm "add $0.4s, $1.4s, $2.4s", "=w,w,y"(<4 x i32> %in1, <4 x i32> %in2) + ret <4 x i32> %1 +} Index: llvm/trunk/test/CodeGen/AArch64/aarch64-sve-asm.ll === --- llvm/trunk/test/CodeGen/AArch64/aarch64-sve-asm.ll +++ llvm/trunk/test/CodeGen/AArch64/aarch64-sve-asm.ll @@ -0,0 +1,44 @@ +; RUN: llc < %s -mtriple aarch64-none-linux-gnu -mattr=+sve -stop-after=finalize-isel | FileCheck %s --check-prefix=CHECK + +target datalayout = "e-m:e-i64:64-i128:128-n32:64-S128" +target triple = "aarch64-none-linux-gnu" + +; Function Attrs: nounwind readnone +; CHECK: [[ARG1:%[0-9]+]]:zpr = COPY $z1 +; CHECK: [[ARG2:%[0-9]+]]:zpr = COPY $z0 +; CHECK: [[ARG3:%[0-9]+]]:zpr = COPY [[ARG2]] +; CHECK: [[ARG4:%[0-9]+]]:zpr_3b = COPY [[ARG1]] +define @test_svadd_i8( %Zn, %Zm) { + %1 = tail call asm "add $0.b, $1.b, $2.b", "=w,w,y"( %Zn, %Zm) + ret %1 +} + +; Function Attrs: nounwind readnone +; CHECK: [[ARG1:%[0-9]+]]:zpr = COPY $z1 +; CHECK: [[ARG2:%[0-9]+]]:zpr = COPY $z0 +; CHECK: [[ARG3:%[0-9]+]]:zpr = COPY [[ARG2]] +; CHECK: [[ARG4:%[0-9]+]]:zpr_4b = COPY [[ARG1]] +define @test_svsub_i64( %Zn, %Zm) { + %1 = tail call asm "sub $0.d, $1.d, $2.d", "=w,w,x"( %Zn, %Zm) + ret %1 +} + +; Function Attrs: nounwind readnone +; CHECK: [[ARG1:%[0-9]+]]:zpr = COPY $z1 +; CHECK: [[ARG2:%[0-9]+]]:zpr = COPY $z0 +; CHECK: [[ARG3:%[0-9]+]]:zpr = COPY [[ARG2]] +; CHECK: [[ARG4:%[0-9]+]]:zpr_3b = COPY [[ARG1]] +define @test_svfmul_f16( %Zn, %Zm) { + %1 = tail call asm "fmul $0.h, $1.h, $2.h", "=w,w,y"( %Zn, %Zm) + ret %1 +} + +; Function Attrs: nounwind readnone +; CHECK: [[ARG1:%[0-9]+]]:zpr = COPY $z1 +; CHECK: [[ARG2:%[0-9]+]]:zpr = COPY $z0 +; CHECK: [[ARG3:%[0-9]+]]:zpr = COPY [[ARG2]] +; CHECK: [[ARG4:%[0-9]+]]:zpr_4b = COPY [[ARG1]] +define @test_svfmul_f( %Zn, %Zm) { + %1 = tail call asm "fmul $0.s, $1.s, $2.s", "=w,w,x"( %Zn, %Zm) + ret %1 +} Index: llvm/trunk/lib/Target/AArch64/AArch64AsmPrinter.cpp === --- llvm/trunk/lib/Target/AArch64/AArch64AsmPrinter.cpp +++ llvm/trunk/lib/Target/AArch64/AArch64AsmPrinter.cpp @@ -150,7 +150,7 @@ void printOperand(const MachineInstr *MI, unsigned OpNum, raw_ostream ); bool printAsmMRegister(const MachineOperand , char Mode, raw_ostream ); bool printAsmRegInClass(const MachineOperand , - const TargetRegisterClass *RC, bool isVector, + const TargetRegisterClass *RC, unsigned AltName, raw_ostream );
[PATCH] D66302: [SVE][Inline-Asm] Support for SVE asm operands
sdesmalen accepted this revision. sdesmalen added a comment. This revision is now accepted and ready to land. Thanks for making these changes @kmclaughlin, LGTM! CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66302/new/ https://reviews.llvm.org/D66302 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D66302: [SVE][Inline-Asm] Support for SVE asm operands
kmclaughlin updated this revision to Diff 216655. kmclaughlin added a comment. - Removed a confusing comment from AArch64AsmPrinter.cpp CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66302/new/ https://reviews.llvm.org/D66302 Files: docs/LangRef.rst lib/Target/AArch64/AArch64AsmPrinter.cpp lib/Target/AArch64/AArch64ISelLowering.cpp lib/Target/AArch64/AArch64InstrInfo.cpp lib/Target/AArch64/AArch64SVEInstrInfo.td test/CodeGen/AArch64/aarch64-sve-asm-negative.ll test/CodeGen/AArch64/aarch64-sve-asm.ll test/CodeGen/AArch64/arm64-inline-asm.ll Index: test/CodeGen/AArch64/arm64-inline-asm.ll === --- test/CodeGen/AArch64/arm64-inline-asm.ll +++ test/CodeGen/AArch64/arm64-inline-asm.ll @@ -138,6 +138,8 @@ %a = alloca [2 x float], align 4 %arraydecay = getelementptr inbounds [2 x float], [2 x float]* %a, i32 0, i32 0 %0 = load <2 x float>, <2 x float>* %data, align 8 + call void asm sideeffect "ldr ${1:z}, [$0]\0A", "r,w"(float* %arraydecay, <2 x float> %0) nounwind + ; CHECK: ldr {{z[0-9]+}}, [{{x[0-9]+}}] call void asm sideeffect "ldr ${1:q}, [$0]\0A", "r,w"(float* %arraydecay, <2 x float> %0) nounwind ; CHECK: ldr {{q[0-9]+}}, [{{x[0-9]+}}] call void asm sideeffect "ldr ${1:d}, [$0]\0A", "r,w"(float* %arraydecay, <2 x float> %0) nounwind Index: test/CodeGen/AArch64/aarch64-sve-asm.ll === --- /dev/null +++ test/CodeGen/AArch64/aarch64-sve-asm.ll @@ -0,0 +1,44 @@ +; RUN: llc < %s -mtriple aarch64-none-linux-gnu -mattr=+sve -stop-after=finalize-isel | FileCheck %s --check-prefix=CHECK + +target datalayout = "e-m:e-i64:64-i128:128-n32:64-S128" +target triple = "aarch64-none-linux-gnu" + +; Function Attrs: nounwind readnone +; CHECK: [[ARG1:%[0-9]+]]:zpr = COPY $z1 +; CHECK: [[ARG2:%[0-9]+]]:zpr = COPY $z0 +; CHECK: [[ARG3:%[0-9]+]]:zpr = COPY [[ARG2]] +; CHECK: [[ARG4:%[0-9]+]]:zpr_3b = COPY [[ARG1]] +define @test_svadd_i8( %Zn, %Zm) { + %1 = tail call asm "add $0.b, $1.b, $2.b", "=w,w,y"( %Zn, %Zm) + ret %1 +} + +; Function Attrs: nounwind readnone +; CHECK: [[ARG1:%[0-9]+]]:zpr = COPY $z1 +; CHECK: [[ARG2:%[0-9]+]]:zpr = COPY $z0 +; CHECK: [[ARG3:%[0-9]+]]:zpr = COPY [[ARG2]] +; CHECK: [[ARG4:%[0-9]+]]:zpr_4b = COPY [[ARG1]] +define @test_svsub_i64( %Zn, %Zm) { + %1 = tail call asm "sub $0.d, $1.d, $2.d", "=w,w,x"( %Zn, %Zm) + ret %1 +} + +; Function Attrs: nounwind readnone +; CHECK: [[ARG1:%[0-9]+]]:zpr = COPY $z1 +; CHECK: [[ARG2:%[0-9]+]]:zpr = COPY $z0 +; CHECK: [[ARG3:%[0-9]+]]:zpr = COPY [[ARG2]] +; CHECK: [[ARG4:%[0-9]+]]:zpr_3b = COPY [[ARG1]] +define @test_svfmul_f16( %Zn, %Zm) { + %1 = tail call asm "fmul $0.h, $1.h, $2.h", "=w,w,y"( %Zn, %Zm) + ret %1 +} + +; Function Attrs: nounwind readnone +; CHECK: [[ARG1:%[0-9]+]]:zpr = COPY $z1 +; CHECK: [[ARG2:%[0-9]+]]:zpr = COPY $z0 +; CHECK: [[ARG3:%[0-9]+]]:zpr = COPY [[ARG2]] +; CHECK: [[ARG4:%[0-9]+]]:zpr_4b = COPY [[ARG1]] +define @test_svfmul_f( %Zn, %Zm) { + %1 = tail call asm "fmul $0.s, $1.s, $2.s", "=w,w,x"( %Zn, %Zm) + ret %1 +} Index: test/CodeGen/AArch64/aarch64-sve-asm-negative.ll === --- /dev/null +++ test/CodeGen/AArch64/aarch64-sve-asm-negative.ll @@ -0,0 +1,12 @@ +; RUN: not llc -mtriple aarch64-none-linux-gnu -mattr=+neon -o %t.s -filetype=asm %s 2>&1 | FileCheck %s + +; The 'y' constraint only applies to SVE vector registers (Z0-Z7) +; The test below ensures that we get an appropriate error should the +; constraint be used with a Neon register. + +; Function Attrs: nounwind readnone +; CHECK: error: couldn't allocate input reg for constraint 'y' +define <4 x i32> @test_neon(<4 x i32> %in1, <4 x i32> %in2) { + %1 = tail call <4 x i32> asm "add $0.4s, $1.4s, $2.4s", "=w,w,y"(<4 x i32> %in1, <4 x i32> %in2) + ret <4 x i32> %1 +} Index: lib/Target/AArch64/AArch64SVEInstrInfo.td === --- lib/Target/AArch64/AArch64SVEInstrInfo.td +++ lib/Target/AArch64/AArch64SVEInstrInfo.td @@ -1020,6 +1020,56 @@ (FCMGT_PPzZZ_S PPR32:$Zd, PPR3bAny:$Pg, ZPR32:$Zn, ZPR32:$Zm), 0>; def : InstAlias<"fcmlt $Zd, $Pg/z, $Zm, $Zn", (FCMGT_PPzZZ_D PPR64:$Zd, PPR3bAny:$Pg, ZPR64:$Zn, ZPR64:$Zm), 0>; + + def : Pat<(nxv16i8 (bitconvert (nxv8i16 ZPR:$src))), (nxv16i8 ZPR:$src)>; + def : Pat<(nxv16i8 (bitconvert (nxv4i32 ZPR:$src))), (nxv16i8 ZPR:$src)>; + def : Pat<(nxv16i8 (bitconvert (nxv2i64 ZPR:$src))), (nxv16i8 ZPR:$src)>; + def : Pat<(nxv16i8 (bitconvert (nxv8f16 ZPR:$src))), (nxv16i8 ZPR:$src)>; + def : Pat<(nxv16i8 (bitconvert (nxv4f32 ZPR:$src))), (nxv16i8 ZPR:$src)>; + def : Pat<(nxv16i8 (bitconvert (nxv2f64 ZPR:$src))), (nxv16i8 ZPR:$src)>; + + def : Pat<(nxv8i16 (bitconvert (nxv16i8 ZPR:$src))), (nxv8i16 ZPR:$src)>; + def : Pat<(nxv8i16 (bitconvert (nxv4i32 ZPR:$src))), (nxv8i16
[PATCH] D66302: [SVE][Inline-Asm] Support for SVE asm operands
kmclaughlin updated this revision to Diff 216574. kmclaughlin added a comment. - Changed printAsmRegInClass in AArch64AsmPrinter.cpp to accept //unsigned AltName// instead of //bool isVector// - Added a comment to explain the test in aarch64-sve-asm-negative.ll CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66302/new/ https://reviews.llvm.org/D66302 Files: docs/LangRef.rst lib/Target/AArch64/AArch64AsmPrinter.cpp lib/Target/AArch64/AArch64ISelLowering.cpp lib/Target/AArch64/AArch64InstrInfo.cpp lib/Target/AArch64/AArch64SVEInstrInfo.td test/CodeGen/AArch64/aarch64-sve-asm-negative.ll test/CodeGen/AArch64/aarch64-sve-asm.ll test/CodeGen/AArch64/arm64-inline-asm.ll Index: test/CodeGen/AArch64/arm64-inline-asm.ll === --- test/CodeGen/AArch64/arm64-inline-asm.ll +++ test/CodeGen/AArch64/arm64-inline-asm.ll @@ -138,6 +138,8 @@ %a = alloca [2 x float], align 4 %arraydecay = getelementptr inbounds [2 x float], [2 x float]* %a, i32 0, i32 0 %0 = load <2 x float>, <2 x float>* %data, align 8 + call void asm sideeffect "ldr ${1:z}, [$0]\0A", "r,w"(float* %arraydecay, <2 x float> %0) nounwind + ; CHECK: ldr {{z[0-9]+}}, [{{x[0-9]+}}] call void asm sideeffect "ldr ${1:q}, [$0]\0A", "r,w"(float* %arraydecay, <2 x float> %0) nounwind ; CHECK: ldr {{q[0-9]+}}, [{{x[0-9]+}}] call void asm sideeffect "ldr ${1:d}, [$0]\0A", "r,w"(float* %arraydecay, <2 x float> %0) nounwind Index: test/CodeGen/AArch64/aarch64-sve-asm.ll === --- /dev/null +++ test/CodeGen/AArch64/aarch64-sve-asm.ll @@ -0,0 +1,44 @@ +; RUN: llc < %s -mtriple aarch64-none-linux-gnu -mattr=+sve -stop-after=finalize-isel | FileCheck %s --check-prefix=CHECK + +target datalayout = "e-m:e-i64:64-i128:128-n32:64-S128" +target triple = "aarch64-none-linux-gnu" + +; Function Attrs: nounwind readnone +; CHECK: [[ARG1:%[0-9]+]]:zpr = COPY $z1 +; CHECK: [[ARG2:%[0-9]+]]:zpr = COPY $z0 +; CHECK: [[ARG3:%[0-9]+]]:zpr = COPY [[ARG2]] +; CHECK: [[ARG4:%[0-9]+]]:zpr_3b = COPY [[ARG1]] +define @test_svadd_i8( %Zn, %Zm) { + %1 = tail call asm "add $0.b, $1.b, $2.b", "=w,w,y"( %Zn, %Zm) + ret %1 +} + +; Function Attrs: nounwind readnone +; CHECK: [[ARG1:%[0-9]+]]:zpr = COPY $z1 +; CHECK: [[ARG2:%[0-9]+]]:zpr = COPY $z0 +; CHECK: [[ARG3:%[0-9]+]]:zpr = COPY [[ARG2]] +; CHECK: [[ARG4:%[0-9]+]]:zpr_4b = COPY [[ARG1]] +define @test_svsub_i64( %Zn, %Zm) { + %1 = tail call asm "sub $0.d, $1.d, $2.d", "=w,w,x"( %Zn, %Zm) + ret %1 +} + +; Function Attrs: nounwind readnone +; CHECK: [[ARG1:%[0-9]+]]:zpr = COPY $z1 +; CHECK: [[ARG2:%[0-9]+]]:zpr = COPY $z0 +; CHECK: [[ARG3:%[0-9]+]]:zpr = COPY [[ARG2]] +; CHECK: [[ARG4:%[0-9]+]]:zpr_3b = COPY [[ARG1]] +define @test_svfmul_f16( %Zn, %Zm) { + %1 = tail call asm "fmul $0.h, $1.h, $2.h", "=w,w,y"( %Zn, %Zm) + ret %1 +} + +; Function Attrs: nounwind readnone +; CHECK: [[ARG1:%[0-9]+]]:zpr = COPY $z1 +; CHECK: [[ARG2:%[0-9]+]]:zpr = COPY $z0 +; CHECK: [[ARG3:%[0-9]+]]:zpr = COPY [[ARG2]] +; CHECK: [[ARG4:%[0-9]+]]:zpr_4b = COPY [[ARG1]] +define @test_svfmul_f( %Zn, %Zm) { + %1 = tail call asm "fmul $0.s, $1.s, $2.s", "=w,w,x"( %Zn, %Zm) + ret %1 +} Index: test/CodeGen/AArch64/aarch64-sve-asm-negative.ll === --- /dev/null +++ test/CodeGen/AArch64/aarch64-sve-asm-negative.ll @@ -0,0 +1,12 @@ +; RUN: not llc -mtriple aarch64-none-linux-gnu -mattr=+neon -o %t.s -filetype=asm %s 2>&1 | FileCheck %s + +; The 'y' constraint only applies to SVE vector registers (Z0-Z7) +; The test below ensures that we get an appropriate error should the +; constraint be used with a Neon register. + +; Function Attrs: nounwind readnone +; CHECK: error: couldn't allocate input reg for constraint 'y' +define <4 x i32> @test_neon(<4 x i32> %in1, <4 x i32> %in2) { + %1 = tail call <4 x i32> asm "add $0.4s, $1.4s, $2.4s", "=w,w,y"(<4 x i32> %in1, <4 x i32> %in2) + ret <4 x i32> %1 +} Index: lib/Target/AArch64/AArch64SVEInstrInfo.td === --- lib/Target/AArch64/AArch64SVEInstrInfo.td +++ lib/Target/AArch64/AArch64SVEInstrInfo.td @@ -1020,6 +1020,56 @@ (FCMGT_PPzZZ_S PPR32:$Zd, PPR3bAny:$Pg, ZPR32:$Zn, ZPR32:$Zm), 0>; def : InstAlias<"fcmlt $Zd, $Pg/z, $Zm, $Zn", (FCMGT_PPzZZ_D PPR64:$Zd, PPR3bAny:$Pg, ZPR64:$Zn, ZPR64:$Zm), 0>; + + def : Pat<(nxv16i8 (bitconvert (nxv8i16 ZPR:$src))), (nxv16i8 ZPR:$src)>; + def : Pat<(nxv16i8 (bitconvert (nxv4i32 ZPR:$src))), (nxv16i8 ZPR:$src)>; + def : Pat<(nxv16i8 (bitconvert (nxv2i64 ZPR:$src))), (nxv16i8 ZPR:$src)>; + def : Pat<(nxv16i8 (bitconvert (nxv8f16 ZPR:$src))), (nxv16i8 ZPR:$src)>; + def : Pat<(nxv16i8 (bitconvert (nxv4f32 ZPR:$src))), (nxv16i8 ZPR:$src)>; + def : Pat<(nxv16i8 (bitconvert (nxv2f64 ZPR:$src))), (nxv16i8 ZPR:$src)>; + + def : Pat<(nxv8i16
[PATCH] D66302: [SVE][Inline-Asm] Support for SVE asm operands
sdesmalen added inline comments. Comment at: lib/Target/AArch64/AArch64AsmPrinter.cpp:618 +bool hasAltName; +const TargetRegisterClass *RegClass; The use of `hasAltName` is confusing me here. When I look at the declaration and definition of `printAsmRegInClass`, the parameter is called `bool isVector`, and a SVE vector is still a vector, so passing `false` is odd. I think it makes more sense to change `printAsmRegInClass` to accept an `unsigned AltName`, and just pass `AArch64::vreg` for Neon Vectors directly (or `AArch64::NoRegAltName` otherwise). Comment at: test/CodeGen/AArch64/aarch64-sve-asm-negative.ll:2 +; RUN: not llc -mtriple aarch64-none-linux-gnu -mattr=+neon -o %t.s -filetype=asm %s 2>&1 | FileCheck %s + +; Function Attrs: nounwind readnone Can you add a comment explaining what this is testing (and why the inline asm below is not valid)? Comment at: test/CodeGen/AArch64/aarch64-sve-asm.ll:46 + +!0 = !{i32 188, i32 210} nit: what is this line doing here? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66302/new/ https://reviews.llvm.org/D66302 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D66302: [SVE][Inline-Asm] Support for SVE asm operands
kmclaughlin updated this revision to Diff 216178. kmclaughlin added a comment. - Added a new test file, aarch64-sve-asm-negative.ll - Updated description of the 'y' constraint in LangRef.rst CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66302/new/ https://reviews.llvm.org/D66302 Files: docs/LangRef.rst lib/Target/AArch64/AArch64AsmPrinter.cpp lib/Target/AArch64/AArch64ISelLowering.cpp lib/Target/AArch64/AArch64InstrInfo.cpp lib/Target/AArch64/AArch64SVEInstrInfo.td test/CodeGen/AArch64/aarch64-sve-asm-negative.ll test/CodeGen/AArch64/aarch64-sve-asm.ll test/CodeGen/AArch64/arm64-inline-asm.ll Index: test/CodeGen/AArch64/arm64-inline-asm.ll === --- test/CodeGen/AArch64/arm64-inline-asm.ll +++ test/CodeGen/AArch64/arm64-inline-asm.ll @@ -138,6 +138,8 @@ %a = alloca [2 x float], align 4 %arraydecay = getelementptr inbounds [2 x float], [2 x float]* %a, i32 0, i32 0 %0 = load <2 x float>, <2 x float>* %data, align 8 + call void asm sideeffect "ldr ${1:z}, [$0]\0A", "r,w"(float* %arraydecay, <2 x float> %0) nounwind + ; CHECK: ldr {{z[0-9]+}}, [{{x[0-9]+}}] call void asm sideeffect "ldr ${1:q}, [$0]\0A", "r,w"(float* %arraydecay, <2 x float> %0) nounwind ; CHECK: ldr {{q[0-9]+}}, [{{x[0-9]+}}] call void asm sideeffect "ldr ${1:d}, [$0]\0A", "r,w"(float* %arraydecay, <2 x float> %0) nounwind Index: test/CodeGen/AArch64/aarch64-sve-asm.ll === --- /dev/null +++ test/CodeGen/AArch64/aarch64-sve-asm.ll @@ -0,0 +1,46 @@ +; RUN: llc < %s -mtriple aarch64-none-linux-gnu -mattr=+sve -stop-after=finalize-isel | FileCheck %s --check-prefix=CHECK + +target datalayout = "e-m:e-i64:64-i128:128-n32:64-S128" +target triple = "aarch64-none-linux-gnu" + +; Function Attrs: nounwind readnone +; CHECK: [[ARG1:%[0-9]+]]:zpr = COPY $z1 +; CHECK: [[ARG2:%[0-9]+]]:zpr = COPY $z0 +; CHECK: [[ARG3:%[0-9]+]]:zpr = COPY [[ARG2]] +; CHECK: [[ARG4:%[0-9]+]]:zpr_3b = COPY [[ARG1]] +define @test_svadd_i8( %Zn, %Zm) { + %1 = tail call asm "add $0.b, $1.b, $2.b", "=w,w,y"( %Zn, %Zm) + ret %1 +} + +; Function Attrs: nounwind readnone +; CHECK: [[ARG1:%[0-9]+]]:zpr = COPY $z1 +; CHECK: [[ARG2:%[0-9]+]]:zpr = COPY $z0 +; CHECK: [[ARG3:%[0-9]+]]:zpr = COPY [[ARG2]] +; CHECK: [[ARG4:%[0-9]+]]:zpr_4b = COPY [[ARG1]] +define @test_svsub_i64( %Zn, %Zm) { + %1 = tail call asm "sub $0.d, $1.d, $2.d", "=w,w,x"( %Zn, %Zm) + ret %1 +} + +; Function Attrs: nounwind readnone +; CHECK: [[ARG1:%[0-9]+]]:zpr = COPY $z1 +; CHECK: [[ARG2:%[0-9]+]]:zpr = COPY $z0 +; CHECK: [[ARG3:%[0-9]+]]:zpr = COPY [[ARG2]] +; CHECK: [[ARG4:%[0-9]+]]:zpr_3b = COPY [[ARG1]] +define @test_svfmul_f16( %Zn, %Zm) { + %1 = tail call asm "fmul $0.h, $1.h, $2.h", "=w,w,y"( %Zn, %Zm) + ret %1 +} + +; Function Attrs: nounwind readnone +; CHECK: [[ARG1:%[0-9]+]]:zpr = COPY $z1 +; CHECK: [[ARG2:%[0-9]+]]:zpr = COPY $z0 +; CHECK: [[ARG3:%[0-9]+]]:zpr = COPY [[ARG2]] +; CHECK: [[ARG4:%[0-9]+]]:zpr_4b = COPY [[ARG1]] +define @test_svfmul_f( %Zn, %Zm) { + %1 = tail call asm "fmul $0.s, $1.s, $2.s", "=w,w,x"( %Zn, %Zm) + ret %1 +} + +!0 = !{i32 188, i32 210} Index: test/CodeGen/AArch64/aarch64-sve-asm-negative.ll === --- /dev/null +++ test/CodeGen/AArch64/aarch64-sve-asm-negative.ll @@ -0,0 +1,8 @@ +; RUN: not llc -mtriple aarch64-none-linux-gnu -mattr=+neon -o %t.s -filetype=asm %s 2>&1 | FileCheck %s + +; Function Attrs: nounwind readnone +; CHECK: error: couldn't allocate input reg for constraint 'y' +define <4 x i32> @test_neon(<4 x i32> %in1, <4 x i32> %in2) { + %1 = tail call <4 x i32> asm "add $0.4s, $1.4s, $2.4s", "=w,w,y"(<4 x i32> %in1, <4 x i32> %in2) + ret <4 x i32> %1 +} Index: lib/Target/AArch64/AArch64SVEInstrInfo.td === --- lib/Target/AArch64/AArch64SVEInstrInfo.td +++ lib/Target/AArch64/AArch64SVEInstrInfo.td @@ -1020,6 +1020,56 @@ (FCMGT_PPzZZ_S PPR32:$Zd, PPR3bAny:$Pg, ZPR32:$Zn, ZPR32:$Zm), 0>; def : InstAlias<"fcmlt $Zd, $Pg/z, $Zm, $Zn", (FCMGT_PPzZZ_D PPR64:$Zd, PPR3bAny:$Pg, ZPR64:$Zn, ZPR64:$Zm), 0>; + + def : Pat<(nxv16i8 (bitconvert (nxv8i16 ZPR:$src))), (nxv16i8 ZPR:$src)>; + def : Pat<(nxv16i8 (bitconvert (nxv4i32 ZPR:$src))), (nxv16i8 ZPR:$src)>; + def : Pat<(nxv16i8 (bitconvert (nxv2i64 ZPR:$src))), (nxv16i8 ZPR:$src)>; + def : Pat<(nxv16i8 (bitconvert (nxv8f16 ZPR:$src))), (nxv16i8 ZPR:$src)>; + def : Pat<(nxv16i8 (bitconvert (nxv4f32 ZPR:$src))), (nxv16i8 ZPR:$src)>; + def : Pat<(nxv16i8 (bitconvert (nxv2f64 ZPR:$src))), (nxv16i8 ZPR:$src)>; + + def : Pat<(nxv8i16 (bitconvert (nxv16i8 ZPR:$src))), (nxv8i16 ZPR:$src)>; + def : Pat<(nxv8i16 (bitconvert (nxv4i32 ZPR:$src))), (nxv8i16 ZPR:$src)>; + def : Pat<(nxv8i16 (bitconvert (nxv2i64 ZPR:$src))), (nxv8i16 ZPR:$src)>; + def :
[PATCH] D66302: [SVE][Inline-Asm] Support for SVE asm operands
sdesmalen added a comment. Thanks for this change @kmclaughlin. Comment at: docs/LangRef.rst:3816 +- ``x``: Like w, but restricted to registers 0 to 15 inclusive. +- ``y``: Like w, but restricted to registers 0 to 7 inclusive. I noticed this comment does not match the code below, since `y` only seems to work for scalable vectors, which probably shows this case is missing a test. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66302/new/ https://reviews.llvm.org/D66302 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D66302: [SVE][Inline-Asm] Support for SVE asm operands
kmclaughlin created this revision. kmclaughlin added reviewers: t.p.northover, sdesmalen, rovka, momchil.velikov. Herald added subscribers: psnobl, rkruppe, tschuett, javed.absar. Herald added a reviewer: rengolin. Herald added a project: LLVM. Adds the following inline asm constraints for SVE: - w: SVE vector register with full range, Z0 to Z31 - x: Restricted to registers Z0 to Z15 inclusive. - y: Restricted to registers Z0 to Z7 inclusive. This change also adds the "z" modifier to interpret a register as an SVE register. Not all of the bitconvert patterns added by this patch are used, but they have been included here for completeness. Repository: rL LLVM https://reviews.llvm.org/D66302 Files: docs/LangRef.rst lib/Target/AArch64/AArch64AsmPrinter.cpp lib/Target/AArch64/AArch64ISelLowering.cpp lib/Target/AArch64/AArch64InstrInfo.cpp lib/Target/AArch64/AArch64SVEInstrInfo.td test/CodeGen/AArch64/aarch64-sve-asm.ll test/CodeGen/AArch64/arm64-inline-asm.ll Index: test/CodeGen/AArch64/arm64-inline-asm.ll === --- test/CodeGen/AArch64/arm64-inline-asm.ll +++ test/CodeGen/AArch64/arm64-inline-asm.ll @@ -138,6 +138,8 @@ %a = alloca [2 x float], align 4 %arraydecay = getelementptr inbounds [2 x float], [2 x float]* %a, i32 0, i32 0 %0 = load <2 x float>, <2 x float>* %data, align 8 + call void asm sideeffect "ldr ${1:z}, [$0]\0A", "r,w"(float* %arraydecay, <2 x float> %0) nounwind + ; CHECK: ldr {{z[0-9]+}}, [{{x[0-9]+}}] call void asm sideeffect "ldr ${1:q}, [$0]\0A", "r,w"(float* %arraydecay, <2 x float> %0) nounwind ; CHECK: ldr {{q[0-9]+}}, [{{x[0-9]+}}] call void asm sideeffect "ldr ${1:d}, [$0]\0A", "r,w"(float* %arraydecay, <2 x float> %0) nounwind Index: test/CodeGen/AArch64/aarch64-sve-asm.ll === --- /dev/null +++ test/CodeGen/AArch64/aarch64-sve-asm.ll @@ -0,0 +1,46 @@ +; RUN: llc < %s -mtriple aarch64-none-linux-gnu -mattr=+sve -stop-after=finalize-isel | FileCheck %s --check-prefix=CHECK + +target datalayout = "e-m:e-i64:64-i128:128-n32:64-S128" +target triple = "aarch64-none-linux-gnu" + +; Function Attrs: nounwind readnone +; CHECK: [[ARG1:%[0-9]+]]:zpr = COPY $z1 +; CHECK: [[ARG2:%[0-9]+]]:zpr = COPY $z0 +; CHECK: [[ARG3:%[0-9]+]]:zpr = COPY [[ARG2]] +; CHECK: [[ARG4:%[0-9]+]]:zpr_3b = COPY [[ARG1]] +define @test_svadd_i8( %Zn, %Zm) { + %1 = tail call asm "add $0.b, $1.b, $2.b", "=w,w,y"( %Zn, %Zm) + ret %1 +} + +; Function Attrs: nounwind readnone +; CHECK: [[ARG1:%[0-9]+]]:zpr = COPY $z1 +; CHECK: [[ARG2:%[0-9]+]]:zpr = COPY $z0 +; CHECK: [[ARG3:%[0-9]+]]:zpr = COPY [[ARG2]] +; CHECK: [[ARG4:%[0-9]+]]:zpr_4b = COPY [[ARG1]] +define @test_svsub_i64( %Zn, %Zm) { + %1 = tail call asm "sub $0.d, $1.d, $2.d", "=w,w,x"( %Zn, %Zm) + ret %1 +} + +; Function Attrs: nounwind readnone +; CHECK: [[ARG1:%[0-9]+]]:zpr = COPY $z1 +; CHECK: [[ARG2:%[0-9]+]]:zpr = COPY $z0 +; CHECK: [[ARG3:%[0-9]+]]:zpr = COPY [[ARG2]] +; CHECK: [[ARG4:%[0-9]+]]:zpr_3b = COPY [[ARG1]] +define @test_svfmul_f16( %Zn, %Zm) { + %1 = tail call asm "fmul $0.h, $1.h, $2.h", "=w,w,y"( %Zn, %Zm) + ret %1 +} + +; Function Attrs: nounwind readnone +; CHECK: [[ARG1:%[0-9]+]]:zpr = COPY $z1 +; CHECK: [[ARG2:%[0-9]+]]:zpr = COPY $z0 +; CHECK: [[ARG3:%[0-9]+]]:zpr = COPY [[ARG2]] +; CHECK: [[ARG4:%[0-9]+]]:zpr_4b = COPY [[ARG1]] +define @test_svfmul_f( %Zn, %Zm) { + %1 = tail call asm "fmul $0.s, $1.s, $2.s", "=w,w,x"( %Zn, %Zm) + ret %1 +} + +!0 = !{i32 188, i32 210} Index: lib/Target/AArch64/AArch64SVEInstrInfo.td === --- lib/Target/AArch64/AArch64SVEInstrInfo.td +++ lib/Target/AArch64/AArch64SVEInstrInfo.td @@ -1020,6 +1020,56 @@ (FCMGT_PPzZZ_S PPR32:$Zd, PPR3bAny:$Pg, ZPR32:$Zn, ZPR32:$Zm), 0>; def : InstAlias<"fcmlt $Zd, $Pg/z, $Zm, $Zn", (FCMGT_PPzZZ_D PPR64:$Zd, PPR3bAny:$Pg, ZPR64:$Zn, ZPR64:$Zm), 0>; + + def : Pat<(nxv16i8 (bitconvert (nxv8i16 ZPR:$src))), (nxv16i8 ZPR:$src)>; + def : Pat<(nxv16i8 (bitconvert (nxv4i32 ZPR:$src))), (nxv16i8 ZPR:$src)>; + def : Pat<(nxv16i8 (bitconvert (nxv2i64 ZPR:$src))), (nxv16i8 ZPR:$src)>; + def : Pat<(nxv16i8 (bitconvert (nxv8f16 ZPR:$src))), (nxv16i8 ZPR:$src)>; + def : Pat<(nxv16i8 (bitconvert (nxv4f32 ZPR:$src))), (nxv16i8 ZPR:$src)>; + def : Pat<(nxv16i8 (bitconvert (nxv2f64 ZPR:$src))), (nxv16i8 ZPR:$src)>; + + def : Pat<(nxv8i16 (bitconvert (nxv16i8 ZPR:$src))), (nxv8i16 ZPR:$src)>; + def : Pat<(nxv8i16 (bitconvert (nxv4i32 ZPR:$src))), (nxv8i16 ZPR:$src)>; + def : Pat<(nxv8i16 (bitconvert (nxv2i64 ZPR:$src))), (nxv8i16 ZPR:$src)>; + def : Pat<(nxv8i16 (bitconvert (nxv8f16 ZPR:$src))), (nxv8i16 ZPR:$src)>; + def : Pat<(nxv8i16 (bitconvert (nxv4f32 ZPR:$src))), (nxv8i16 ZPR:$src)>; + def : Pat<(nxv8i16 (bitconvert (nxv2f64 ZPR:$src))), (nxv8i16 ZPR:$src)>; + + def : Pat<(nxv4i32