[PATCH] D47121: [NEON] Support VLD1xN intrinsics in AArch32 mode (Clang part)
This revision was automatically updated to reflect the committed changes. Closed by commit rL333819: [NEON] Support VLD1xN intrinsics in AArch32 mode (Clang part) (authored by kosarev, committed by ). Herald added a subscriber: llvm-commits. Changed prior to commit: https://reviews.llvm.org/D47121?vs=147718=149612#toc Repository: rL LLVM https://reviews.llvm.org/D47121 Files: llvm/trunk/include/llvm/IR/IntrinsicsARM.td llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.cpp llvm/trunk/lib/Target/ARM/ARMExpandPseudoInsts.cpp llvm/trunk/lib/Target/ARM/ARMISelDAGToDAG.cpp llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp llvm/trunk/lib/Target/ARM/ARMInstrNEON.td llvm/trunk/test/CodeGen/ARM/arm-vld1.ll Index: llvm/trunk/lib/Target/ARM/ARMISelDAGToDAG.cpp === --- llvm/trunk/lib/Target/ARM/ARMISelDAGToDAG.cpp +++ llvm/trunk/lib/Target/ARM/ARMISelDAGToDAG.cpp @@ -1761,9 +1761,7 @@ case MVT::v4f32: case MVT::v4i32: OpcodeIndex = 2; break; case MVT::v2f64: - case MVT::v2i64: OpcodeIndex = 3; -assert(NumVecs == 1 && "v2i64 type only supported for VLD1"); -break; + case MVT::v2i64: OpcodeIndex = 3; break; } EVT ResTy; @@ -3441,6 +3439,51 @@ return; } +case Intrinsic::arm_neon_vld1x2: { + static const uint16_t DOpcodes[] = { ARM::VLD1q8, ARM::VLD1q16, + ARM::VLD1q32, ARM::VLD1q64 }; + static const uint16_t QOpcodes[] = { ARM::VLD1d8QPseudo, + ARM::VLD1d16QPseudo, + ARM::VLD1d32QPseudo, + ARM::VLD1d64QPseudo }; + SelectVLD(N, false, 2, DOpcodes, QOpcodes, nullptr); + return; +} + +case Intrinsic::arm_neon_vld1x3: { + static const uint16_t DOpcodes[] = { ARM::VLD1d8TPseudo, + ARM::VLD1d16TPseudo, + ARM::VLD1d32TPseudo, + ARM::VLD1d64TPseudo }; + static const uint16_t QOpcodes0[] = { ARM::VLD1q8LowTPseudo_UPD, +ARM::VLD1q16LowTPseudo_UPD, +ARM::VLD1q32LowTPseudo_UPD, +ARM::VLD1q64LowTPseudo_UPD }; + static const uint16_t QOpcodes1[] = { ARM::VLD1q8HighTPseudo, +ARM::VLD1q16HighTPseudo, +ARM::VLD1q32HighTPseudo, +ARM::VLD1q64HighTPseudo }; + SelectVLD(N, false, 3, DOpcodes, QOpcodes0, QOpcodes1); + return; +} + +case Intrinsic::arm_neon_vld1x4: { + static const uint16_t DOpcodes[] = { ARM::VLD1d8QPseudo, + ARM::VLD1d16QPseudo, + ARM::VLD1d32QPseudo, + ARM::VLD1d64QPseudo }; + static const uint16_t QOpcodes0[] = { ARM::VLD1q8LowQPseudo_UPD, +ARM::VLD1q16LowQPseudo_UPD, +ARM::VLD1q32LowQPseudo_UPD, +ARM::VLD1q64LowQPseudo_UPD }; + static const uint16_t QOpcodes1[] = { ARM::VLD1q8HighQPseudo, +ARM::VLD1q16HighQPseudo, +ARM::VLD1q32HighQPseudo, +ARM::VLD1q64HighQPseudo }; + SelectVLD(N, false, 4, DOpcodes, QOpcodes0, QOpcodes1); + return; +} + case Intrinsic::arm_neon_vld2: { static const uint16_t DOpcodes[] = { ARM::VLD2d8, ARM::VLD2d16, ARM::VLD2d32, ARM::VLD1q64 }; Index: llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp === --- llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp +++ llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp @@ -12763,6 +12763,9 @@ case ISD::INTRINSIC_W_CHAIN: switch (cast(N->getOperand(1))->getZExtValue()) { case Intrinsic::arm_neon_vld1: +case Intrinsic::arm_neon_vld1x2: +case Intrinsic::arm_neon_vld1x3: +case Intrinsic::arm_neon_vld1x4: case Intrinsic::arm_neon_vld2: case Intrinsic::arm_neon_vld3: case Intrinsic::arm_neon_vld4: @@ -14074,6 +14077,21 @@ Info.flags = MachineMemOperand::MOLoad; return true; } + case Intrinsic::arm_neon_vld1x2: + case Intrinsic::arm_neon_vld1x3: + case Intrinsic::arm_neon_vld1x4: { +Info.opc = ISD::INTRINSIC_W_CHAIN; +// Conservatively set memVT to the entire set of vectors loaded. +auto = I.getCalledFunction()->getParent()->getDataLayout(); +uint64_t NumElts = DL.getTypeSizeInBits(I.getType()) / 64; +Info.memVT =
[PATCH] D47121: [NEON] Support VLD1xN intrinsics in AArch32 mode (Clang part)
SjoerdMeijer accepted this revision. SjoerdMeijer added a comment. This revision is now accepted and ready to land. I agree: these intrinsics are available in v7/A32/A64. Comment at: lib/CodeGen/CGBuiltin.cpp:7865 } // FIXME: Sharing loads & stores with 32-bit is complicated by the absence // of an Align parameter here. kosarev wrote: > SjoerdMeijer wrote: > > How about this FIXME? Is it still relevant? And does it need to be moved > > up? Or perhaps better: move the code back here to minimise the diff? > Yes, it's still true for the vst builtins handled below. None of the vld/vst > patches removes this comment, but it should go away with whatever is the one > to be committed last. > > Umm, it seems leaving the vld code here wouldn't make the diff smaller? ah yes, nevermind, got confused here. https://reviews.llvm.org/D47121 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D47121: [NEON] Support VLD1xN intrinsics in AArch32 mode (Clang part)
kosarev added a comment. Thanks for reviewing. Comment at: lib/CodeGen/CGBuiltin.cpp:7865 } // FIXME: Sharing loads & stores with 32-bit is complicated by the absence // of an Align parameter here. SjoerdMeijer wrote: > How about this FIXME? Is it still relevant? And does it need to be moved up? > Or perhaps better: move the code back here to minimise the diff? Yes, it's still true for the vst builtins handled below. None of the vld/vst patches removes this comment, but it should go away with whatever is the one to be committed last. Umm, it seems leaving the vld code here wouldn't make the diff smaller? Comment at: test/CodeGen/arm-neon-vld.c:4 +// RUN: FileCheck -check-prefixes=CHECK,CHECK-A64 %s +// RUN: %clang_cc1 -triple armv8-none-linux-gnueabi -target-feature +neon \ +// RUN: -S -disable-O0-optnone -emit-llvm -o - %s | opt -S -mem2reg | \ SjoerdMeijer wrote: > Should this be armv7? There are more ARMv8 vld intrinsics that we currently support only in A64 so I was going to add tests for them here. I'm not sure if we want to test availability of NEON intrinsics for various architectures with codegen tests like this one or have some separate tests in sema. https://reviews.llvm.org/D47121 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D47121: [NEON] Support VLD1xN intrinsics in AArch32 mode (Clang part)
SjoerdMeijer added a comment. Had only a first brief look; see some first drive by comments inline. Comment at: lib/CodeGen/CGBuiltin.cpp:7865 } // FIXME: Sharing loads & stores with 32-bit is complicated by the absence // of an Align parameter here. How about this FIXME? Is it still relevant? And does it need to be moved up? Or perhaps better: move the code back here to minimise the diff? Comment at: test/CodeGen/arm-neon-vld.c:4 +// RUN: FileCheck -check-prefixes=CHECK,CHECK-A64 %s +// RUN: %clang_cc1 -triple armv8-none-linux-gnueabi -target-feature +neon \ +// RUN: -S -disable-O0-optnone -emit-llvm -o - %s | opt -S -mem2reg | \ Should this be armv7? https://reviews.llvm.org/D47121 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D47121: [NEON] Support VLD1xN intrinsics in AArch32 mode (Clang part)
kosarev added a comment. Ping. https://reviews.llvm.org/D47121 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits