[PATCH] D47121: [NEON] Support VLD1xN intrinsics in AArch32 mode (Clang part)

2018-06-02 Thread Ivan Kosarev via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rL333819: [NEON] Support VLD1xN intrinsics in AArch32 mode 
(Clang part) (authored by kosarev, committed by ).
Herald added a subscriber: llvm-commits.

Changed prior to commit:
  https://reviews.llvm.org/D47121?vs=147718=149612#toc

Repository:
  rL LLVM

https://reviews.llvm.org/D47121

Files:
  llvm/trunk/include/llvm/IR/IntrinsicsARM.td
  llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.cpp
  llvm/trunk/lib/Target/ARM/ARMExpandPseudoInsts.cpp
  llvm/trunk/lib/Target/ARM/ARMISelDAGToDAG.cpp
  llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp
  llvm/trunk/lib/Target/ARM/ARMInstrNEON.td
  llvm/trunk/test/CodeGen/ARM/arm-vld1.ll

Index: llvm/trunk/lib/Target/ARM/ARMISelDAGToDAG.cpp
===
--- llvm/trunk/lib/Target/ARM/ARMISelDAGToDAG.cpp
+++ llvm/trunk/lib/Target/ARM/ARMISelDAGToDAG.cpp
@@ -1761,9 +1761,7 @@
   case MVT::v4f32:
   case MVT::v4i32: OpcodeIndex = 2; break;
   case MVT::v2f64:
-  case MVT::v2i64: OpcodeIndex = 3;
-assert(NumVecs == 1 && "v2i64 type only supported for VLD1");
-break;
+  case MVT::v2i64: OpcodeIndex = 3; break;
   }
 
   EVT ResTy;
@@ -3441,6 +3439,51 @@
   return;
 }
 
+case Intrinsic::arm_neon_vld1x2: {
+  static const uint16_t DOpcodes[] = { ARM::VLD1q8, ARM::VLD1q16,
+   ARM::VLD1q32, ARM::VLD1q64 };
+  static const uint16_t QOpcodes[] = { ARM::VLD1d8QPseudo,
+   ARM::VLD1d16QPseudo,
+   ARM::VLD1d32QPseudo,
+   ARM::VLD1d64QPseudo };
+  SelectVLD(N, false, 2, DOpcodes, QOpcodes, nullptr);
+  return;
+}
+
+case Intrinsic::arm_neon_vld1x3: {
+  static const uint16_t DOpcodes[] = { ARM::VLD1d8TPseudo,
+   ARM::VLD1d16TPseudo,
+   ARM::VLD1d32TPseudo,
+   ARM::VLD1d64TPseudo };
+  static const uint16_t QOpcodes0[] = { ARM::VLD1q8LowTPseudo_UPD,
+ARM::VLD1q16LowTPseudo_UPD,
+ARM::VLD1q32LowTPseudo_UPD,
+ARM::VLD1q64LowTPseudo_UPD };
+  static const uint16_t QOpcodes1[] = { ARM::VLD1q8HighTPseudo,
+ARM::VLD1q16HighTPseudo,
+ARM::VLD1q32HighTPseudo,
+ARM::VLD1q64HighTPseudo };
+  SelectVLD(N, false, 3, DOpcodes, QOpcodes0, QOpcodes1);
+  return;
+}
+
+case Intrinsic::arm_neon_vld1x4: {
+  static const uint16_t DOpcodes[] = { ARM::VLD1d8QPseudo,
+   ARM::VLD1d16QPseudo,
+   ARM::VLD1d32QPseudo,
+   ARM::VLD1d64QPseudo };
+  static const uint16_t QOpcodes0[] = { ARM::VLD1q8LowQPseudo_UPD,
+ARM::VLD1q16LowQPseudo_UPD,
+ARM::VLD1q32LowQPseudo_UPD,
+ARM::VLD1q64LowQPseudo_UPD };
+  static const uint16_t QOpcodes1[] = { ARM::VLD1q8HighQPseudo,
+ARM::VLD1q16HighQPseudo,
+ARM::VLD1q32HighQPseudo,
+ARM::VLD1q64HighQPseudo };
+  SelectVLD(N, false, 4, DOpcodes, QOpcodes0, QOpcodes1);
+  return;
+}
+
 case Intrinsic::arm_neon_vld2: {
   static const uint16_t DOpcodes[] = { ARM::VLD2d8, ARM::VLD2d16,
ARM::VLD2d32, ARM::VLD1q64 };
Index: llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp
===
--- llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp
+++ llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp
@@ -12763,6 +12763,9 @@
   case ISD::INTRINSIC_W_CHAIN:
 switch (cast(N->getOperand(1))->getZExtValue()) {
 case Intrinsic::arm_neon_vld1:
+case Intrinsic::arm_neon_vld1x2:
+case Intrinsic::arm_neon_vld1x3:
+case Intrinsic::arm_neon_vld1x4:
 case Intrinsic::arm_neon_vld2:
 case Intrinsic::arm_neon_vld3:
 case Intrinsic::arm_neon_vld4:
@@ -14074,6 +14077,21 @@
 Info.flags = MachineMemOperand::MOLoad;
 return true;
   }
+  case Intrinsic::arm_neon_vld1x2:
+  case Intrinsic::arm_neon_vld1x3:
+  case Intrinsic::arm_neon_vld1x4: {
+Info.opc = ISD::INTRINSIC_W_CHAIN;
+// Conservatively set memVT to the entire set of vectors loaded.
+auto  = I.getCalledFunction()->getParent()->getDataLayout();
+uint64_t NumElts = DL.getTypeSizeInBits(I.getType()) / 64;
+Info.memVT = 

[PATCH] D47121: [NEON] Support VLD1xN intrinsics in AArch32 mode (Clang part)

2018-05-31 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer accepted this revision.
SjoerdMeijer added a comment.
This revision is now accepted and ready to land.

I agree: these intrinsics are available in v7/A32/A64.




Comment at: lib/CodeGen/CGBuiltin.cpp:7865
   }
 // FIXME: Sharing loads & stores with 32-bit is complicated by the absence
 // of an Align parameter here.

kosarev wrote:
> SjoerdMeijer wrote:
> > How about this FIXME? Is it still relevant? And does it need to be moved 
> > up? Or perhaps better: move the code back here to minimise the diff?
> Yes, it's still true for the vst builtins handled below. None of the vld/vst 
> patches removes this comment, but it should go away with whatever is the one 
> to be committed last.
> 
> Umm, it seems leaving the vld code here wouldn't make the diff smaller?
ah yes, nevermind, got confused here.


https://reviews.llvm.org/D47121



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D47121: [NEON] Support VLD1xN intrinsics in AArch32 mode (Clang part)

2018-05-30 Thread Ivan Kosarev via Phabricator via cfe-commits
kosarev added a comment.

Thanks for reviewing.




Comment at: lib/CodeGen/CGBuiltin.cpp:7865
   }
 // FIXME: Sharing loads & stores with 32-bit is complicated by the absence
 // of an Align parameter here.

SjoerdMeijer wrote:
> How about this FIXME? Is it still relevant? And does it need to be moved up? 
> Or perhaps better: move the code back here to minimise the diff?
Yes, it's still true for the vst builtins handled below. None of the vld/vst 
patches removes this comment, but it should go away with whatever is the one to 
be committed last.

Umm, it seems leaving the vld code here wouldn't make the diff smaller?



Comment at: test/CodeGen/arm-neon-vld.c:4
+// RUN: FileCheck -check-prefixes=CHECK,CHECK-A64 %s
+// RUN: %clang_cc1 -triple armv8-none-linux-gnueabi -target-feature +neon \
+// RUN: -S -disable-O0-optnone -emit-llvm -o - %s | opt -S -mem2reg | \

SjoerdMeijer wrote:
> Should this be armv7?
There are more ARMv8 vld intrinsics that we currently support only in A64 so I 
was going to add tests for them here. I'm not sure if we want to test 
availability of NEON intrinsics for various architectures with codegen tests 
like this one or have some separate tests in sema.


https://reviews.llvm.org/D47121



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D47121: [NEON] Support VLD1xN intrinsics in AArch32 mode (Clang part)

2018-05-30 Thread Sjoerd Meijer via Phabricator via cfe-commits
SjoerdMeijer added a comment.

Had only a first brief look; see some first drive by comments inline.




Comment at: lib/CodeGen/CGBuiltin.cpp:7865
   }
 // FIXME: Sharing loads & stores with 32-bit is complicated by the absence
 // of an Align parameter here.

How about this FIXME? Is it still relevant? And does it need to be moved up? Or 
perhaps better: move the code back here to minimise the diff?



Comment at: test/CodeGen/arm-neon-vld.c:4
+// RUN: FileCheck -check-prefixes=CHECK,CHECK-A64 %s
+// RUN: %clang_cc1 -triple armv8-none-linux-gnueabi -target-feature +neon \
+// RUN: -S -disable-O0-optnone -emit-llvm -o - %s | opt -S -mem2reg | \

Should this be armv7?


https://reviews.llvm.org/D47121



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D47121: [NEON] Support VLD1xN intrinsics in AArch32 mode (Clang part)

2018-05-30 Thread Ivan Kosarev via Phabricator via cfe-commits
kosarev added a comment.

Ping.


https://reviews.llvm.org/D47121



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits