subject:"\[PATCH\] D78569\: \[SVE\]\[CodeGen\] Lower SDIV \& UDIV to SVE intrinsics"

[PATCH] D78569: [SVE][CodeGen] Lower SDIV & UDIV to SVE intrinsics

2020-04-24 Thread Kerry McLaughlin via Phabricator via cfe-commits

kmclaughlin added a comment.

Thank you both for your comments on this patch, @efriedma & @sdesmalen!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78569/new/

https://reviews.llvm.org/D78569



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D78569: [SVE][CodeGen] Lower SDIV & UDIV to SVE intrinsics

2020-04-24 Thread Kerry McLaughlin via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
kmclaughlin marked an inline comment as done.
Closed by commit rG53dd72a87aeb: [SVE][CodeGen] Lower SDIV & UDIV to SVE 
intrinsics (authored by kmclaughlin).

Changed prior to commit:
  https://reviews.llvm.org/D78569?vs=259610&id=259852#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78569/new/

https://reviews.llvm.org/D78569

Files:
  llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
  llvm/lib/Target/AArch64/AArch64ISelLowering.h
  llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
  llvm/test/CodeGen/AArch64/llvm-ir-to-intrinsic.ll

Index: llvm/test/CodeGen/AArch64/llvm-ir-to-intrinsic.ll
===
--- /dev/null
+++ llvm/test/CodeGen/AArch64/llvm-ir-to-intrinsic.ll
@@ -0,0 +1,45 @@
+; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve < %s | FileCheck %s
+
+;
+; SDIV
+;
+
+define  @sdiv_i32( %a,  %b) {
+; CHECK-LABEL: @sdiv_i32
+; CHECK-DAG: ptrue p0.s
+; CHECK-DAG: sdiv z0.s, p0/m, z0.s, z1.s
+; CHECK-NEXT: ret
+  %div = sdiv  %a, %b
+  ret  %div
+}
+
+define  @sdiv_i64( %a,  %b) {
+; CHECK-LABEL: @sdiv_i64
+; CHECK-DAG: ptrue p0.d
+; CHECK-DAG: sdiv z0.d, p0/m, z0.d, z1.d
+; CHECK-NEXT: ret
+  %div = sdiv  %a, %b
+  ret  %div
+}
+
+;
+; UDIV
+;
+
+define  @udiv_i32( %a,  %b) {
+; CHECK-LABEL: @udiv_i32
+; CHECK-DAG: ptrue p0.s
+; CHECK-DAG: udiv z0.s, p0/m, z0.s, z1.s
+; CHECK-NEXT: ret
+  %div = udiv  %a, %b
+  ret  %div
+}
+
+define  @udiv_i64( %a,  %b) {
+; CHECK-LABEL: @udiv_i64
+; CHECK-DAG: ptrue p0.d
+; CHECK-DAG: udiv z0.d, p0/m, z0.d, z1.d
+; CHECK-NEXT: ret
+  %div = udiv  %a, %b
+  ret  %div
+}
Index: llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
===
--- llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
+++ llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
@@ -145,6 +145,14 @@
 def AArch64lasta   :  SDNode<"AArch64ISD::LASTA", SDT_AArch64Reduce>;
 def AArch64lastb   :  SDNode<"AArch64ISD::LASTB", SDT_AArch64Reduce>;
 
+def SDT_AArch64DIV : SDTypeProfile<1, 3, [
+  SDTCisVec<0>, SDTCisVec<1>, SDTCisVec<2>, SDTCisVec<3>,
+  SDTCVecEltisVT<1,i1>, SDTCisSameAs<2,3>
+]>;
+
+def AArch64sdiv_pred  :  SDNode<"AArch64ISD::SDIV_PRED", SDT_AArch64DIV>;
+def AArch64udiv_pred  :  SDNode<"AArch64ISD::UDIV_PRED", SDT_AArch64DIV>;
+
 def SDT_AArch64ReduceWithInit : SDTypeProfile<1, 3, [SDTCisVec<1>, SDTCisVec<3>]>;
 def AArch64clasta_n   : SDNode<"AArch64ISD::CLASTA_N",   SDT_AArch64ReduceWithInit>;
 def AArch64clastb_n   : SDNode<"AArch64ISD::CLASTB_N",   SDT_AArch64ReduceWithInit>;
@@ -239,8 +247,8 @@
   def : Pat<(mul nxv2i64:$Op1, nxv2i64:$Op2),
 (MUL_ZPmZ_D (PTRUE_D 31), $Op1, $Op2)>;
 
-  defm SDIV_ZPmZ  : sve_int_bin_pred_arit_2_div<0b100, "sdiv", int_aarch64_sve_sdiv>;
-  defm UDIV_ZPmZ  : sve_int_bin_pred_arit_2_div<0b101, "udiv", int_aarch64_sve_udiv>;
+  defm SDIV_ZPmZ  : sve_int_bin_pred_arit_2_div<0b100, "sdiv",  AArch64sdiv_pred>;
+  defm UDIV_ZPmZ  : sve_int_bin_pred_arit_2_div<0b101, "udiv",  AArch64udiv_pred>;
   defm SDIVR_ZPmZ : sve_int_bin_pred_arit_2_div<0b110, "sdivr", int_aarch64_sve_sdivr>;
   defm UDIVR_ZPmZ : sve_int_bin_pred_arit_2_div<0b111, "udivr", int_aarch64_sve_udivr>;
 
Index: llvm/lib/Target/AArch64/AArch64ISelLowering.h
===
--- llvm/lib/Target/AArch64/AArch64ISelLowering.h
+++ llvm/lib/Target/AArch64/AArch64ISelLowering.h
@@ -52,6 +52,10 @@
   ADC,
   SBC, // adc, sbc instructions
 
+  // Arithmetic instructions
+  SDIV_PRED,
+  UDIV_PRED,
+
   // Arithmetic instructions which write flags.
   ADDS,
   SUBS,
@@ -781,6 +785,8 @@
   SDValue LowerVECTOR_SHUFFLE(SDValue Op, SelectionDAG &DAG) const;
   SDValue LowerSPLAT_VECTOR(SDValue Op, SelectionDAG &DAG) const;
   SDValue LowerDUPQLane(SDValue Op, SelectionDAG &DAG) const;
+  SDValue LowerDIV(SDValue Op, SelectionDAG &DAG,
+   unsigned NewOp) const;
   SDValue LowerEXTRACT_SUBVECTOR(SDValue Op, SelectionDAG &DAG) const;
   SDValue LowerVectorSRA_SRL_SHL(SDValue Op, SelectionDAG &DAG) const;
   SDValue LowerShiftLeftParts(SDValue Op, SelectionDAG &DAG) const;
Index: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
===
--- llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -883,8 +883,11 @@
 // splat of 0 or undef) once vector selects supported in SVE codegen. See
 // D68877 for more details.
 for (MVT VT : MVT::integer_scalable_vector_valuetypes()) {
-  if (isTypeLegal(VT))
+  if (isTypeLegal(VT)) {
 setOperationAction(ISD::SPLAT_VECTOR, VT, Custom);
+setOperationAction(ISD::SDIV, VT, Custom);
+setOperationAction(ISD::UDIV, VT, Custom);
+  }
 }
 setOperationAction(ISD::INTRINSIC_WO_CHAIN, MVT::i8, Custom

[PATCH] D78569: [SVE][CodeGen] Lower SDIV & UDIV to SVE intrinsics

2020-04-23 Thread Eli Friedman via Phabricator via cfe-commits

efriedma accepted this revision.
efriedma added a comment.
This revision is now accepted and ready to land.

LGTM




Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:11388
 return LowerSVEIntrinsicEXT(N, DAG);
+case Intrinsic::aarch64_sve_sdiv:
+return DAG.getNode(AArch64ISD::SDIV_PRED, SDLoc(N), N->getValueType(0),

Whitespace.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78569/new/

https://reviews.llvm.org/D78569



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D78569: [SVE][CodeGen] Lower SDIV & UDIV to SVE intrinsics

2020-04-23 Thread Kerry McLaughlin via Phabricator via cfe-commits

kmclaughlin updated this revision to Diff 259610.
kmclaughlin added a comment.

- Removed changes to handle legalisation from this patch (this will be included 
in a follow up)
- Added AArch64ISD nodes for SDIV_PRED & UDIV_PRED
- Changed LowerDIV to use the new ISD nodes rather than lowering to SVE 
intrinsics
- Update tests to use CHECK-DAG


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78569/new/

https://reviews.llvm.org/D78569

Files:
  llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
  llvm/lib/Target/AArch64/AArch64ISelLowering.h
  llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
  llvm/test/CodeGen/AArch64/llvm-ir-to-intrinsic.ll

Index: llvm/test/CodeGen/AArch64/llvm-ir-to-intrinsic.ll
===
--- /dev/null
+++ llvm/test/CodeGen/AArch64/llvm-ir-to-intrinsic.ll
@@ -0,0 +1,45 @@
+; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve < %s | FileCheck %s
+
+;
+; SDIV
+;
+
+define  @sdiv_i32( %a,  %b) {
+; CHECK-LABEL: @sdiv_i32
+; CHECK-DAG: ptrue p0.s
+; CHECK-DAG: sdiv z0.s, p0/m, z0.s, z1.s
+; CHECK-NEXT: ret
+  %div = sdiv  %a, %b
+  ret  %div
+}
+
+define  @sdiv_i64( %a,  %b) {
+; CHECK-LABEL: @sdiv_i64
+; CHECK-DAG: ptrue p0.d
+; CHECK-DAG: sdiv z0.d, p0/m, z0.d, z1.d
+; CHECK-NEXT: ret
+  %div = sdiv  %a, %b
+  ret  %div
+}
+
+;
+; UDIV
+;
+
+define  @udiv_i32( %a,  %b) {
+; CHECK-LABEL: @udiv_i32
+; CHECK-DAG: ptrue p0.s
+; CHECK-DAG: udiv z0.s, p0/m, z0.s, z1.s
+; CHECK-NEXT: ret
+  %div = udiv  %a, %b
+  ret  %div
+}
+
+define  @udiv_i64( %a,  %b) {
+; CHECK-LABEL: @udiv_i64
+; CHECK-DAG: ptrue p0.d
+; CHECK-DAG: udiv z0.d, p0/m, z0.d, z1.d
+; CHECK-NEXT: ret
+  %div = udiv  %a, %b
+  ret  %div
+}
Index: llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
===
--- llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
+++ llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
@@ -145,6 +145,14 @@
 def AArch64lasta   :  SDNode<"AArch64ISD::LASTA", SDT_AArch64Reduce>;
 def AArch64lastb   :  SDNode<"AArch64ISD::LASTB", SDT_AArch64Reduce>;
 
+def SDT_AArch64DIV : SDTypeProfile<1, 3, [
+  SDTCisVec<0>, SDTCisVec<1>, SDTCisVec<2>, SDTCisVec<3>,
+  SDTCVecEltisVT<1,i1>, SDTCisSameAs<2,3>
+]>;
+
+def AArch64sdiv_pred  :  SDNode<"AArch64ISD::SDIV_PRED", SDT_AArch64DIV>;
+def AArch64udiv_pred  :  SDNode<"AArch64ISD::UDIV_PRED", SDT_AArch64DIV>;
+
 def SDT_AArch64ReduceWithInit : SDTypeProfile<1, 3, [SDTCisVec<1>, SDTCisVec<3>]>;
 def AArch64clasta_n   : SDNode<"AArch64ISD::CLASTA_N",   SDT_AArch64ReduceWithInit>;
 def AArch64clastb_n   : SDNode<"AArch64ISD::CLASTB_N",   SDT_AArch64ReduceWithInit>;
@@ -239,8 +247,8 @@
   def : Pat<(mul nxv2i64:$Op1, nxv2i64:$Op2),
 (MUL_ZPmZ_D (PTRUE_D 31), $Op1, $Op2)>;
 
-  defm SDIV_ZPmZ  : sve_int_bin_pred_arit_2_div<0b100, "sdiv", int_aarch64_sve_sdiv>;
-  defm UDIV_ZPmZ  : sve_int_bin_pred_arit_2_div<0b101, "udiv", int_aarch64_sve_udiv>;
+  defm SDIV_ZPmZ  : sve_int_bin_pred_arit_2_div<0b100, "sdiv",  AArch64sdiv_pred>;
+  defm UDIV_ZPmZ  : sve_int_bin_pred_arit_2_div<0b101, "udiv",  AArch64udiv_pred>;
   defm SDIVR_ZPmZ : sve_int_bin_pred_arit_2_div<0b110, "sdivr", int_aarch64_sve_sdivr>;
   defm UDIVR_ZPmZ : sve_int_bin_pred_arit_2_div<0b111, "udivr", int_aarch64_sve_udivr>;
 
Index: llvm/lib/Target/AArch64/AArch64ISelLowering.h
===
--- llvm/lib/Target/AArch64/AArch64ISelLowering.h
+++ llvm/lib/Target/AArch64/AArch64ISelLowering.h
@@ -52,6 +52,10 @@
   ADC,
   SBC, // adc, sbc instructions
 
+  // Arithmetic instructions
+  SDIV_PRED,
+  UDIV_PRED,
+
   // Arithmetic instructions which write flags.
   ADDS,
   SUBS,
@@ -781,6 +785,8 @@
   SDValue LowerVECTOR_SHUFFLE(SDValue Op, SelectionDAG &DAG) const;
   SDValue LowerSPLAT_VECTOR(SDValue Op, SelectionDAG &DAG) const;
   SDValue LowerDUPQLane(SDValue Op, SelectionDAG &DAG) const;
+  SDValue LowerDIV(SDValue Op, SelectionDAG &DAG,
+   unsigned NewOp) const;
   SDValue LowerEXTRACT_SUBVECTOR(SDValue Op, SelectionDAG &DAG) const;
   SDValue LowerVectorSRA_SRL_SHL(SDValue Op, SelectionDAG &DAG) const;
   SDValue LowerShiftLeftParts(SDValue Op, SelectionDAG &DAG) const;
Index: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
===
--- llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -883,8 +883,11 @@
 // splat of 0 or undef) once vector selects supported in SVE codegen. See
 // D68877 for more details.
 for (MVT VT : MVT::integer_scalable_vector_valuetypes()) {
-  if (isTypeLegal(VT))
+  if (isTypeLegal(VT)) {
 setOperationAction(ISD::SPLAT_VECTOR, VT, Custom);
+setOperationAction(ISD::SDIV, VT, Custom);
+setOperationAction(ISD::UDIV, VT, Custom);
+  }
 }
 setOperationAction(ISD::INTRINSIC_WO_CHAIN, MVT::i8, Custom);
 se

[PATCH] D78569: [SVE][CodeGen] Lower SDIV & UDIV to SVE intrinsics

2020-04-23 Thread Sander de Smalen via Phabricator via cfe-commits

sdesmalen added inline comments.



Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:7670
+ Mask, Op.getOperand(0), Op.getOperand(1));
+}
+

efriedma wrote:
> sdesmalen wrote:
> > efriedma wrote:
> > > If we're going to support these operations, we might as well just add 
> > > isel patterns; that's what we've been doing for other arithmetic 
> > > operations.
> > Just to provide a bit of context to this approach:
> > 
> > For unpredicated ISD nodes for which there is no predicated instruction, 
> > the predicate needs to be generated. For scalable vectors this will be a 
> > `ptrue all`, but for fixed-width vectors may take some other predicate such 
> > as VL8 for fixed `8` elements.
> > 
> > Rather than creating new predicated AArch64 ISD nodes for each operation 
> > such as `AArch64ISD::UDIV_PRED`, the idea is to reuse the intrinsic layer 
> > we already added to support the ACLE - which are predicated and for which 
> > we already have the patterns - and map directly onto those.
> > 
> > By doing the expansion in ISelLowering, the patterns stay simple and we can 
> > generalise `getPtrue` method so that it generates the right predicate for 
> > any scalable/fixed vector size as done in D71760 avoiding the need to write 
> > multiple patterns for different vector lengths.
> > 
> > This patch was meant as the proof of concept of that idea (as discussed in 
> > the sync-up call of Apr 2). 
> Using INTRINSIC_WO_CHAIN is a little annoying; it's hard to read in DAG 
> dumps, and it gives weird error messages if we fail in selection.  But there 
> aren't really any other immediate downsides I can think of, vs. doing it the 
> other way (converting the intrinsic to AArch64ISD::UDIV_PRED).
> 
> Long-term, we're going to have a target-independent ISD::UDIV_PRED.  We 
> probably want to start using those nodes at some point, to get 
> target-independent optimizations. Not sure if that impacts what we want to do 
> right now.
I agree that using INTRINSIC_WO_CHAIN will be a bit annoying for more 
complicated patterns. The reuse of the intrinsics was merely for convenience 
because we already have the patterns, and was not a critical part of the 
design. It shouldn't be a big effort to create AArch64ISD nodes and use these 
for the intrinsics as well.

If we use AArch64-specific nodes we can implement what's needed now for SVE 
codegen and I expect we can easily migrate to target-independent nodes when 
they get added.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78569/new/

https://reviews.llvm.org/D78569



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D78569: [SVE][CodeGen] Lower SDIV & UDIV to SVE intrinsics

2020-04-21 Thread Eli Friedman via Phabricator via cfe-commits

efriedma added inline comments.



Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:7670
+ Mask, Op.getOperand(0), Op.getOperand(1));
+}
+

sdesmalen wrote:
> efriedma wrote:
> > If we're going to support these operations, we might as well just add isel 
> > patterns; that's what we've been doing for other arithmetic operations.
> Just to provide a bit of context to this approach:
> 
> For unpredicated ISD nodes for which there is no predicated instruction, the 
> predicate needs to be generated. For scalable vectors this will be a `ptrue 
> all`, but for fixed-width vectors may take some other predicate such as VL8 
> for fixed `8` elements.
> 
> Rather than creating new predicated AArch64 ISD nodes for each operation such 
> as `AArch64ISD::UDIV_PRED`, the idea is to reuse the intrinsic layer we 
> already added to support the ACLE - which are predicated and for which we 
> already have the patterns - and map directly onto those.
> 
> By doing the expansion in ISelLowering, the patterns stay simple and we can 
> generalise `getPtrue` method so that it generates the right predicate for any 
> scalable/fixed vector size as done in D71760 avoiding the need to write 
> multiple patterns for different vector lengths.
> 
> This patch was meant as the proof of concept of that idea (as discussed in 
> the sync-up call of Apr 2). 
Using INTRINSIC_WO_CHAIN is a little annoying; it's hard to read in DAG dumps, 
and it gives weird error messages if we fail in selection.  But there aren't 
really any other immediate downsides I can think of, vs. doing it the other way 
(converting the intrinsic to AArch64ISD::UDIV_PRED).

Long-term, we're going to have a target-independent ISD::UDIV_PRED.  We 
probably want to start using those nodes at some point, to get 
target-independent optimizations. Not sure if that impacts what we want to do 
right now.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78569/new/

https://reviews.llvm.org/D78569



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D78569: [SVE][CodeGen] Lower SDIV & UDIV to SVE intrinsics

2020-04-21 Thread Sander de Smalen via Phabricator via cfe-commits

sdesmalen added inline comments.

Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:7670
+ Mask, Op.getOperand(0), Op.getOperand(1));
+}
+

efriedma wrote:
> If we're going to support these operations, we might as well just add isel 
> patterns; that's what we've been doing for other arithmetic operations.
Just to provide a bit of context to this approach:

For unpredicated ISD nodes for which there is no predicated instruction, the 
predicate needs to be generated. For scalable vectors this will be a `ptrue 
all`, but for fixed-width vectors may take some other predicate such as VL8 for 
fixed `8` elements.

Rather than creating new predicated AArch64 ISD nodes for each operation such 
as `AArch64ISD::UDIV_PRED`, the idea is to reuse the intrinsic layer we already 
added to support the ACLE - which are predicated and for which we already have 
the patterns - and map directly onto those.

By doing the expansion in ISelLowering, the patterns stay simple and we can 
generalise `getPtrue` method so that it generates the right predicate for any 
scalable/fixed vector size as done in D71760 avoiding the need to write 
multiple patterns for different vector lengths.

This patch was meant as the proof of concept of that idea (as discussed in the 
sync-up call of Apr 2). 

Comment at: llvm/test/CodeGen/AArch64/llvm-ir-to-intrinsic.ll:28
+; CHECK: ptrue p0.s
+; CHECK-NEXT: sdiv z0.s, p0/m, z0.s, z2.s
+; CHECK-NEXT: sdiv z1.s, p0/m, z1.s, z3.s

This test should use CHECK-DAG instead of CHECK-NEXT, as the sdiv instructions 
are independent. (same for some of the other tests)

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78569/new/

https://reviews.llvm.org/D78569

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D78569: [SVE][CodeGen] Lower SDIV & UDIV to SVE intrinsics

2020-04-21 Thread Eli Friedman via Phabricator via cfe-commits

efriedma added a comment.

I'd prefer to handle legalization in a separate patch from handling legal 
sdiv/udiv operations, so we actually have some context to discuss the 
legalization strategy.




Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:7670
+ Mask, Op.getOperand(0), Op.getOperand(1));
+}
+

If we're going to support these operations, we might as well just add isel 
patterns; that's what we've been doing for other arithmetic operations.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78569/new/

https://reviews.llvm.org/D78569



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D78569: [SVE][CodeGen] Lower SDIV & UDIV to SVE intrinsics

2020-04-21 Thread Kerry McLaughlin via Phabricator via cfe-commits

kmclaughlin created this revision.
kmclaughlin added reviewers: sdesmalen, c-rhodes, efriedma, cameron.mcinally.
Herald added subscribers: psnobl, rkruppe, hiraditya, kristof.beyls, tschuett.
Herald added a reviewer: rengolin.
Herald added a project: LLVM.

This patch maps IR operations for sdiv & udiv to the
@llvm.aarch64.sve.[s|u]div intrinsics.

A ptrue must be created during lowering as the div instructions
have only a predicated form.

Patch contains changes by Andrzej Warzynski.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D78569

Files:
  llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
  llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
  llvm/lib/CodeGen/TargetLoweringBase.cpp
  llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
  llvm/lib/Target/AArch64/AArch64ISelLowering.h
  llvm/test/CodeGen/AArch64/llvm-ir-to-intrinsic.ll

Index: llvm/test/CodeGen/AArch64/llvm-ir-to-intrinsic.ll
===
--- /dev/null
+++ llvm/test/CodeGen/AArch64/llvm-ir-to-intrinsic.ll
@@ -0,0 +1,87 @@
+; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve < %s | FileCheck %s
+
+;
+; SDIV
+;
+
+define  @sdiv_i32( %a,  %b) {
+; CHECK-LABEL: @sdiv_i32
+; CHECK: ptrue p0.s
+; CHECK-NEXT: sdiv z0.s, p0/m, z0.s, z1.s
+; CHECK-NEXT: ret
+  %div = sdiv  %a, %b
+  ret  %div
+}
+
+define  @sdiv_i64( %a,  %b) {
+; CHECK-LABEL: @sdiv_i64
+; CHECK: ptrue p0.d
+; CHECK-NEXT: sdiv z0.d, p0/m, z0.d, z1.d
+; CHECK-NEXT: ret
+  %div = sdiv  %a, %b
+  ret  %div
+}
+
+define  @sdiv_narrow( %a,  %b) {
+; CHECK-LABEL: @sdiv_narrow
+; CHECK: ptrue p0.s
+; CHECK-NEXT: sdiv z0.s, p0/m, z0.s, z2.s
+; CHECK-NEXT: sdiv z1.s, p0/m, z1.s, z3.s
+; CHECK-NEXT: ret
+  %div = sdiv  %a, %b
+  ret  %div
+}
+
+define  @sdiv_widen( %a,  %b) {
+; CHECK-LABEL: @sdiv_widen
+; CHECK: ptrue p0.d
+; CHECK-NEXT: sxtw z1.d, p0/m, z1.d
+; CHECK-NEXT: sxtw z0.d, p0/m, z0.d
+; CHECK-NEXT: sdiv z0.d, p0/m, z0.d, z1.d
+; CHECK-NEXT: ret
+  %div = sdiv  %a, %b
+  ret  %div
+}
+
+;
+; UDIV
+;
+
+define  @udiv_i32( %a,  %b) {
+; CHECK-LABEL: @udiv_i32
+; CHECK: ptrue p0.s
+; CHECK-NEXT: udiv z0.s, p0/m, z0.s, z1.s
+; CHECK-NEXT: ret
+  %div = udiv  %a, %b
+  ret  %div
+}
+
+define  @udiv_i64( %a,  %b) {
+; CHECK-LABEL: @udiv_i64
+; CHECK: ptrue p0.d
+; CHECK-NEXT: udiv z0.d, p0/m, z0.d, z1.d
+; CHECK-NEXT: ret
+  %div = udiv  %a, %b
+  ret  %div
+}
+
+define  @udiv_narrow( %a,  %b) {
+; CHECK-LABEL: @udiv_narrow
+; CHECK: ptrue p0.s
+; CHECK-NEXT: udiv z0.s, p0/m, z0.s, z2.s
+; CHECK-NEXT: udiv z1.s, p0/m, z1.s, z3.s
+; CHECK-NEXT: ret
+  %div = udiv  %a, %b
+  ret  %div
+}
+
+define  @udiv_widen( %a,  %b) {
+; CHECK-LABEL: @udiv_widen
+; CHECK: ptrue p0.d
+; CHECK-NEXT: and z1.d, z1.d, #0x
+; CHECK-NEXT: and z0.d, z0.d, #0x
+; CHECK-NEXT: udiv z0.d, p0/m, z0.d, z1.d
+; CHECK-NEXT: ret
+  %div = udiv  %a, %b
+  ret  %div
+}
Index: llvm/lib/Target/AArch64/AArch64ISelLowering.h
===
--- llvm/lib/Target/AArch64/AArch64ISelLowering.h
+++ llvm/lib/Target/AArch64/AArch64ISelLowering.h
@@ -776,6 +776,8 @@
   SDValue LowerVECTOR_SHUFFLE(SDValue Op, SelectionDAG &DAG) const;
   SDValue LowerSPLAT_VECTOR(SDValue Op, SelectionDAG &DAG) const;
   SDValue LowerDUPQLane(SDValue Op, SelectionDAG &DAG) const;
+  SDValue LowerDIV(SDValue Op, SelectionDAG &DAG,
+   unsigned IntrID) const;
   SDValue LowerEXTRACT_SUBVECTOR(SDValue Op, SelectionDAG &DAG) const;
   SDValue LowerVectorSRA_SRL_SHL(SDValue Op, SelectionDAG &DAG) const;
   SDValue LowerShiftLeftParts(SDValue Op, SelectionDAG &DAG) const;
Index: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
===
--- llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -883,8 +883,11 @@
 // splat of 0 or undef) once vector selects supported in SVE codegen. See
 // D68877 for more details.
 for (MVT VT : MVT::integer_scalable_vector_valuetypes()) {
-  if (isTypeLegal(VT))
+  if (isTypeLegal(VT)) {
 setOperationAction(ISD::SPLAT_VECTOR, VT, Custom);
+setOperationAction(ISD::SDIV, VT, Custom);
+setOperationAction(ISD::UDIV, VT, Custom);
+  }
 }
 setOperationAction(ISD::INTRINSIC_WO_CHAIN, MVT::i8, Custom);
 setOperationAction(ISD::INTRINSIC_WO_CHAIN, MVT::i16, Custom);
@@ -3337,6 +3340,10 @@
 return LowerSPLAT_VECTOR(Op, DAG);
   case ISD::EXTRACT_SUBVECTOR:
 return LowerEXTRACT_SUBVECTOR(Op, DAG);
+  case ISD::SDIV:
+return LowerDIV(Op, DAG, Intrinsic::aarch64_sve_sdiv);
+  case ISD::UDIV:
+return LowerDIV(Op, DAG, Intrinsic::aarch64_sve_udiv);
   case ISD::SRA:
   case ISD::SRL:
   case ISD::SHL:
@@ -7643,6 +7650,25 @@
   return DAG.getNode(ISD::BITCAST, DL, VT, TBL);
 }
 
+SDValue AArch64TargetLowering::LowerDIV(SDValue Op,
+SelectionDAG &DAG,
+

[PATCH] D78569: [SVE][CodeGen] Lower SDIV & UDIV to SVE intrinsics

[PATCH] D78569: [SVE][CodeGen] Lower SDIV & UDIV to SVE intrinsics

[PATCH] D78569: [SVE][CodeGen] Lower SDIV & UDIV to SVE intrinsics

[PATCH] D78569: [SVE][CodeGen] Lower SDIV & UDIV to SVE intrinsics

[PATCH] D78569: [SVE][CodeGen] Lower SDIV & UDIV to SVE intrinsics

[PATCH] D78569: [SVE][CodeGen] Lower SDIV & UDIV to SVE intrinsics

[PATCH] D78569: [SVE][CodeGen] Lower SDIV & UDIV to SVE intrinsics

[PATCH] D78569: [SVE][CodeGen] Lower SDIV & UDIV to SVE intrinsics

[PATCH] D78569: [SVE][CodeGen] Lower SDIV & UDIV to SVE intrinsics

9 matches

Site Navigation

Mail list logo

Footer information