[PATCH] D78569: [SVE][CodeGen] Lower SDIV & UDIV to SVE intrinsics
kmclaughlin added a comment. Thank you both for your comments on this patch, @efriedma & @sdesmalen! Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D78569/new/ https://reviews.llvm.org/D78569 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D78569: [SVE][CodeGen] Lower SDIV & UDIV to SVE intrinsics
This revision was automatically updated to reflect the committed changes. kmclaughlin marked an inline comment as done. Closed by commit rG53dd72a87aeb: [SVE][CodeGen] Lower SDIV & UDIV to SVE intrinsics (authored by kmclaughlin). Changed prior to commit: https://reviews.llvm.org/D78569?vs=259610&id=259852#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D78569/new/ https://reviews.llvm.org/D78569 Files: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp llvm/lib/Target/AArch64/AArch64ISelLowering.h llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td llvm/test/CodeGen/AArch64/llvm-ir-to-intrinsic.ll Index: llvm/test/CodeGen/AArch64/llvm-ir-to-intrinsic.ll === --- /dev/null +++ llvm/test/CodeGen/AArch64/llvm-ir-to-intrinsic.ll @@ -0,0 +1,45 @@ +; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve < %s | FileCheck %s + +; +; SDIV +; + +define @sdiv_i32( %a, %b) { +; CHECK-LABEL: @sdiv_i32 +; CHECK-DAG: ptrue p0.s +; CHECK-DAG: sdiv z0.s, p0/m, z0.s, z1.s +; CHECK-NEXT: ret + %div = sdiv %a, %b + ret %div +} + +define @sdiv_i64( %a, %b) { +; CHECK-LABEL: @sdiv_i64 +; CHECK-DAG: ptrue p0.d +; CHECK-DAG: sdiv z0.d, p0/m, z0.d, z1.d +; CHECK-NEXT: ret + %div = sdiv %a, %b + ret %div +} + +; +; UDIV +; + +define @udiv_i32( %a, %b) { +; CHECK-LABEL: @udiv_i32 +; CHECK-DAG: ptrue p0.s +; CHECK-DAG: udiv z0.s, p0/m, z0.s, z1.s +; CHECK-NEXT: ret + %div = udiv %a, %b + ret %div +} + +define @udiv_i64( %a, %b) { +; CHECK-LABEL: @udiv_i64 +; CHECK-DAG: ptrue p0.d +; CHECK-DAG: udiv z0.d, p0/m, z0.d, z1.d +; CHECK-NEXT: ret + %div = udiv %a, %b + ret %div +} Index: llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td === --- llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td +++ llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td @@ -145,6 +145,14 @@ def AArch64lasta : SDNode<"AArch64ISD::LASTA", SDT_AArch64Reduce>; def AArch64lastb : SDNode<"AArch64ISD::LASTB", SDT_AArch64Reduce>; +def SDT_AArch64DIV : SDTypeProfile<1, 3, [ + SDTCisVec<0>, SDTCisVec<1>, SDTCisVec<2>, SDTCisVec<3>, + SDTCVecEltisVT<1,i1>, SDTCisSameAs<2,3> +]>; + +def AArch64sdiv_pred : SDNode<"AArch64ISD::SDIV_PRED", SDT_AArch64DIV>; +def AArch64udiv_pred : SDNode<"AArch64ISD::UDIV_PRED", SDT_AArch64DIV>; + def SDT_AArch64ReduceWithInit : SDTypeProfile<1, 3, [SDTCisVec<1>, SDTCisVec<3>]>; def AArch64clasta_n : SDNode<"AArch64ISD::CLASTA_N", SDT_AArch64ReduceWithInit>; def AArch64clastb_n : SDNode<"AArch64ISD::CLASTB_N", SDT_AArch64ReduceWithInit>; @@ -239,8 +247,8 @@ def : Pat<(mul nxv2i64:$Op1, nxv2i64:$Op2), (MUL_ZPmZ_D (PTRUE_D 31), $Op1, $Op2)>; - defm SDIV_ZPmZ : sve_int_bin_pred_arit_2_div<0b100, "sdiv", int_aarch64_sve_sdiv>; - defm UDIV_ZPmZ : sve_int_bin_pred_arit_2_div<0b101, "udiv", int_aarch64_sve_udiv>; + defm SDIV_ZPmZ : sve_int_bin_pred_arit_2_div<0b100, "sdiv", AArch64sdiv_pred>; + defm UDIV_ZPmZ : sve_int_bin_pred_arit_2_div<0b101, "udiv", AArch64udiv_pred>; defm SDIVR_ZPmZ : sve_int_bin_pred_arit_2_div<0b110, "sdivr", int_aarch64_sve_sdivr>; defm UDIVR_ZPmZ : sve_int_bin_pred_arit_2_div<0b111, "udivr", int_aarch64_sve_udivr>; Index: llvm/lib/Target/AArch64/AArch64ISelLowering.h === --- llvm/lib/Target/AArch64/AArch64ISelLowering.h +++ llvm/lib/Target/AArch64/AArch64ISelLowering.h @@ -52,6 +52,10 @@ ADC, SBC, // adc, sbc instructions + // Arithmetic instructions + SDIV_PRED, + UDIV_PRED, + // Arithmetic instructions which write flags. ADDS, SUBS, @@ -781,6 +785,8 @@ SDValue LowerVECTOR_SHUFFLE(SDValue Op, SelectionDAG &DAG) const; SDValue LowerSPLAT_VECTOR(SDValue Op, SelectionDAG &DAG) const; SDValue LowerDUPQLane(SDValue Op, SelectionDAG &DAG) const; + SDValue LowerDIV(SDValue Op, SelectionDAG &DAG, + unsigned NewOp) const; SDValue LowerEXTRACT_SUBVECTOR(SDValue Op, SelectionDAG &DAG) const; SDValue LowerVectorSRA_SRL_SHL(SDValue Op, SelectionDAG &DAG) const; SDValue LowerShiftLeftParts(SDValue Op, SelectionDAG &DAG) const; Index: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp === --- llvm/lib/Target/AArch64/AArch64ISelLowering.cpp +++ llvm/lib/Target/AArch64/AArch64ISelLowering.cpp @@ -883,8 +883,11 @@ // splat of 0 or undef) once vector selects supported in SVE codegen. See // D68877 for more details. for (MVT VT : MVT::integer_scalable_vector_valuetypes()) { - if (isTypeLegal(VT)) + if (isTypeLegal(VT)) { setOperationAction(ISD::SPLAT_VECTOR, VT, Custom); +setOperationAction(ISD::SDIV, VT, Custom); +setOperationAction(ISD::UDIV, VT, Custom); + } } setOperationAction(ISD::INTRINSIC_WO_CHAIN, MVT::i8, Custom
[PATCH] D78569: [SVE][CodeGen] Lower SDIV & UDIV to SVE intrinsics
efriedma accepted this revision. efriedma added a comment. This revision is now accepted and ready to land. LGTM Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:11388 return LowerSVEIntrinsicEXT(N, DAG); +case Intrinsic::aarch64_sve_sdiv: +return DAG.getNode(AArch64ISD::SDIV_PRED, SDLoc(N), N->getValueType(0), Whitespace. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D78569/new/ https://reviews.llvm.org/D78569 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D78569: [SVE][CodeGen] Lower SDIV & UDIV to SVE intrinsics
kmclaughlin updated this revision to Diff 259610. kmclaughlin added a comment. - Removed changes to handle legalisation from this patch (this will be included in a follow up) - Added AArch64ISD nodes for SDIV_PRED & UDIV_PRED - Changed LowerDIV to use the new ISD nodes rather than lowering to SVE intrinsics - Update tests to use CHECK-DAG CHANGES SINCE LAST ACTION https://reviews.llvm.org/D78569/new/ https://reviews.llvm.org/D78569 Files: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp llvm/lib/Target/AArch64/AArch64ISelLowering.h llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td llvm/test/CodeGen/AArch64/llvm-ir-to-intrinsic.ll Index: llvm/test/CodeGen/AArch64/llvm-ir-to-intrinsic.ll === --- /dev/null +++ llvm/test/CodeGen/AArch64/llvm-ir-to-intrinsic.ll @@ -0,0 +1,45 @@ +; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve < %s | FileCheck %s + +; +; SDIV +; + +define @sdiv_i32( %a, %b) { +; CHECK-LABEL: @sdiv_i32 +; CHECK-DAG: ptrue p0.s +; CHECK-DAG: sdiv z0.s, p0/m, z0.s, z1.s +; CHECK-NEXT: ret + %div = sdiv %a, %b + ret %div +} + +define @sdiv_i64( %a, %b) { +; CHECK-LABEL: @sdiv_i64 +; CHECK-DAG: ptrue p0.d +; CHECK-DAG: sdiv z0.d, p0/m, z0.d, z1.d +; CHECK-NEXT: ret + %div = sdiv %a, %b + ret %div +} + +; +; UDIV +; + +define @udiv_i32( %a, %b) { +; CHECK-LABEL: @udiv_i32 +; CHECK-DAG: ptrue p0.s +; CHECK-DAG: udiv z0.s, p0/m, z0.s, z1.s +; CHECK-NEXT: ret + %div = udiv %a, %b + ret %div +} + +define @udiv_i64( %a, %b) { +; CHECK-LABEL: @udiv_i64 +; CHECK-DAG: ptrue p0.d +; CHECK-DAG: udiv z0.d, p0/m, z0.d, z1.d +; CHECK-NEXT: ret + %div = udiv %a, %b + ret %div +} Index: llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td === --- llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td +++ llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td @@ -145,6 +145,14 @@ def AArch64lasta : SDNode<"AArch64ISD::LASTA", SDT_AArch64Reduce>; def AArch64lastb : SDNode<"AArch64ISD::LASTB", SDT_AArch64Reduce>; +def SDT_AArch64DIV : SDTypeProfile<1, 3, [ + SDTCisVec<0>, SDTCisVec<1>, SDTCisVec<2>, SDTCisVec<3>, + SDTCVecEltisVT<1,i1>, SDTCisSameAs<2,3> +]>; + +def AArch64sdiv_pred : SDNode<"AArch64ISD::SDIV_PRED", SDT_AArch64DIV>; +def AArch64udiv_pred : SDNode<"AArch64ISD::UDIV_PRED", SDT_AArch64DIV>; + def SDT_AArch64ReduceWithInit : SDTypeProfile<1, 3, [SDTCisVec<1>, SDTCisVec<3>]>; def AArch64clasta_n : SDNode<"AArch64ISD::CLASTA_N", SDT_AArch64ReduceWithInit>; def AArch64clastb_n : SDNode<"AArch64ISD::CLASTB_N", SDT_AArch64ReduceWithInit>; @@ -239,8 +247,8 @@ def : Pat<(mul nxv2i64:$Op1, nxv2i64:$Op2), (MUL_ZPmZ_D (PTRUE_D 31), $Op1, $Op2)>; - defm SDIV_ZPmZ : sve_int_bin_pred_arit_2_div<0b100, "sdiv", int_aarch64_sve_sdiv>; - defm UDIV_ZPmZ : sve_int_bin_pred_arit_2_div<0b101, "udiv", int_aarch64_sve_udiv>; + defm SDIV_ZPmZ : sve_int_bin_pred_arit_2_div<0b100, "sdiv", AArch64sdiv_pred>; + defm UDIV_ZPmZ : sve_int_bin_pred_arit_2_div<0b101, "udiv", AArch64udiv_pred>; defm SDIVR_ZPmZ : sve_int_bin_pred_arit_2_div<0b110, "sdivr", int_aarch64_sve_sdivr>; defm UDIVR_ZPmZ : sve_int_bin_pred_arit_2_div<0b111, "udivr", int_aarch64_sve_udivr>; Index: llvm/lib/Target/AArch64/AArch64ISelLowering.h === --- llvm/lib/Target/AArch64/AArch64ISelLowering.h +++ llvm/lib/Target/AArch64/AArch64ISelLowering.h @@ -52,6 +52,10 @@ ADC, SBC, // adc, sbc instructions + // Arithmetic instructions + SDIV_PRED, + UDIV_PRED, + // Arithmetic instructions which write flags. ADDS, SUBS, @@ -781,6 +785,8 @@ SDValue LowerVECTOR_SHUFFLE(SDValue Op, SelectionDAG &DAG) const; SDValue LowerSPLAT_VECTOR(SDValue Op, SelectionDAG &DAG) const; SDValue LowerDUPQLane(SDValue Op, SelectionDAG &DAG) const; + SDValue LowerDIV(SDValue Op, SelectionDAG &DAG, + unsigned NewOp) const; SDValue LowerEXTRACT_SUBVECTOR(SDValue Op, SelectionDAG &DAG) const; SDValue LowerVectorSRA_SRL_SHL(SDValue Op, SelectionDAG &DAG) const; SDValue LowerShiftLeftParts(SDValue Op, SelectionDAG &DAG) const; Index: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp === --- llvm/lib/Target/AArch64/AArch64ISelLowering.cpp +++ llvm/lib/Target/AArch64/AArch64ISelLowering.cpp @@ -883,8 +883,11 @@ // splat of 0 or undef) once vector selects supported in SVE codegen. See // D68877 for more details. for (MVT VT : MVT::integer_scalable_vector_valuetypes()) { - if (isTypeLegal(VT)) + if (isTypeLegal(VT)) { setOperationAction(ISD::SPLAT_VECTOR, VT, Custom); +setOperationAction(ISD::SDIV, VT, Custom); +setOperationAction(ISD::UDIV, VT, Custom); + } } setOperationAction(ISD::INTRINSIC_WO_CHAIN, MVT::i8, Custom); se
[PATCH] D78569: [SVE][CodeGen] Lower SDIV & UDIV to SVE intrinsics
sdesmalen added inline comments. Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:7670 + Mask, Op.getOperand(0), Op.getOperand(1)); +} + efriedma wrote: > sdesmalen wrote: > > efriedma wrote: > > > If we're going to support these operations, we might as well just add > > > isel patterns; that's what we've been doing for other arithmetic > > > operations. > > Just to provide a bit of context to this approach: > > > > For unpredicated ISD nodes for which there is no predicated instruction, > > the predicate needs to be generated. For scalable vectors this will be a > > `ptrue all`, but for fixed-width vectors may take some other predicate such > > as VL8 for fixed `8` elements. > > > > Rather than creating new predicated AArch64 ISD nodes for each operation > > such as `AArch64ISD::UDIV_PRED`, the idea is to reuse the intrinsic layer > > we already added to support the ACLE - which are predicated and for which > > we already have the patterns - and map directly onto those. > > > > By doing the expansion in ISelLowering, the patterns stay simple and we can > > generalise `getPtrue` method so that it generates the right predicate for > > any scalable/fixed vector size as done in D71760 avoiding the need to write > > multiple patterns for different vector lengths. > > > > This patch was meant as the proof of concept of that idea (as discussed in > > the sync-up call of Apr 2). > Using INTRINSIC_WO_CHAIN is a little annoying; it's hard to read in DAG > dumps, and it gives weird error messages if we fail in selection. But there > aren't really any other immediate downsides I can think of, vs. doing it the > other way (converting the intrinsic to AArch64ISD::UDIV_PRED). > > Long-term, we're going to have a target-independent ISD::UDIV_PRED. We > probably want to start using those nodes at some point, to get > target-independent optimizations. Not sure if that impacts what we want to do > right now. I agree that using INTRINSIC_WO_CHAIN will be a bit annoying for more complicated patterns. The reuse of the intrinsics was merely for convenience because we already have the patterns, and was not a critical part of the design. It shouldn't be a big effort to create AArch64ISD nodes and use these for the intrinsics as well. If we use AArch64-specific nodes we can implement what's needed now for SVE codegen and I expect we can easily migrate to target-independent nodes when they get added. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D78569/new/ https://reviews.llvm.org/D78569 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D78569: [SVE][CodeGen] Lower SDIV & UDIV to SVE intrinsics
efriedma added inline comments. Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:7670 + Mask, Op.getOperand(0), Op.getOperand(1)); +} + sdesmalen wrote: > efriedma wrote: > > If we're going to support these operations, we might as well just add isel > > patterns; that's what we've been doing for other arithmetic operations. > Just to provide a bit of context to this approach: > > For unpredicated ISD nodes for which there is no predicated instruction, the > predicate needs to be generated. For scalable vectors this will be a `ptrue > all`, but for fixed-width vectors may take some other predicate such as VL8 > for fixed `8` elements. > > Rather than creating new predicated AArch64 ISD nodes for each operation such > as `AArch64ISD::UDIV_PRED`, the idea is to reuse the intrinsic layer we > already added to support the ACLE - which are predicated and for which we > already have the patterns - and map directly onto those. > > By doing the expansion in ISelLowering, the patterns stay simple and we can > generalise `getPtrue` method so that it generates the right predicate for any > scalable/fixed vector size as done in D71760 avoiding the need to write > multiple patterns for different vector lengths. > > This patch was meant as the proof of concept of that idea (as discussed in > the sync-up call of Apr 2). Using INTRINSIC_WO_CHAIN is a little annoying; it's hard to read in DAG dumps, and it gives weird error messages if we fail in selection. But there aren't really any other immediate downsides I can think of, vs. doing it the other way (converting the intrinsic to AArch64ISD::UDIV_PRED). Long-term, we're going to have a target-independent ISD::UDIV_PRED. We probably want to start using those nodes at some point, to get target-independent optimizations. Not sure if that impacts what we want to do right now. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D78569/new/ https://reviews.llvm.org/D78569 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D78569: [SVE][CodeGen] Lower SDIV & UDIV to SVE intrinsics
sdesmalen added inline comments. Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:7670 + Mask, Op.getOperand(0), Op.getOperand(1)); +} + efriedma wrote: > If we're going to support these operations, we might as well just add isel > patterns; that's what we've been doing for other arithmetic operations. Just to provide a bit of context to this approach: For unpredicated ISD nodes for which there is no predicated instruction, the predicate needs to be generated. For scalable vectors this will be a `ptrue all`, but for fixed-width vectors may take some other predicate such as VL8 for fixed `8` elements. Rather than creating new predicated AArch64 ISD nodes for each operation such as `AArch64ISD::UDIV_PRED`, the idea is to reuse the intrinsic layer we already added to support the ACLE - which are predicated and for which we already have the patterns - and map directly onto those. By doing the expansion in ISelLowering, the patterns stay simple and we can generalise `getPtrue` method so that it generates the right predicate for any scalable/fixed vector size as done in D71760 avoiding the need to write multiple patterns for different vector lengths. This patch was meant as the proof of concept of that idea (as discussed in the sync-up call of Apr 2). Comment at: llvm/test/CodeGen/AArch64/llvm-ir-to-intrinsic.ll:28 +; CHECK: ptrue p0.s +; CHECK-NEXT: sdiv z0.s, p0/m, z0.s, z2.s +; CHECK-NEXT: sdiv z1.s, p0/m, z1.s, z3.s This test should use CHECK-DAG instead of CHECK-NEXT, as the sdiv instructions are independent. (same for some of the other tests) Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D78569/new/ https://reviews.llvm.org/D78569 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D78569: [SVE][CodeGen] Lower SDIV & UDIV to SVE intrinsics
efriedma added a comment. I'd prefer to handle legalization in a separate patch from handling legal sdiv/udiv operations, so we actually have some context to discuss the legalization strategy. Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:7670 + Mask, Op.getOperand(0), Op.getOperand(1)); +} + If we're going to support these operations, we might as well just add isel patterns; that's what we've been doing for other arithmetic operations. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D78569/new/ https://reviews.llvm.org/D78569 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D78569: [SVE][CodeGen] Lower SDIV & UDIV to SVE intrinsics
kmclaughlin created this revision. kmclaughlin added reviewers: sdesmalen, c-rhodes, efriedma, cameron.mcinally. Herald added subscribers: psnobl, rkruppe, hiraditya, kristof.beyls, tschuett. Herald added a reviewer: rengolin. Herald added a project: LLVM. This patch maps IR operations for sdiv & udiv to the @llvm.aarch64.sve.[s|u]div intrinsics. A ptrue must be created during lowering as the div instructions have only a predicated form. Patch contains changes by Andrzej Warzynski. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D78569 Files: llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp llvm/lib/CodeGen/TargetLoweringBase.cpp llvm/lib/Target/AArch64/AArch64ISelLowering.cpp llvm/lib/Target/AArch64/AArch64ISelLowering.h llvm/test/CodeGen/AArch64/llvm-ir-to-intrinsic.ll Index: llvm/test/CodeGen/AArch64/llvm-ir-to-intrinsic.ll === --- /dev/null +++ llvm/test/CodeGen/AArch64/llvm-ir-to-intrinsic.ll @@ -0,0 +1,87 @@ +; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve < %s | FileCheck %s + +; +; SDIV +; + +define @sdiv_i32( %a, %b) { +; CHECK-LABEL: @sdiv_i32 +; CHECK: ptrue p0.s +; CHECK-NEXT: sdiv z0.s, p0/m, z0.s, z1.s +; CHECK-NEXT: ret + %div = sdiv %a, %b + ret %div +} + +define @sdiv_i64( %a, %b) { +; CHECK-LABEL: @sdiv_i64 +; CHECK: ptrue p0.d +; CHECK-NEXT: sdiv z0.d, p0/m, z0.d, z1.d +; CHECK-NEXT: ret + %div = sdiv %a, %b + ret %div +} + +define @sdiv_narrow( %a, %b) { +; CHECK-LABEL: @sdiv_narrow +; CHECK: ptrue p0.s +; CHECK-NEXT: sdiv z0.s, p0/m, z0.s, z2.s +; CHECK-NEXT: sdiv z1.s, p0/m, z1.s, z3.s +; CHECK-NEXT: ret + %div = sdiv %a, %b + ret %div +} + +define @sdiv_widen( %a, %b) { +; CHECK-LABEL: @sdiv_widen +; CHECK: ptrue p0.d +; CHECK-NEXT: sxtw z1.d, p0/m, z1.d +; CHECK-NEXT: sxtw z0.d, p0/m, z0.d +; CHECK-NEXT: sdiv z0.d, p0/m, z0.d, z1.d +; CHECK-NEXT: ret + %div = sdiv %a, %b + ret %div +} + +; +; UDIV +; + +define @udiv_i32( %a, %b) { +; CHECK-LABEL: @udiv_i32 +; CHECK: ptrue p0.s +; CHECK-NEXT: udiv z0.s, p0/m, z0.s, z1.s +; CHECK-NEXT: ret + %div = udiv %a, %b + ret %div +} + +define @udiv_i64( %a, %b) { +; CHECK-LABEL: @udiv_i64 +; CHECK: ptrue p0.d +; CHECK-NEXT: udiv z0.d, p0/m, z0.d, z1.d +; CHECK-NEXT: ret + %div = udiv %a, %b + ret %div +} + +define @udiv_narrow( %a, %b) { +; CHECK-LABEL: @udiv_narrow +; CHECK: ptrue p0.s +; CHECK-NEXT: udiv z0.s, p0/m, z0.s, z2.s +; CHECK-NEXT: udiv z1.s, p0/m, z1.s, z3.s +; CHECK-NEXT: ret + %div = udiv %a, %b + ret %div +} + +define @udiv_widen( %a, %b) { +; CHECK-LABEL: @udiv_widen +; CHECK: ptrue p0.d +; CHECK-NEXT: and z1.d, z1.d, #0x +; CHECK-NEXT: and z0.d, z0.d, #0x +; CHECK-NEXT: udiv z0.d, p0/m, z0.d, z1.d +; CHECK-NEXT: ret + %div = udiv %a, %b + ret %div +} Index: llvm/lib/Target/AArch64/AArch64ISelLowering.h === --- llvm/lib/Target/AArch64/AArch64ISelLowering.h +++ llvm/lib/Target/AArch64/AArch64ISelLowering.h @@ -776,6 +776,8 @@ SDValue LowerVECTOR_SHUFFLE(SDValue Op, SelectionDAG &DAG) const; SDValue LowerSPLAT_VECTOR(SDValue Op, SelectionDAG &DAG) const; SDValue LowerDUPQLane(SDValue Op, SelectionDAG &DAG) const; + SDValue LowerDIV(SDValue Op, SelectionDAG &DAG, + unsigned IntrID) const; SDValue LowerEXTRACT_SUBVECTOR(SDValue Op, SelectionDAG &DAG) const; SDValue LowerVectorSRA_SRL_SHL(SDValue Op, SelectionDAG &DAG) const; SDValue LowerShiftLeftParts(SDValue Op, SelectionDAG &DAG) const; Index: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp === --- llvm/lib/Target/AArch64/AArch64ISelLowering.cpp +++ llvm/lib/Target/AArch64/AArch64ISelLowering.cpp @@ -883,8 +883,11 @@ // splat of 0 or undef) once vector selects supported in SVE codegen. See // D68877 for more details. for (MVT VT : MVT::integer_scalable_vector_valuetypes()) { - if (isTypeLegal(VT)) + if (isTypeLegal(VT)) { setOperationAction(ISD::SPLAT_VECTOR, VT, Custom); +setOperationAction(ISD::SDIV, VT, Custom); +setOperationAction(ISD::UDIV, VT, Custom); + } } setOperationAction(ISD::INTRINSIC_WO_CHAIN, MVT::i8, Custom); setOperationAction(ISD::INTRINSIC_WO_CHAIN, MVT::i16, Custom); @@ -3337,6 +3340,10 @@ return LowerSPLAT_VECTOR(Op, DAG); case ISD::EXTRACT_SUBVECTOR: return LowerEXTRACT_SUBVECTOR(Op, DAG); + case ISD::SDIV: +return LowerDIV(Op, DAG, Intrinsic::aarch64_sve_sdiv); + case ISD::UDIV: +return LowerDIV(Op, DAG, Intrinsic::aarch64_sve_udiv); case ISD::SRA: case ISD::SRL: case ISD::SHL: @@ -7643,6 +7650,25 @@ return DAG.getNode(ISD::BITCAST, DL, VT, TBL); } +SDValue AArch64TargetLowering::LowerDIV(SDValue Op, +SelectionDAG &DAG, +