On Mon, 2022-02-28 at 22:21 -0500, Michael Meissner wrote: > Optimize signed DImode -> TImode on power10, PR target/104698. >
Hi, Logic seems OK to me, a few suggestions on the comments intermixed below. As always, i defer if there are counter arguments. :-) > On power10, GCC tries to optimize the signed conversion from DImode to > TImode by using the vextsd2q instruction. However to generate this > instruction, it would have to generate 3 direct moves (1 from the GPR > registers to the altivec registers, and 2 from the altivec registers to > the GPR register). > > This patch adds code back in to use the shift right immediate instruction > to do the conversion if the target/source is GPR registers. Perhaps drop "back in". If it's necessary to call out a previous commit that removed the code for whatever reason, certainly do so. It's not clear from context if that was the case. > > 2022-02-28 Michael Meissner <meiss...@linux.ibm.com> > > gcc/ > PR target/104698 > * config/rs6000/vsx.md (mtvsrdd_diti_w1): Delete. > (extendditi2): Replace with code to deal with both GPR registers > and with altivec registers. Perhaps enhance with (extendditi2): Convert from define_expand to define_insn_and_split. Replace with code ... > --- > gcc/config/rs6000/vsx.md | 73 ++++++++++++++++++++++++++++------------ > 1 file changed, 52 insertions(+), 21 deletions(-) > > diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md > index b53de103872..62464f67f4d 100644 > --- a/gcc/config/rs6000/vsx.md > +++ b/gcc/config/rs6000/vsx.md > @@ -5023,15 +5023,58 @@ (define_expand "vsignextend_si_v2di" > DONE; > }) > > -;; ISA 3.1 vector sign extend > -;; Move DI value from GPR to TI mode in VSX register, word 1. > -(define_insn "mtvsrdd_diti_w1" > - [(set (match_operand:TI 0 "register_operand" "=wa") > - (unspec:TI [(match_operand:DI 1 "register_operand" "r")] > - UNSPEC_MTVSRD_DITI_W1))] > - "TARGET_POWERPC64 && TARGET_DIRECT_MOVE" > - "mtvsrdd %x0,0,%1" > - [(set_attr "type" "vecmove")]) > +;; Sign extend DI to TI. We provide both GPR targets and Altivec targets. > If > +;; the register allocator prefers the GPRs, we won't have to move the value > to > +;; the altivec registers, do the vextsd2q instruction and move it back. If > we > +;; aren't compiling for 64-bit power10, don't provide the service and let the > +;; machine independent code handle the extension. So, the ".. we won't have to ..." applies to the altivec target path here? Describing in a way that indicates what code doesn't do doesn't seem right. If so, and perhaps even if not, i suggest rearranging the comment slightly so it can be read as an either or. If the register allocator prefers the GPRS, ... Otherwise, for altivec registers we dothe vextsd2q ... > +(define_insn_and_split "extendditi2" > + [(set (match_operand:TI 0 "register_operand" "=r,r,v,v,v") > + (sign_extend:TI (match_operand:DI 1 "input_operand" "r,m,r,wa,Z"))) > + (clobber (reg:DI CA_REGNO))] > + "TARGET_POWERPC64 && TARGET_POWER10" > + "#" > + "&& reload_completed" > + [(pc)] > +{ > + rtx dest = operands[0]; > + rtx src = operands[1]; > + int dest_regno = reg_or_subregno (dest); > + > + /* Handle conversion to GPR registers. Load up the low part and then do > + a sign extension to the upper part. */ > + if (INT_REGNO_P (dest_regno)) > + { > + rtx dest_hi = gen_highpart (DImode, dest); > + rtx dest_lo = gen_lowpart (DImode, dest); > + > + emit_move_insn (dest_lo, src); > + emit_insn (gen_ashrdi3 (dest_hi, dest_lo, GEN_INT (63))); > + DONE; > + } ok > + > + /* For conversion to Altivec register, generate either a splat operation or > + a load rightmost double word instruction. Both instructions gets the > + DImode value into the lower 64 bits, and then do the vextsd2q > + instruction. */ consider s/instruction. Both instructions gets/to get/ > + else if (ALTIVEC_REGNO_P (dest_regno)) > + { > + if (MEM_P (src)) > + emit_insn (gen_vsx_lxvrdx (dest, src)); > + else > + { > + rtx dest_v2di = gen_rtx_REG (V2DImode, dest_regno); > + emit_insn (gen_vsx_splat_v2di (dest_v2di, src)); > + } > + > + emit_insn (gen_extendditi2_vector (dest, dest)); > + DONE; > + } ok lgtm, thanks -Will > + > + else > + gcc_unreachable (); > +} > + [(set_attr "length" "8")]) > > ;; Sign extend 64-bit value in TI reg, word 1, to 128-bit value in TI reg > (define_insn "extendditi2_vector" > @@ -5042,18 +5085,6 @@ (define_insn "extendditi2_vector" > "vextsd2q %0,%1" > [(set_attr "type" "vecexts")]) > > -(define_expand "extendditi2" > - [(set (match_operand:TI 0 "gpc_reg_operand") > - (sign_extend:DI (match_operand:DI 1 "gpc_reg_operand")))] > - "TARGET_POWER10" > - { > - /* Move 64-bit src from GPR to vector reg and sign extend to 128-bits. > */ > - rtx temp = gen_reg_rtx (TImode); > - emit_insn (gen_mtvsrdd_diti_w1 (temp, operands[1])); > - emit_insn (gen_extendditi2_vector (operands[0], temp)); > - DONE; > - }) > - > > ;; ISA 3.0 Binary Floating-Point Support > > -- > 2.35.1 > >