Resending for the list, as the last copy got bounced. Thanks, James
----- Forwarded message from James Greenhalgh <james.greenha...@arm.com> ----- Date: Thu, 22 Jun 2017 11:16:38 +0100 From: James Greenhalgh <james.greenha...@arm.com> To: Michael Collison <michael.colli...@arm.com>, Wilco Dijkstra <wilco.dijks...@arm.com>, Christophe Lyon <christophe.l...@linaro.org>, GCC Patches <gcc-patches@gcc.gnu.org>, nd <n...@arm.com>, richard.sandif...@linaro.org Subject: Re: [PATCH] [Aarch64] Variable shift count truncation issues User-Agent: Mutt/1.5.21 (2010-09-15) On Wed, Jun 21, 2017 at 04:42:00PM +0100, Richard Sandiford wrote: > Michael Collison <michael.colli...@arm.com> writes: > > Updated the patch per Richard's suggestions to allow scheduling of > > instructions before reload. > > Thanks, this looks good to me FWIW, but obviously I can't approve it. Thanks for the review Richard, that gives me good confidence in this patch. I have a few comments below, which are closer to nitpicking than structural issues with the patch. With those fixed, this is OK to commit. > > 2017-05-22 Kyrylo Tkachov <kyrylo.tkac...@arm.com> > > Michael Collison <michael.colli...@arm.com> With the work you've done, you can probably place yourself first on the ChangeLog now ;) > > > > PR target/70119 > > * config/aarch64/aarch64.md (*aarch64_<optab>_reg_<mode>3_mask1): > > New pattern. > > (*aarch64_reg_<mode>3_neg_mask2): New pattern. > > (*aarch64_reg_<mode>3_minus_mask): New pattern. > > (*aarch64_<optab>_reg_di3_mask2): New pattern. > > * config/aarch64/aarch64.c (aarch64_rtx_costs): Account for cost > > of shift when the shift amount is masked with constant equal to > > the size of the mode. > > * config/aarch64/predicates.md (subreg_lowpart_operator): New > > predicate. > > > > > > 2016-05-22 Kyrylo Tkachov <kyrylo.tkac...@arm.com> > > Michael Collison <michael.colli...@arm.com> > > > > PR target/70119 > > * gcc.target/aarch64/var_shift_mask_1.c: New test. > > diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md > > index d89df66..ff5720c 100644 > > --- a/gcc/config/aarch64/aarch64.md > > +++ b/gcc/config/aarch64/aarch64.md > > @@ -3942,6 +3942,97 @@ > > } > > ) > > > > +;; When the LSL, LSR, ASR, ROR instructions operate on all register > > arguments > > +;; they truncate the shift/rotate amount by the size of the registers they > > +;; operate on: 32 for W-regs, 63 for X-regs. This allows us to optimise > > away Is this "63" a typo? Should it be 64? > > +;; such redundant masking instructions. GCC can do that automatically when > > +;; SHIFT_COUNT_TRUNCATED is true, but we can't enable it for TARGET_SIMD > > +;; because some of the SISD shift alternatives don't perform this > > truncations. > > +;; So this pattern exists to catch such cases. > > + > > +(define_insn "*aarch64_<optab>_reg_<mode>3_mask1" > > + [(set (match_operand:GPI 0 "register_operand" "=r") > > + (SHIFT:GPI > > + (match_operand:GPI 1 "register_operand" "r") > > + (match_operator 4 "subreg_lowpart_operator" > > + [(and:GPI (match_operand:GPI 2 "register_operand" "r") > > + (match_operand 3 "const_int_operand" "n"))])))] > > + "(~INTVAL (operands[3]) & (GET_MODE_BITSIZE (<MODE>mode)-1)) == 0" Spaces around "-" > > + "<shift>\t%<w>0, %<w>1, %<w>2" > > + [(set_attr "type" "shift_reg")] > > +) > > + > > +(define_insn_and_split "*aarch64_reg_<mode>3_neg_mask2" > > + [(set (match_operand:GPI 0 "register_operand" "=&r") > > + (SHIFT:GPI > > + (match_operand:GPI 1 "register_operand" "r") > > + (match_operator 4 "subreg_lowpart_operator" > > + [(neg:SI (and:SI (match_operand:SI 2 "register_operand" "r") > > + (match_operand 3 "const_int_operand" "n")))])))] > > + "((~INTVAL (operands[3]) & (GET_MODE_BITSIZE (<MODE>mode)-1)) == 0)" > > + "#" > > + "&& 1" I'd prefer "true" to "1" > > + [(const_int 0)] > > + { > > + rtx tmp = (can_create_pseudo_p () ? gen_reg_rtx (SImode) > > + : operands[0]); > > + emit_insn (gen_negsi2 (tmp, operands[2])); > > + > > + rtx and_op = gen_rtx_AND (SImode, tmp, operands[3]); > > + rtx subreg_tmp = gen_rtx_SUBREG (GET_MODE (operands[4]), and_op, > > + SUBREG_BYTE (operands[4])); > > + emit_insn (gen_<optab><mode>3 (operands[0], operands[1], subreg_tmp)); > > + DONE; > > + } > > +) > > + > > +(define_insn_and_split "*aarch64_reg_<mode>3_minus_mask" > > + [(set (match_operand:GPI 0 "register_operand" "=&r") > > + (ashift:GPI > > + (match_operand:GPI 1 "register_operand" "r") > > + (minus:QI (match_operand 2 "const_int_operand" "n") > > + (match_operator 5 "subreg_lowpart_operator" > > + [(and:SI (match_operand:SI 3 "register_operand" "r") > > + (match_operand 4 "const_int_operand" "n"))]))))] > > + "((~INTVAL (operands[4]) & (GET_MODE_BITSIZE (<MODE>mode)-1)) == 0) > > + && INTVAL (operands[2]) == GET_MODE_BITSIZE (<MODE>mode)" > > + "#" > > + "&& 1" Likewise. Thanks, James ----- End forwarded message -----