Resending for the list, as the last copy got bounced.

Thanks,
James

----- Forwarded message from James Greenhalgh <james.greenha...@arm.com> -----

Date: Thu, 22 Jun 2017 11:16:38 +0100
From: James Greenhalgh <james.greenha...@arm.com>
To: Michael Collison <michael.colli...@arm.com>, Wilco Dijkstra 
<wilco.dijks...@arm.com>, Christophe Lyon <christophe.l...@linaro.org>, GCC 
Patches <gcc-patches@gcc.gnu.org>, nd <n...@arm.com>, 
richard.sandif...@linaro.org
Subject: Re: [PATCH] [Aarch64] Variable shift count truncation issues
User-Agent: Mutt/1.5.21 (2010-09-15)

On Wed, Jun 21, 2017 at 04:42:00PM +0100, Richard Sandiford wrote:
> Michael Collison <michael.colli...@arm.com> writes:
> > Updated the patch per Richard's suggestions to allow scheduling of
> > instructions before reload.
> 
> Thanks, this looks good to me FWIW, but obviously I can't approve it.

Thanks for the review Richard, that gives me good confidence in this patch.

I have a few comments below, which are closer to nitpicking than structural
issues with the patch.

With those fixed, this is OK to commit.

> > 2017-05-22  Kyrylo Tkachov  <kyrylo.tkac...@arm.com>
> >         Michael Collison <michael.colli...@arm.com>

With the work you've done, you can probably place yourself first on the
ChangeLog now ;)

> >
> >     PR target/70119
> >     * config/aarch64/aarch64.md (*aarch64_<optab>_reg_<mode>3_mask1):
> >     New pattern.
> >     (*aarch64_reg_<mode>3_neg_mask2): New pattern.
> >     (*aarch64_reg_<mode>3_minus_mask): New pattern.
> >     (*aarch64_<optab>_reg_di3_mask2): New pattern.
> >     * config/aarch64/aarch64.c (aarch64_rtx_costs): Account for cost
> >     of shift when the shift amount is masked with constant equal to
> >     the size of the mode.
> >     * config/aarch64/predicates.md (subreg_lowpart_operator): New
> >     predicate.
> >
> >
> > 2016-05-22  Kyrylo Tkachov  <kyrylo.tkac...@arm.com>
> >         Michael Collison <michael.colli...@arm.com>
> >
> >     PR target/70119
> >     * gcc.target/aarch64/var_shift_mask_1.c: New test.


> > diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> > index d89df66..ff5720c 100644
> > --- a/gcc/config/aarch64/aarch64.md
> > +++ b/gcc/config/aarch64/aarch64.md
> > @@ -3942,6 +3942,97 @@
> >    }
> >  )
> >  
> > +;; When the LSL, LSR, ASR, ROR instructions operate on all register 
> > arguments
> > +;; they truncate the shift/rotate amount by the size of the registers they
> > +;; operate on: 32 for W-regs, 63 for X-regs.  This allows us to optimise 
> > away

Is this "63" a typo? Should it be 64?

> > +;; such redundant masking instructions.  GCC can do that automatically when
> > +;; SHIFT_COUNT_TRUNCATED is true, but we can't enable it for TARGET_SIMD
> > +;; because some of the SISD shift alternatives don't perform this 
> > truncations.
> > +;; So this pattern exists to catch such cases.
> > +
> > +(define_insn "*aarch64_<optab>_reg_<mode>3_mask1"
> > +  [(set (match_operand:GPI 0 "register_operand" "=r")
> > +   (SHIFT:GPI
> > +     (match_operand:GPI 1 "register_operand" "r")
> > +     (match_operator 4 "subreg_lowpart_operator"
> > +      [(and:GPI (match_operand:GPI 2 "register_operand" "r")
> > +                (match_operand 3 "const_int_operand" "n"))])))]
> > +  "(~INTVAL (operands[3]) & (GET_MODE_BITSIZE (<MODE>mode)-1)) == 0"

Spaces around "-"

> > +  "<shift>\t%<w>0, %<w>1, %<w>2"
> > +  [(set_attr "type" "shift_reg")]
> > +)
> > +
> > +(define_insn_and_split "*aarch64_reg_<mode>3_neg_mask2"
> > +  [(set (match_operand:GPI 0 "register_operand" "=&r")
> > +   (SHIFT:GPI
> > +     (match_operand:GPI 1 "register_operand" "r")
> > +     (match_operator 4 "subreg_lowpart_operator"
> > +     [(neg:SI (and:SI (match_operand:SI 2 "register_operand" "r")
> > +                      (match_operand 3 "const_int_operand" "n")))])))]
> > +  "((~INTVAL (operands[3]) & (GET_MODE_BITSIZE (<MODE>mode)-1)) == 0)"
> > +  "#"
> > +  "&& 1"

I'd prefer "true" to "1"

> > +  [(const_int 0)]
> > +  {
> > +    rtx tmp = (can_create_pseudo_p () ? gen_reg_rtx (SImode)
> > +          : operands[0]);
> > +    emit_insn (gen_negsi2 (tmp, operands[2]));
> > +
> > +    rtx and_op = gen_rtx_AND (SImode, tmp, operands[3]);
> > +    rtx subreg_tmp = gen_rtx_SUBREG (GET_MODE (operands[4]), and_op,
> > +                                SUBREG_BYTE (operands[4]));
> > +    emit_insn (gen_<optab><mode>3 (operands[0], operands[1], subreg_tmp));
> > +    DONE;
> > +  }
> > +)
> > +
> > +(define_insn_and_split "*aarch64_reg_<mode>3_minus_mask"
> > +  [(set (match_operand:GPI 0 "register_operand" "=&r")
> > +   (ashift:GPI
> > +     (match_operand:GPI 1 "register_operand" "r")
> > +     (minus:QI (match_operand 2 "const_int_operand" "n")
> > +               (match_operator 5 "subreg_lowpart_operator"
> > +               [(and:SI (match_operand:SI 3 "register_operand" "r")
> > +                        (match_operand 4 "const_int_operand" "n"))]))))]
> > +  "((~INTVAL (operands[4]) & (GET_MODE_BITSIZE (<MODE>mode)-1)) == 0)
> > +   && INTVAL (operands[2]) == GET_MODE_BITSIZE (<MODE>mode)"
> > +  "#"
> > +  "&& 1"

Likewise.

Thanks,
James


----- End forwarded message -----

Reply via email to