Hi,

> -----Original Message-----
> From: Richard Sandiford [mailto:richard.sandif...@arm.com]
> Sent: Tuesday, May 26, 2020 11:58 PM
> To: Yangfei (Felix) <felix.y...@huawei.com>
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH PR95254] aarch64: gcc generate inefficient code with
> fixed sve vector length
> 
> Sorry for the slow reply, was off for a few days.
> 
> I think the new code ought to happen earlier in emit_move_insn, before:
> 
>   if (CONSTANT_P (y))
>     {
> 
> That way, all the canonicalisation happens on the mode we actually want the
> move to have.

OK. That’s a good point.

> "Yangfei (Felix)" <felix.y...@huawei.com> writes:
> > diff --git a/gcc/expr.c b/gcc/expr.c
> > index dfbeae71518..4442fb83367 100644
> > --- a/gcc/expr.c
> > +++ b/gcc/expr.c
> > @@ -3852,6 +3852,62 @@ emit_move_insn (rtx x, rtx y)
> >
> >    gcc_assert (mode != BLKmode);
> >
> > +  rtx x_inner = NULL_RTX;
> > +  rtx y_inner = NULL_RTX;
> > +  machine_mode x_inner_mode, y_inner_mode;
> > +
> > +  if (SUBREG_P (x)
> > +      && REG_P (SUBREG_REG (x))
> > +      && known_eq (SUBREG_BYTE (x), 0))
> > +    {
> > +      x_inner = SUBREG_REG (x);
> > +      x_inner_mode = GET_MODE (x_inner);
> > +    }
> > +  if (SUBREG_P (y)
> > +      && REG_P (SUBREG_REG (y))
> > +      && known_eq (SUBREG_BYTE (y), 0))
> > +    {
> > +      y_inner = SUBREG_REG (y);
> > +      y_inner_mode = GET_MODE (y_inner);
> > +    }
> 
> The later code is only interested in SUBREG_REGs that are the same size as
> "mode", so I think it would make sense to check that in the "if"s above
> instead of checking SUBREG_BYTE.  (SUBREG_BYTE is always zero when the
> modes are the same size, but the reverse is not true.)
> 
> It might also be better to avoid [xy]_inner_mode and just use GET_MODE
> where necessary.
> 
> It would be good to have a block comment above the code to explain what
> we're doing.

Good suggestion. Done.

> > +  if (x_inner != NULL_RTX
> > +      && y_inner != NULL_RTX
> > +      && x_inner_mode == y_inner_mode
> > +      && known_eq (GET_MODE_SIZE (x_inner_mode), GET_MODE_SIZE
> (mode))
> > +      && ! targetm.can_change_mode_class (x_inner_mode, mode,
> ALL_REGS))
> > +    {
> > +      x = x_inner;
> > +      y = y_inner;
> > +    }
> > +  else if (x_inner != NULL_RTX && CONSTANT_P (y)
> 
> Formatting nit: one subcondition per line when the condition spans multiple
> lines.

OK.

> > +      && known_eq (GET_MODE_SIZE (x_inner_mode),
> GET_MODE_SIZE (mode))
> > +      && ! targetm.can_change_mode_class (x_inner_mode, mode,
> ALL_REGS)
> > +      && targetm.legitimate_constant_p (x_inner_mode, y))
> 
> This call isn't valid, since the mode has to match the rtx.  ("y" still has 
> mode
> "mode" at this point.)  I think instead we should just do:
> 
>          && (y_inner = simplify_subreg (GET_MODE (x_inner), y, mode, 0))
> 
> to convert the constant, and use it if the result is nonnull.
> The existing CONSTANT_P emit_move_insn code will handle cases in which
> the new constant isn't legitimate.

Good catch. Done.

> > +
> > +    {
> > +      x = x_inner;
> > +    }
> > +  else if (x_inner != NULL_RTX && MEM_P (y)
> > +      && known_eq (GET_MODE_SIZE (x_inner_mode),
> GET_MODE_SIZE (mode))
> > +      && ! targetm.can_change_mode_class (x_inner_mode, mode,
> ALL_REGS)
> > +      && (! targetm.slow_unaligned_access (x_inner_mode,
> MEM_ALIGN (y))
> > +          || MEM_ALIGN (y) >= GET_MODE_ALIGNMENT
> (x_inner_mode)))
> 
> What is the last condition protecting against?  Seems worth a comment.

Comment added.  Here I am intended to avoid generating a slow unaligned memory 
access.
Machine modes like VNx2HImode may have an small alignment than modes like V4HI.
For the given test case, SLP forces the alignment of memory access of mode 
VNx2HImode to be 32 bytes.
In theory, we may have other cases where alignment of innermode is bigger than 
that of the outermode.

Attached please find the v3 patch.  Bootstrapped and tested on 
aarch64-linux-gnu.
Does it look better?

gcc/ChangeLog:
+2020-05-27  Felix Yang  <felix.y...@huawei.com>
+           Richard Sandiford  <richard.sandif...@arm.com>
+
+       PR target/95254
+       * expr.c (emit_move_insn): If we have a copy of which src and/or dest
+       is a subreg, try to remove the subreg when innermode and outermode are
+       equal in size and targetm.can_change_mode_class (outermode, innermode,
+       ALL_REGS) returns false.

testsuite/ChangeLog:
+2020-05-27  Felix Yang  <felix.y...@huawei.com>
+           Richard Sandiford  <richard.sandif...@arm.com>
+
+       PR target/95254
+       * gcc.target/aarch64/pr95254.c: New test.

Thanks,
Felix


Attachment: pr95254-v3.diff
Description: pr95254-v3.diff

Reply via email to