Hi,

> -----Original Message-----
> From: Richard Sandiford [mailto:richard.sandif...@arm.com]
> Sent: Sunday, May 31, 2020 12:01 AM
> To: Yangfei (Felix) <felix.y...@huawei.com>
> Cc: gcc-patches@gcc.gnu.org; Uros Bizjak <ubiz...@gmail.com>; Jakub
> Jelinek <ja...@redhat.com>; Hongtao Liu <crazy...@gmail.com>; H.J. Lu
> <hjl.to...@gmail.com>
> Subject: Re: [PATCH PR95254] aarch64: gcc generate inefficient code with
> fixed sve vector length
> 

Snip...

> >
> > The v5 patch attached addressed this issue.
> >
> > There two added changes compared with the v4 patch:
> > 1. In candidate_mem_p, mov_optab for innermode should be available.
> >      In this case, mov_optab for SDmode is not there and subreg are added
> back by emit_move_insn_1.  So we won't get the benefit with the patch.
> 
> I agree we should have this check.  I think the rule applies to all of the
> transforms though, not just the mem one, so we should add the check to the
> register and constant cases too.

OK.  I changed to make this an extra condition for calculating x_inner & y 
_inner.

> > 2. Instead of using adjust_address, I changed to use adjust_address_nv to
> avoid the emit of invalid insn 13.
> >     The latter call to validize_mem() in emit_move_insn will take care of 
> > the
> address for us.
> 
> The validation performed by validize_mem is the same as that performed by
> adjust_address, so the only case this should make a difference is for
> push_operands:

True.

>   /* If X or Y are memory references, verify that their addresses are valid
>      for the machine.  */
>   if (MEM_P (x)
>       && (! memory_address_addr_space_p (GET_MODE (x), XEXP (x, 0),
>                                        MEM_ADDR_SPACE (x))
>         && ! push_operand (x, GET_MODE (x))))
>     x = validize_mem (x);
> 
>   if (MEM_P (y)
>       && ! memory_address_addr_space_p (GET_MODE (y), XEXP (y, 0),
>                                       MEM_ADDR_SPACE (y)))
>     y = validize_mem (y);
> 
> So I think the fix is to punt on push_operands instead (and continue to use
> adjust_address rather than adjust_address_nv).

Not sure if I understand it correctly.
Do you mean excluding push_operand in candidate_mem_p? Like:

 3830   auto candidate_mem_p = [&](machine_mode innermode, rtx mem) {
 3831     return !targetm.can_change_mode_class (innermode, GET_MODE (mem), 
ALL_REGS)
 3832            && !push_operand (mem, GET_MODE (mem))
 3833            /* Not a candiate if innermode requires too much alignment.  */
 3834            && (MEM_ALIGN (mem) >= GET_MODE_ALIGNMENT (innermode)
 3835                || targetm.slow_unaligned_access (GET_MODE (mem),
 3836                                                  MEM_ALIGN (mem))
 3837                || !targetm.slow_unaligned_access (innermode, MEM_ALIGN 
(mem)));
 3838   };

Thanks,
Felix

Reply via email to