Alex Coplan <alex.cop...@arm.com> writes:
> Hi,
>
> The ldp/stp fusion pass can change the base of an access so that the two
> accesses end up using a common base register.  So far we have been using
> adjust_address_nv to do this, but this means that we don't preserve
> other properties of the mem we're replacing.  It seems better to use
> replace_equiv_address_nv, as this will preserve e.g. the MEM_ALIGN of the
> mem whose address we're changing.
>
> The PR shows that by adjusting the other mem we lose alignment
> information about the original access and therefore end up rejecting an
> otherwise viable pair when --param=aarch64-stp-policy=aligned is passed.
> This patch fixes that by using replace_equiv_address_nv instead.
>
> Notably this is the same approach as taken by
> aarch64_check_consecutive_mems when a change of base is required, so
> this at least makes things more consistent between the ldp fusion pass
> and the peepholes.
>
> Bootstrapped/regtested on aarch64-linux-gnu, OK for trunk when stage 1
> opens for GCC 15?

Yes, thanks.

Richard

>
> Thanks,
> Alex
>
>
> gcc/ChangeLog:
>
>       PR target/114674
>       * config/aarch64/aarch64-ldp-fusion.cc (ldp_bb_info::fuse_pair):
>       Use replace_equiv_address_nv on a change of base instead of
>       adjust_address_nv on the other access.
>
> gcc/testsuite/ChangeLog:
>
>       PR target/114674
>       * gcc.target/aarch64/pr114674.c: New test.
>
> diff --git a/gcc/config/aarch64/aarch64-ldp-fusion.cc 
> b/gcc/config/aarch64/aarch64-ldp-fusion.cc
> index 365dcf48b22..4258a560c48 100644
> --- a/gcc/config/aarch64/aarch64-ldp-fusion.cc
> +++ b/gcc/config/aarch64/aarch64-ldp-fusion.cc
> @@ -1730,11 +1730,11 @@ ldp_bb_info::fuse_pair (bool load_p,
>       adjust_amt *= -1;
>  
>        rtx change_reg = XEXP (change_pat, !load_p);
> -      machine_mode mode_for_mem = GET_MODE (change_mem);
>        rtx effective_base = drop_writeback (base_mem);
> -      rtx new_mem = adjust_address_nv (effective_base,
> -                                    mode_for_mem,
> -                                    adjust_amt);
> +      rtx adjusted_addr = plus_constant (Pmode,
> +                                      XEXP (effective_base, 0),
> +                                      adjust_amt);
> +      rtx new_mem = replace_equiv_address (change_mem, adjusted_addr);
>        rtx new_set = load_p
>       ? gen_rtx_SET (change_reg, new_mem)
>       : gen_rtx_SET (new_mem, change_reg);
> diff --git a/gcc/testsuite/gcc.target/aarch64/pr114674.c 
> b/gcc/testsuite/gcc.target/aarch64/pr114674.c
> new file mode 100644
> index 00000000000..944784fd008
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/pr114674.c
> @@ -0,0 +1,17 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3 --param=aarch64-stp-policy=aligned" } */
> +typedef struct {
> +     unsigned int f1;
> +     unsigned int f2;
> +} test_struct;
> +
> +static test_struct ts = {
> +     123, 456
> +};
> +
> +void foo(void)
> +{
> +     ts.f2 = 36969 * (ts.f2 & 65535) + (ts.f1 >> 16);
> +     ts.f1 = 18000 * (ts.f2 & 65535) + (ts.f2 >> 16);
> +}
> +/* { dg-final { scan-assembler-times "stp" 1 } } */

Reply via email to