On Wed, Feb 4, 2026 at 11:02 AM Доля Денис <[email protected]> wrote:
>
> Hello,
>
> this patch fixes PR tree-optimization/94071 by improving the store-merging
> pass to recognize adjacent byte loads even when offsets are computed
> through simple SSA expressions.
>
> The change teaches the pass to decompose offset expressions into
> base plus constant, allowing patterns like data[i] and data[i + 1]
> to be merged into halfword loads even when temporaries or helper
> functions are involved.
>
> An AArch64 testsuite case is added to verify the optimization.
>
> Tested on aarch64-linux-gnu:
>
> make check-gcc 
> RUNTESTFLAGS="aarch64.exp=gcc.target/aarch64/adjacent-byte-load-merge.c"
>
> The patch is attached.
>
> Any feedback is welcome.

There is already similar support to gather address parts and splitting constant
offsets as part of SCEV and dataref analysis so I believe we should re-use that
instead.  That will also handle multiplication which you'd see when the array
element size is not 1 byte (not relevant for bswap, but possibly word-swap).

So you'd want to look at using create_data_ref here.

An alternative is to use tree-affine.cc which has tree_to_aff_combination_expand
doing similar gathering.

Richard.

> Best regards,
> Denis Dolya
>
> --
> Denis Dolya (Ferki)
> GCC contributor
> GitHub: https://github.com/Ferki-git-creator

Reply via email to