On Thu, 12 Feb 2026, Pengfei Li wrote:
> > On Mon, 26 Jan 2026, Pengfei Li wrote:
>
> > > > > I don't think this is correct. Instead the use in the expander
> > > > > should be adjusted as I wrote in the bugreport. At least I do not see
> > > > > that there's a guaranteee that the VECTOR_CST ends up in type aligned
> > > > > memory if it is not asscoiated with an actual data object (STRING_CST
> > > > > is indeed special here).
> > > >
> > > > That said, get_object_alignment will return the maximum _guaranteed_
> > > > alignment of an access. The RTL expansion case needs to honor that,
> > > > but as 1 byte is always a correct answer here, for STRICT_ALIGNMENT
> > > > targets we probably need to to a MAX (GET_MODE_ALIGNMENT, ...) on it,
> > > > or if BLKmode we might want to do MAX (TYPE_ALIGN, ...) on it. Possibly
> > > > that optional alignment need to be capped by max-stack alignment that
> > > > can be used.
> > >
> > > Thanks Richi for the suggestions.
> > >
> > > This v2 patch increases the stack slot alignment in the expander and caps
> > > it by MAX_SUPPORTED_STACK_ALIGNMENT, as Richi suggested.
> > >
> > > This has been re-tested on aarch64-linux-gnu and x86_64-linux-gnu.
>
> > LGTM.
>
> The following patch has been in trunk for >2 weeks.
>
> I've tested it on GCC 15. It applies cleanly and passes the regression.
>
> Ok to backport?
OK.
Richard.
> Thanks,
> Pengfei
>
>
> > > -- >8 --
> > >
> > > PR123447 reports an ICE on AArch64 with "-O2 -mstrict-align" in subreg
> > > lowering while decomposing the following multiword store RTL:
> > >
> > > (insn 12 11 13 2 (set (mem/c:XI (plus:DI (reg/f:DI 64 sfp)
> > > (const_int -96 [0xffffffffffffffa0])) [0 S64 A8])
> > > (reg:XI 103)) "a.c":14:6 4861 {*aarch64_movxi}
> > >
> > > This RTL originates from expanding the following GIMPLE statement:
> > >
> > > _1 = BIT_FIELD_REF <{ 9, -64497, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
> > > }, 256, 0>;
> > >
> > > The operand is a constant _Decimal64 vector with BLKmode, so expand has
> > > to materialize it in memory. Current get_object_alignment() returns a
> > > 1-byte guaranteed alignment for this VECTOR_CST, as indicated by A8 in
> > > the RTL dump above. However, with "-mstrict-align" enabled, later subreg
> > > lowering pass expects at least 64-bit alignment when it constructs a new
> > > RTX to decompose the store into pieces. Because the original alignment
> > > is too small, simplify_gen_subreg() returns NULL_RTX and an assertion is
> > > hit.
> > >
> > > This patch increases the stack slot alignment for STRICT_ALIGNMENT
> > > targets, when the operand is forced into memory. The increased alignment
> > > is capped by MAX_SUPPORTED_STACK_ALIGNMENT so it won't be too large.
> > >
> > > Bootstrapped and tested on aarch64-linux-gnu and x86_64-linux-gnu.
> > >
> > > gcc/ChangeLog:
> > >
> > > PR middle-end/123447
> > > * expr.cc (expand_expr_real_1): Increase stack slot alignment
> > > for STRICT_ALIGNMENT targets.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > > PR middle-end/123447
> > > * gcc.dg/pr123447.c: New test.
> > > ---
> > > gcc/expr.cc | 21 +++++++++++++++++----
> > > gcc/testsuite/gcc.dg/pr123447.c | 19 +++++++++++++++++++
> > > 2 files changed, 36 insertions(+), 4 deletions(-)
> > > create mode 100644 gcc/testsuite/gcc.dg/pr123447.c
> > >
> > > diff --git a/gcc/expr.cc b/gcc/expr.cc
> > > index b6d593d09a2..d3dad6c8041 100644
> > > --- a/gcc/expr.cc
> > > +++ b/gcc/expr.cc
> > > @@ -12336,13 +12336,26 @@ expand_expr_real_1 (tree exp, rtx target,
> > > machine_mode tmode,
> > > and need be, put it there. */
> > > else if (CONSTANT_P (op0) || (!MEM_P (op0) && must_force_mem))
> > > {
> > > + machine_mode tem_mode = TYPE_MODE (TREE_TYPE (tem));
> > > poly_int64 size;
> > > if (!poly_int_tree_p (TYPE_SIZE_UNIT (TREE_TYPE (tem)), &size))
> > > size = max_int_size_in_bytes (TREE_TYPE (tem));
> > > - memloc = assign_stack_local (TYPE_MODE (TREE_TYPE (tem)), size,
> > > - TREE_CODE (tem) == SSA_NAME
> > > - ? TYPE_ALIGN (TREE_TYPE (tem))
> > > - : get_object_alignment (tem));
> > > + unsigned int align = TREE_CODE (tem) == SSA_NAME
> > > + ? TYPE_ALIGN (TREE_TYPE (tem))
> > > + : get_object_alignment (tem);
> > > + if (STRICT_ALIGNMENT)
> > > + {
> > > + /* For STRICT_ALIGNMENT targets, when we force the operand
> > > to
> > > + memory, we may need to increase the alignment to meet the
> > > + expectation in later RTL lowering passes. The increased
> > > + alignment is capped by MAX_SUPPORTED_STACK_ALIGNMENT. */
> > > + if (tem_mode != BLKmode)
> > > + align = MAX (align, GET_MODE_ALIGNMENT (tem_mode));
> > > + else
> > > + align = MAX (align, TYPE_ALIGN (TREE_TYPE (tem)));
> > > + align = MIN (align, (unsigned)
> > > MAX_SUPPORTED_STACK_ALIGNMENT);
> > > + }
> > > + memloc = assign_stack_local (tem_mode, size, align);
> > > emit_move_insn (memloc, op0);
> > > op0 = memloc;
> > > clear_mem_expr = true;
> > > diff --git a/gcc/testsuite/gcc.dg/pr123447.c
> > > b/gcc/testsuite/gcc.dg/pr123447.c
> > > new file mode 100644
> > > index 00000000000..b2ee1473758
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.dg/pr123447.c
> > > @@ -0,0 +1,19 @@
> > > +/* PR middle-end/123447 */
> > > +/* { dg-do compile { target aarch64*-*-* } } */
> > > +/* { dg-options "-O2 -mstrict-align" } */
> > > +
> > > +typedef __attribute__((__vector_size__(32))) _Decimal64 D;
> > > +typedef __attribute__((__vector_size__(64))) int V;
> > > +typedef __attribute__((__vector_size__(64))) _Decimal64 D64;
> > > +
> > > +D d;
> > > +
> > > +void foo1 () {
> > > + D _4;
> > > + D64 _5;
> > > + V _1;
> > > + _1 = (V) { 9, -64497, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 };
> > > + _5 = (D64) _1;
> > > + _4 = __builtin_shufflevector (_5, _5, 0, 1, 2, 3);
> > > + d = _4;
> > > +}
> > >
>
--
Richard Biener <[email protected]>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Jochen Jaser, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)