RE: [PATCH]middle-end: Fix store_bit_field expansions of vector constructors [PR120718]

Tamar Christina Tue, 24 Jun 2025 18:56:31 -0700

> -----Original Message-----
> From: Richard Biener <rguent...@suse.de>
> Sent: Tuesday, June 24, 2025 9:58 AM
> To: Tamar Christina <tamar.christ...@arm.com>
> Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com>; Richard Sandiford
> <richard.sandif...@arm.com>
> Subject: Re: [PATCH]middle-end: Fix store_bit_field expansions of vector
> constructors [PR120718]
> 
> On Tue, 24 Jun 2025, Tamar Christina wrote:
> 
> > store_bit_field_1 has an optimization where if a target is not a memory 
> > operand
> > and the entire value is being set from something larger we can just wrap a
> > subreg around the source and emit a move.
> >
> > For vector constructors this is however problematic because the subreg means
> > that the expansion of the constructor won't happen through vec_init anymore.
> 
> But the expansion of the constructor happened already?
>


No, the RTL here is a CONST_VECTOR with one const_int and one const_poly_int.
If expansion had happened they would have been converted into a vec_merge 
because
It's not an expression the target can handle because it's not actually a 
constant.

So expansion definitely did not happen.

> > Complicated constructors which aren't natively supported by targets then 
> > ICE as
> > they wouldn't have been expanded so recog fails.
> >
> > This patch blocks the optimization on non-constant vector constructors. Or 
> > non-
> uniform
> > non-constant vectors.
> 
> +static bool
> +foldable_value_with_subreg (rtx value)
> +{
> +  if (GET_CODE (value) != CONST_VECTOR || const_vec_duplicate_p (value))
> +    return true;
> 
> that seems to allow any non-CONST_VECTOR, thus doesn't block
> non-constant vector constructors?
> 

Non-constant vectors don't reach here normally.  The problem is that
a const_poly_int is not a compile time constant.  But it's allowed inside
a CONST_VECTOR.

> It sounds like the problem happens upthread of store_bit_field_1?
> 

To answer that the question is whether a const_poly_int should be allowed
inside of a CONST_VECTOR.

Tamar

>  I allowed constant vectors because if I read the code right
> > simplify-rtx should be able to perform the simplification of pulling out the
> element
> > or merging the constant values.  There are several testcases in aarch64-sve-
> pcs.exp
> > that test this as well. I allowed uniform non-constant vectors because they
> > would be folded into a vec_select later on.
> >
> > Note that codegen is quite horrible, for what should only be an lsr.  But 
> > I'll
> > address that separately so that this patch is backportable.
> >
> > Bootstrapped Regtested on aarch64-none-linux-gnu,
> > arm-none-linux-gnueabihf, x86_64-pc-linux-gnu
> > -m32, -m64 and no issues.
> >
> > Ok for master? and GCC 15, 14, 13?
> >
> > Thanks,
> > Tamar
> >
> >
> > gcc/ChangeLog:
> >
> >     PR target/120718
> >     * expmed.cc (store_bit_field_1): Only push subreg over uniform vector
> >     constructors.
> >     (foldable_value_with_subreg): New.
> >
> > gcc/testsuite/ChangeLog:
> >
> >     PR target/120718
> >     * gcc.target/aarch64/sve/pr120718.c: New test.
> >
> > ---
> > diff --git a/gcc/expmed.cc b/gcc/expmed.cc
> > index
> be427dca5d9afeed2013954472dde3a5430169e0..a468aa5c0c3f20bd62a7afc1d
> 245d64e87be5396 100644
> > --- a/gcc/expmed.cc
> > +++ b/gcc/expmed.cc
> > @@ -740,6 +740,28 @@ store_bit_field_using_insv (const extraction_insn
> *insv, rtx op0,
> >    return false;
> >  }
> >
> > +/* For non-constant vectors wrapping a subreg around the RTX will not make
> > +   the expression expand properly through vec_init.  For constant vectors
> > +   we can because simplification can just extract the element out by
> > +   by merging the values.  This can be done by simplify-rtx and so the
> > +   subreg will be eliminated.  However poly constants require vec_init as
> > +   they are a runtime value.  So only allow the subreg for simple integer
> > +   or floating point constants.  */
> > +
> > +static bool
> > +foldable_value_with_subreg (rtx value)
> > +{
> > +  if (GET_CODE (value) != CONST_VECTOR || const_vec_duplicate_p (value))
> > +    return true;
> > +
> > +  for (unsigned i = 0; i < const_vector_encoded_nelts (value); i++)
> > +    if (!CONST_INT_P (const_vector_elt (value, i))
> > +   && !CONST_DOUBLE_P (const_vector_elt (value, i)))
> > +      return false;
> > +
> > +  return true;
> > +}
> > +
> >  /* A subroutine of store_bit_field, with the same arguments.  Return true
> >     if the operation could be implemented.
> >
> > @@ -795,7 +817,8 @@ store_bit_field_1 (rtx str_rtx, poly_uint64 bitsize,
> poly_uint64 bitnum,
> >    /* If the target is a register, overwriting the entire object, or storing
> >       a full-word or multi-word field can be done with just a SUBREG.  */
> >    if (!MEM_P (op0)
> > -      && known_eq (bitsize, GET_MODE_BITSIZE (fieldmode)))
> > +      && known_eq (bitsize, GET_MODE_BITSIZE (fieldmode))
> > +      && foldable_value_with_subreg (value))
> >      {
> >        /* Use the subreg machinery either to narrow OP0 to the required
> >      words or to cope with mode punning between equal-sized modes.
> > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr120718.c
> b/gcc/testsuite/gcc.target/aarch64/sve/pr120718.c
> > new file mode 100644
> > index
> 0000000000000000000000000000000000000000..cb21d94792f0679a48cc20
> c3dcdf78c89c05a5c6
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr120718.c
> > @@ -0,0 +1,13 @@
> > +/* { dg-do compile } */
> > +/* { dg-additional-options "-O3" } */
> > +
> > +#include <arm_sve.h>
> > +typedef int __attribute__((vector_size(8))) v2si;
> > +typedef struct { int x; int y; } A;
> > +void bar(A a);
> > +void foo()
> > +{
> > +    A a;
> > +    *(v2si *)&a = (v2si){0, (int)svcntd_pat(SV_ALL)};
> > +    bar(a);
> > +}
> >
> >
> >
> 
> --
> Richard Biener <rguent...@suse.de>
> SUSE Software Solutions Germany GmbH,
> Frankenstrasse 146, 90461 Nuernberg, Germany;
> GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

RE: [PATCH]middle-end: Fix store_bit_field expansions of vector constructors [PR120718]

Reply via email to