RE: [PATCH]middle-end: Fix store_bit_field expansions of vector constructors [PR120718]

Richard Biener Tue, 24 Jun 2025 02:15:18 -0700

On Tue, 24 Jun 2025, Tamar Christina wrote:

> > -----Original Message-----
> > From: Richard Biener <rguent...@suse.de>
> > Sent: Tuesday, June 24, 2025 9:58 AM
> > To: Tamar Christina <tamar.christ...@arm.com>
> > Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com>; Richard Sandiford
> > <richard.sandif...@arm.com>
> > Subject: Re: [PATCH]middle-end: Fix store_bit_field expansions of vector
> > constructors [PR120718]
> > 
> > On Tue, 24 Jun 2025, Tamar Christina wrote:
> > 
> > > store_bit_field_1 has an optimization where if a target is not a memory 
> > > operand
> > > and the entire value is being set from something larger we can just wrap a
> > > subreg around the source and emit a move.
> > >
> > > For vector constructors this is however problematic because the subreg 
> > > means
> > > that the expansion of the constructor won't happen through vec_init 
> > > anymore.
> > 
> > But the expansion of the constructor happened already?
> > 
> 
> No, the RTL here is a CONST_VECTOR with one const_int and one const_poly_int.
> If expansion had happened they would have been converted into a vec_merge 
> because
> It's not an expression the target can handle because it's not actually a 
> constant.
> 
> So expansion definitely did not happen.


Well, it's RTL, not a tree CTOR.  Isn't the issue we are creating
a CONST_VECTOR that isn't a CONST_VECTOR in the first place then?

> > > Complicated constructors which aren't natively supported by targets then 
> > > ICE as
> > > they wouldn't have been expanded so recog fails.
> > >
> > > This patch blocks the optimization on non-constant vector constructors. 
> > > Or non-
> > uniform
> > > non-constant vectors.
> > 
> > +static bool
> > +foldable_value_with_subreg (rtx value)
> > +{
> > +  if (GET_CODE (value) != CONST_VECTOR || const_vec_duplicate_p (value))
> > +    return true;
> > 
> > that seems to allow any non-CONST_VECTOR, thus doesn't block
> > non-constant vector constructors?
> > 
> 
> Non-constant vectors don't reach here normally.  The problem is that
> a const_poly_int is not a compile time constant.  But it's allowed inside
> a CONST_VECTOR.

Why's that then?  Who's "lowering" a CONST_VECTOR with a POLY_INT
to something we can handle?

It must be you'll be playing whack-a-mole here if everybody needs to
know some COSNT_VECTORs are not CONST_VECTORs ...

Richard.

> > It sounds like the problem happens upthread of store_bit_field_1?
> > 
> 
> To answer that the question is whether a const_poly_int should be allowed
> inside of a CONST_VECTOR.
>
> Tamar
> 
> >  I allowed constant vectors because if I read the code right
> > > simplify-rtx should be able to perform the simplification of pulling out 
> > > the
> > element
> > > or merging the constant values.  There are several testcases in 
> > > aarch64-sve-
> > pcs.exp
> > > that test this as well. I allowed uniform non-constant vectors because 
> > > they
> > > would be folded into a vec_select later on.
> > >
> > > Note that codegen is quite horrible, for what should only be an lsr.  But 
> > > I'll
> > > address that separately so that this patch is backportable.
> > >
> > > Bootstrapped Regtested on aarch64-none-linux-gnu,
> > > arm-none-linux-gnueabihf, x86_64-pc-linux-gnu
> > > -m32, -m64 and no issues.
> > >
> > > Ok for master? and GCC 15, 14, 13?
> > >
> > > Thanks,
> > > Tamar
> > >
> > >
> > > gcc/ChangeLog:
> > >
> > >   PR target/120718
> > >   * expmed.cc (store_bit_field_1): Only push subreg over uniform vector
> > >   constructors.
> > >   (foldable_value_with_subreg): New.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > >   PR target/120718
> > >   * gcc.target/aarch64/sve/pr120718.c: New test.
> > >
> > > ---
> > > diff --git a/gcc/expmed.cc b/gcc/expmed.cc
> > > index
> > be427dca5d9afeed2013954472dde3a5430169e0..a468aa5c0c3f20bd62a7afc1d
> > 245d64e87be5396 100644
> > > --- a/gcc/expmed.cc
> > > +++ b/gcc/expmed.cc
> > > @@ -740,6 +740,28 @@ store_bit_field_using_insv (const extraction_insn
> > *insv, rtx op0,
> > >    return false;
> > >  }
> > >
> > > +/* For non-constant vectors wrapping a subreg around the RTX will not 
> > > make
> > > +   the expression expand properly through vec_init.  For constant vectors
> > > +   we can because simplification can just extract the element out by
> > > +   by merging the values.  This can be done by simplify-rtx and so the
> > > +   subreg will be eliminated.  However poly constants require vec_init as
> > > +   they are a runtime value.  So only allow the subreg for simple integer
> > > +   or floating point constants.  */
> > > +
> > > +static bool
> > > +foldable_value_with_subreg (rtx value)
> > > +{
> > > +  if (GET_CODE (value) != CONST_VECTOR || const_vec_duplicate_p (value))
> > > +    return true;
> > > +
> > > +  for (unsigned i = 0; i < const_vector_encoded_nelts (value); i++)
> > > +    if (!CONST_INT_P (const_vector_elt (value, i))
> > > + && !CONST_DOUBLE_P (const_vector_elt (value, i)))
> > > +      return false;
> > > +
> > > +  return true;
> > > +}
> > > +
> > >  /* A subroutine of store_bit_field, with the same arguments.  Return true
> > >     if the operation could be implemented.
> > >
> > > @@ -795,7 +817,8 @@ store_bit_field_1 (rtx str_rtx, poly_uint64 bitsize,
> > poly_uint64 bitnum,
> > >    /* If the target is a register, overwriting the entire object, or 
> > > storing
> > >       a full-word or multi-word field can be done with just a SUBREG.  */
> > >    if (!MEM_P (op0)
> > > -      && known_eq (bitsize, GET_MODE_BITSIZE (fieldmode)))
> > > +      && known_eq (bitsize, GET_MODE_BITSIZE (fieldmode))
> > > +      && foldable_value_with_subreg (value))
> > >      {
> > >        /* Use the subreg machinery either to narrow OP0 to the required
> > >    words or to cope with mode punning between equal-sized modes.
> > > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr120718.c
> > b/gcc/testsuite/gcc.target/aarch64/sve/pr120718.c
> > > new file mode 100644
> > > index
> > 0000000000000000000000000000000000000000..cb21d94792f0679a48cc20
> > c3dcdf78c89c05a5c6
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr120718.c
> > > @@ -0,0 +1,13 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-additional-options "-O3" } */
> > > +
> > > +#include <arm_sve.h>
> > > +typedef int __attribute__((vector_size(8))) v2si;
> > > +typedef struct { int x; int y; } A;
> > > +void bar(A a);
> > > +void foo()
> > > +{
> > > +    A a;
> > > +    *(v2si *)&a = (v2si){0, (int)svcntd_pat(SV_ALL)};
> > > +    bar(a);
> > > +}
> > >
> > >
> > >
> > 
> > --
> > Richard Biener <rguent...@suse.de>
> > SUSE Software Solutions Germany GmbH,
> > Frankenstrasse 146, 90461 Nuernberg, Germany;
> > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

RE: [PATCH]middle-end: Fix store_bit_field expansions of vector constructors [PR120718]

Reply via email to