Richard Biener <rguent...@suse.de> writes:
> On Tue, 24 Jun 2025, Richard Sandiford wrote:
>
>> Tamar Christina <tamar.christ...@arm.com> writes:
>> > store_bit_field_1 has an optimization where if a target is not a memory 
>> > operand
>> > and the entire value is being set from something larger we can just wrap a
>> > subreg around the source and emit a move.
>> >
>> > For vector constructors this is however problematic because the subreg 
>> > means
>> > that the expansion of the constructor won't happen through vec_init 
>> > anymore.
>> >
>> > Complicated constructors which aren't natively supported by targets then 
>> > ICE as
>> > they wouldn't have been expanded so recog fails.
>> >
>> > This patch blocks the optimization on non-constant vector constructors. Or 
>> > non-uniform
>> > non-constant vectors. I allowed constant vectors because if I read the 
>> > code right
>> > simplify-rtx should be able to perform the simplification of pulling out 
>> > the element
>> > or merging the constant values.  There are several testcases in 
>> > aarch64-sve-pcs.exp
>> > that test this as well. I allowed uniform non-constant vectors because they
>> > would be folded into a vec_select later on.
>> >
>> > Note that codegen is quite horrible, for what should only be an lsr.  But 
>> > I'll
>> > address that separately so that this patch is backportable.
>> >
>> > Bootstrapped Regtested on aarch64-none-linux-gnu,
>> > arm-none-linux-gnueabihf, x86_64-pc-linux-gnu
>> > -m32, -m64 and no issues.
>> >
>> > Ok for master? and GCC 15, 14, 13?
>> 
>> I was discussing this Alex off-list last week, and the fix we talked
>> about there was:
>> 
>> diff --git a/gcc/explow.cc b/gcc/explow.cc
>> index 7799a98053b..8b138f54f75 100644
>> --- a/gcc/explow.cc
>> +++ b/gcc/explow.cc
>> @@ -753,7 +753,7 @@ force_subreg (machine_mode outermode, rtx op,
>>            machine_mode innermode, poly_uint64 byte)
>>  {
>>    rtx x = simplify_gen_subreg (outermode, op, innermode, byte);
>> -  if (x)
>> +  if (x && (!SUBREG_P (x) || REG_P (SUBREG_REG (x))))
>>      return x;
>>  
>>    auto *start = get_last_insn ();
>> 
>> The justification is that force_subreg is somewhat like a "subreg
>> version of force_operand", and so should try to avoid returning
>> subregs that force_operand would have replaced.  The force_operand
>> code I mean is:
>
> Yeah, in particular CONSTANT_P isn't sth documented as valid as
> subreg operands, only registers (and memory) are.  But isn't this
> then a bug in simplify_gen_subreg itself, that it creates a SUBREG
> of a non-REG/MEM?

I don't think the documentation is correct/up-to-date.  subreg is
de facto used as a general operation, and for example there are
patterns like:

(define_insn ""
  [(set (match_operand:QI 0 "general_operand_dst" "=rm,Za,Zb,Zc,Zd,Ze,Zf,Zh,Zg")
        (subreg:QI (lshiftrt:SI (match_operand:SI 1 "register_operand" 
"r,Z0,Z1,Z2,Z3,Z4,Z5,Z6,Z7")
                                (const_int 16)) 3))
   (clobber (match_scratch:SI 2 "=&r,&r,&r,&r,&r,&r,&r,&r,&r"))
   (clobber (reg:CC CC_REG))]
  ""
  "mov.w\\t%e1,%f2\;mov.b\\t%w2,%R0"
  [(set_attr "length" "10")])

(from h8300).  This is also why simplify_gen_subreg has:

  if (GET_CODE (op) == SUBREG
      || GET_CODE (op) == CONCAT
      || GET_MODE (op) == VOIDmode)
    return NULL_RTX;

  if (MODE_COMPOSITE_P (outermode)
      && (CONST_SCALAR_INT_P (op)
          || CONST_DOUBLE_AS_FLOAT_P (op)
          || CONST_FIXED_P (op)
          || GET_CODE (op) == CONST_VECTOR))
    return NULL_RTX;

rather than the !REG_P (op) && !MEM_P (op) that the documentation
would imply.

To take another example that I happened to be working on today,
LRA eliminates registers recursively, so if we have:

  (subreg:SI (reg:DI R) 0)

for an eliminable register R, we end up with:

  (subreg:SI (plus:DI (reg:DI R') (const_int N)) 0)

I wouldn't object to trying to change this, but

(a) if we do, we should also restrict it to just REGs, not REGs and MEMs

(b) it would be far too invasive to backport

(c) it might be an ongoing whack-a-mole project, a bit like the recent
    tightening of validate_subreg

But I agree it would make things cleaner in some ways.

Thanks,
Richard

Reply via email to