Ping
Richard Sandiford <richard.sandif...@arm.com> writes: > gcc.target/aarch64/pr108840.c has failed since r15-268-g9dbff9c05520 > (which means that I really ought to have looked at it earlier). > > The test wants us to fold an SImode AND into all shifts that use it. > This is something that late-combine is supposed to do, but: > > (1) the pre-RA pass chickened out because of a register pressure check > > (2) the post-RA pass can't handle it, because the shift uses are in > QImode and the sets are in SImode > > Both are things that would be good to fix. But (1) is particularly > silly. The constraints on the shift have "rk" for the destination > (so allowing the stack pointer) and "r" for the first source. > Including the stack pointer made the destination seem more permissive > than the source. > > The intention was instead to check whether there are any > *allocatable* registers in the destination class that aren't > present in the source. > > That's enough for all tests but the last one. The last one still > fails because combine merges the final shift with the move into > the hard return register, giving an arithmetic instruction with > a hard register destination. Pre-RA late-combine currently punts > on those, again due to register pressure concerns. That too is > something I'd like to relax, but not for GCC 15. In the interim, > the best thing seems to be to disable combine for the test. > > Boostrapped & regression-tested on aarch64-linux-gnu and > x86_64-linux-gnu. OK to install? > > Richard > > > gcc/ > PR rtl-optimization/108840 > * late-combine.cc (late_combine::check_register_pressure): > Take only allocatable registers into account when checking > the permissiveness of register classes. > > gcc/testsuite/ > PR rtl-optimization/108840 > * gcc.target/aarch64/pr108840.c: Run at -O2 but disable combine. > --- > gcc/late-combine.cc | 10 ++++++++-- > gcc/testsuite/gcc.target/aarch64/pr108840.c | 2 +- > 2 files changed, 9 insertions(+), 3 deletions(-) > > diff --git a/gcc/late-combine.cc b/gcc/late-combine.cc > index 1707ceebd5f..90d7ef09583 100644 > --- a/gcc/late-combine.cc > +++ b/gcc/late-combine.cc > @@ -552,8 +552,14 @@ late_combine::check_register_pressure (insn_info *insn, > rtx set) > // Make sure that the source operand's class is at least as > // permissive as the destination operand's class. > auto src_class = alternative_class (alt, i); > - if (!reg_class_subset_p (dest_class, src_class)) > - return false; > + if (dest_class != src_class) > + { > + auto extra_dest_regs = (reg_class_contents[dest_class] > + & ~reg_class_contents[src_class] > + & ~fixed_reg_set); > + if (!hard_reg_set_empty_p (extra_dest_regs)) > + return false; > + } > > // Make sure that the source operand occupies no more hard > // registers than the destination operand. This mostly matters > diff --git a/gcc/testsuite/gcc.target/aarch64/pr108840.c > b/gcc/testsuite/gcc.target/aarch64/pr108840.c > index 804c1cd9156..7e1ea6fa4fe 100644 > --- a/gcc/testsuite/gcc.target/aarch64/pr108840.c > +++ b/gcc/testsuite/gcc.target/aarch64/pr108840.c > @@ -1,6 +1,6 @@ > /* PR target/108840. Check that the explicit &31 is eliminated. */ > /* { dg-do compile } */ > -/* { dg-options "-O" } */ > +/* { dg-options "-O2 -fno-tree-vectorize -fdisable-rtl-combine" } */ > > int > foo (int x, int y)