On Sun, Jul 21, 2024 at 11:47 PM Stefan Schulze Frielinghaus <stefa...@gcc.gnu.org> wrote: > > It is not trivial to decide when a write of a register pair terminates > or starts a new chain. For example, prior regrename we have > > (insn 91 38 36 5 (set (reg:FPRX2 16 %f0 [orig:76 x ] [76]) > (const_double:FPRX2 0.0 [0x0.0p+0])) > "float-cast-overflow-7-reduced.c":5:55 discrim 2 1507 {*movfprx2_64} > (expr_list:REG_EQUAL (const_double:FPRX2 0.0 [0x0.0p+0]) > (nil))) > (insn 36 91 37 5 (set (subreg:DF (reg:FPRX2 16 %f0 [orig:76 x ] [76]) 0) > (mem/c:DF (plus:DI (reg/f:DI 15 %r15) > (const_int 160 [0xa0])) [7 %sfp+-32 S8 A64])) > "float-cast-overflow-7-reduced.c":5:55 discrim 2 1512 {*movdf_64dfp} > (nil)) > (insn 37 36 43 5 (set (subreg:DF (reg:FPRX2 16 %f0 [orig:76 x ] [76]) 8) > (mem/c:DF (plus:DI (reg/f:DI 15 %r15) > (const_int 168 [0xa8])) [7 %sfp+-24 S8 A64])) > "float-cast-overflow-7-reduced.c":5:55 discrim 2 1512 {*movdf_64dfp} > (nil)) > > where insn 91 writes both registers of a register pair and it is clear > that an existing chain must be terminated and a new started. Insn 36 > and 37 write only into one register of a corresponding register pair. > For each write on its own it is not obvious when to terminate an > existing chain and to start a new one. In other words, once insn 36 > materializes and 37 didn't we are kind of in a limbo state. Tracking > this correctly is inherently hard and I'm not entirely sure whether > optimizations could even lead to more complicated cases where it is even > less clear when a chain terminates and a new has to be started. > Therefore, skip renaming of register pairs. > > Bootstrapped and regtested on x86_64, aarch64, powerpc64le, and s390. > Ok for mainline? > > This fixes on s390: > FAIL: g++.dg/cpp23/ext-floating14.C -std=gnu++23 execution test > FAIL: g++.dg/cpp23/ext-floating14.C -std=gnu++26 execution test > FAIL: c-c++-common/ubsan/float-cast-overflow-7.c -O2 execution test > FAIL: c-c++-common/ubsan/float-cast-overflow-7.c -O2 -flto > -fno-use-linker-plugin -flto-partition=none execution test > FAIL: c-c++-common/ubsan/float-cast-overflow-7.c -O2 -flto > -fuse-linker-plugin -fno-fat-lto-objects execution test > FAIL: gcc.dg/torture/fp-int-convert-float128-ieee-timode.c -O0 execution > test > FAIL: gcc.dg/torture/fp-int-convert-float128-ieee-timode.c -O1 execution > test > FAIL: gcc.dg/torture/fp-int-convert-float128-ieee-timode.c -O2 execution > test > FAIL: gcc.dg/torture/fp-int-convert-float128-ieee-timode.c -O2 -flto > -fno-use-linker-plugin -flto-partition=none execution test > FAIL: gcc.dg/torture/fp-int-convert-float128-ieee-timode.c -O2 -flto > -fuse-linker-plugin -fno-fat-lto-objects execution test > FAIL: gcc.dg/torture/fp-int-convert-float128-ieee-timode.c -O3 -g > execution test > FAIL: gcc.dg/torture/fp-int-convert-float128-ieee-timode.c -Os execution > test > FAIL: gcc.dg/torture/fp-int-convert-float64x-timode.c -O0 execution test > FAIL: gcc.dg/torture/fp-int-convert-float64x-timode.c -O1 execution test > FAIL: gcc.dg/torture/fp-int-convert-float64x-timode.c -O2 execution test > FAIL: gcc.dg/torture/fp-int-convert-float64x-timode.c -O2 -flto > -fno-use-linker-plugin -flto-partition=none execution test > FAIL: gcc.dg/torture/fp-int-convert-float64x-timode.c -O2 -flto > -fuse-linker-plugin -fno-fat-lto-objects execution test > FAIL: gcc.dg/torture/fp-int-convert-float64x-timode.c -O3 -g execution test > FAIL: gcc.dg/torture/fp-int-convert-float64x-timode.c -Os execution test > FAIL: gcc.dg/torture/fp-int-convert-timode.c -O0 execution test > FAIL: gcc.dg/torture/fp-int-convert-timode.c -O1 execution test > FAIL: gcc.dg/torture/fp-int-convert-timode.c -O2 execution test > FAIL: gcc.dg/torture/fp-int-convert-timode.c -O2 -flto > -fno-use-linker-plugin -flto-partition=none execution test > FAIL: gcc.dg/torture/fp-int-convert-timode.c -O2 -flto -fuse-linker-plugin > -fno-fat-lto-objects execution test > FAIL: gcc.dg/torture/fp-int-convert-timode.c -O3 -g execution test > FAIL: gcc.dg/torture/fp-int-convert-timode.c -Os execution test > FAIL: gfortran.dg/pr96711.f90 -O0 execution test > FAIL: TestSignalForwardingExternal > FAIL: go test misc/cgo/testcarchive > FAIL: libffi.closures/nested_struct5.c -W -Wall -Wno-psabi -O2 output pattern > test > FAIL: libphobos.phobos/std/algorithm/mutation.d execution test > FAIL: libphobos.phobos/std/conv.d execution test > FAIL: libphobos.phobos/std/internal/math/errorfunction.d execution test > FAIL: libphobos.phobos/std/variant.d execution test > FAIL: libphobos.phobos_shared/std/algorithm/mutation.d execution test > FAIL: libphobos.phobos_shared/std/conv.d execution test > FAIL: libphobos.phobos_shared/std/internal/math/errorfunction.d execution test > FAIL: libphobos.phobos_shared/std/variant.d execution test > > gcc/ChangeLog: > > PR rtl-optimiztion/115860 > * regrename.cc (scan_rtx_reg): Do not try to rename register > pairs. > --- > gcc/regrename.cc | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/gcc/regrename.cc b/gcc/regrename.cc > index 054e601740b..6ae5a2309d0 100644 > --- a/gcc/regrename.cc > +++ b/gcc/regrename.cc > @@ -1113,6 +1113,10 @@ scan_rtx_reg (rtx_insn *insn, rtx *loc, enum reg_class > cl, enum scan_actions act > > c = create_new_chain (this_regno, this_nregs, loc, insn, cl); > > + /* Give up early in case of register pairs. */ > + if (this_nregs != 1) > + c->cannot_rename = 1;
I am a bit worried this will make TImode (and DImode for 32bit targets) worse. And it might make aarch64's vector struct types much worse than they are currently. It is interesting how there is a subreg of a hardregister after reload showing up here. Is that on purpose? They come from: ``` (define_insn "*tf_to_fprx2_0" [(set (subreg:DF (match_operand:FPRX2 0 "nonimmediate_operand" "+f") 0) (subreg:DF (match_operand:TF 1 "general_operand" "v") 0))] ... (define_insn "*tf_to_fprx2_1" [(set (subreg:DF (match_operand:FPRX2 0 "nonimmediate_operand" "+f") 8) (subreg:DF (match_operand:TF 1 "general_operand" "v") 8))] ``` I am not sure if that is a valid thing to do. s390 backend is the only one that has insn patterns like this. all that uses "+" use either strict_lowpart of zero_extract for the lhs or just a pure set. Maybe there is a better way of representing this. Maybe using unspec here? Thanks, Andrew Pinski > + > /* We try to tie chains in a move instruction for > a single output. */ > if (recog_data.n_operands == 2 > -- > 2.45.2 >