[Bug target/92303] [10 regression] gcc.target/sparc/ultrasp12.c times out
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92303 --- Comment #11 from Vladimir Makarov --- Jakub, thank you for the analysis. I've been working on this PR too. I hope the patch will be ready on Friday or at the beginning of the next week.
[Bug target/92303] [10 regression] gcc.target/sparc/ultrasp12.c times out
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92303 --- Comment #10 from Jakub Jelinek --- So, quite early during LRA we get: (insn 14 13 18 3 (set (reg:DI 9 %o1) (zero_extend:DI (subreg:SI (mem/c:V8QI (plus:DI (reg/f:DI 101 %sfp) (const_int -8 [0xfff8])) [4 %sfp+-8 S8 A64]) 4))) "ultrasp13.c":23:7 179 {*zero_extendsidi2_insn_sp64} (nil)) (insn 18 14 15 3 (set (reg:DI 111 [ ivtmp.11 ]) (plus:DI (reg:DI 111 [ ivtmp.11 ]) (const_int 8 [0x8]))) "ultrasp13.c":17:3 224 {*adddi3_sp64} (nil)) (insn 15 18 16 3 (set (reg:DI 8 %o0) (zero_extend:DI (subreg:SI (mem/c:V8QI (plus:DI (reg/f:DI 101 %sfp) (const_int -8 [0xfff8])) [4 %sfp+-8 S8 A64]) 0))) "ultrasp13.c":23:7 179 {*zero_extendsidi2_insn_sp64} (expr_list:REG_DEAD (reg/v:V8QI 114 [ v ]) (nil))) It isn't clear why the subreg of the mem isn't folded to a mem/c:SI. Further iteration tries to do: (insn 44 13 45 3 (set (reg:V8QI 126) (mem/c:V8QI (plus:DI (reg/f:DI 101 %sfp) (const_int -8 [0xfff8])) [4 %sfp+-8 S8 A64])) "ultrasp13.c":23:7 470 {*movv8qi_insn_sp64} (nil)) (insn 45 44 14 3 (set (reg:SI 127) (subreg:SI (reg:V8QI 126) 4)) "ultrasp13.c":23:7 116 {*movsi_insn} (expr_list:REG_DEAD (reg:V8QI 126) (nil))) (insn 14 45 18 3 (set (reg:DI 9 %o1) (zero_extend:DI (reg:SI 127))) "ultrasp13.c":23:7 179 {*zero_extendsidi2_insn_sp64} (expr_list:REG_DEAD (reg:SI 127) (nil))) (insn 18 14 42 3 (set (reg:DI 111 [ ivtmp.11 ]) (plus:DI (reg:DI 111 [ ivtmp.11 ]) (const_int 8 [0x8]))) "ultrasp13.c":17:3 224 {*adddi3_sp64} (nil)) (insn 42 18 43 3 (set (reg:V8QI 124) (mem/c:V8QI (plus:DI (reg/f:DI 101 %sfp) (const_int -8 [0xfff8])) [4 %sfp+-8 S8 A64])) "ultrasp13.c":23:7 470 {*movv8qi_insn_sp64} (nil)) (insn 43 42 15 3 (set (reg:SI 125) (subreg:SI (reg:V8QI 124) 0)) "ultrasp13.c":23:7 116 {*movsi_insn} (expr_list:REG_DEAD (reg:V8QI 124) (nil))) (insn 15 43 16 3 (set (reg:DI 8 %o0) (zero_extend:DI (reg:SI 125))) "ultrasp13.c":23:7 179 {*zero_extendsidi2_insn_sp64} (expr_list:REG_DEAD (reg:SI 125) (nil))) and the insn 43 then reloaded into: (insn 46 42 43 3 (set (mem/c:V8QI (plus:DI (reg/f:DI 101 %sfp) (const_int -16 [0xfff0])) [4 %sfp+-16 S8 A64]) (reg:V8QI 124)) "ultrasp13.c":23:7 470 {*movv8qi_insn_sp64} (expr_list:REG_DEAD (reg:V8QI 124) (nil))) (insn 43 46 15 3 (set (reg:SI 125) (subreg:SI (mem/c:V8QI (plus:DI (reg/f:DI 101 %sfp) (const_int -16 [0xfff0])) [4 %sfp+-16 S8 A64]) 0)) "ultrasp13.c":23:7 116 {*movsi_insn} (expr_list:REG_DEAD (reg:V8QI 128 [124]) (nil))) and then (insn 47 46 48 3 (set (reg:V8QI 129) (mem/c:V8QI (plus:DI (reg/f:DI 101 %sfp) (const_int -16 [0xfff0])) [4 %sfp+-16 S8 A64])) "ultrasp13.c":23:7 470 {*movv8qi_insn_sp64} (nil)) (insn 48 47 43 3 (set (reg:SI 130) (subreg:SI (reg:V8QI 129) 0)) "ultrasp13.c":23:7 116 {*movsi_insn} (expr_list:REG_DEAD (reg:V8QI 129) (nil))) (insn 43 48 15 3 (set (reg:SI 125) (reg:SI 130)) "ultrasp13.c":23:7 116 {*movsi_insn} (expr_list:REG_DEAD (reg:SI 130) (nil))) and now we are back at two steps before with insn 48 now being what insn 43 used to be, and like that forever.
[Bug target/92303] [10 regression] gcc.target/sparc/ultrasp12.c times out
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92303 --- Comment #9 from Jakub Jelinek --- The reason that the non-lowpart subreg is allowed here is: sparc_regmode_natural_size (machine_mode mode) which returns for MODE_VECTOR_INT modes 4 rather than UNITS_PER_WORD (and for MODE_FLOAT too). IRA decision for the pseudo 114 seems to be mem.
[Bug target/92303] [10 regression] gcc.target/sparc/ultrasp12.c times out
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92303 --- Comment #8 from Jakub Jelinek --- I think the difference on the reduced testcase between success (first) and hang (second) is: (insn 10 16 21 3 (set (reg/v:DI 117 [ hl_ ]) (subreg:DI (reg/v:V8QI 114 [ v ]) 0)) "ultrasp13.c":19:105 125 {*movdi_insn_sp64} (nil)) (insn 21 10 17 3 (set (reg:DI 111 [ ivtmp.11 ]) (plus:DI (reg:DI 111 [ ivtmp.11 ]) (const_int 8 [0x8]))) 224 {*adddi3_sp64} (nil)) (insn 17 21 18 3 (set (reg:DI 9 %o1) (zero_extend:DI (subreg:SI (reg/v:V8QI 114 [ v ]) 4))) "ultrasp13.c":23:7 179 {*zero_extendsidi2_insn_sp64} (expr_list:REG_DEAD (reg/v:V8QI 114 [ v ]) (nil))) (insn 18 17 19 3 (set (reg:DI 8 %o0) (lshiftrt:DI (reg/v:DI 117 [ hl_ ]) (const_int 32 [0x20]))) "ultrasp13.c":23:7 403 {*lshrdi3_sp64} (expr_list:REG_DEAD (reg/v:DI 117 [ hl_ ]) (nil))) vs. (insn 14 13 18 3 (set (reg:DI 9 %o1) (zero_extend:DI (subreg:SI (reg/v:V8QI 114 [ v ]) 4))) "ultrasp13.c":23:7 179 {*zero_extendsidi2_insn_sp64} (nil)) (insn 18 14 15 3 (set (reg:DI 111 [ ivtmp.11 ]) (plus:DI (reg:DI 111 [ ivtmp.11 ]) (const_int 8 [0x8]))) 224 {*adddi3_sp64} (nil)) (insn 15 18 16 3 (set (reg:DI 8 %o0) (zero_extend:DI (subreg:SI (reg/v:V8QI 114 [ v ]) 0))) "ultrasp13.c":23:7 179 {*zero_extendsidi2_insn_sp64} (expr_list:REG_DEAD (reg/v:V8QI 114 [ v ]) (nil))) SPARC is big-endian, so I'd guess the non-lowpart subreg in the zero extension might be a problem. But in the expand dump it shows up both in the good and bad cases, like: (insn 11 10 12 4 (set (reg:V4QI 119) (subreg:V4QI (reg/v:DI 117 [ hl_ ]) 4)) "ultrasp13.c":23:7 -1 (nil))
[Bug target/92303] [10 regression] gcc.target/sparc/ultrasp12.c times out
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92303 --- Comment #7 from Jakub Jelinek --- richi's change basically on the testcase just changed 6 times in the function: hl_.v_ = a11_334; - accvhi4__777 = hl_.hilo_.hi_; - accvlo4__778 = hl_.hilo_.lo_; + _612 = BIT_FIELD_REF ; + _613 = BIT_FIELD_REF ; or so (plus different SSA_NAME_VERSION). That is given: typedef unsigned char rc_vec4_type_ __attribute__((__vector_size__(4))); struct { rc_vec4_type_ hi_, lo_; } hilo_; } RC_hl_type_; so I don't really see anything wrong on that, a11_334 has type typedef unsigned char rc_vec_t __attribute__((__vector_size__(8))); and this is a valid way to extract low or high half of a vector. Simplified testcase that still hangs in LRA: /* { dg-do compile } */ /* { dg-require-effective-target lp64 } */ /* { dg-options "-O2 -mcpu=ultrasparc -mvis" } */ typedef unsigned char rc_vec_t __attribute__((__vector_size__(8))); typedef short rc_svec_type_ __attribute__((__vector_size__(8))); typedef unsigned char rc_vec4_type_ __attribute__((__vector_size__(4))); void foo (unsigned int, unsigned int); void bar (rc_vec_t *pv) { rc_vec_t v = {}; for (int i = 0; i < 64; i++) { typedef union { rc_vec_t v_; struct { rc_vec4_type_ hi_, lo_; } hilo_; } RC_hl_type_; RC_hl_type_ hl_ = (RC_hl_type_) v; rc_vec4_type_ a = hl_.hilo_.hi_; rc_vec4_type_ b = hl_.hilo_.lo_; union U { rc_vec4_type_ v; unsigned int u; }; foo (((union U) { .v = a }).u, ((union U) { .v = b }).u); v = pv[i]; } }
[Bug target/92303] [10 regression] gcc.target/sparc/ultrasp12.c times out
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92303 --- Comment #6 from Jakub Jelinek --- That IMHO just made a latent issue no longer latent. I'd say it is either a LRA issue or some backend issue related to RA, on a relatively short function LRA shouldn't take hours.
[Bug target/92303] [10 regression] gcc.target/sparc/ultrasp12.c times out
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92303 --- Comment #5 from Jeffrey A. Law --- Umm, the issue was bisected to a sccvn change, so I'm not sure why is landing on Vlad. Richi or someone familiar with SCCVN needs to take a look.
[Bug target/92303] [10 regression] gcc.target/sparc/ultrasp12.c times out
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92303 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org, ||vmakarov at gcc dot gnu.org --- Comment #4 from Jakub Jelinek --- Vlad, do you think you could have a look? Seems the function isn't that big, 23 bbs and < 1200 insns before LRA, 1000 pseudos, but during LRA it turns those into 12+ insns and counting, 10GB *.lra log (didn't wait until it completes). Thanks.
[Bug target/92303] [10 regression] gcc.target/sparc/ultrasp12.c times out
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92303 Jeffrey A. Law changed: What|Removed |Added Priority|P3 |P1 Status|WAITING |NEW CC||law at redhat dot com
[Bug target/92303] [10 regression] gcc.target/sparc/ultrasp12.c times out
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92303 --- Comment #3 from ro at CeBiTec dot Uni-Bielefeld.DE --- > --- Comment #2 from Richard Biener --- > There's no RA commits in that range, further bisection is needed. Done now. I've found r272742 to be the culprit: 2019-06-27 Richard Biener * tree-ssa-sccvn.c (vn_reference_lookup_3): Encode valueized RHS. At r272740, cc1 -fpreprocessed ultrasp12.i -mptr64 -mstack-bias -mno-v8plus -quiet -m64 -mcpu=ultrasparc -mvis -O2 -o ultrasp12.s takes less than a second, at r272740 the compilation never terminates.
[Bug target/92303] [10 regression] gcc.target/sparc/ultrasp12.c times out
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92303 Richard Biener changed: What|Removed |Added Keywords||needs-bisection Status|NEW |WAITING --- Comment #2 from Richard Biener --- There's no RA commits in that range, further bisection is needed.
[Bug target/92303] [10 regression] gcc.target/sparc/ultrasp12.c times out
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92303 Eric Botcazou changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2019-10-31 Ever confirmed|0 |1 --- Comment #1 from Eric Botcazou --- In my experience, LRA is getting slower and slower since GCC 8.
[Bug target/92303] [10 regression] gcc.target/sparc/ultrasp12.c times out
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92303 Rainer Orth changed: What|Removed |Added Target Milestone|--- |10.0