https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64928
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org --- Comment #9 from Richard Biener <rguenth at gcc dot gnu.org> --- It seems that loop invariant motion is responsible for most of the abnormals, thus -fno-tree-loop-im restores performance. The loop LIM detects is of style <bb 6>: (header) # ___fp_3(ab) = PHI <___fp_41(4), ___fp_5(21)> # ___r1_7(ab) = PHI <___r1_42(4), ___r1_9(21)> # ___r2_11(ab) = PHI <___r2_43(4), ___r3_17(21)> # ___r3_19(ab) = PHI <___r3_44(4), ___r3_23(21)> # ___r4_25 = PHI <___r4_45(4), ___r4_26(21)> # gotovar.17_29 = PHI <_51(4), _69(21)> goto gotovar.17_29; ... <bb 21>: (latch) _67 = ___pc_1 + 15; _68 = (void * *) _67; _69 = *_68; PROF_edge_counter_142 = __gcov0.___H_object_2d__3e_u8vector[14]; PROF_edge_counter_143 = PROF_edge_counter_142 + 1; __gcov0.___H_object_2d__3e_u8vector[14] = PROF_edge_counter_143; goto <bb 6>; not sure if we should artificially limit such loops. LIM doesn't account for the (compile-time) cost of needing very many PHIs when rewriting the store-motion vars into SSA form (but it could in theory estimate by taking into account the CFG structure of the "loop"). Let's see if we can first generate a smaller testcase to illustrate the issue. Mine for now.