[Bug tree-optimization/113134] gcc does not version loops with early break conditions that don't have side-effects
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113134 --- Comment #22 from JuzheZhong --- I have done this following experiment. diff --git a/gcc/tree-ssa-loop-ivcanon.cc b/gcc/tree-ssa-loop-ivcanon.cc index bf017137260..8c36cc63d3b 100644 --- a/gcc/tree-ssa-loop-ivcanon.cc +++ b/gcc/tree-ssa-loop-ivcanon.cc @@ -1260,6 +1260,39 @@ canonicalize_loop_induction_variables (class loop *loop, may_be_zero = false; } + if (!exit) + { + auto_vec exits = get_loop_exit_edges (loop); + exit = exits[0]; + class tree_niter_desc desc1; + class tree_niter_desc desc2; + if (number_of_iterations_exit (loop, exits[0], , false) + && number_of_iterations_exit (loop, exits[1], , false)) + { + niter = fold_build2 (MIN_EXPR, unsigned_type_node, desc1.niter, + desc2.niter); + create_canonical_iv (loop, exit, niter); + gcond *cond_stmt; + class nb_iter_bound *elt; + for (elt = loop->bounds; elt; elt = elt->next) + { + if (elt->is_exit + && !wi::ltu_p (loop->nb_iterations_upper_bound, +elt->bound)) + { + cond_stmt = as_a (elt->stmt); + break; + } + } + if (exits[1]->flags & EDGE_TRUE_VALUE) + gimple_cond_make_false (cond_stmt); + else + gimple_cond_make_true (cond_stmt); + update_stmt (cond_stmt); + return false; + } + } + I know the check is wrong just for experiment, Then: [local count: 69202658]: _21 = (unsigned int) N_13(D); _22 = MIN_EXPR <_21, 1001>; > Use MIN_EXPR as the check. _23 = _22 + 1; goto ; [100.00%] [local count: 1014686025]: _1 = (long unsigned int) i_9; _2 = _1 * 4; _3 = a_14(D) + _2; _4 = *_3; _5 = b_15(D) + _2; _6 = *_5; _7 = c_16(D) + _2; _8 = _4 + _6; *_7 = _8; if (0 != 0) goto ; [1.00%] else goto ; [99.00%] [local count: 1004539166]: i_18 = i_9 + 1; [local count: 1073741824]: # i_9 = PHI <0(2), i_18(4)> # ivtmp_19 = PHI <_23(2), ivtmp_20(4)> ivtmp_20 = ivtmp_19 - 1; if (ivtmp_20 != 0) goto ; [94.50%] else goto ; [5.50%] [local count: 69202658]: return; Then it can vectorize. I am not sure whether it is the right place to put the codes.
[Bug tree-optimization/113134] gcc does not version loops with early break conditions that don't have side-effects
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113134 --- Comment #21 from JuzheZhong --- Hi, Richard. I looked into ivcanon. I found that: /* If the loop has more than one exit, try checking all of them for # of iterations determinable through scev. */ if (!exit) niter = find_loop_niter (loop, ); In find_loop_niter, we iterate 2 exit edges: 1. bb 5 -> bb 6 with niter = (unsigned int) N_13(D). 2. bb 3 -> bb 6 with niter = 1001. It just skip niter = (unsigned int) N_13(D) in: if (!integer_zerop (desc.may_be_zero)) continue; find_loop_niter (loop, ) return 1001 with skipping (unsigned int) N_13(D). Should it return MIN (1001, (unsigned int) N_13(D)). I prefer fix it in ivcanon since I believe it would be more elegant than fix it in loop splitter. I am still investigating, any guides will be really appreciated. Thanks.
[Bug tree-optimization/113134] gcc does not version loops with early break conditions that don't have side-effects
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113134 --- Comment #20 from Richard Biener --- I think we want split_loop () handle this case. That means extending it to handle loops with multiple exits. OTOH after loop rotation to if (i_21 == 1001) goto ; [1.00%] else goto ; [99.00%] [local count: 1004539166]: i_18 = i_21 + 1; if (N_13(D) > i_18) goto ; [94.50%] else goto ; [5.50%] it could be also IVCANONs job to rewrite the exit test so the bound is loop invariant and it becomes a single exit. There's another recent PR where an exit condition like i < N && i < M should become i < MIN(N,M).
[Bug tree-optimization/113134] gcc does not version loops with early break conditions that don't have side-effects
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113134 --- Comment #19 from JuzheZhong --- The loop is: bb 3 -> bb 4 -> bb 5 | |__⬆ |__⬆ The condition in bb 3 is if (i_21 == 1001). The condition in bb 4 is if (N_13(D) > i_18). Look into lsplit: This loop doesn't satisfy the check of: if (split_loop (loop) || split_loop_on_cond (loop)) In split_loop_on_cond, it's trying to split the loop that condition is loop invariant. However, no matter bb 3 or bb 4, their conditions are not loop invariant. I wonder whether we should add a new kind of loop splitter like: diff --git a/gcc/tree-ssa-loop-split.cc b/gcc/tree-ssa-loop-split.cc index 04215fe7937..a4081b9b6f5 100644 --- a/gcc/tree-ssa-loop-split.cc +++ b/gcc/tree-ssa-loop-split.cc @@ -1769,7 +1769,8 @@ tree_ssa_split_loops (void) if (optimize_loop_for_size_p (loop)) continue; - if (split_loop (loop) || split_loop_on_cond (loop)) + if (split_loop (loop) || split_loop_on_cond (loop) + || split_loop_for_early_break (loop)) { /* Mark our containing loop as having had some split inner loops. */ loop_outer (loop)->aux = loop;
[Bug tree-optimization/113134] gcc does not version loops with early break conditions that don't have side-effects
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113134 Richard Biener changed: What|Removed |Added CC||hubicka at gcc dot gnu.org --- Comment #18 from Richard Biener --- The loop is rotated by header copying and the "combined" exit test looks like if (i_21 == 1001) goto ; [1.00%] else goto ; [99.00%] [local count: 1004539166]: i_18 = i_21 + 1; if (N_13(D) > i_18) goto ; [94.50%] else goto ; [5.50%] this could be rewritten to use min(1002, N_13(D)) with the knowledge how 'i' evolves. We get i_21 != 1001 || N_13(D) > (i_21 + 1) for the iteration condition which I think we cannot combine in general. The "easiest" way would be to have loop splitting split the loop on the i_21 == 1001 condition I think.
[Bug tree-optimization/113134] gcc does not version loops with early break conditions that don't have side-effects
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113134 Andrew Pinski changed: What|Removed |Added Severity|normal |enhancement Component|c |tree-optimization