https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85275
Bug ID: 85275 Summary: copyheader peels off almost the entire iteration Product: gcc Version: unknown Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: amonakov at gcc dot gnu.org Target Milestone: --- I expected predcom to eliminate one of the loads in this loop at -O3: int is_sorted(int *a, int n) { for (int i = 0; i < n - 1; i++) if (a[i] > a[i + 1]) return 0; return 1; } Unfortunately, predcom bails out since the loads it sees are not always-executed. Ideally loop header copying would make this a suitable do-while loop, but in this case it duplicates too much: ;; Loop 1 ;; header 5, latch 4 ;; depth 1, outer 0 ;; nodes: 5 4 3 ;; 2 succs { 5 } ;; 3 succs { 6 4 } ;; 4 succs { 5 } ;; 5 succs { 3 6 } ;; 6 succs { 1 } Analyzing loop 1 Loop 1 is not do-while loop: latch is not empty. Will duplicate bb 5 Will duplicate bb 3 Not duplicating bb 4: it is single succ. Duplicating header of the loop 1 up to edge 3->4, 12 insns. [...] <bb 2> [local count: 114863532]: _17 = n_12(D) + -1; if (_17 > 0) goto <bb 3>; [94.50%] else goto <bb 6>; [5.50%] <bb 3> [local count: 108546038]: _18 = 0; _19 = _18 * 4; _20 = a_13(D) + _19; _21 = *_20; _22 = _18 + 1; _23 = _22 * 4; _24 = a_13(D) + _23; _25 = *_24; if (_21 > _25) goto <bb 6>; [5.50%] else goto <bb 5>; [94.50%] <bb 4> [local count: 906139986]: _1 = (long unsigned int) i_15; _2 = _1 * 4; _3 = a_13(D) + _2; _4 = *_3; _5 = _1 + 1; _6 = _5 * 4; _7 = a_13(D) + _6; _8 = *_7; if (_4 > _8) goto <bb 6>; [5.50%] else goto <bb 5>; [94.50%] <bb 5> [local count: 958878293]: # i_26 = PHI <0(3), i_15(4)> i_15 = i_26 + 1; _9 = n_12(D) + -1; if (_9 > i_15) goto <bb 4>; [94.50%] else goto <bb 6>; [5.50%] (throttling it down with --param max-loop-header-insns=5 gives the expected optimization)