https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122493
Bug ID: 122493
Summary: is_eye in leela is not split because of empty basic
block
Product: gcc
Version: unknown
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: middle-end
Assignee: unassigned at gcc dot gnu.org
Reporter: hubicka at gcc dot gnu.org
Target Milestone: ---
is_eye in leela would benefit from partial inlining that does not happen. We
have:
<bb 2> [local count: 1073741822]:
_1 = (long unsigned int) i_62(D);
# VUSE <.MEM_64(D)>
_4 = MEM <struct array> [(short unsigned int &)this_63(D) + 7076].elems[_1];
_5 = (int) _4;
_6 = (long unsigned int) color_66(D);
# VUSE <.MEM_64(D)>
_8 = MEM <const struct array> [(const int &)&s_eyemask].elems[_6];
ownsurrounded_68 = _5 & _8;
if (ownsurrounded_68 == 0)
goto <bb 3>; [34.00%]
else
goto <bb 5>; [66.00%]
<bb 3> [local count: 365072224]:
// predicted unlikely by early return (on trees) predictor.
<bb 4> [local count: 636386382]:
# .MEM_21 = PHI <.MEM_64(D)(3), .MEM_79(7), .MEM_79(9)>
goto <bb 10>; [100.00%]
<bb 5> [local count: 708669599]:
# .MEM_69 = VDEF <.MEM_64(D)>
colorcount[0] = 0;
# .MEM_70 = VDEF <.MEM_69>
colorcount[1] = 0;
# .MEM_71 = VDEF <.MEM_70>
colorcount[3] = 0;
_9 = i_62(D) + -1;
# VUSE <.MEM_71>
_10 = this_63(D)->m_boardsize;
....
<bb 10> [local count: 1073741824]:
# _60 = PHI <0(4), 1(6), 1(8)>
# .MEM_61 = PHI <.MEM_21(4), .MEM_79(6), .MEM_79(8)>
# .MEM_84 = VDEF <.MEM_61>
colorcount ={v} {CLOBBER(eos)};
# VUSE <.MEM_84>
return _60;
}
and we would like to split at BB5. However the tail also does:
<bb 8> [local count: 354334800]:
_57 = color_66(D) == 0;
_58 = (int) _57;
# VUSE <.MEM_79>
_59 = colorcount[_58];
if (_59 != 0)
goto <bb 9>; [34.00%]
else
goto <bb 10>; [66.00%]
<bb 7> [local count: 150840325]:
// predicted unlikely by early return (on trees) predictor.
goto <bb 4>; [100.00%]
Whch means that it sometimes terminated by jumping to return block but
sometimes to bb 4 which is forwarder to return block. This makes the split
point not recognized. Sceduling extra merge_phi:
diff --git a/gcc/passes.def b/gcc/passes.def
index fac04cd86c7..a8a89b980a5 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -97,6 +97,7 @@ along with GCC; see the file COPYING3. If not see
NEXT_PASS (pass_merge_phi);
NEXT_PASS (pass_dse);
NEXT_PASS (pass_cd_dce, false /* update_address_taken_p */, true /*
remove_unused_locals */);
+ NEXT_PASS (pass_merge_phi);
NEXT_PASS (pass_phiopt, true /* early_p */);
/* Cleanup eh is done before tail recusision to remove empty (only
clobbers)
finally blocks which would block tail recursion. */
solves the problem. However I think this can be done by cfgcleanup itself.