https://gcc.gnu.org/bugzilla/show_bug.cgi?id=123606
--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> --- (In reply to Richard Biener from comment #3) > It's likely that having a latch vs. not might trigger some of the many weird > conditions we have on threading around loops. No differences in -fopt-info (trunk vs. trunk with r16-6679 reverted), but threading doesn't emit opt-info messages (which is IMO good). On a Zen2 machine I can barely measure a runtime difference (161s vs. 158s at best, which is below 2%). With that perf produces only useless data for me. There's very few actual assembler differences, even less meaningful ones. I can see one larger BB ordering difference, so possibly the effect is on profile / bb-reorder effects with the reduced churn possibly preserving more correct data that turns out to be "bad". Having a CFG-only dump might prove useful (possibly indicating stmt count stats).
