[Bug tree-optimization/102178] [12 Regression] SPECFP 2006 470.lbm regressions on AMD Zen CPUs after r12-897-gde56f95afaaa22
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102178 --- Comment #6 from Martin Liška --- Created attachment 52296 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52296=edit perf annotate before and after the revision
[Bug tree-optimization/102178] [12 Regression] SPECFP 2006 470.lbm regressions on AMD Zen CPUs after r12-897-gde56f95afaaa22
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102178 Richard Biener changed: What|Removed |Added CC||rguenth at gcc dot gnu.org Priority|P3 |P1 Keywords||missed-optimization Host|x86_64-linix| --- Comment #5 from Richard Biener --- Analysis is missing but the regression persists. On Haswell I do not see any effect. I do suspect it's about cmov vs. non-cmov but w/o a profile and looking affected assembly that's a wild guess.
[Bug tree-optimization/102178] [12 Regression] SPECFP 2006 470.lbm regressions on AMD Zen CPUs after r12-897-gde56f95afaaa22
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102178 --- Comment #4 from Martin Jambor --- (In reply to Martin Jambor from comment #3) > ...I'll have a very brief look at what is actually happening just so that I > have more reasons to believe this is not a code placement issue again. The hot function is at the same address when compiled by both revisions and the newer version looks sufficiently different. I even tried sprinkling it with nops and it did not help. I am no saying we are not bumping against some michro-architectural peculiarity but it does not seem to be a code placement issue.
[Bug tree-optimization/102178] [12 Regression] SPECFP 2006 470.lbm regressions on AMD Zen CPUs after r12-897-gde56f95afaaa22
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102178 --- Comment #3 from Martin Jambor --- (In reply to Richard Biener from comment #1) > Martin, maybe you can try moving late sink to before the last phiopt pass. If you mean the following then unfortunately that has not helped. diff --git a/gcc/passes.def b/gcc/passes.def index d7a1f8c97a6..5eb70cd2cd8 100644 --- a/gcc/passes.def +++ b/gcc/passes.def @@ -347,10 +347,10 @@ along with GCC; see the file COPYING3. If not see /* After late CD DCE we rewrite no longer addressed locals into SSA form if possible. */ NEXT_PASS (pass_forwprop); + NEXT_PASS (pass_sink_code); NEXT_PASS (pass_phiopt, false /* early_p */); NEXT_PASS (pass_fold_builtins); NEXT_PASS (pass_optimize_widening_mul); - NEXT_PASS (pass_sink_code); NEXT_PASS (pass_store_merging); NEXT_PASS (pass_tail_calls); /* If DCE is not run before checking for uninitialized uses, ...I'll have a very brief look at what is actually happening just so that I have more reasons to believe this is not a code placement issue again.
[Bug tree-optimization/102178] [12 Regression] SPECFP 2006 470.lbm regressions on AMD Zen CPUs after r12-897-gde56f95afaaa22
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102178 --- Comment #2 from luoxhu at gcc dot gnu.org --- Verified 470.lbm doesn't show regression on Power8 with Ofast. runtime is 141 sec for r12-897, without that patch it is 142 sec.
[Bug tree-optimization/102178] [12 Regression] SPECFP 2006 470.lbm regressions on AMD Zen CPUs after r12-897-gde56f95afaaa22
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102178 Richard Biener changed: What|Removed |Added Version|11.0|12.0 Summary|SPECFP 2006 470.lbm |[12 Regression] SPECFP 2006 |regressions on AMD Zen CPUs |470.lbm regressions on AMD |after |Zen CPUs after |r12-897-gde56f95afaaa22 |r12-897-gde56f95afaaa22 Target Milestone|--- |12.0