[Bug middle-end/70159] missed CSE optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70159 --- Comment #18 from Anton Blanchard --- Urgh too early in the morning for me. PR71866 created, with the correct backtrace.
[Bug middle-end/70159] missed CSE optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70159 --- Comment #17 from Andrew Pinski --- (In reply to Anton Blanchard from comment #16) > I'm seeing a lockup in gcc with this patch on ppc64le. Run as: > > gcc -O2 -c testcase.i Can you file a new bug for this? Also your backtrace is just for the driver.
[Bug middle-end/70159] missed CSE optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70159 --- Comment #16 from Anton Blanchard --- I'm seeing a lockup in gcc with this patch on ppc64le. Run as: gcc -O2 -c testcase.i It gets stuck in: #0 0x3fffb7e5e3e8 in __waitpid_nocancel () at ../sysdeps/unix/syscall-template.S:84 #1 0x10088c58 in pex_wait (obj=, time=0x0, status=0x1015f080, pid=) at ../../gcc/libiberty/pex-unix.c:134 #2 pex_unix_wait (obj=, pid=, status=0x1015f080, time=0x0, done=, errmsg=0x3fffd750, err=0x3fffd74c) at ../../gcc/libiberty/pex-unix.c:738 #3 0x1008762c in pex_get_status_and_time (obj=, done=, errmsg=, err=) at ../../gcc/libiberty/pex-common.c:534 #4 0x10088630 in pex_get_status (obj=0x1015efc0, count=, vector=0x3fffd7e0) at ../../gcc/libiberty/pex-common.c:554 #5 0x100101dc in execute () at ../../gcc/gcc/gcc.c:3107 #6 0x10010f04 in do_spec_1 ( spec=0x1015ec20 "-o %|.s |\n as %(asm_options) %m.s %A", inswitch=0, soft_matched_part=0x0) at ../../gcc/gcc/gcc.c:5145 #7 0x100139e4 in process_brace_body (matched=, starred=0, end_atom=0x1015ed3c ":-o %|.s |\n as %(asm_options) %m.s %A }", atom=0x1015ed3b "S:-o %|.s |\n as %(asm_options) %m.s %A }", p=0x1015ed62 "}") at ../../gcc/gcc/gcc.c:6431 #8 handle_braces (p=) at ../../gcc/gcc/gcc.c:6345 #9 0x10011604 in do_spec_1 ( spec=0x1015ecf0 " %{fcompare-debug=*|fdump-final-insns=*:%:compare-debug-dump-opt()} %{!S:-o %|.s |\n as %(asm_options) %m.s %A }", inswitch=0, soft_matched_part=0x0) at ../../gcc/gcc/gcc.c:5802 #10 0x100139e4 in process_brace_body (matched=, starred=1, end_atom=0x100a877f "*: %{fcompare-debug=*|fdump-final-insns=*:%:compare-debug-dump-opt()} %{!S:-o %|.s |\n as %(asm_options) %m.s %A } }", atom=0x100a877b "fwpa*: %{fcompare-debug=*|fdump-final-insns=*:%:compare-debug-dump-opt()} %{!S:-o %|.s |\n as %(asm_options) %m.s %A } }", p=0x100a87f6 "}") at ../../gcc/gcc/gcc.c:6431 #11 handle_braces (p=) at ../../gcc/gcc/gcc.c:6345 #12 0x10011604 in do_spec_1 ( spec=0x100a8778 "%{!fwpa*: %{fcompare-debug=*|fdump-final-insns=*:%:compare-debug-dump-opt()} %{!S:-o %|.s |\n as %(asm_options) %m.s %A } }", inswitch=0, soft_matched_part=0x0) at ../../gcc/gcc/gcc.c:5802 #13 0x10011228 in do_spec_1 (spec=0x1015ec90 "%(invoke_as)", inswitch=0, soft_matched_part=0x0) at ../../gcc/gcc/gcc.c:5917 #14 0x100139e4 in process_brace_body (matched=, starred=0, end_atom=0x1015ec04 ":%(invoke_as)}", atom=0x1015ebf8 "fsyntax-only:%(invoke_as)}", p=0x1015ec11 "}") at ../../gcc/gcc/gcc.c:6431 #15 handle_braces (p=) at ../../gcc/gcc/gcc.c:6345 #16 0x10011604 in do_spec_1 ( spec=0x1015ebd0 "cc1 -fpreprocessed %i %(cc1_options) %{!fsyntax-only:%(invoke_as)}", inswitch=0, soft_matched_part=0x0) at ../../gcc/gcc/gcc.c:5802 #17 0x100139e4 in process_brace_body (matched=, starred=0, end_atom=0x1015eb74 ":cc1 -fpreprocessed %i %(cc1_options) %{!fsyntax-only:%(invoke_as)}}", atom=0x1015eb73 "E:cc1 -fpreprocessed %i %(cc1_options) %{!fsyntax-only:%(invoke_as)}}", p=0x1015ebb7 "}") at ../../gcc/gcc/gcc.c:6431 #18 handle_braces (p=) at ../../gcc/gcc/gcc.c:6345 #19 0x10011604 in do_spec_1 ( spec=0x1015eb70 "%{!E:cc1 -fpreprocessed %i %(cc1_options) %{!fsyntax-only:%(invoke_as)}}", inswitch=0, soft_matched_part=0x0) at ../../gcc/gcc/gcc.c:5802 #20 0x100139e4 in process_brace_body (matched=, starred=0, end_atom=0x1015eb15 ":%{!E:cc1 -fpreprocessed %i %(cc1_options) %{!fsyntax-only:%(invoke_as)}}}", atom=0x1015eb13 "MM:%{!E:cc1 -fpreprocessed %i %(cc1_options) %{!fsyntax-only:%(invoke_as)}}}", p=0x1015eb5e "}") at ../../gcc/gcc/gcc.c:6431 #21 handle_braces (p=) at ../../gcc/gcc/gcc.c:6345 #22 0x10011604 in do_spec_1 ( spec=0x1015eb10 "%{!MM:%{!E:cc1 -fpreprocessed %i %(cc1_options) %{!fsyntax-only:%(invoke_as)}}}", inswitch=0, soft_matched_part=0x0) at ../../gcc/gcc/gcc.c:5802 #23 0x100139e4 in process_brace_body (matched=, starred=0, end_atom=0x100a789c ":%{!MM:%{!E:cc1 -fpreprocessed %i %(cc1_options) %{!fsyntax-only:%(invoke_as)", atom=0x100a789b "M:%{!MM:%{!E:cc1 -fpreprocessed %i %(cc1_options) %{!fsyntax-only:%(invoke_as)", p=0x100a78ec "}") at ../../gcc/gcc/gcc.c:6431 #24 handle_braces (p=) at ../../gcc/gcc/gcc.c:6345 #25 0x10011604 in do_spec_1 ( spec=0x100a7898 "%{!M:%{!MM:%{!E:cc1 -fpreprocessed %i %(cc1_options) %{!fsyntax-only:%(invoke_as)", inswitch=0, soft_matched_part=0x0) at ../../gcc/gcc/gcc.c:5802 #26 0x10012878 in do_spec_2 (spec=) at ../../gcc/gcc/gcc.c:4841 #27 0x10014514 in do_spec (spec=) at ../../gcc/gcc/gcc.c:4808 #28 0x1001479c in driver::do_spec_on_infiles (this=0x3108) at ../../gcc/gcc/gcc.c:8076 #29 0x10003138 in driver::main
[Bug middle-end/70159] missed CSE optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70159 Anton Blanchard changed: What|Removed |Added CC||anton at samba dot org --- Comment #15 from Anton Blanchard --- Created attachment 38890 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38890=edit test case
[Bug middle-end/70159] missed CSE optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70159 Richard Biener changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED Target Milestone|--- |7.0 --- Comment #14 from Richard Biener --- Fixed.
[Bug middle-end/70159] missed CSE optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70159 Bug 70159 depends on bug 23286, which changed state. Bug 23286 Summary: Missed code hoisting optimization https://gcc.gnu.org/bugzilla/show_bug.cgi?id=23286 What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED
[Bug middle-end/70159] missed CSE optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70159 --- Comment #13 from Richard Biener --- Author: rguenth Date: Tue Jul 12 13:32:04 2016 New Revision: 238242 URL: https://gcc.gnu.org/viewcvs?rev=238242=gcc=rev Log: 2016-07-12 Steven BosscherRichard Biener PR tree-optimization/23286 PR tree-optimization/70159 * doc/invoke.texi: Document -fcode-hoisting. * common.opt (fcode-hoisting): New flag. * opts.c (default_options_table): Enable -fcode-hoisting at -O2+. * tree-ssa-pre.c (pre_stats): Add hoist_insert. (do_regular_insertion): Rename to ... (do_pre_regular_insertion): ... this and amend general comments on insertion strathegy. (do_partial_partial_insertion): Rename to ... (do_pre_partial_partial_insertion): ... this. (do_hoist_insertion): New function. (insert_aux): Take flags on whether to do PRE and/or hoist insertion and call do_hoist_insertion properly. (insert): Adjust. (pass_pre::gate): Enable also if -fcode-hoisting is enabled. (pass_pre::execute): Register hoist_insert stats. * gcc.dg/tree-ssa/ssa-pre-11.c: Disable code hosting. * gcc.dg/tree-ssa/ssa-pre-27.c: Likewise. * gcc.dg/tree-ssa/ssa-pre-28.c: Likewise. * gcc.dg/tree-ssa/ssa-pre-2.c: Likewise. * gcc.dg/tree-ssa/pr35286.c: Likewise. * gcc.dg/tree-ssa/pr35287.c: Likewise. * gcc.dg/hoist-register-pressure-1.c: Likewise. * gcc.dg/hoist-register-pressure-2.c: Likewise. * gcc.dg/hoist-register-pressure-3.c: Likewise. * gcc.dg/pr51879-12.c: Likewise. * gcc.dg/strlenopt-9.c: Likewise. * gcc.dg/tree-ssa/pr47392.c: Likewise. * gcc.dg/tree-ssa/pr68619-4.c: Likewise. * gcc.dg/tree-ssa/split-path-5.c: Likewise. * gcc.dg/tree-ssa/slsr-35.c: Likewise. * gcc.dg/tree-ssa/slsr-36.c: Likewise. * gcc.dg/tree-ssa/loadpre3.c: Adjust so hosting doesn't apply. * gcc.dg/tree-ssa/pr43491.c: Scan optimized dump for desired result. * gcc.dg/tree-ssa/ssa-pre-31.c: Adjust expected outcome for hoisting. * gcc.dg/tree-ssa/ssa-hoist-1.c: New testcase. * gcc.dg/tree-ssa/ssa-hoist-2.c: New testcase. * gcc.dg/tree-ssa/ssa-hoist-3.c: New testcase. * gcc.dg/tree-ssa/ssa-hoist-4.c: New testcase. * gcc.dg/tree-ssa/ssa-hoist-5.c: New testcase. * gcc.dg/tree-ssa/ssa-hoist-6.c: New testcase. * gfortran.dg/pr43984.f90: Adjust expected outcome. Added: trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-hoist-1.c trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-hoist-2.c trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-hoist-3.c trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-hoist-4.c trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-hoist-6.c Modified: trunk/gcc/ChangeLog trunk/gcc/common.opt trunk/gcc/doc/invoke.texi trunk/gcc/opts.c trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.dg/hoist-register-pressure-1.c trunk/gcc/testsuite/gcc.dg/hoist-register-pressure-2.c trunk/gcc/testsuite/gcc.dg/hoist-register-pressure-3.c trunk/gcc/testsuite/gcc.dg/pr51879-12.c trunk/gcc/testsuite/gcc.dg/strlenopt-9.c trunk/gcc/testsuite/gcc.dg/tree-ssa/loadpre3.c trunk/gcc/testsuite/gcc.dg/tree-ssa/pr35286.c trunk/gcc/testsuite/gcc.dg/tree-ssa/pr35287.c trunk/gcc/testsuite/gcc.dg/tree-ssa/pr43491.c trunk/gcc/testsuite/gcc.dg/tree-ssa/pr47392.c trunk/gcc/testsuite/gcc.dg/tree-ssa/pr68619-4.c trunk/gcc/testsuite/gcc.dg/tree-ssa/slsr-35.c trunk/gcc/testsuite/gcc.dg/tree-ssa/slsr-36.c trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-pre-11.c trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-pre-2.c trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-pre-27.c trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-pre-28.c trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-pre-31.c trunk/gcc/testsuite/gfortran.dg/pr43984.f90 trunk/gcc/tree-ssa-pre.c
[Bug middle-end/70159] missed CSE optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70159 --- Comment #12 from rguenther at suse dot de --- On Sat, 2 Jul 2016, hiraditya at msn dot com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70159 > > AK changed: > >What|Removed |Added > > CC||hiraditya at msn dot com > > --- Comment #11 from AK --- > Just as an update, the new gvn-hoist pass in llvm hoists the common > computations: > > @cat test.c > float foo_p(float d, float min, float max, float a) > { > float tmin; > float tmax; > > float inv = 1.0f / d; > if (inv >= 0) { > tmin = (min - a) * inv; > tmax = (max - a) * inv; > } else { > tmin = (max - a) * inv; > tmax = (min - a) * inv; > } > > return tmax + tmin; > } > > > clang -c -Ofast test.c -mllvm -print-after-all > > > *** IR Dump Before Early GVN Hoisting of Expressions *** > ; Function Attrs: nounwind uwtable > define float @_Z5foo_p(float %d, float %min, float %max, float %a) #0 { > entry: > %div = fdiv fast float 1.00e+00, %d > %cmp = fcmp fast oge float %div, 0.00e+00 > br i1 %cmp, label %if.then, label %if.else > > if.then: ; preds = %entry > %sub = fsub fast float %min, %a > %mul = fmul fast float %sub, %div > %sub1 = fsub fast float %max, %a > %mul2 = fmul fast float %sub1, %div > br label %if.end > > if.else: ; preds = %entry > %sub3 = fsub fast float %max, %a > %mul4 = fmul fast float %sub3, %div > %sub5 = fsub fast float %min, %a > %mul6 = fmul fast float %sub5, %div > br label %if.end > > if.end: ; preds = %if.else, %if.then > %tmax.0 = phi float [ %mul2, %if.then ], [ %mul6, %if.else ] > %tmin.0 = phi float [ %mul, %if.then ], [ %mul4, %if.else ] > %add = fadd fast float %tmax.0, %tmin.0 > ret float %add > } > > > *** IR Dump After Early GVN Hoisting of Expressions *** > ; Function Attrs: nounwind uwtable > define float @_Z5foo_p(float %d, float %min, float %max, float %a) #0 { > entry: > %div = fdiv fast float 1.00e+00, %d > %cmp = fcmp fast oge float %div, 0.00e+00 > %sub = fsub fast float %min, %a > %mul = fmul fast float %sub, %div > %sub1 = fsub fast float %max, %a > %mul2 = fmul fast float %sub1, %div > br i1 %cmp, label %if.then, label %if.else > > if.then: ; preds = %entry > br label %if.end > > if.else: ; preds = %entry > br label %if.end > > if.end: ; preds = %if.else, %if.then > %tmax.0 = phi float [ %mul2, %if.then ], [ %mul, %if.else ] > %tmin.0 = phi float [ %mul, %if.then ], [ %mul2, %if.else ] > %add = fadd fast float %tmax.0, %tmin.0 > ret float %add > } Same if you forward-port the patch from PR23286 (just did that). > cat t.c.126t.pre ... foo_p (float d, float min, float max, float a) { float inv; float tmax; float tmin; float _16; float _18; float _19; float _20; float _21; : inv_8 = 1.0e+0 / d_7(D); _18 = min_9(D) - a_10(D); _19 = inv_8 * _18; _20 = max_12(D) - a_10(D); _21 = inv_8 * _20; if (inv_8 >= 0.0) goto ; else goto ; : : # tmin_5 = PHI <_19(2), _21(3)> # tmax_6 = PHI <_21(2), _19(3)> _16 = tmin_5 + tmax_6; return _16;
[Bug middle-end/70159] missed CSE optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70159 AK changed: What|Removed |Added CC||hiraditya at msn dot com --- Comment #11 from AK --- Just as an update, the new gvn-hoist pass in llvm hoists the common computations: @cat test.c float foo_p(float d, float min, float max, float a) { float tmin; float tmax; float inv = 1.0f / d; if (inv >= 0) { tmin = (min - a) * inv; tmax = (max - a) * inv; } else { tmin = (max - a) * inv; tmax = (min - a) * inv; } return tmax + tmin; } clang -c -Ofast test.c -mllvm -print-after-all *** IR Dump Before Early GVN Hoisting of Expressions *** ; Function Attrs: nounwind uwtable define float @_Z5foo_p(float %d, float %min, float %max, float %a) #0 { entry: %div = fdiv fast float 1.00e+00, %d %cmp = fcmp fast oge float %div, 0.00e+00 br i1 %cmp, label %if.then, label %if.else if.then: ; preds = %entry %sub = fsub fast float %min, %a %mul = fmul fast float %sub, %div %sub1 = fsub fast float %max, %a %mul2 = fmul fast float %sub1, %div br label %if.end if.else: ; preds = %entry %sub3 = fsub fast float %max, %a %mul4 = fmul fast float %sub3, %div %sub5 = fsub fast float %min, %a %mul6 = fmul fast float %sub5, %div br label %if.end if.end: ; preds = %if.else, %if.then %tmax.0 = phi float [ %mul2, %if.then ], [ %mul6, %if.else ] %tmin.0 = phi float [ %mul, %if.then ], [ %mul4, %if.else ] %add = fadd fast float %tmax.0, %tmin.0 ret float %add } *** IR Dump After Early GVN Hoisting of Expressions *** ; Function Attrs: nounwind uwtable define float @_Z5foo_p(float %d, float %min, float %max, float %a) #0 { entry: %div = fdiv fast float 1.00e+00, %d %cmp = fcmp fast oge float %div, 0.00e+00 %sub = fsub fast float %min, %a %mul = fmul fast float %sub, %div %sub1 = fsub fast float %max, %a %mul2 = fmul fast float %sub1, %div br i1 %cmp, label %if.then, label %if.else if.then: ; preds = %entry br label %if.end if.else: ; preds = %entry br label %if.end if.end: ; preds = %if.else, %if.then %tmax.0 = phi float [ %mul2, %if.then ], [ %mul, %if.else ] %tmin.0 = phi float [ %mul, %if.then ], [ %mul2, %if.else ] %add = fadd fast float %tmax.0, %tmin.0 ret float %add }
[Bug middle-end/70159] missed CSE optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70159 --- Comment #10 from rguenther at suse dot de --- On Thu, 10 Mar 2016, spop at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70159 > > --- Comment #9 from Sebastian Pop --- > Created attachment 37927 > --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37927=edit > patch for hoisting expressions > > Updated the patch from PR23286 to hoist the redundant expressions: > > : > inv_4 = 1.0e+0 / d_3(D); > _18 = min_5(D) - a_6(D); > _19 = _18 / inv_4; > _20 = max_9(D) - a_6(D); > _21 = _20 / inv_4; > if (inv_4 >= 0.0) > goto ; > else > goto ; > > : > > : > # tmin_1 = PHI <_19(2), _21(3)> > # tmax_2 = PHI <_21(2), _19(3)> > _16 = tmin_1 + tmax_2; > return _16; > > The attached patch does not pass make check and causes some infinite > recursion. Yes, the patch has some issues still. As for every stage1 I plan to try picking it up again ... I notice from the above output that it doesn't see the CSE opportunity but inserts both expressions - it shouldn't do that, so there's something "new wrong" with the patch (or your update to it)
[Bug middle-end/70159] missed CSE optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70159 --- Comment #9 from Sebastian Pop --- Created attachment 37927 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37927=edit patch for hoisting expressions Updated the patch from PR23286 to hoist the redundant expressions: : inv_4 = 1.0e+0 / d_3(D); _18 = min_5(D) - a_6(D); _19 = _18 / inv_4; _20 = max_9(D) - a_6(D); _21 = _20 / inv_4; if (inv_4 >= 0.0) goto ; else goto ; : : # tmin_1 = PHI <_19(2), _21(3)> # tmax_2 = PHI <_21(2), _19(3)> _16 = tmin_1 + tmax_2; return _16; The attached patch does not pass make check and causes some infinite recursion.
[Bug middle-end/70159] missed CSE optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70159 Richard Biener changed: What|Removed |Added Keywords||missed-optimization Status|UNCONFIRMED |NEW Last reconfirmed||2016-03-10 CC||rguenth at gcc dot gnu.org Version|unknown |6.0 Ever confirmed|0 |1 --- Comment #8 from Richard Biener --- We already compute nearly everything that is necessary: Value numbering tmin_13 stmt = tmin_13 = inv_4 * _12; Match-and-simplified inv_4 * _12 to tmax_11 RHS inv_4 * _12 simplified to tmax_11 Setting value number of tmin_13 to tmax_11 (changed) ... Value numbering tmax_15 stmt = tmax_15 = inv_4 * _14; Match-and-simplified inv_4 * _14 to tmin_8 RHS inv_4 * _14 simplified to tmin_8 Setting value number of tmax_15 to tmin_8 (changed) so we know the equivalency but can't do sth sensible with it yet. : # tmin_1 = PHI# tmax_2 = PHI _16 = tmin_1 + tmax_2; so value-wise this is : # tmin_1 = PHI # tmax_2 = PHI _16 = tmin_1 + tmax_2; which we could transform by pattern matching this case changing the if (inv >= 0) to always take the path which otherwise has no side-effects. I'm not sure it really fits into FRE/PRE elimination phase but at least it may be easier to do sth like phi-opt on the more complex cases in the VN framework (to avoid the need to compute equivalencies this complex). Hoisting would surely help as well but it still would need sth to detect the commutatively redundant swap. That is, float foo_p(float d, float min, float max, float a) { float tmin; float tmax; float inv = 1.0f / d; if (inv >= 0) { tmin = min; tmax = max; } else { tmin = max; tmax = min; } return tmax + tmin; } is not optimized either and we retain if (inv_4 >= 0.0) goto ; else goto ; : : # tmin_1 = PHI # tmax_2 = PHI _7 = tmin_1 + tmax_2; until .optimized. So that part of this PR is independent of the hoisting issue we have other PRs for. (simplify (plus (cond @0 @1 @2) (cond @0 @2 @1)) (plus @1 @2)) would fix it if we'd match PHIs for conds as well. If we "help" PRE with float foo_p(float d, float min, float max, float a) { float tmin; float tmax; float inv = 1.0f / d; float bar = min + max; if (inv >= 0) { tmin = min; tmax = max; } else { tmin = max; tmax = min; } return tmax + tmin + bar; } it figures that tmax + tmin is equal to bar and optimizes this. Sth that hoisting (as implemented in PRE) should then do in one step hopefully. Catching this somewhat earlier would be nice though.
[Bug middle-end/70159] missed CSE optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70159 --- Comment #7 from Sebastian Pop --- (In reply to Andrew Pinski from comment #6) > Note this is both a hoisting and a sinking issue. > Hoisting should happen before sinking. > LLVM looks like it only implements sinking. You are right: LLVM does sinking very early as part of instcombine: it transforms the phi nodes after the if into selects over the operands and sinks the sub and mul after the select. By the time other redundancy elimination passes are executed the shape of the code is more difficult to optimize.
[Bug middle-end/70159] missed CSE optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70159 --- Comment #6 from Andrew Pinski --- Note this is both a hoisting and a sinking issue. Hoisting should happen before sinking. LLVM looks like it only implements sinking.
[Bug middle-end/70159] missed CSE optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70159 --- Comment #3 from Andrew Pinski --- Related to 5738. Try with -Os and you will see the RTL non PRE based GCSE does most of the job: foo_p: fmovs4, 1.0e+0 fdivs0, s4, s0 fsubs1, s1, s3 fsubs2, s2, s3 fcmpe s0, #0.0 blt .L6 fmuls4, s1, s0 fmuls0, s2, s0 .L4: fadds0, s4, s0 ret .L6: fmuls4, s2, s0 fmuls0, s1, s0 b .L4
[Bug middle-end/70159] missed CSE optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70159 --- Comment #5 from Andrew Pinski --- Oh and bug 23286.
[Bug middle-end/70159] missed CSE optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70159 --- Comment #4 from Andrew Pinski --- "Related to bug 5738."
[Bug middle-end/70159] missed CSE optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70159 --- Comment #2 from Sebastian Pop --- Right, with -Ofast it be able to optimize away the branch or selects. The original benchmark had something more complex than fadd to use the tmin and tmax results. Here is one more test using the results in a non commutative operation: bool foo_p(float d, float min, float max, float a) { float tmin; float tmax; float inv = 1.0f / d; if (inv >= 0) { tmin = (min - a) * inv; tmax = (max - a) * inv; } else { tmin = (max - a) * inv; tmax = (min - a) * inv; } return tmax > tmin; }
[Bug middle-end/70159] missed CSE optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70159 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #1 from Jakub Jelinek --- Well, as tmax + tmin is commutative, this one actually should be optimized into just (min - a) * inv + (max - a) * inv or with -Ofast perhaps to (min - a + max - a) * inv.