https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79390

--- Comment #23 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #22)
> (In reply to rguent...@suse.de from comment #21)
> > On April 7, 2017 6:57:13 PM GMT+02:00, "jakub at gcc dot gnu.org"
> > <gcc-bugzi...@gcc.gnu.org> wrote:
> > >https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79390
> > >
> > >--- Comment #20 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
> > >So, Richard, any thoughts on what can be done split paths to avoid
> > >this?
> > 
> > Invent some new heuristic that avoids splitting this case...
> 
> Index: gcc/gimple-ssa-split-paths.c
> ===================================================================
> --- gcc/gimple-ssa-split-paths.c        (revision 246803)
> +++ gcc/gimple-ssa-split-paths.c        (working copy)
> @@ -249,13 +249,17 @@ is_feasible_trace (basic_block bb)
>                   imm_use_iterator iter2;
>                   FOR_EACH_IMM_USE_FAST (use2_p, iter2, gimple_phi_result
> (stmt))
>                     {
> -                     if (is_gimple_debug (USE_STMT (use2_p)))
> +                     gimple *use_stmt = USE_STMT (use2_p);
> +                     if (is_gimple_debug (use_stmt))
>                         continue;
> -                     basic_block use_bb = gimple_bb (USE_STMT (use2_p));
> +                     basic_block use_bb = gimple_bb (use_stmt);
>                       if (use_bb != bb
>                           && dominated_by_p (CDI_DOMINATORS, bb, use_bb))
>                         {
> -                         found_useful_phi = true;
> +                         if (gcond *cond = dyn_cast <gcond *> (use_stmt))
> +                           if (gimple_cond_code (cond) == EQ_EXPR
> +                               || gimple_cond_code (cond) == NE_EXPR)
> +                             found_useful_phi = true;
>                           break;
>                         }
>                     }
> 
> avoids the splitting at at least passes tree-ssa.exp testing.  Throwing it
> on full testing (there are some path splitting testcases randomly placed
> IIRC).

Bootstrap / regtest went ok.  With this and -O3 -march=native (on a broadwell
CPU) I get

gcc6 -O3 -march=native: 5469.25 Mflops
gcc7 -O3 -march=native: 5439.39 Mflops

but note that with -Ofast -march=native the situation is still bad
(-fno-split-paths doesn't help but -ftree-loop-if-convert does):

gcc6 -Ofast -march=native: 5500.51 Mflops
gcc7 -Ofast -march=native: 4765.56 Mflops
gcc7 -Ofast -march=native -ftree-loop-if-convert: 5335.49 Mflops

Shall I go for the split-path fix for the moment?

Reply via email to