[Bug tree-optimization/102383] Missing optimization for PRE after enable O2 vectorization

2023-11-02 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102383

Richard Biener  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #6 from Richard Biener  ---
By r12-1275-g4db34072d5336d indeed which enables predcom when vectorization is
enabled.

[Bug tree-optimization/102383] Missing optimization for PRE after enable O2 vectorization

2023-10-31 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102383

--- Comment #5 from Hongtao.liu  ---
It's fixed in GCC12.1

[Bug tree-optimization/102383] Missing optimization for PRE after enable O2 vectorization

2021-09-17 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102383

Kewen Lin  changed:

   What|Removed |Added

 CC||linkw at gcc dot gnu.org

--- Comment #4 from Kewen Lin  ---
(In reply to Richard Biener from comment #2)
> The issue is that we tame PRE because it tends to inhibit vectorization.
> 
>   /* Inhibit the use of an inserted PHI on a loop header when
>  the address of the memory reference is a simple induction
>  variable.  In other cases the vectorizer won't do anything
>  anyway (either it's loop invariant or a complicated
>  expression).  */
>   if (sprime
>   && TREE_CODE (sprime) == SSA_NAME
>   && do_pre
>   && (flag_tree_loop_vectorize || flag_tree_parallelize_loops > 1)
>   && loop_outer (b->loop_father)
>   && has_zero_uses (sprime)
>   && bitmap_bit_p (inserted_exprs, SSA_NAME_VERSION (sprime))
>   && gimple_assign_load_p (stmt))
> 
> the heuristic would either need to become much more elaborate (do more
> checks whether vectorization is likely) or we could make the behavior
> depend on the cost model as well, for example exclude very-cheap here.
> That might have an influence on the performance benefit seen from
> -O2 default vectorization though.
> 
> IIRC we suggested to enable predictive commoning at -O2 but avoid
> unroll factors > 1 when it was not explicitely enabled.
> 

Yeah, it's PR100794.  I also collected some data for different approaches at
that time.  Recently I opened another issue PR102054 which is also related to
that we restrict PRE due to loop-vect.

[Bug tree-optimization/102383] Missing optimization for PRE after enable O2 vectorization

2021-09-17 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102383

--- Comment #3 from Hongtao.liu  ---
The issue also exists for -O3

[Bug tree-optimization/102383] Missing optimization for PRE after enable O2 vectorization

2021-09-17 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102383

Richard Biener  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 CC||rguenth at gcc dot gnu.org
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2021-09-17

--- Comment #2 from Richard Biener  ---
The issue is that we tame PRE because it tends to inhibit vectorization.

  /* Inhibit the use of an inserted PHI on a loop header when
 the address of the memory reference is a simple induction
 variable.  In other cases the vectorizer won't do anything
 anyway (either it's loop invariant or a complicated
 expression).  */
  if (sprime
  && TREE_CODE (sprime) == SSA_NAME
  && do_pre
  && (flag_tree_loop_vectorize || flag_tree_parallelize_loops > 1)
  && loop_outer (b->loop_father)
  && has_zero_uses (sprime)
  && bitmap_bit_p (inserted_exprs, SSA_NAME_VERSION (sprime))
  && gimple_assign_load_p (stmt))

the heuristic would either need to become much more elaborate (do more
checks whether vectorization is likely) or we could make the behavior
depend on the cost model as well, for example exclude very-cheap here.
That might have an influence on the performance benefit seen from
-O2 default vectorization though.

IIRC we suggested to enable predictive commoning at -O2 but avoid
unroll factors > 1 when it was not explicitely enabled.

Note that the issue for this testcase is that w/o PRE the predcom
behaves differently (but the testcase comment suggests that we'd
have to undo PRE).

[Bug tree-optimization/102383] Missing optimization for PRE after enable O2 vectorization

2021-09-16 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102383

--- Comment #1 from Hongtao.liu  ---
Similar issue for gfortran.dg/pr77498.f?(not quite sure)