https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70359

--- Comment #34 from rguenther at suse dot de <rguenther at suse dot de> ---
On Thu, 8 Mar 2018, aldyh at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70359
> 
> --- Comment #32 from Aldy Hernandez <aldyh at gcc dot gnu.org> ---
> As mentioned in the previous comment, the proposed patch brings down the count
> from 116 to 108 on ARM, but is shy of the desired 96.
> 
> The missing bytes can be attributed to forwprop folding this (IL expanded for
> illustration):
> 
>   if (ui_7 / 10 != 0)
> 
> into:
> 
>   if (ui_7 > 9)
> 
> More specifically, changing this:
> 
>   # ui_7 = PHI <ui_13(2), ui_21(3)>
>   ...
>   ui_21 = ui_7 / 10;
>   if (ui_21 != 0)
> 
> into:
> 
>   # ui_7 = PHI <ui_13(2), ui_21(3)>
>   ...
>   ui_21 = ui_7 / 10;
>   if (ui_7 > 9)
> 
> Inhibiting this optimization brings down the byte count to 92 which is even
> lower than our 96 boogie man, so perhaps worth pursuing.  (Assumes my proposed
> patch is also applied.)  I'm no expert, but isn't a EQ/NE with 0 preferable
> than a <> with a non-zero?
> 
> If so, should we restrict the folding somewhat, or clean this up after the
> fact?
> 
> For reference, the folding (in forwprop) is due to this match.pd pattern:
> 
> /* X / C1 op C2 into a simple range test.  */
> 
> ...though eliminating it causes another pattern to pick up the slack and do 
> the
> same:
> 
> /* Transform:
>  * (X / Y) == 0 -> X < Y if X, Y are unsigned.
>  * (X / Y) != 0 -> X >= Y, if X, Y are unsigned.
>  */
> 
> Eliminating both patterns "fixes" the problem.
> 
> Suggestions welcome :).

In other places where this happened we add single_use () checks to
the patterns.  The :s isn't really effective for patterns that
simplify into "simple" expressions.  A patch doing that to the
two patterns in question would be OK I think.

Sth like

Index: gcc/match.pd
===================================================================
--- gcc/match.pd        (revision 258359)
+++ gcc/match.pd        (working copy)
@@ -1290,11 +1290,12 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 /* X / C1 op C2 into a simple range test.  */
 (for cmp (simple_comparison)
  (simplify
-  (cmp (trunc_div:s @0 INTEGER_CST@1) INTEGER_CST@2)
+  (cmp (trunc_div:s@3 @0 INTEGER_CST@1) INTEGER_CST@2)
   (if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
        && integer_nonzerop (@1)
        && !TREE_OVERFLOW (@1)
-       && !TREE_OVERFLOW (@2))
+       && !TREE_OVERFLOW (@2)
+       && single_use (@3))
    (with { tree lo, hi; bool neg_overflow;
           enum tree_code code = fold_div_compare (cmp, @1, @2, &lo, &hi,
                                                   &neg_overflow); }

tough there is at least one constant simplification result where
we shouldn't apply this restriction so moving the single_use check
to multiple places below or factoring things a bit would be
appropriate.

Similar for the other pattern then.

Reply via email to