Hi all,

This patch sets the default value to 16 for parameter
max_unrolled_average_calls which could be used to restict calls in loop
when unrolling.  This default value(16) is a big number which keeps
current behavior for almost all cases.

Bootstrap and regtest pass on powerpc64le.  Is this ok for trunk?

Thanks for comments!

Jiufu Guo

gcc/ChangeLog
2020-09-16  Jiufu Guo   <guoji...@cn.ibm.com>

        * params.opt (param_max_unrolled_average_calls_x10000): New param.
        * cfgloop.h (average_num_loop_calls): New declare.
        * cfgloopanal.c (average_num_loop_calls): New function.
        * loop-unroll.c (decide_unroll_constant_iteration,
        decide_unroll_runtime_iterations,
        decide_unroll_stupid): Check average_num_loop_calls and
        param_max_unrolled_average_calls_x10000.
---
 gcc/cfgloop.h     |  2 ++
 gcc/cfgloopanal.c | 25 +++++++++++++++++++++++++
 gcc/loop-unroll.c | 10 ++++++++++
 gcc/params.opt    |  4 ++++
 4 files changed, 41 insertions(+)

diff --git a/gcc/cfgloop.h b/gcc/cfgloop.h
index 18b404e292f..dab933da150 100644
--- a/gcc/cfgloop.h
+++ b/gcc/cfgloop.h
@@ -21,6 +21,7 @@ along with GCC; see the file COPYING3.  If not see
 #define GCC_CFGLOOP_H
 
 #include "cfgloopmanip.h"
+#include "sreal.h"
 
 /* Structure to hold decision about unrolling/peeling.  */
 enum lpt_dec
@@ -387,6 +388,7 @@ extern vec<edge> get_loop_exit_edges (const class loop *, 
basic_block * = NULL);
 extern edge single_exit (const class loop *);
 extern edge single_likely_exit (class loop *loop, vec<edge>);
 extern unsigned num_loop_branches (const class loop *);
+extern sreal average_num_loop_calls (const class loop *);
 
 extern edge loop_preheader_edge (const class loop *);
 extern edge loop_latch_edge (const class loop *);
diff --git a/gcc/cfgloopanal.c b/gcc/cfgloopanal.c
index 0b33e8272a7..a314db4e0c0 100644
--- a/gcc/cfgloopanal.c
+++ b/gcc/cfgloopanal.c
@@ -233,6 +233,31 @@ average_num_loop_insns (const class loop *loop)
   return ret;
 }
 
+/* Count the number of call insns in LOOP.  */
+sreal
+average_num_loop_calls (const class loop *loop)
+{
+  basic_block *bbs;
+  rtx_insn *insn;
+  unsigned int i, bncalls;
+  sreal ncalls = 0;
+
+  bbs = get_loop_body (loop);
+  for (i = 0; i < loop->num_nodes; i++)
+    {
+      bncalls = 0;
+      FOR_BB_INSNS (bbs[i], insn)
+       if (CALL_P (insn))
+         bncalls++;
+
+      ncalls += (sreal) bncalls
+       * bbs[i]->count.to_sreal_scale (loop->header->count);
+    }
+  free (bbs);
+
+  return ncalls;
+}
+
 /* Returns expected number of iterations of LOOP, according to
    measured or guessed profile.
 
diff --git a/gcc/loop-unroll.c b/gcc/loop-unroll.c
index 693c7768868..56b8fb37d2a 100644
--- a/gcc/loop-unroll.c
+++ b/gcc/loop-unroll.c
@@ -370,6 +370,10 @@ decide_unroll_constant_iterations (class loop *loop, int 
flags)
     nunroll = nunroll_by_av;
   if (nunroll > (unsigned) param_max_unroll_times)
     nunroll = param_max_unroll_times;
+  if (!loop->unroll
+      && (average_num_loop_calls (loop) * (sreal) 10000).to_int ()
+          > (unsigned) param_max_unrolled_average_calls_x10000)
+    nunroll = 0;
 
   if (targetm.loop_unroll_adjust)
     nunroll = targetm.loop_unroll_adjust (nunroll, loop);
@@ -689,6 +693,9 @@ decide_unroll_runtime_iterations (class loop *loop, int 
flags)
     nunroll = nunroll_by_av;
   if (nunroll > (unsigned) param_max_unroll_times)
     nunroll = param_max_unroll_times;
+  if ((average_num_loop_calls (loop) * (sreal) 10000).to_int ()
+      > (unsigned) param_max_unrolled_average_calls_x10000)
+    nunroll = 0;
 
   if (targetm.loop_unroll_adjust)
     nunroll = targetm.loop_unroll_adjust (nunroll, loop);
@@ -1173,6 +1180,9 @@ decide_unroll_stupid (class loop *loop, int flags)
     nunroll = nunroll_by_av;
   if (nunroll > (unsigned) param_max_unroll_times)
     nunroll = param_max_unroll_times;
+  if ((average_num_loop_calls (loop) * (sreal) 10000).to_int ()
+      > (unsigned) param_max_unrolled_average_calls_x10000)
+    nunroll = 0;
 
   if (targetm.loop_unroll_adjust)
     nunroll = targetm.loop_unroll_adjust (nunroll, loop);
diff --git a/gcc/params.opt b/gcc/params.opt
index f39e5d1a012..80605861223 100644
--- a/gcc/params.opt
+++ b/gcc/params.opt
@@ -634,6 +634,10 @@ The maximum number of unrollings of a single loop.
 Common Joined UInteger Var(param_max_unrolled_insns) Init(200) Param 
Optimization
 The maximum number of instructions to consider to unroll in a loop.
 
+-param=max-unrolled-average-calls-x10000=
+Common Joined UInteger Var(param_max_unrolled_average_calls_x10000) 
Init(160000) Param Optimization
+The maximum number of calls to consider to unroll in a loop on average and 
multiply 10000.
+
 -param=max-unswitch-insns=
 Common Joined UInteger Var(param_max_unswitch_insns) Init(50) Param 
Optimization
 The maximum number of insns of an unswitched loop.
-- 
2.25.1



Jan Hubicka <hubi...@ucw.cz> writes:

>> On Thu, Aug 20, 2020 at 6:35 AM guojiufu via Gcc-patches
>> <gcc-patches@gcc.gnu.org> wrote:
>> >
>> > Hi,
>> >
>> > When unroll loops, if there are calls inside the loop, those calls
>> > may raise negative impacts for unrolling.  This patch adds a param
>> > param_max_unrolled_calls, and checks if the number of calls inside
>> > the loop bigger than this param, loop is prevent from unrolling.
>> >
>> > This patch is checking the _average_ number of calls which is the
>> > summary of call numbers multiply the possibility of the call maybe
>> > executed.  The _average_ number could be a fraction, to keep the
>> > precision, the param is the threshold number multiply 10000.
>> >
>> > Bootstrap and regtest pass on powerpc64le.  Is this ok for trunk?
>> 
>> Can you try mimicking what try_unroll_loop_completely on GIMPLE does
>> instead?  IIRC the main motivation to not unroll calls is the spilling code
>> around it which we cannot estimate very well.  And that spilling happens
>> irrespective of whether the call is in a hot or cold path so I'm not sure
>> it makes sense to use the "average" number of calls here.
>
> As long as I remember, we excluded calls simply becuase it is/was an
> expensive intruction so it was an indication that the loop overhead is
> small compared to the overhead of loop body.
>
> Honza

Reply via email to