https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59515

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
-fkeep-inline-functions results for

extern void foobar();
struct Foo {
    void baz () {
        foobar();
        foobar();
    }
    void bar () {
        baz();
        baz();
    }
};
int main()
{
  Foo foo;
  foo.bar();
  return 0;
}

in

_ZN3Foo3bazEv:
.LFB0:
        .cfi_startproc
        subq    $8, %rsp
        .cfi_def_cfa_offset 16
        call    _Z6foobarv
        call    _Z6foobarv
        addq    $8, %rsp
        .cfi_def_cfa_offset 8
        ret

_ZN3Foo3barEv:
.LFB1:
        .cfi_startproc
        subq    $8, %rsp
        .cfi_def_cfa_offset 16
        call    _Z6foobarv
        call    _Z6foobarv
        call    _Z6foobarv
        call    _Z6foobarv
        addq    $8, %rsp
        .cfi_def_cfa_offset 8
        ret

main:
.LFB2:
        .cfi_startproc
        subq    $8, %rsp
        .cfi_def_cfa_offset 16
        call    _Z6foobarv
        call    _Z6foobarv
        call    _Z6foobarv
        call    _Z6foobarv
        movl    $0, %eax
        addq    $8, %rsp
        .cfi_def_cfa_offset 8
        ret


that's undesirable and for some testcases can lead to exponential increase
of compile-time and binary size.  So what we want is for the functions
just kept because of -fkeep-inline-functions is restricting inlining
even more (_really_ only optimize if the size shrinks), or maybe even
disable inlining completely.

As said the implementation side would require keeping offline copy clones
due to the way early inlining works.

Another option would be to reduce --param early-inlining-insns for -Og
and live with the theoretical exponential explosion issue.
With that --param set to zero we do not inline into main but we do inline
into bar () (weird).  The threshold to inline into main is 10.
Something is wrong with size accounting it seems?

Analyzing function body size: void Foo::baz()
                Accounting size:2.00, time:0.00 on new predicate:(not inlined)

 BB 2 predicate:(true)
  foobar ();
                freq:1.00 size:  1 time: 10
  foobar ();
                freq:1.00 size:  1 time: 10
  return;
                freq:1.00 size:  1 time:  2
                Will be eliminated by inlining
                Accounting size:1.00, time:2.00 on predicate:(not inlined)

Inline summary for void Foo::baz()/0 inlinable
  self time:       22
  global time:     0
  self size:       5
  global size:     0
  min size:       0
  self stack:      0
  global stack:    0
    size:0.000000, time:0.000000, predicate:(true)
    size:3.000000, time:2.000000, predicate:(not inlined)

  calls:
    void Foo::baz()/0 function not considered for inlining
      loop depth: 0 freq:1000 size: 2 time: 11 callee size: 2 stack: 0
    void Foo::baz()/0 function not considered for inlining
      loop depth: 0 freq:1000 size: 2 time: 11 callee size: 2 stack: 0

so it seems the call with the this parameter is size 2 and the inlined
function with two calls w/o parameters is 2 as well.  Makes sense in
a simplistic way...

That said, a bandaid fix would be the following - some benchmarking
(compile-time / runtime / code-size) for whether adjusting the
early inlining param is necessary would be nice.  Documentation should
be adjusted to reflect the -fkeep-inline-functions default for -Og.

Index: gcc/opts.c
===================================================================
--- gcc/opts.c  (revision 254074)
+++ gcc/opts.c  (working copy)
@@ -673,11 +673,17 @@ default_options_optimization (struct gcc
                           default_param_value (PARAM_MIN_CROSSJUMP_INSNS),
                           opts->x_param_values, opts_set->x_param_values);

-  /* Restrict the amount of work combine does at -Og while retaining
-     most of its useful transforms.  */
   if (opts->x_optimize_debug)
-    maybe_set_param_value (PARAM_MAX_COMBINE_INSNS, 2,
-                          opts->x_param_values, opts_set->x_param_values);
+    {
+      /* Restrict the amount of work combine does at -Og while retaining
+        most of its useful transforms.  */
+      maybe_set_param_value (PARAM_MAX_COMBINE_INSNS, 2,
+                            opts->x_param_values, opts_set->x_param_values);
+      /* Restrict early inlining to avoid -fkeep-inline-functions kept
+         functions to grow too large.  */
+      maybe_set_param_value (PARAM_EARLY_INLINING_INSNS, 4,
+                            opts->x_param_values, opts_set->x_param_values);
+    }

   /* Allow default optimizations to be specified on a per-machine basis.  */
   maybe_default_options (opts, opts_set,
@@ -948,6 +954,11 @@ finish_options (struct gcc_options *opts
     maybe_set_param_value (PARAM_MAX_STORES_TO_SINK, 0,
                            opts->x_param_values, opts_set->x_param_values);

+  /* When using -Og enable -fkeep-inline-functions.  */
+  if (opts->x_optimize_debug
+      && !opts_set->x_flag_keep_inline_functions)
+    opts->x_flag_keep_inline_functions = 1;
+
   /* The -gsplit-dwarf option requires -ggnu-pubnames.  */
   if (opts->x_dwarf_split_debug_info)
     opts->x_debug_generate_pub_sections = 2;

Reply via email to