On March 14, 2020 10:55:09 AM GMT+01:00, "FRÉDÉRIC RECOULES"
<[email protected]> wrote:
>Hello the GCC community,
>I just want to share some thoughts on inlining a function even if
>it is called through a function pointer.
>My starting point is the version 9.2 (used at https://godbolt.org/),
>so I am sorry if something similar have already been discussed since.
>
>
>For the context, I got very excited when I discovered the (not so new
>but not yet really used) Link Time Optimization and I started to play
>with
>to put under the test the inlining capacities.
>I will assume however that LTO is just an enabler and so, examples can
>be
>simplified by writing everything in the same file and activate the
>whole
>program optimization.
>
>
>To make my remarks concrete, I will rely on the following (dumb but
>inspired by real software) example compiled with -O3 -fwhole-program:
>
>int (*f) (int, int);
>
>static int f_add (int x, int y)
>{
> return x + y;
>}
>
>static int f_sub (int x, int y)
>{
> return x - y;
>}
>
>enum f_e { ADD, SUB };
>void f_init(enum f_e op) {
> switch (op) {
> case ADD:
> f = &f_add;
> break;
> case SUB:
> f = &f_sub;
> }
>}
>
>STEP 1: statically known at function call site
>
>#include <stdlib.h>
>#include <stdio.h>
>
>int main (int argc, char *argv[])
>{
> int x, y, z;
> f_init(ADD);
> if (argc < 3) return -1;
> x = atoi(argv[1]);
> y = atoi(argv[2]);
> z = f(x, y);
> printf("%d\n", z);
> return 0;
>}
>
>I was pretty disappointed to see that even if the compiler knows we are
>calling f_add, it doesn't inline the call (it ends up with "call
>f_add").
It's probably because we know it's only called once and thus not performance
relevant. Try put it into a loop.
Richard.
>I can but only suppose it is because its address is taken and from a
>blind black box user perspective, it doesn't sound too difficult to
>completely inline it.
>
>STEP 2: statically known as being among a pool of less than
> (arbitrarily fixed = 2) N functions
>
>#include <stdlib.h>
>#include <stdio.h>
>#include <string.h>
>
>int main (int argc, char *argv[])
>{
> int x, y, z;
> enum f_e e;
> if (argc < 4) return -1;
> if (strcmp(argv[1], "add") == 0)
> e = ADD;
> else if (strcmp(argv[1], "sub") == 0)
> e = SUB;
> else return -1;
> f_init(e);
> x = atoi(argv[2]);
> y = atoi(argv[3]);
> z = f(x, y);
> printf("%d\n", z);
> return 0;
>}
>
>Here the compiler can't know at compile time the function that will be
>called but I suppose that it knows that it will be either f_add or
>f_sub.
>A simple work around would be for the compiler to test at the call site
>the value of f and inline the call thereafter:
>
> if (f == &f_add)
> z = f_add(x, y);
> else if (f == &f_sub)
> z = f_sub(x, y);
> else __builtin_unreachable(); /* or z = f(x, y) to be conservative */
>
>Once again, this transformation don't sound too complicated to
>implement.
>Still, easy to say-so without diving into the compiler's code.
>
>
>I hope it will assist you in your reflections,
>Have a nice day,
>Frédéric Recoules