On Wed, Feb 05, 2014 at 02:20:05PM +0100, Richard Biener wrote: > > Using !!optimize to determine if we should switch local ABI to regparm > > convention isn't compatible with optimize attribute, as !!optimize is > > whether the current function is being optimized, but for the ABI decisions > > we actually need the caller and callee to agree on the calling convention. > > > > Fixed by looking at callee's optimize settings all the time. > > > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > Seeing a 2nd variant of such code I'd rather have an abstraction > for this ... optimize_fn ()? opt_for_fn (fn, OPT_...)?
Here is an updated version of the patch, unfortunately this one regressed in quite a lot of tests. The problem is that by using opt_for_fn there it also sets the failure reason for all functions at -O0 that don't have the optimize attribute at all. Now copy_forbidden is used by the inliner, where we have to handle always_inline functions, other -O0 functions we likely don't want to inline, be it because the cmd line options are -O0 or because the function we are considering for inlining has optimize (0), and for versioning, where I'd say we want to never version the -O0 functions. So, where do we want to do that instead? E.g. should it be e.g. in tree_versionable_function_p directly and let the inliner (if it doesn't do already) also treat optimize(0) functions that aren't always_inline as noinline? I guess even without this patch my PR60026 fix broke inlining of __attribute__((always_inline, optimize (0))) functions. 2014-02-05 Jakub Jelinek <ja...@redhat.com> PR target/60062 * tree.h (opts_for_fn): New inline function. (opt_for_fn): Define. * config/i386/i386.c (ix86_function_regparm): Use opt_for_fn (decl, optimize) instead of optimize. * tree-inline.c (copy_forbidden): Use opt_for_fn. * gcc.c-torture/execute/pr60062.c: New test. * gcc.c-torture/execute/pr60072.c: New test. --- gcc/tree.h.jj 2014-01-20 19:16:29.000000000 +0100 +++ gcc/tree.h 2014-02-05 15:54:04.681904492 +0100 @@ -4470,6 +4470,20 @@ may_be_aliased (const_tree var) || TREE_ADDRESSABLE (var))); } +/* Return pointer to optimization flags of FNDECL. */ +static inline struct cl_optimization * +opts_for_fn (const_tree fndecl) +{ + tree fn_opts = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (fndecl); + if (fn_opts == NULL_TREE) + fn_opts = optimization_default_node; + return TREE_OPTIMIZATION (fn_opts); +} + +/* opt flag for function FNDECL, e.g. opts_for_fn (fndecl, optimize) is + the optimization level of function fndecl. */ +#define opt_for_fn(fndecl, opt) (opts_for_fn (fndecl)->x_##opt) + /* For anonymous aggregate types, we need some sort of name to hold on to. In practice, this should not appear, but it should not be harmful if it does. */ --- gcc/config/i386/i386.c.jj 2014-02-05 10:37:36.166167116 +0100 +++ gcc/config/i386/i386.c 2014-02-05 15:56:36.779146394 +0100 @@ -5608,7 +5608,12 @@ ix86_function_regparm (const_tree type, /* Use register calling convention for local functions when possible. */ if (decl && TREE_CODE (decl) == FUNCTION_DECL - && optimize + /* Caller and callee must agree on the calling convention, so + checking here just optimize means that with + __attribute__((optimize (...))) caller could use regparm convention + and callee not, or vice versa. Instead look at whether the callee + is optimized or not. */ + && opt_for_fn (decl, optimize) && !(profile_flag && !flag_fentry)) { /* FIXME: remove this CONST_CAST when cgraph.[ch] is constified. */ --- gcc/tree-inline.c.jj 2014-02-04 14:03:31.000000000 +0100 +++ gcc/tree-inline.c 2014-02-05 15:55:34.747448968 +0100 @@ -3315,16 +3315,10 @@ copy_forbidden (struct function *fun, tr goto fail; } - tree fs_opts; - fs_opts = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (fun->decl); - if (fs_opts) + if (!opt_for_fn (fun->decl, optimize)) { - struct cl_optimization *os = TREE_OPTIMIZATION (fs_opts); - if (!os->x_optimize) - { - reason = G_("function %q+F compiled without optimizations"); - goto fail; - } + reason = G_("function %q+F compiled without optimizations"); + goto fail; } fail: --- gcc/testsuite/gcc.c-torture/execute/pr60062.c.jj 2014-02-05 15:55:48.639388697 +0100 +++ gcc/testsuite/gcc.c-torture/execute/pr60062.c 2014-02-05 15:55:48.639388697 +0100 @@ -0,0 +1,25 @@ +/* PR target/60062 */ + +int a; + +static void +foo (const char *p1, int p2) +{ + if (__builtin_strcmp (p1, "hello") != 0) + __builtin_abort (); +} + +static void +bar (const char *p1) +{ + if (__builtin_strcmp (p1, "hello") != 0) + __builtin_abort (); +} + +__attribute__((optimize (0))) int +main () +{ + foo ("hello", a); + bar ("hello"); + return 0; +} --- gcc/testsuite/gcc.c-torture/execute/pr60072.c.jj 2014-02-05 15:55:48.640388641 +0100 +++ gcc/testsuite/gcc.c-torture/execute/pr60072.c 2014-02-05 15:55:48.640388641 +0100 @@ -0,0 +1,16 @@ +/* PR target/60072 */ + +int c = 1; + +__attribute__ ((optimize (1))) +static int *foo (int *p) +{ + return p; +} + +int +main () +{ + *foo (&c) = 2; + return c - 2; +} Jakub