Re: [PATCH] Fix ix86_function_regparm with optimize attribute (PR target/60062, take 2)
On Wed, 5 Feb 2014, Jakub Jelinek wrote: On Wed, Feb 05, 2014 at 02:20:05PM +0100, Richard Biener wrote: Using !!optimize to determine if we should switch local ABI to regparm convention isn't compatible with optimize attribute, as !!optimize is whether the current function is being optimized, but for the ABI decisions we actually need the caller and callee to agree on the calling convention. Fixed by looking at callee's optimize settings all the time. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Seeing a 2nd variant of such code I'd rather have an abstraction for this ... optimize_fn ()? opt_for_fn (fn, OPT_...)? Here is an updated version of the patch, unfortunately this one regressed in quite a lot of tests. The problem is that by using opt_for_fn there it also sets the failure reason for all functions at -O0 that don't have the optimize attribute at all. Now copy_forbidden is used by the inliner, where we have to handle always_inline functions, other -O0 functions we likely don't want to inline, be it because the cmd line options are -O0 or because the function we are considering for inlining has optimize (0), and for versioning, where I'd say we want to never version the -O0 functions. So, where do we want to do that instead? E.g. should it be e.g. in tree_versionable_function_p directly and let the inliner (if it doesn't do already) also treat optimize(0) functions that aren't always_inline as noinline? I guess even without this patch my PR60026 fix broke inlining of __attribute__((always_inline, optimize (0))) functions. 2014-02-05 Jakub Jelinek ja...@redhat.com PR target/60062 * tree.h (opts_for_fn): New inline function. (opt_for_fn): Define. * config/i386/i386.c (ix86_function_regparm): Use opt_for_fn (decl, optimize) instead of optimize. * tree-inline.c (copy_forbidden): Use opt_for_fn. * gcc.c-torture/execute/pr60062.c: New test. * gcc.c-torture/execute/pr60072.c: New test. --- gcc/tree.h.jj 2014-01-20 19:16:29.0 +0100 +++ gcc/tree.h2014-02-05 15:54:04.681904492 +0100 @@ -4470,6 +4470,20 @@ may_be_aliased (const_tree var) || TREE_ADDRESSABLE (var))); } +/* Return pointer to optimization flags of FNDECL. */ +static inline struct cl_optimization * +opts_for_fn (const_tree fndecl) +{ + tree fn_opts = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (fndecl); + if (fn_opts == NULL_TREE) +fn_opts = optimization_default_node; + return TREE_OPTIMIZATION (fn_opts); +} + +/* opt flag for function FNDECL, e.g. opts_for_fn (fndecl, optimize) is + the optimization level of function fndecl. */ +#define opt_for_fn(fndecl, opt) (opts_for_fn (fndecl)-x_##opt) + /* For anonymous aggregate types, we need some sort of name to hold on to. In practice, this should not appear, but it should not be harmful if it does. */ --- gcc/config/i386/i386.c.jj 2014-02-05 10:37:36.166167116 +0100 +++ gcc/config/i386/i386.c2014-02-05 15:56:36.779146394 +0100 @@ -5608,7 +5608,12 @@ ix86_function_regparm (const_tree type, /* Use register calling convention for local functions when possible. */ if (decl TREE_CODE (decl) == FUNCTION_DECL - optimize + /* Caller and callee must agree on the calling convention, so + checking here just optimize means that with + __attribute__((optimize (...))) caller could use regparm convention + and callee not, or vice versa. Instead look at whether the callee + is optimized or not. */ + opt_for_fn (decl, optimize) !(profile_flag !flag_fentry)) { /* FIXME: remove this CONST_CAST when cgraph.[ch] is constified. */ --- gcc/tree-inline.c.jj 2014-02-04 14:03:31.0 +0100 +++ gcc/tree-inline.c 2014-02-05 15:55:34.747448968 +0100 @@ -3315,16 +3315,10 @@ copy_forbidden (struct function *fun, tr goto fail; } - tree fs_opts; - fs_opts = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (fun-decl); - if (fs_opts) + if (!opt_for_fn (fun-decl, optimize)) Maybe if (!opt_for_fn (fun-decl, optimize) optimization_default_node-x_optimize != 0) ? But yes, copy_forbidden is a bit big hammer considering always_inline and similar stuff. I think we need to handle -O0 at all consumers selectively. { - struct cl_optimization *os = TREE_OPTIMIZATION (fs_opts); - if (!os-x_optimize) - { - reason = G_(function %q+F compiled without optimizations); - goto fail; - } + reason = G_(function %q+F compiled without optimizations); + goto fail; } fail: --- gcc/testsuite/gcc.c-torture/execute/pr60062.c.jj 2014-02-05 15:55:48.639388697 +0100 +++ gcc/testsuite/gcc.c-torture/execute/pr60062.c 2014-02-05 15:55:48.639388697 +0100 @@ -0,0 +1,25 @@ +/* PR target/60062 */ + +int a; + +static void +foo (const char
[PATCH] Fix ix86_function_regparm with optimize attribute (PR target/60062, take 2)
On Wed, Feb 05, 2014 at 02:20:05PM +0100, Richard Biener wrote: Using !!optimize to determine if we should switch local ABI to regparm convention isn't compatible with optimize attribute, as !!optimize is whether the current function is being optimized, but for the ABI decisions we actually need the caller and callee to agree on the calling convention. Fixed by looking at callee's optimize settings all the time. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Seeing a 2nd variant of such code I'd rather have an abstraction for this ... optimize_fn ()? opt_for_fn (fn, OPT_...)? Here is an updated version of the patch, unfortunately this one regressed in quite a lot of tests. The problem is that by using opt_for_fn there it also sets the failure reason for all functions at -O0 that don't have the optimize attribute at all. Now copy_forbidden is used by the inliner, where we have to handle always_inline functions, other -O0 functions we likely don't want to inline, be it because the cmd line options are -O0 or because the function we are considering for inlining has optimize (0), and for versioning, where I'd say we want to never version the -O0 functions. So, where do we want to do that instead? E.g. should it be e.g. in tree_versionable_function_p directly and let the inliner (if it doesn't do already) also treat optimize(0) functions that aren't always_inline as noinline? I guess even without this patch my PR60026 fix broke inlining of __attribute__((always_inline, optimize (0))) functions. 2014-02-05 Jakub Jelinek ja...@redhat.com PR target/60062 * tree.h (opts_for_fn): New inline function. (opt_for_fn): Define. * config/i386/i386.c (ix86_function_regparm): Use opt_for_fn (decl, optimize) instead of optimize. * tree-inline.c (copy_forbidden): Use opt_for_fn. * gcc.c-torture/execute/pr60062.c: New test. * gcc.c-torture/execute/pr60072.c: New test. --- gcc/tree.h.jj 2014-01-20 19:16:29.0 +0100 +++ gcc/tree.h 2014-02-05 15:54:04.681904492 +0100 @@ -4470,6 +4470,20 @@ may_be_aliased (const_tree var) || TREE_ADDRESSABLE (var))); } +/* Return pointer to optimization flags of FNDECL. */ +static inline struct cl_optimization * +opts_for_fn (const_tree fndecl) +{ + tree fn_opts = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (fndecl); + if (fn_opts == NULL_TREE) +fn_opts = optimization_default_node; + return TREE_OPTIMIZATION (fn_opts); +} + +/* opt flag for function FNDECL, e.g. opts_for_fn (fndecl, optimize) is + the optimization level of function fndecl. */ +#define opt_for_fn(fndecl, opt) (opts_for_fn (fndecl)-x_##opt) + /* For anonymous aggregate types, we need some sort of name to hold on to. In practice, this should not appear, but it should not be harmful if it does. */ --- gcc/config/i386/i386.c.jj 2014-02-05 10:37:36.166167116 +0100 +++ gcc/config/i386/i386.c 2014-02-05 15:56:36.779146394 +0100 @@ -5608,7 +5608,12 @@ ix86_function_regparm (const_tree type, /* Use register calling convention for local functions when possible. */ if (decl TREE_CODE (decl) == FUNCTION_DECL - optimize + /* Caller and callee must agree on the calling convention, so +checking here just optimize means that with +__attribute__((optimize (...))) caller could use regparm convention +and callee not, or vice versa. Instead look at whether the callee +is optimized or not. */ + opt_for_fn (decl, optimize) !(profile_flag !flag_fentry)) { /* FIXME: remove this CONST_CAST when cgraph.[ch] is constified. */ --- gcc/tree-inline.c.jj2014-02-04 14:03:31.0 +0100 +++ gcc/tree-inline.c 2014-02-05 15:55:34.747448968 +0100 @@ -3315,16 +3315,10 @@ copy_forbidden (struct function *fun, tr goto fail; } - tree fs_opts; - fs_opts = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (fun-decl); - if (fs_opts) + if (!opt_for_fn (fun-decl, optimize)) { - struct cl_optimization *os = TREE_OPTIMIZATION (fs_opts); - if (!os-x_optimize) - { - reason = G_(function %q+F compiled without optimizations); - goto fail; - } + reason = G_(function %q+F compiled without optimizations); + goto fail; } fail: --- gcc/testsuite/gcc.c-torture/execute/pr60062.c.jj2014-02-05 15:55:48.639388697 +0100 +++ gcc/testsuite/gcc.c-torture/execute/pr60062.c 2014-02-05 15:55:48.639388697 +0100 @@ -0,0 +1,25 @@ +/* PR target/60062 */ + +int a; + +static void +foo (const char *p1, int p2) +{ + if (__builtin_strcmp (p1, hello) != 0) +__builtin_abort (); +} + +static void +bar (const char *p1) +{ + if (__builtin_strcmp (p1, hello) != 0) +__builtin_abort (); +} + +__attribute__((optimize (0))) int +main () +{ + foo (hello, a); + bar (hello); + return 0; +} --- gcc/testsuite/gcc.c-torture/execute/pr60072.c.jj2014-02-05