Re: [PATCH] Fix ix86_function_regparm with optimize attribute (PR target/60062, take 3)
On Thu, Feb 6, 2014 at 10:45 AM, Richard Biener wrote: > On Thu, 6 Feb 2014, Jakub Jelinek wrote: > >> On Wed, Feb 05, 2014 at 08:42:27PM +0100, Jakub Jelinek wrote: >> > So, where do we want to do that instead? E.g. should it be e.g. in >> > tree_versionable_function_p directly and let the inliner (if it doesn't do >> > already) also treat optimize(0) functions that aren't always_inline as >> > noinline? >> >> So, another attempt to put the && opt_for_fn (fndecl, optimize) into >> tree_versionable_function_p also failed, because e.g. for TM (but also for >> SIMD clones) we need to clone -O0 functions. So, can I fix PR600{6,7}2 >> without touching tree-inline.c and fix PR60026 again separately somehow else >> as follow-up? Bootstrapped/regtested on x86_64-linux and i686-linux. > > That works for me. Also OK for x86. Thanks, Uros.
Re: [PATCH] Fix ix86_function_regparm with optimize attribute (PR target/60062, take 3)
On Thu, 6 Feb 2014, Jakub Jelinek wrote: > On Wed, Feb 05, 2014 at 08:42:27PM +0100, Jakub Jelinek wrote: > > So, where do we want to do that instead? E.g. should it be e.g. in > > tree_versionable_function_p directly and let the inliner (if it doesn't do > > already) also treat optimize(0) functions that aren't always_inline as > > noinline? > > So, another attempt to put the && opt_for_fn (fndecl, optimize) into > tree_versionable_function_p also failed, because e.g. for TM (but also for > SIMD clones) we need to clone -O0 functions. So, can I fix PR600{6,7}2 > without touching tree-inline.c and fix PR60026 again separately somehow else > as follow-up? Bootstrapped/regtested on x86_64-linux and i686-linux. That works for me. Thanks, Richard. > 2014-02-06 Jakub Jelinek > > PR target/60062 > * tree.h (opts_for_fn): New inline function. > (opt_for_fn): Define. > * config/i386/i386.c (ix86_function_regparm): Use > opt_for_fn (decl, optimize) instead of optimize. > > * gcc.c-torture/execute/pr60062.c: New test. > * gcc.c-torture/execute/pr60072.c: New test. > > --- gcc/tree.h.jj 2014-01-20 19:16:29.0 +0100 > +++ gcc/tree.h2014-02-05 15:54:04.681904492 +0100 > @@ -4470,6 +4470,20 @@ may_be_aliased (const_tree var) > || TREE_ADDRESSABLE (var))); > } > > +/* Return pointer to optimization flags of FNDECL. */ > +static inline struct cl_optimization * > +opts_for_fn (const_tree fndecl) > +{ > + tree fn_opts = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (fndecl); > + if (fn_opts == NULL_TREE) > +fn_opts = optimization_default_node; > + return TREE_OPTIMIZATION (fn_opts); > +} > + > +/* opt flag for function FNDECL, e.g. opts_for_fn (fndecl, optimize) is > + the optimization level of function fndecl. */ > +#define opt_for_fn(fndecl, opt) (opts_for_fn (fndecl)->x_##opt) > + > /* For anonymous aggregate types, we need some sort of name to > hold on to. In practice, this should not appear, but it should > not be harmful if it does. */ > --- gcc/config/i386/i386.c.jj 2014-02-05 10:37:36.166167116 +0100 > +++ gcc/config/i386/i386.c2014-02-05 15:56:36.779146394 +0100 > @@ -5608,7 +5608,12 @@ ix86_function_regparm (const_tree type, >/* Use register calling convention for local functions when possible. */ >if (decl >&& TREE_CODE (decl) == FUNCTION_DECL > - && optimize > + /* Caller and callee must agree on the calling convention, so > + checking here just optimize means that with > + __attribute__((optimize (...))) caller could use regparm convention > + and callee not, or vice versa. Instead look at whether the callee > + is optimized or not. */ > + && opt_for_fn (decl, optimize) >&& !(profile_flag && !flag_fentry)) > { >/* FIXME: remove this CONST_CAST when cgraph.[ch] is constified. */ > --- gcc/testsuite/gcc.c-torture/execute/pr60062.c.jj 2014-02-05 > 15:55:48.639388697 +0100 > +++ gcc/testsuite/gcc.c-torture/execute/pr60062.c 2014-02-05 > 15:55:48.639388697 +0100 > @@ -0,0 +1,25 @@ > +/* PR target/60062 */ > + > +int a; > + > +static void > +foo (const char *p1, int p2) > +{ > + if (__builtin_strcmp (p1, "hello") != 0) > +__builtin_abort (); > +} > + > +static void > +bar (const char *p1) > +{ > + if (__builtin_strcmp (p1, "hello") != 0) > +__builtin_abort (); > +} > + > +__attribute__((optimize (0))) int > +main () > +{ > + foo ("hello", a); > + bar ("hello"); > + return 0; > +} > --- gcc/testsuite/gcc.c-torture/execute/pr60072.c.jj 2014-02-05 > 15:55:48.640388641 +0100 > +++ gcc/testsuite/gcc.c-torture/execute/pr60072.c 2014-02-05 > 15:55:48.640388641 +0100 > @@ -0,0 +1,16 @@ > +/* PR target/60072 */ > + > +int c = 1; > + > +__attribute__ ((optimize (1))) > +static int *foo (int *p) > +{ > + return p; > +} > + > +int > +main () > +{ > + *foo (&c) = 2; > + return c - 2; > +} > > > Jakub > > -- Richard Biener SUSE / SUSE Labs SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer
[PATCH] Fix ix86_function_regparm with optimize attribute (PR target/60062, take 3)
On Wed, Feb 05, 2014 at 08:42:27PM +0100, Jakub Jelinek wrote: > So, where do we want to do that instead? E.g. should it be e.g. in > tree_versionable_function_p directly and let the inliner (if it doesn't do > already) also treat optimize(0) functions that aren't always_inline as > noinline? So, another attempt to put the && opt_for_fn (fndecl, optimize) into tree_versionable_function_p also failed, because e.g. for TM (but also for SIMD clones) we need to clone -O0 functions. So, can I fix PR600{6,7}2 without touching tree-inline.c and fix PR60026 again separately somehow else as follow-up? Bootstrapped/regtested on x86_64-linux and i686-linux. 2014-02-06 Jakub Jelinek PR target/60062 * tree.h (opts_for_fn): New inline function. (opt_for_fn): Define. * config/i386/i386.c (ix86_function_regparm): Use opt_for_fn (decl, optimize) instead of optimize. * gcc.c-torture/execute/pr60062.c: New test. * gcc.c-torture/execute/pr60072.c: New test. --- gcc/tree.h.jj 2014-01-20 19:16:29.0 +0100 +++ gcc/tree.h 2014-02-05 15:54:04.681904492 +0100 @@ -4470,6 +4470,20 @@ may_be_aliased (const_tree var) || TREE_ADDRESSABLE (var))); } +/* Return pointer to optimization flags of FNDECL. */ +static inline struct cl_optimization * +opts_for_fn (const_tree fndecl) +{ + tree fn_opts = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (fndecl); + if (fn_opts == NULL_TREE) +fn_opts = optimization_default_node; + return TREE_OPTIMIZATION (fn_opts); +} + +/* opt flag for function FNDECL, e.g. opts_for_fn (fndecl, optimize) is + the optimization level of function fndecl. */ +#define opt_for_fn(fndecl, opt) (opts_for_fn (fndecl)->x_##opt) + /* For anonymous aggregate types, we need some sort of name to hold on to. In practice, this should not appear, but it should not be harmful if it does. */ --- gcc/config/i386/i386.c.jj 2014-02-05 10:37:36.166167116 +0100 +++ gcc/config/i386/i386.c 2014-02-05 15:56:36.779146394 +0100 @@ -5608,7 +5608,12 @@ ix86_function_regparm (const_tree type, /* Use register calling convention for local functions when possible. */ if (decl && TREE_CODE (decl) == FUNCTION_DECL - && optimize + /* Caller and callee must agree on the calling convention, so +checking here just optimize means that with +__attribute__((optimize (...))) caller could use regparm convention +and callee not, or vice versa. Instead look at whether the callee +is optimized or not. */ + && opt_for_fn (decl, optimize) && !(profile_flag && !flag_fentry)) { /* FIXME: remove this CONST_CAST when cgraph.[ch] is constified. */ --- gcc/testsuite/gcc.c-torture/execute/pr60062.c.jj2014-02-05 15:55:48.639388697 +0100 +++ gcc/testsuite/gcc.c-torture/execute/pr60062.c 2014-02-05 15:55:48.639388697 +0100 @@ -0,0 +1,25 @@ +/* PR target/60062 */ + +int a; + +static void +foo (const char *p1, int p2) +{ + if (__builtin_strcmp (p1, "hello") != 0) +__builtin_abort (); +} + +static void +bar (const char *p1) +{ + if (__builtin_strcmp (p1, "hello") != 0) +__builtin_abort (); +} + +__attribute__((optimize (0))) int +main () +{ + foo ("hello", a); + bar ("hello"); + return 0; +} --- gcc/testsuite/gcc.c-torture/execute/pr60072.c.jj2014-02-05 15:55:48.640388641 +0100 +++ gcc/testsuite/gcc.c-torture/execute/pr60072.c 2014-02-05 15:55:48.640388641 +0100 @@ -0,0 +1,16 @@ +/* PR target/60072 */ + +int c = 1; + +__attribute__ ((optimize (1))) +static int *foo (int *p) +{ + return p; +} + +int +main () +{ + *foo (&c) = 2; + return c - 2; +} Jakub