Re: [PATCH] Fix ix86_function_regparm with optimize attribute (PR target/60062, take 2)

2014-02-06 Thread Richard Biener
On Wed, 5 Feb 2014, Jakub Jelinek wrote:

 On Wed, Feb 05, 2014 at 02:20:05PM +0100, Richard Biener wrote:
   Using !!optimize to determine if we should switch local ABI to regparm
   convention isn't compatible with optimize attribute, as !!optimize is
   whether the current function is being optimized, but for the ABI decisions
   we actually need the caller and callee to agree on the calling convention.
   
   Fixed by looking at callee's optimize settings all the time.
   
   Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
  
  Seeing a 2nd variant of such code I'd rather have an abstraction
  for this ... optimize_fn ()?  opt_for_fn (fn, OPT_...)?
 
 Here is an updated version of the patch, unfortunately this one regressed in
 quite a lot of tests.  The problem is that by using opt_for_fn there
 it also sets the failure reason for all functions at -O0 that don't have
 the optimize attribute at all.  Now copy_forbidden is used by the inliner,
 where we have to handle always_inline functions, other -O0 functions we likely
 don't want to inline, be it because the cmd line options are -O0 or because
 the function we are considering for inlining has optimize (0), and
 for versioning, where I'd say we want to never version the -O0 functions.
 
 So, where do we want to do that instead?  E.g. should it be e.g. in
 tree_versionable_function_p directly and let the inliner (if it doesn't do
 already) also treat optimize(0) functions that aren't always_inline as
 noinline?
 
 I guess even without this patch my PR60026 fix broke inlining of
 __attribute__((always_inline, optimize (0))) functions.
 
 2014-02-05  Jakub Jelinek  ja...@redhat.com
 
   PR target/60062
   * tree.h (opts_for_fn): New inline function.
   (opt_for_fn): Define.
   * config/i386/i386.c (ix86_function_regparm): Use
   opt_for_fn (decl, optimize) instead of optimize.
   * tree-inline.c (copy_forbidden): Use opt_for_fn.
 
   * gcc.c-torture/execute/pr60062.c: New test.
   * gcc.c-torture/execute/pr60072.c: New test.
 
 --- gcc/tree.h.jj 2014-01-20 19:16:29.0 +0100
 +++ gcc/tree.h2014-02-05 15:54:04.681904492 +0100
 @@ -4470,6 +4470,20 @@ may_be_aliased (const_tree var)
 || TREE_ADDRESSABLE (var)));
  }
  
 +/* Return pointer to optimization flags of FNDECL.  */
 +static inline struct cl_optimization *
 +opts_for_fn (const_tree fndecl)
 +{
 +  tree fn_opts = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (fndecl);
 +  if (fn_opts == NULL_TREE)
 +fn_opts = optimization_default_node;
 +  return TREE_OPTIMIZATION (fn_opts);
 +}
 +
 +/* opt flag for function FNDECL, e.g. opts_for_fn (fndecl, optimize) is
 +   the optimization level of function fndecl.  */
 +#define opt_for_fn(fndecl, opt) (opts_for_fn (fndecl)-x_##opt)
 +
  /* For anonymous aggregate types, we need some sort of name to
 hold on to.  In practice, this should not appear, but it should
 not be harmful if it does.  */
 --- gcc/config/i386/i386.c.jj 2014-02-05 10:37:36.166167116 +0100
 +++ gcc/config/i386/i386.c2014-02-05 15:56:36.779146394 +0100
 @@ -5608,7 +5608,12 @@ ix86_function_regparm (const_tree type,
/* Use register calling convention for local functions when possible.  */
if (decl
 TREE_CODE (decl) == FUNCTION_DECL
 -   optimize
 +  /* Caller and callee must agree on the calling convention, so
 +  checking here just optimize means that with
 +  __attribute__((optimize (...))) caller could use regparm convention
 +  and callee not, or vice versa.  Instead look at whether the callee
 +  is optimized or not.  */
 +   opt_for_fn (decl, optimize)
 !(profile_flag  !flag_fentry))
  {
/* FIXME: remove this CONST_CAST when cgraph.[ch] is constified.  */
 --- gcc/tree-inline.c.jj  2014-02-04 14:03:31.0 +0100
 +++ gcc/tree-inline.c 2014-02-05 15:55:34.747448968 +0100
 @@ -3315,16 +3315,10 @@ copy_forbidden (struct function *fun, tr
   goto fail;
}
  
 -  tree fs_opts;
 -  fs_opts = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (fun-decl);
 -  if (fs_opts)
 +  if (!opt_for_fn (fun-decl, optimize))

Maybe

  if (!opt_for_fn (fun-decl, optimize)
   optimization_default_node-x_optimize != 0)

?

But yes, copy_forbidden is a bit big hammer considering always_inline
and similar stuff.  I think we need to handle -O0 at all consumers
selectively.

  {
 -  struct cl_optimization *os = TREE_OPTIMIZATION (fs_opts);
 -  if (!os-x_optimize)
 - {
 -   reason = G_(function %q+F compiled without optimizations);
 -   goto fail;
 - }
 +  reason = G_(function %q+F compiled without optimizations);
 +  goto fail;
  }
  
   fail:
 --- gcc/testsuite/gcc.c-torture/execute/pr60062.c.jj  2014-02-05 
 15:55:48.639388697 +0100
 +++ gcc/testsuite/gcc.c-torture/execute/pr60062.c 2014-02-05 
 15:55:48.639388697 +0100
 @@ -0,0 +1,25 @@
 +/* PR target/60062 */
 +
 +int a;
 +
 +static void
 +foo (const char 

[PATCH] Fix ix86_function_regparm with optimize attribute (PR target/60062, take 2)

2014-02-05 Thread Jakub Jelinek
On Wed, Feb 05, 2014 at 02:20:05PM +0100, Richard Biener wrote:
  Using !!optimize to determine if we should switch local ABI to regparm
  convention isn't compatible with optimize attribute, as !!optimize is
  whether the current function is being optimized, but for the ABI decisions
  we actually need the caller and callee to agree on the calling convention.
  
  Fixed by looking at callee's optimize settings all the time.
  
  Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
 
 Seeing a 2nd variant of such code I'd rather have an abstraction
 for this ... optimize_fn ()?  opt_for_fn (fn, OPT_...)?

Here is an updated version of the patch, unfortunately this one regressed in
quite a lot of tests.  The problem is that by using opt_for_fn there
it also sets the failure reason for all functions at -O0 that don't have
the optimize attribute at all.  Now copy_forbidden is used by the inliner,
where we have to handle always_inline functions, other -O0 functions we likely
don't want to inline, be it because the cmd line options are -O0 or because
the function we are considering for inlining has optimize (0), and
for versioning, where I'd say we want to never version the -O0 functions.

So, where do we want to do that instead?  E.g. should it be e.g. in
tree_versionable_function_p directly and let the inliner (if it doesn't do
already) also treat optimize(0) functions that aren't always_inline as
noinline?

I guess even without this patch my PR60026 fix broke inlining of
__attribute__((always_inline, optimize (0))) functions.

2014-02-05  Jakub Jelinek  ja...@redhat.com

PR target/60062
* tree.h (opts_for_fn): New inline function.
(opt_for_fn): Define.
* config/i386/i386.c (ix86_function_regparm): Use
opt_for_fn (decl, optimize) instead of optimize.
* tree-inline.c (copy_forbidden): Use opt_for_fn.

* gcc.c-torture/execute/pr60062.c: New test.
* gcc.c-torture/execute/pr60072.c: New test.

--- gcc/tree.h.jj   2014-01-20 19:16:29.0 +0100
+++ gcc/tree.h  2014-02-05 15:54:04.681904492 +0100
@@ -4470,6 +4470,20 @@ may_be_aliased (const_tree var)
  || TREE_ADDRESSABLE (var)));
 }
 
+/* Return pointer to optimization flags of FNDECL.  */
+static inline struct cl_optimization *
+opts_for_fn (const_tree fndecl)
+{
+  tree fn_opts = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (fndecl);
+  if (fn_opts == NULL_TREE)
+fn_opts = optimization_default_node;
+  return TREE_OPTIMIZATION (fn_opts);
+}
+
+/* opt flag for function FNDECL, e.g. opts_for_fn (fndecl, optimize) is
+   the optimization level of function fndecl.  */
+#define opt_for_fn(fndecl, opt) (opts_for_fn (fndecl)-x_##opt)
+
 /* For anonymous aggregate types, we need some sort of name to
hold on to.  In practice, this should not appear, but it should
not be harmful if it does.  */
--- gcc/config/i386/i386.c.jj   2014-02-05 10:37:36.166167116 +0100
+++ gcc/config/i386/i386.c  2014-02-05 15:56:36.779146394 +0100
@@ -5608,7 +5608,12 @@ ix86_function_regparm (const_tree type,
   /* Use register calling convention for local functions when possible.  */
   if (decl
TREE_CODE (decl) == FUNCTION_DECL
-   optimize
+  /* Caller and callee must agree on the calling convention, so
+checking here just optimize means that with
+__attribute__((optimize (...))) caller could use regparm convention
+and callee not, or vice versa.  Instead look at whether the callee
+is optimized or not.  */
+   opt_for_fn (decl, optimize)
!(profile_flag  !flag_fentry))
 {
   /* FIXME: remove this CONST_CAST when cgraph.[ch] is constified.  */
--- gcc/tree-inline.c.jj2014-02-04 14:03:31.0 +0100
+++ gcc/tree-inline.c   2014-02-05 15:55:34.747448968 +0100
@@ -3315,16 +3315,10 @@ copy_forbidden (struct function *fun, tr
goto fail;
   }
 
-  tree fs_opts;
-  fs_opts = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (fun-decl);
-  if (fs_opts)
+  if (!opt_for_fn (fun-decl, optimize))
 {
-  struct cl_optimization *os = TREE_OPTIMIZATION (fs_opts);
-  if (!os-x_optimize)
-   {
- reason = G_(function %q+F compiled without optimizations);
- goto fail;
-   }
+  reason = G_(function %q+F compiled without optimizations);
+  goto fail;
 }
 
  fail:
--- gcc/testsuite/gcc.c-torture/execute/pr60062.c.jj2014-02-05 
15:55:48.639388697 +0100
+++ gcc/testsuite/gcc.c-torture/execute/pr60062.c   2014-02-05 
15:55:48.639388697 +0100
@@ -0,0 +1,25 @@
+/* PR target/60062 */
+
+int a;
+
+static void
+foo (const char *p1, int p2)
+{
+  if (__builtin_strcmp (p1, hello) != 0)
+__builtin_abort ();
+}
+
+static void
+bar (const char *p1)
+{
+  if (__builtin_strcmp (p1, hello) != 0)
+__builtin_abort ();
+}
+
+__attribute__((optimize (0))) int
+main ()
+{
+  foo (hello, a);
+  bar (hello);
+  return 0;
+}
--- gcc/testsuite/gcc.c-torture/execute/pr60072.c.jj2014-02-05