Re: [PATCH][PR c++/80038][6/7 Regression] Destroy temps for _Cilk_spawn calling in the child

2017-03-23 Thread Xi Ruoyao
On 2017-03-24 05:26 +0800, Xi Ruoyao wrote:

> The patch has 500+ lines so I attach it.  ChangeLog is pasted here:

Damn it... I attached the draft of patch where some useless functions had
not been removed.

This time the attachment is correct.

> 2017-03-24  Xi Ruoyao  
> 
>   PR c++/80038
>   * c-family/c-common.h (cilk_gimplify_call_params_in_spawned_fn,
>     cilk_install_body_pedigree_operations): Remove prototypes.
>   * c-family/c-gimplify.c (c_gimplify_expr): Remove the calls to
>     the function cilk_gimplify_call_params_in_spawned_fn.
>   * c-family/cilk.c: (cilk_set_spawn_marker): Mark the function
>     calls which should be detached.
>     (cilk_gimplify_call_params_in_spawned_fn,
>      cilk_install_body_pedigree_operations): Remove function.
>     (gimplify_cilk_spawn): Add EXPR_STMT and CLEANUP_POINT_EXPR
>     unwrapping.
>   * c/c-typeck.c (cilk_install_body_with_frame_cleanup):
>     Don't add pedigree operation and detach call here.
>   * cp/cp-cilkplus.c (cilk_install_body_with_frame_cleanup): Ditto.
>   * cp/cp-gimplify.c (cilk_cp_gimplify_call_params_in_spawned_fn):
>     Remove function.
>     (cp_gimplify_expr): Remove the calls to the function
>     cilk_cp_gimplify_call_params_in_spawned_fn.
>   * cp/semantics.c: Preserve the flag of function calls should
>     be detached.
>   * cilk_common.c (expand_builtin_cilk_detach): Move pedigree
>     operations here.
>   * gimplify.c (gimplify_cilk_detach): New static function.
>     (gimplify_call_expr, gimplify_modify_expr): Call it for the
>     function calls should be detached.
>   * lto/lto-lang.c (lto_init): Set in_lto_p earlier.
>   * tree-core.h: Document new macro EXPR_CILK_SPAWN.
>   * tree.h: Add new macro EXPR_CILK_SPAWN.
>   * testsuite/g++.dg/cilk-plus/CK/pr80038.cc: New test.
-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University
From dfbce2c4aafd4a555ef751af2d7422d2ba79189b Mon Sep 17 00:00:00 2001
From: Xi Ruoyao 
Date: Fri, 24 Mar 2017 04:35:23 +0800
Subject: [PATCH] Destroy temps for _Cilk_spawn calling in the child (PR
 c++/80038)

Revert r227423, and re-fix PR60586 without breaking cilk specs.

2017-03-24  Xi Ruoyao  

	PR c++/80038
	* c-family/c-common.h (cilk_gimplify_call_params_in_spawned_fn,
	  cilk_install_body_pedigree_operations): Remove prototypes.
	* c-family/c-gimplify.c (c_gimplify_expr): Remove the calls to
	  the function cilk_gimplify_call_params_in_spawned_fn.
	* c-family/cilk.c: (cilk_set_spawn_marker): Mark the function
	  calls which should be detached.
	  (cilk_gimplify_call_params_in_spawned_fn,
	   cilk_install_body_pedigree_operations): Remove function.
	  (gimplify_cilk_spawn): Add EXPR_STMT and CLEANUP_POINT_EXPR
	  unwrapping.
	* c/c-typeck.c (cilk_install_body_with_frame_cleanup):
	  Don't add pedigree operation and detach call here.
	* cp/cp-cilkplus.c (cilk_install_body_with_frame_cleanup): Ditto.
	* cp/cp-gimplify.c (cilk_cp_gimplify_call_params_in_spawned_fn):
	  Remove function.
	  (cp_gimplify_expr): Remove the calls to the function
	  cilk_cp_gimplify_call_params_in_spawned_fn.
	* cp/semantics.c: Preserve the flag of function calls should
	  be detached.
	* cilk_common.c (expand_builtin_cilk_detach): Move pedigree
	  operations here.
	* gimplify.c (gimplify_cilk_detach): New static function.
	  (gimplify_call_expr, gimplify_modify_expr): Call it for the
	  function calls should be detached.
	* lto/lto-lang.c (lto_init): Set in_lto_p earlier.
	* tree-core.h: Document new macro EXPR_CILK_SPAWN.
	* tree.h: Add new macro EXPR_CILK_SPAWN.
	* testsuite/g++.dg/cilk-plus/CK/pr80038.cc: New test.
---
 gcc/c-family/c-common.h  |   2 -
 gcc/c-family/c-gimplify.c|  10 +--
 gcc/c-family/cilk.c  | 102 +++
 gcc/c/c-typeck.c |   8 +--
 gcc/cilk-common.c|  49 +
 gcc/cp/cp-cilkplus.c |   6 +-
 gcc/cp/cp-gimplify.c |  40 ++-
 gcc/cp/semantics.c   |   2 +
 gcc/gimplify.c   |  21 ++
 gcc/lto/lto-lang.c   |   6 +-
 gcc/testsuite/g++.dg/cilk-plus/CK/pr80038.cc |  47 
 gcc/tree-core.h  |   4 ++
 gcc/tree.h   |   6 ++
 13 files changed, 150 insertions(+), 153 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cilk-plus/CK/pr80038.cc

diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index ac86712..40b9845 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -1463,7 +1463,6 @@ extern bool is_cilkplus_vector_p (tree);
 extern tree insert_cilk_frame (tree);
 extern void 

[PATCH][PR c++/80038][6/7 Regression] Destroy temps for _Cilk_spawn calling in the child

2017-03-23 Thread Xi Ruoyao
Hi,

After r227423, GCC begun to destroy temps for cilkplus spawned functions in the 
parent, instead
of in the child.  This violated Intel Cilk specs

and broke some valid code such as NumericMonoid
.

To fix this, revert r227423 and re-fix PR60586 without moving out the 
destruction of temps by
inserting __cilkrts_detach call in the exact location of GIMPLE, instead of 
tree (where we can
not separate argument evaluation and copying with function call). 

The pedigree operations are merged into built-in __cilkrts_detach (just like 
libcilkrts cilk_fake.h).

Bootstrapped and regtested on x86_64-linux-gnu.

The patch has 500+ lines so I attach it.  ChangeLog is pasted here:

2017-03-24  Xi Ruoyao  

PR c++/80038
* c-family/c-common.h (cilk_gimplify_call_params_in_spawned_fn,
  cilk_install_body_pedigree_operations): Remove prototypes.
* c-family/c-gimplify.c (c_gimplify_expr): Remove the calls to
  the function cilk_gimplify_call_params_in_spawned_fn.
* c-family/cilk.c: (cilk_set_spawn_marker): Mark the function
  calls which should be detached.
  (cilk_gimplify_call_params_in_spawned_fn,
   cilk_install_body_pedigree_operations): Remove function.
  (gimplify_cilk_spawn): Add EXPR_STMT and CLEANUP_POINT_EXPR
  unwrapping.
* c/c-typeck.c (cilk_install_body_with_frame_cleanup):
  Don't add pedigree operation and detach call here.
* cp/cp-cilkplus.c (cilk_install_body_with_frame_cleanup): Ditto.
* cp/cp-gimplify.c (cilk_cp_gimplify_call_params_in_spawned_fn):
  Remove function.
  (cp_gimplify_expr): Remove the calls to the function
  cilk_cp_gimplify_call_params_in_spawned_fn.
* cp/semantics.c: Preserve the flag of function calls should
  be detached.
* cilk_common.c (expand_builtin_cilk_detach): Move pedigree
  operations here.
* gimplify.c (gimplify_cilk_detach): New static function.
  (gimplify_call_expr, gimplify_modify_expr): Call it for the
  function calls should be detached.
* lto/lto-lang.c (lto_init): Set in_lto_p earlier.
* tree-core.h: Document new macro EXPR_CILK_SPAWN.
* tree.h: Add new macro EXPR_CILK_SPAWN.
* testsuite/g++.dg/cilk-plus/CK/pr80038.cc: New test.
-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University
diff --git a/gcc/c-family/c-gimplify.c b/gcc/c-family/c-gimplify.c
index 57edb41..1ae75d2 100644
--- a/gcc/c-family/c-gimplify.c
+++ b/gcc/c-family/c-gimplify.c
@@ -280,10 +280,7 @@ c_gimplify_expr (tree *expr_p, gimple_seq *pre_p ATTRIBUTE_UNUSED,
 		 && cilk_detect_spawn_and_unwrap (expr_p));
 
   if (!seen_error ())
-	{
-	  cilk_gimplify_call_params_in_spawned_fn (expr_p, pre_p);
-	  return (enum gimplify_status) gimplify_cilk_spawn (expr_p);
-	}
+return (enum gimplify_status) gimplify_cilk_spawn (expr_p);
   return GS_ERROR;
 
 case MODIFY_EXPR:
@@ -295,10 +292,7 @@ c_gimplify_expr (tree *expr_p, gimple_seq *pre_p ATTRIBUTE_UNUSED,
 	 original expression (MODIFY/INIT/CALL_EXPR) is processes as
 	 it is supposed to be.  */
 	  && !seen_error ())
-	{
-	  cilk_gimplify_call_params_in_spawned_fn (expr_p, pre_p);
-	  return (enum gimplify_status) gimplify_cilk_spawn (expr_p);
-	}
+return (enum gimplify_status) gimplify_cilk_spawn (expr_p);
 
 default:;
 }
diff --git a/gcc/c-family/cilk.c b/gcc/c-family/cilk.c
index 43478ff..b58fb99 100644
--- a/gcc/c-family/cilk.c
+++ b/gcc/c-family/cilk.c
@@ -109,6 +109,10 @@ cilk_set_spawn_marker (location_t loc, tree fcall)
   else
 {
   cfun->calls_cilk_spawn = true;
+  if (TREE_CODE (fcall) == CALL_EXPR)
+EXPR_CILK_SPAWN (fcall) = 1;
+  else /* TREE_CODE (fcall) == TARGET_EXPR */
+EXPR_CILK_SPAWN (TREE_OPERAND (fcall, 1)) = 1;
   return true;
 }
 }
@@ -775,37 +779,6 @@ create_cilk_wrapper (tree exp, tree *args_out)
   return fndecl;
 }
 
-/* Gimplify all the parameters for the Spawned function.  *EXPR_P can be a
-   CALL_EXPR, INIT_EXPR, MODIFY_EXPR or TARGET_EXPR.  *PRE_P and *POST_P are
-   gimple sequences from the caller of gimplify_cilk_spawn.  */
-
-void
-cilk_gimplify_call_params_in_spawned_fn (tree *expr_p, gimple_seq *pre_p)
-{
-  int ii = 0;
-  tree *fix_parm_expr = expr_p;
-
-  /* Remove CLEANUP_POINT_EXPR and EXPR_STMT from *spawn_p.  */
-  while (TREE_CODE (*fix_parm_expr) == CLEANUP_POINT_EXPR
-	 || TREE_CODE (*fix_parm_expr) == EXPR_STMT)
-*fix_parm_expr = TREE_OPERAND (*fix_parm_expr, 0);
-
-  if ((TREE_CODE (*expr_p) == INIT_EXPR)
-  || (TREE_CODE (*expr_p) == TARGET_EXPR)
-  || (TREE_CODE (*expr_p) == MODIFY_EXPR))
-fix_parm_expr = _OPERAND (*expr_p, 1);
-
-  if (TREE_CODE 

Re: [PATCH] Fix profiledbootstrap ada checking failure (PR debug/79255)

2017-03-23 Thread Jason Merrill
On Thu, Mar 23, 2017 at 4:44 PM, Jakub Jelinek  wrote:
> The following C testcase shows how profiledbootstrap fails with checking
> compiler.  We have a (nested) FUNCTION_DECL inside of BLOCK_VARS of an
> inline function, when it gets inlined, it is moved into
> BLOCK_NONLOCALIZED_VARS.  And, decls_for_scope calls process_scope_var
> with NULL decl and non-NULL origin for all BLOCK_NONLOCALIZED_VARS.
> That is fine for variables, but for FUNCTION_DECLs it can actually
> try to dwarf2out_abstract_function that FUNCTION_DECL, which should be
> really done only when it is inlined (i.e. BLOCK_ABSTRACT_ORIGIN of
> some BLOCK).

And when it's cloned.

But does it make sense for gen_decl_die to call
dwarf2out_abstract_function when decl is null?  That seems wrong.

Jason


Re: C++ PATCH to fix ICE in replace_placeholders_r (PR c++/79937)

2017-03-23 Thread Jason Merrill
On Thu, Mar 23, 2017 at 4:34 PM, Marek Polacek  wrote:
> On Tue, Mar 14, 2017 at 02:34:30PM -0400, Jason Merrill wrote:
>> On Tue, Mar 14, 2017 at 2:33 PM, Jason Merrill  wrote:
>> > On Tue, Mar 7, 2017 at 12:10 PM, Marek Polacek  wrote:
>> >> In this testcase we have
>> >> C c = bar (X{1});
>> >> which store_init_value sees as
>> >> c = TARGET_EXPR > >> .n=(&)->i}>)>
>> >> i.e. we're initializing "c" with a TARGET_EXPR.  We call 
>> >> replace_placeholders
>> >> that walks the whole tree to substitute the placeholders.  Eventually we 
>> >> find
>> >> the nested  but that's for another object, so 
>> >> we
>> >> crash.  Seems that we shouldn't have stepped into the second TARGET_EXPR 
>> >> at
>> >> all; it has nothing to with "c", it's bar's argument.
>> >>
>> >> It occurred to me that we shouldn't step into CALL_EXPRs and leave the
>> >> placeholders in function arguments to cp_gimplify_init_expr which calls
>> >> replace_placeholders for constructors.  Not sure if it's enough to handle
>> >> CALL_EXPRs like this, anything else?
>> >
>> > Hmm, we might have a DMI containing a call with an argument referring
>> > to *this, i.e.
>> >
>> > struct A
>> > {
>> >   int i;
>> >   int j = frob (this->i);
>> > };
>> >
>> > The TARGET_EXPR seems like a more likely barrier, but even there we
>> > could have something like
>> >
>> > struct A { int i; };
>> > struct B
>> > {
>> >   int i;
>> >   A a = A{this->i};
>> > };
>> >
>> > I think we need replace_placeholders to keep a stack of objects, so
>> > that when we see a TARGET_EXPR we add it to the stack and therefore
>> > can properly replace a PLACEHOLDER_EXPR of its type.
>>
>> Or actually, avoid replacing such a PLACEHOLDER_EXPR, but rather leave
>> it for later when we lower the TARGET_EXPR.
>
> Sorry, I don't really follow.  I have a patch that puts TARGET_EXPRs on
> a stack, but I don't know how that helps.  E.g. with nsdmi-aggr3.C
> we have
> B b = TARGET_EXPR >}>
> so when we get to that PLACEHOLDER_EXPR, on the stack there's
> TARGET_EXPR with type struct A
> TARGET_EXPR with type struct B
> so the type of the PLACEHOLDER_EXPR doesn't match the type of the current
> TARGET_EXPR, but we still want to replace it in this case.
>
> So -- could you expand a bit on what you had in mind, please?

So then when we see a placeholder, we walk the stack to find the
object of the matching type.

But if the object we find was collected from walking through a
TARGET_EXPR, we should leave the PLACEHOLDER_EXPR alone, so that it
can be replaced later with the actual target of the initialization.

Jason


[PATCH] Fix profiledbootstrap ada checking failure (PR debug/79255)

2017-03-23 Thread Jakub Jelinek
Hi!

The following C testcase shows how profiledbootstrap fails with checking
compiler.  We have a (nested) FUNCTION_DECL inside of BLOCK_VARS of an
inline function, when it gets inlined, it is moved into
BLOCK_NONLOCALIZED_VARS.  And, decls_for_scope calls process_scope_var
with NULL decl and non-NULL origin for all BLOCK_NONLOCALIZED_VARS.
That is fine for variables, but for FUNCTION_DECLs it can actually
try to dwarf2out_abstract_function that FUNCTION_DECL, which should be
really done only when it is inlined (i.e. BLOCK_ABSTRACT_ORIGIN of
some BLOCK).  The effect is that we actually add DW_AT_inline attribute
to that DW_TAG_subroutine, and then later when processing it again
we add DW_AT_low_pc etc. and ICE, because those attributes should not
appear on DW_AT_inline functions.

Fixed by handling FUNCTION_DECLs always the same, whether in BLOCK_VARS
or BLOCK_NONLOCALIZED_VARS.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2017-03-23  Jakub Jelinek  

PR debug/79255
* dwarf2out.c (decls_for_scope): If BLOCK_NONLOCALIZED_VAR is
a FUNCTION_DECL, pass it as decl instead of origin to
process_scope_var.

* gcc.dg/pr79255.c: New test.

--- gcc/dwarf2out.c.jj  2017-03-22 19:31:41.525055795 +0100
+++ gcc/dwarf2out.c 2017-03-23 17:57:09.419362739 +0100
@@ -24861,8 +24861,13 @@ decls_for_scope (tree stmt, dw_die_ref c
 if we've done it once already.  */
   if (! early_dwarf)
for (i = 0; i < BLOCK_NUM_NONLOCALIZED_VARS (stmt); i++)
- process_scope_var (stmt, NULL, BLOCK_NONLOCALIZED_VAR (stmt, i),
-context_die);
+ {
+   decl = BLOCK_NONLOCALIZED_VAR (stmt, i);
+   if (TREE_CODE (decl) == FUNCTION_DECL)
+ process_scope_var (stmt, decl, NULL_TREE, context_die);
+   else
+ process_scope_var (stmt, NULL_TREE, decl, context_die);
+ }
 }
 
   /* Even if we're at -g1, we need to process the subblocks in order to get
--- gcc/testsuite/gcc.dg/pr79255.c.jj   2017-03-23 17:57:44.711911298 +0100
+++ gcc/testsuite/gcc.dg/pr79255.c  2017-03-23 17:56:24.0 +0100
@@ -0,0 +1,21 @@
+/* PR bootstrap/79255 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -g -fno-toplevel-reorder -Wno-attributes" } */
+
+static inline __attribute__((always_inline)) int foo (int x);
+
+int
+baz (void)
+{
+  return foo (3) + foo (6) + foo (9);
+}
+
+static inline __attribute__((always_inline)) int
+foo (int x)
+{
+  auto inline int __attribute__((noinline)) bar (int x)
+  {
+return x + 3;
+  }
+  return bar (x) + bar (x + 2);
+}

Jakub


[PATCH] avoid cselib rtx_equal_for_cselib_1 infinite recursion (PR debug/80025)

2017-03-23 Thread Jakub Jelinek
Hi!

On Thu, Mar 23, 2017 at 03:00:04PM +0100, Jakub Jelinek wrote:
> On Tue, Mar 21, 2017 at 07:43:51PM -0300, Alexandre Oliva wrote:
> > When two VALUEs are recorded in the cselib equivalence table such that
> > they are equivalent to each other XORed with the same expression, if
> > we started a cselib equivalence test between say the odd one and the
> > even one, we'd end up recursing to compare the even one with the odd
> > one, and then again, indefinitely.
> > 
> > This patch cuts off this infinite recursion by recognizing the XOR
> > special case (it's the only binary commutative operation in which
> > applying one of the operands of an operation to the result of that
> > operation brings you back to the other operand) and determining
> > whether we have an equivalence without recursing down the infinite
> > path.
> 
> Is XOR the only special case though?  E.g. PLUS or MINUS with
> the most negative constant act the same, and I don't see why if unlucky
> enough with reverse ops etc. you couldn't get something similar through
> combination thereof, perhaps indirectly through multiple VALUEs.
> 
> So I think it is safer to just put a cap on the rtx_equal_for_cselib_1
> recursion depth (should be enough to bump the depth only when walking
> locs of a VALUE).  I'll test such a patch.

Here is that patch, bootstrapped/regtested on x86_64-linux and i686-linux,
ok for trunk?  Or shall I turn that 128 into a --param?

2017-03-23  Jakub Jelinek  

PR debug/80025
* cselib.h (rtx_equal_for_cselib_1): Add depth argument.
(rtx_equal_for_cselib_p): Pass 0 to it.
* cselib.c (cselib_hasher::equal): Likewise.
(rtx_equal_for_cselib_1): Add depth argument.  If depth
is 128, don't look up VALUE locs and punt.  Increment
depth in recursive calls when walking VALUE locs.

* gcc.dg/torture/pr80025.c: New test.

--- gcc/cselib.h.jj 2017-01-01 12:45:37.0 +0100
+++ gcc/cselib.h2017-03-23 14:06:35.348504435 +0100
@@ -82,7 +82,7 @@ extern void cselib_finish (void);
 extern void cselib_process_insn (rtx_insn *);
 extern bool fp_setter_insn (rtx_insn *);
 extern machine_mode cselib_reg_set_mode (const_rtx);
-extern int rtx_equal_for_cselib_1 (rtx, rtx, machine_mode);
+extern int rtx_equal_for_cselib_1 (rtx, rtx, machine_mode, int);
 extern int references_value_p (const_rtx, int);
 extern rtx cselib_expand_value_rtx (rtx, bitmap, int);
 typedef rtx (*cselib_expand_callback)(rtx, bitmap, int, void *);
@@ -134,7 +134,7 @@ rtx_equal_for_cselib_p (rtx x, rtx y)
   if (x == y)
 return 1;
 
-  return rtx_equal_for_cselib_1 (x, y, VOIDmode);
+  return rtx_equal_for_cselib_1 (x, y, VOIDmode, 0);
 }
 
 #endif /* GCC_CSELIB_H */
--- gcc/cselib.c.jj 2017-01-01 12:45:39.0 +0100
+++ gcc/cselib.c2017-03-23 14:38:50.238487353 +0100
@@ -125,7 +125,7 @@ cselib_hasher::equal (const cselib_val *
   /* We don't guarantee that distinct rtx's have different hash values,
  so we need to do a comparison.  */
   for (l = v->locs; l; l = l->next)
-if (rtx_equal_for_cselib_1 (l->loc, x, memmode))
+if (rtx_equal_for_cselib_1 (l->loc, x, memmode, 0))
   {
promote_debug_loc (l);
return true;
@@ -834,7 +834,7 @@ autoinc_split (rtx x, rtx *off, machine_
addresses, MEMMODE should be VOIDmode.  */
 
 int
-rtx_equal_for_cselib_1 (rtx x, rtx y, machine_mode memmode)
+rtx_equal_for_cselib_1 (rtx x, rtx y, machine_mode memmode, int depth)
 {
   enum rtx_code code;
   const char *fmt;
@@ -867,6 +867,9 @@ rtx_equal_for_cselib_1 (rtx x, rtx y, ma
   if (GET_CODE (y) == VALUE)
return e == canonical_cselib_val (CSELIB_VAL_PTR (y));
 
+  if (depth == 128)
+   return 0;
+
   for (l = e->locs; l; l = l->next)
{
  rtx t = l->loc;
@@ -876,7 +879,7 @@ rtx_equal_for_cselib_1 (rtx x, rtx y, ma
 list.  */
  if (REG_P (t) || MEM_P (t) || GET_CODE (t) == VALUE)
continue;
- else if (rtx_equal_for_cselib_1 (t, y, memmode))
+ else if (rtx_equal_for_cselib_1 (t, y, memmode, depth + 1))
return 1;
}
 
@@ -887,13 +890,16 @@ rtx_equal_for_cselib_1 (rtx x, rtx y, ma
   cselib_val *e = canonical_cselib_val (CSELIB_VAL_PTR (y));
   struct elt_loc_list *l;
 
+  if (depth == 128)
+   return 0;
+
   for (l = e->locs; l; l = l->next)
{
  rtx t = l->loc;
 
  if (REG_P (t) || MEM_P (t) || GET_CODE (t) == VALUE)
continue;
- else if (rtx_equal_for_cselib_1 (x, t, memmode))
+ else if (rtx_equal_for_cselib_1 (x, t, memmode, depth + 1))
return 1;
}
 
@@ -914,12 +920,12 @@ rtx_equal_for_cselib_1 (rtx x, rtx y, ma
   if (!xoff != !yoff)
return 0;
 
-  if (xoff && !rtx_equal_for_cselib_1 (xoff, yoff, memmode))
+  if (xoff && !rtx_equal_for_cselib_1 (xoff, yoff, memmode, depth))
return 0;
 
   /* Don't recurse if 

[C++ PATCH] Fix -fsanitize={null,alignment} of references (PR c++/79572)

2017-03-23 Thread Jakub Jelinek
Hi!

Since late C++ folding has been committed, we don't sanitize some reference
bindings to NULL.  Earlier we had always NOP_EXPR to REFERENCE_TYPE say from
INTEGER_CST or whatever else, but cp_fold can now turn that right into
INTEGER_CST with REFERENCE_TYPE.  The following patch sanitizes even those.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2017-03-23  Jakub Jelinek  

PR c++/79572
* c-ubsan.h (ubsan_maybe_instrument_reference): Change argument to
tree *.
* c-ubsan.c (ubsan_maybe_instrument_reference): Likewise.  Handle
not just NOP_EXPR to REFERENCE_TYPE, but also INTEGER_CST with
REFERENCE_TYPE.

* cp-gimplify.c (cp_genericize_r): Sanitize INTEGER_CSTs with
REFERENCE_TYPE.  Adjust ubsan_maybe_instrument_reference caller
for NOP_EXPR to REFERENCE_TYPE.

* g++.dg/ubsan/null-8.C: New test.

--- gcc/c-family/c-ubsan.h.jj   2017-01-01 12:45:46.0 +0100
+++ gcc/c-family/c-ubsan.h  2017-03-23 09:13:16.287888726 +0100
@@ -28,7 +28,7 @@ extern tree ubsan_instrument_return (loc
 extern tree ubsan_instrument_bounds (location_t, tree, tree *, bool);
 extern bool ubsan_array_ref_instrumented_p (const_tree);
 extern void ubsan_maybe_instrument_array_ref (tree *, bool);
-extern void ubsan_maybe_instrument_reference (tree);
+extern void ubsan_maybe_instrument_reference (tree *);
 extern void ubsan_maybe_instrument_member_call (tree, bool);
 
 /* Declare this here as well as in ubsan.h. */
--- gcc/c-family/c-ubsan.c.jj   2017-01-01 12:45:46.0 +0100
+++ gcc/c-family/c-ubsan.c  2017-03-23 09:18:51.775486699 +0100
@@ -458,17 +458,26 @@ ubsan_maybe_instrument_reference_or_call
   return fold_build2 (COMPOUND_EXPR, TREE_TYPE (op), call, op);
 }
 
-/* Instrument a NOP_EXPR to REFERENCE_TYPE if needed.  */
+/* Instrument a NOP_EXPR to REFERENCE_TYPE or INTEGER_CST with REFERENCE_TYPE
+   type if needed.  */
 
 void
-ubsan_maybe_instrument_reference (tree stmt)
+ubsan_maybe_instrument_reference (tree *stmt_p)
 {
-  tree op = TREE_OPERAND (stmt, 0);
+  tree stmt = *stmt_p;
+  tree op = stmt;
+  if (TREE_CODE (stmt) == NOP_EXPR)
+op = TREE_OPERAND (stmt, 0);
   op = ubsan_maybe_instrument_reference_or_call (EXPR_LOCATION (stmt), op,
 TREE_TYPE (stmt),
 UBSAN_REF_BINDING);
   if (op)
-TREE_OPERAND (stmt, 0) = op;
+{
+  if (TREE_CODE (stmt) == NOP_EXPR) 
+   TREE_OPERAND (stmt, 0) = op;
+  else
+   *stmt_p = op;
+}
 }
 
 /* Instrument a CALL_EXPR to a method if needed.  */
--- gcc/cp/cp-gimplify.c.jj 2017-03-03 13:23:58.0 +0100
+++ gcc/cp/cp-gimplify.c2017-03-23 09:21:26.693460888 +0100
@@ -1130,6 +1130,19 @@ cp_genericize_r (tree *stmt_p, int *walk
}
 }
 
+  if (TREE_CODE (stmt) == INTEGER_CST
+  && TREE_CODE (TREE_TYPE (stmt)) == REFERENCE_TYPE
+  && (flag_sanitize & (SANITIZE_NULL | SANITIZE_ALIGNMENT))
+  && !wtd->no_sanitize_p)
+{
+  ubsan_maybe_instrument_reference (stmt_p);
+  if (*stmt_p != stmt)
+   {
+ *walk_subtrees = 0;
+ return NULL_TREE;
+   }
+}
+
   /* Other than invisiref parms, don't walk the same tree twice.  */
   if (p_set->contains (stmt))
 {
@@ -1477,7 +1490,7 @@ cp_genericize_r (tree *stmt_p, int *walk
   if ((flag_sanitize & (SANITIZE_NULL | SANITIZE_ALIGNMENT))
  && TREE_CODE (stmt) == NOP_EXPR
  && TREE_CODE (TREE_TYPE (stmt)) == REFERENCE_TYPE)
-   ubsan_maybe_instrument_reference (stmt);
+   ubsan_maybe_instrument_reference (stmt_p);
   else if (TREE_CODE (stmt) == CALL_EXPR)
{
  tree fn = CALL_EXPR_FN (stmt);
--- gcc/testsuite/g++.dg/ubsan/null-8.C.jj  2017-03-23 09:42:31.664696676 
+0100
+++ gcc/testsuite/g++.dg/ubsan/null-8.C 2017-03-23 09:43:31.501908802 +0100
@@ -0,0 +1,19 @@
+// PR c++/79572
+// { dg-do run }
+// { dg-options "-fsanitize=null -std=c++14" }
+// { dg-output "reference binding to null pointer of type 'const int'" }
+
+void
+foo (const int )
+{
+  if ()
+__builtin_printf ("iref %d\n", iref);
+  else
+__builtin_printf ("iref is NULL\n");
+}
+
+int
+main ()
+{
+  foo (*((int*) __null));
+}

Jakub


Re: C++ PATCH to fix ICE in replace_placeholders_r (PR c++/79937)

2017-03-23 Thread Marek Polacek
On Tue, Mar 14, 2017 at 02:34:30PM -0400, Jason Merrill wrote:
> On Tue, Mar 14, 2017 at 2:33 PM, Jason Merrill  wrote:
> > On Tue, Mar 7, 2017 at 12:10 PM, Marek Polacek  wrote:
> >> In this testcase we have
> >> C c = bar (X{1});
> >> which store_init_value sees as
> >> c = TARGET_EXPR  >> .n=(&)->i}>)>
> >> i.e. we're initializing "c" with a TARGET_EXPR.  We call 
> >> replace_placeholders
> >> that walks the whole tree to substitute the placeholders.  Eventually we 
> >> find
> >> the nested  but that's for another object, so we
> >> crash.  Seems that we shouldn't have stepped into the second TARGET_EXPR at
> >> all; it has nothing to with "c", it's bar's argument.
> >>
> >> It occurred to me that we shouldn't step into CALL_EXPRs and leave the
> >> placeholders in function arguments to cp_gimplify_init_expr which calls
> >> replace_placeholders for constructors.  Not sure if it's enough to handle
> >> CALL_EXPRs like this, anything else?
> >
> > Hmm, we might have a DMI containing a call with an argument referring
> > to *this, i.e.
> >
> > struct A
> > {
> >   int i;
> >   int j = frob (this->i);
> > };
> >
> > The TARGET_EXPR seems like a more likely barrier, but even there we
> > could have something like
> >
> > struct A { int i; };
> > struct B
> > {
> >   int i;
> >   A a = A{this->i};
> > };
> >
> > I think we need replace_placeholders to keep a stack of objects, so
> > that when we see a TARGET_EXPR we add it to the stack and therefore
> > can properly replace a PLACEHOLDER_EXPR of its type.
> 
> Or actually, avoid replacing such a PLACEHOLDER_EXPR, but rather leave
> it for later when we lower the TARGET_EXPR.

Sorry, I don't really follow.  I have a patch that puts TARGET_EXPRs on
a stack, but I don't know how that helps.  E.g. with nsdmi-aggr3.C
we have
B b = TARGET_EXPR }>
so when we get to that PLACEHOLDER_EXPR, on the stack there's
TARGET_EXPR with type struct A
TARGET_EXPR with type struct B
so the type of the PLACEHOLDER_EXPR doesn't match the type of the current
TARGET_EXPR, but we still want to replace it in this case.

So -- could you expand a bit on what you had in mind, please?

Thanks,

Marek


Re: [CHKP] Fix for PR79990

2017-03-23 Thread Ilya Enkovich
2017-03-23 17:18 GMT+03:00 Alexander Ivchenko :
> Hi,
>
> The patch below attempts to fix the PR. I checked that it did not
> break any of mpx.exp tests, but I did not run the full testing yet.
> Would like to know whether this approach is generally correct or not.
>
> The issue is that we have the hard reg vector variable:
>
> typedef int U __attribute__ ((vector_size (16)));
> register U u asm("xmm0");
>
> and chkp tries to instrument access to it:
>
> return u[i];
>
> by doing that:
>
> __bound_tmp.0_4 = __builtin_ia32_bndmk (, 16);
>
> However, you cannot take an address of a register variable (in fact if
> you do that, the compiler will give you "address of register variable
> ‘u’ requested" error), so expand, sensibly, gives an ICE on on 
> here. I believe that if there is no pointers, pointer bounds checker
> shouldn't get involved into that business. What do you think?

Hi!

I think with this patch I can call foo with any index and thus access
some random stack slot. The first thing we should answer is 'do we
want to catch array index overflows in such cases'? If we want to (and
this looks reasonable thing to do because it prevents invalid memory
accesses) then this patch doesn't resolve the problem.

I'm not sure it can affect the patch, but please consider more complex
cases. E.g.:

typedef int v8 __attribute__ ((vector_size(8)));

struct U {
  v8 a;
  v8 b;
};

int
foo (int i)
{
  register struct U u asm ("xmm0");
  return u.b[i];
}

One way to catch overflow in such cases might be to use some fake
pointer value (e.g. 0) for such not addressible variable. This fake value
would be used as base for memory access and as lower bound. I don't
see other cases except array_ref overflow check where such value
might be used. So this fake value will not be passed somewhere and
will not be stored to Bounds Table.

Thanks,
Ilya

>
>
>
>
> diff --git a/gcc/tree-chkp.c b/gcc/tree-chkp.c
> index 75caf83..e39ec9a 100644
> --- a/gcc/tree-chkp.c
> +++ b/gcc/tree-chkp.c
> @@ -3383,6 +3383,7 @@ chkp_parse_array_and_component_ref (tree node, tree 
> *ptr,
>tree comp_to_narrow = NULL_TREE;
>tree last_comp = NULL_TREE;
>bool array_ref_found = false;
> +  bool is_register_var = false;
>tree *nodes;
>tree var;
>int len;
> @@ -3440,6 +3441,9 @@ chkp_parse_array_and_component_ref (tree node, tree 
> *ptr,
>   || TREE_CODE (var) == STRING_CST
>   || TREE_CODE (var) == SSA_NAME);
>
> +  if (VAR_P (var) && DECL_HARD_REGISTER (var))
> +   is_register_var = true;
> +
>*ptr = chkp_build_addr_expr (var);
>  }
>
> @@ -3455,7 +3459,11 @@ chkp_parse_array_and_component_ref (tree node, tree 
> *ptr,
>
>if (TREE_CODE (var) == ARRAY_REF)
> {
> - *safe = false;
> + // Mark it as unsafe, unless the array being accessed
> + // has been explicitly placed on a register: in this
> + // case we cannot take a pointer of this variable,
> + // so we don't instrument the access.
> + *safe = is_register_var;
>   array_ref_found = true;
>   if (flag_chkp_narrow_bounds
>   && !flag_chkp_narrow_to_innermost_arrray
> @@ -4001,6 +4009,19 @@ chkp_process_stmt (gimple_stmt_iterator *iter, tree 
> node,
> bool bitfield;
> tree elt;
>
> +   {
> + // We don't instrument accesses to arrays that
> + // are explicitely assigned to hard registers.
> + HOST_WIDE_INT bitsize, bitpos;
> + tree base, offset;
> + machine_mode mode;
> + int unsignedp, reversep, volatilep = 0;
> + base = get_inner_reference (node, , , , ,
> + , , );
> + if (VAR_P (base) && DECL_HARD_REGISTER (base))
> +   safe = true;
> +   }
> +
> if (safe)
>   {
> /* We are not going to generate any checks, so do not
>
> diff --git a/gcc/testsuite/gcc.target/i386/mpx/pr79990.c
> b/gcc/testsuite/gcc.target/i386/mpx/pr79990.c
> new file mode 100644
> index 000..a27734d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/mpx/pr79990.c
> @@ -0,0 +1,18 @@
> +/* { dg-do compile } */
> +/* { dg-options "-fcheck-pointer-bounds -mmpx" } */
> +
> +typedef int U __attribute__ ((vector_size (16)));
> +
> +int
> +foo (int i)
> +{
> +#if __SSE2__
> +  register
> +#endif
> +U u
> +#if __SSE2__
> +  asm ("xmm0")
> +#endif
> +  ;
> +  return u[i];
> +}
>
> regards,
> Alexander


[C++ Patch/RFC] PR 80145

2017-03-23 Thread Paolo Carlini

Hi,

this ICE on invalid code isn't a regression, thus a patch probably 
doesn't qualify for Stage 4, but IMHO I made good progress on it and I'm 
sending what I have now anyway... The ICE happens during error recovery 
after a sensible diagnostic for the first declaration in:


auto* foo() { return 0; }
auto* foo();

After the error, finish_function does:

apply_deduced_return_type (fndecl, void_type_node);
fntype = TREE_TYPE (fndecl);

which then is inconsistent with the auto* return type of the second 
declaration and leads to an ICE in merge_types (which duplicate_decls 
thought was safe to call because types_match is true (evidently: 
decls_match uses fndecl_declared_return_type)). Thus, in terms of error 
recovery, I think that in cases like the one at issue it makes sense not 
to replace auto* after the error and leave the return type untouched: 
certainly the below passes testing and avoids ICEing on the testcase at 
issue and a variant of it.


Thanks,
Paolo.

//

Index: cp/decl.c
===
--- cp/decl.c   (revision 246414)
+++ cp/decl.c   (working copy)
@@ -15573,16 +15573,19 @@ finish_function (int flags)
   if (!processing_template_decl && FNDECL_USED_AUTO (fndecl)
   && TREE_TYPE (fntype) == current_function_auto_return_pattern)
 {
-  if (!is_auto (current_function_auto_return_pattern)
- && !current_function_returns_value && !current_function_returns_null)
+  if (is_auto (current_function_auto_return_pattern))
{
+ apply_deduced_return_type (fndecl, void_type_node);
+ fntype = TREE_TYPE (fndecl);
+   }
+  else if (!current_function_returns_value
+  && !current_function_returns_null)
+   {
  error ("no return statements in function returning %qT",
 current_function_auto_return_pattern);
  inform (input_location, "only plain % return type can be "
  "deduced to %");
}
-  apply_deduced_return_type (fndecl, void_type_node);
-  fntype = TREE_TYPE (fndecl);
 }
 
   // If this is a concept, check that the definition is reasonable.
Index: testsuite/g++.dg/cpp1y/auto-fn37.C
===
--- testsuite/g++.dg/cpp1y/auto-fn37.C  (revision 0)
+++ testsuite/g++.dg/cpp1y/auto-fn37.C  (working copy)
@@ -0,0 +1,5 @@
+// PR c++/80145
+// { dg-do compile { target c++14 } }
+
+auto* foo() { return 0; }  // { dg-error "unable to deduce" }
+auto* foo();
Index: testsuite/g++.dg/cpp1y/auto-fn38.C
===
--- testsuite/g++.dg/cpp1y/auto-fn38.C  (revision 0)
+++ testsuite/g++.dg/cpp1y/auto-fn38.C  (working copy)
@@ -0,0 +1,5 @@
+// PR c++/80145
+// { dg-do compile { target c++14 } }
+
+auto* foo() { }  // { dg-error "no return statements" }
+auto* foo();


[PATCH] Fix PR80158

2017-03-23 Thread Bill Schmidt
Hi,

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80158 reports an ICE in
SLSR while building 416.gamess on x86_64.  This is a latent (but
previously harmless) bug that was exposed by the fix for PR80054.
When replacing any statement with a strength reduction, SLSR updates
the candidate statement S associated with the associated candidate
record in the candidate table.  However, S may actually be associated
with two candidate records when an expression may be treated as
either a CAND_ADD or a CAND_MULT.  In this case, the second one can
be reached via the next_interp field.

In the excerpt from 416.gamess, the first candidate record was
updated when its statement was replaced, but the next-interp record
was not.  Later, that record was used as a basis for another tree
of strength-reduction candidates, but since it now pointed to an
orphaned statement without a basic block, the ICE occurred when
checking dominance against the statement's block.

The fix is simply to ensure that the next_interp record is always
updated when present.

I've bootstrapped and tested this on powerpc64le-unknown-linux-gnu
with no regressions.  I tested that the standalone test works on
x86_64 with this fix, but I have been unable to perform an x86_64
regstrap because the machine I have access to has insufficient
disk space. :/  I would appreciate it if you could do this for me.

Assuming no problems with x86_64 regstrap, is this ok for trunk?

Note: I will be unreachable from now until next Tuesday (i.e.,
returning the 28th).  If needed, please feel free to commit the
patch on my behalf.  Otherwise I will attend to it when I return.

Thanks!
Bill


[gcc]

2017-03-23  Bill Schmidt  

PR tree-optimization/80158
* gimple-ssa-strength-reduction.c (replace_mult_candidate): When
replacing a candidate statement, also replace it for the
candidate's alternate interpretation.
(replace_rhs_if_not_dup): Likewise.
(replace_one_candidate): Likewise.

[gcc/testsuite]

2017-03-23  Bill Schmidt  

PR tree-optimization/80158
* gfortran.fortran-torture/compile/pr80158.f: New file.


Index: gcc/gimple-ssa-strength-reduction.c
===
--- gcc/gimple-ssa-strength-reduction.c (revision 246420)
+++ gcc/gimple-ssa-strength-reduction.c (working copy)
@@ -2089,6 +2089,8 @@ replace_mult_candidate (slsr_cand_t c, tree basis_
  gimple_set_location (copy_stmt, gimple_location (c->cand_stmt));
  gsi_replace (, copy_stmt, false);
  c->cand_stmt = copy_stmt;
+ if (c->next_interp)
+   lookup_cand (c->next_interp)->cand_stmt = copy_stmt;
  if (dump_file && (dump_flags & TDF_DETAILS))
stmt_to_print = copy_stmt;
}
@@ -2118,6 +2120,8 @@ replace_mult_candidate (slsr_cand_t c, tree basis_
  basis_name, bump_tree);
  update_stmt (gsi_stmt (gsi));
   c->cand_stmt = gsi_stmt (gsi);
+ if (c->next_interp)
+   lookup_cand (c->next_interp)->cand_stmt = gsi_stmt (gsi);
  if (dump_file && (dump_flags & TDF_DETAILS))
stmt_to_print = gsi_stmt (gsi);
}
@@ -3405,6 +3409,8 @@ replace_rhs_if_not_dup (enum tree_code new_code, t
   gimple_assign_set_rhs_with_ops (, new_code, new_rhs1, new_rhs2);
   update_stmt (gsi_stmt (gsi));
   c->cand_stmt = gsi_stmt (gsi);
+  if (c->next_interp)
+   lookup_cand (c->next_interp)->cand_stmt = gsi_stmt (gsi);
 
   if (dump_file && (dump_flags & TDF_DETAILS))
return gsi_stmt (gsi);
@@ -3511,6 +3517,8 @@ replace_one_candidate (slsr_cand_t c, unsigned i,
  gimple_assign_set_rhs_with_ops (, MINUS_EXPR, basis_name, rhs2);
  update_stmt (gsi_stmt (gsi));
   c->cand_stmt = gsi_stmt (gsi);
+ if (c->next_interp)
+   lookup_cand (c->next_interp)->cand_stmt = gsi_stmt (gsi);
 
  if (dump_file && (dump_flags & TDF_DETAILS))
stmt_to_print = gsi_stmt (gsi);
@@ -3532,6 +3540,8 @@ replace_one_candidate (slsr_cand_t c, unsigned i,
  gimple_set_location (copy_stmt, gimple_location (c->cand_stmt));
  gsi_replace (, copy_stmt, false);
  c->cand_stmt = copy_stmt;
+ if (c->next_interp)
+   lookup_cand (c->next_interp)->cand_stmt = copy_stmt;
 
  if (dump_file && (dump_flags & TDF_DETAILS))
stmt_to_print = copy_stmt;
@@ -3543,6 +3553,8 @@ replace_one_candidate (slsr_cand_t c, unsigned i,
  gimple_set_location (cast_stmt, gimple_location (c->cand_stmt));
  gsi_replace (, cast_stmt, false);
  c->cand_stmt = cast_stmt;
+ if (c->next_interp)
+   lookup_cand (c->next_interp)->cand_stmt = cast_stmt;
 
  if (dump_file && (dump_flags & TDF_DETAILS))
stmt_to_print = cast_stmt;
Index: 

C++ PATCH for c++/80150, ICE with overloaded variadic deduction

2017-03-23 Thread Jason Merrill
This bug is related to 69056, where I implemented deducing a pack that
starts with explicitly specified arguments.  In that patch I added a
sanity check to make sure that when we do that we always start from a
pack marked incomplete.  But this test demonstrates a case where that
check is wrong; if we're deducing the same pack from two different
function arguments, it isn't incomplete the second time, we need to
deduce the same thing we got the first time.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit c46a927234d00a1723347ca1b21b2509f6513fc9
Author: Jason Merrill 
Date:   Thu Mar 23 13:29:35 2017 -0400

PR c++/80150 - ICE with overloaded variadic deduction.

* pt.c (try_one_overload): Remove asserts.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index a4bf890..5259dad 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -19694,9 +19694,10 @@ try_one_overload (tree tparms,
 is equivalent to the corresponding explicitly specified argument.
 We may have deduced more arguments than were explicitly specified,
 and that's OK.  */
- gcc_assert (ARGUMENT_PACK_INCOMPLETE_P (oldelt));
- gcc_assert (ARGUMENT_PACK_ARGS (oldelt)
- == ARGUMENT_PACK_EXPLICIT_ARGS (oldelt));
+
+ /* We used to assert ARGUMENT_PACK_INCOMPLETE_P (oldelt) here, but
+that's wrong if we deduce the same argument pack from multiple
+function arguments: it's only incomplete the first time.  */
 
  tree explicit_pack = ARGUMENT_PACK_ARGS (oldelt);
  tree deduced_pack = ARGUMENT_PACK_ARGS (elt);
diff --git a/gcc/testsuite/g++.dg/cpp0x/variadic-unify-3.C 
b/gcc/testsuite/g++.dg/cpp0x/variadic-unify-3.C
new file mode 100644
index 000..45f4d63
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/variadic-unify-3.C
@@ -0,0 +1,20 @@
+// PR c++/80150
+// { dg-do compile { target c++11 } }
+
+template 
+bool compare_functions(R(*funcA)(Args...), R(*funcB)(Args...), Args... args) {
+  return false;
+}
+
+int foo(int x) {
+  return x;
+}
+
+float foo(float x) {
+ return x;
+}
+
+int main() {
+  int a = 10;
+  compare_functions(foo, foo, a);
+}


Re: [Patch] Inline Variables for the Standard Library (p0607r0)

2017-03-23 Thread Jonathan Wakely

On 23/03/17 13:47 -0400, Tim Song wrote:

On Thu, Mar 23, 2017 at 1:45 PM, Jonathan Wakely  wrote:

+ // NOTE: This makes use of the fact that we know how moveable
+ // is implemented on pair, and also vector. If the implementation
+ // changes this test may begin to fail.


Maybe drop this comment?


Ha, yes, I'll do so. Thanks.


[PATCH] Fix Debug Mode test failures

2017-03-23 Thread Jonathan Wakely

This fixes some recent testsuite FAILures with Debug Mode.

* testsuite/23_containers/array/tuple_interface/
tuple_element_debug_neg.cc: Adjust dg-error.
* testsuite/23_containers/list/operations/78389.cc: Fix less-than to
define a valid strict weak ordering.
* testsuite/23_containers/priority_queue/67085.cc: Disable test for
Debug Mode, due to debug checks making extra copies of predicate.
* testsuite/ext/pb_ds/regression/priority_queue_binary_heap-62045.cc:
Likewise.

Tested powerpc64le-linux, committing to trunk.


commit f622cdad191e50e65522d8f36906ccf31ed132a2
Author: Jonathan Wakely 
Date:   Thu Mar 23 17:01:08 2017 +

Fix Debug Mode test failures

* testsuite/23_containers/array/tuple_interface/
tuple_element_debug_neg.cc: Adjust dg-error.
* testsuite/23_containers/list/operations/78389.cc: Fix less-than to
define a valid strict weak ordering.
* testsuite/23_containers/priority_queue/67085.cc: Disable test for
Debug Mode, due to debug checks making extra copies of predicate.
* testsuite/ext/pb_ds/regression/priority_queue_binary_heap-62045.cc:
Likewise.

diff --git 
a/libstdc++-v3/testsuite/23_containers/array/tuple_interface/tuple_element_debug_neg.cc
 
b/libstdc++-v3/testsuite/23_containers/array/tuple_interface/tuple_element_debug_neg.cc
index 894acef..e433c6e 100644
--- 
a/libstdc++-v3/testsuite/23_containers/array/tuple_interface/tuple_element_debug_neg.cc
+++ 
b/libstdc++-v3/testsuite/23_containers/array/tuple_interface/tuple_element_debug_neg.cc
@@ -22,4 +22,4 @@
 
 typedef std::tuple_element<1, std::array>::type type;
 
-// { dg-error "static assertion failed" "" { target *-*-* } 316 }
+// { dg-error "static assertion failed" "" { target *-*-* } 323 }
diff --git a/libstdc++-v3/testsuite/23_containers/list/operations/78389.cc 
b/libstdc++-v3/testsuite/23_containers/list/operations/78389.cc
index e0cc6e6..a35b2c9 100644
--- a/libstdc++-v3/testsuite/23_containers/list/operations/78389.cc
+++ b/libstdc++-v3/testsuite/23_containers/list/operations/78389.cc
@@ -48,7 +48,7 @@ bool operator<(const X&, const X&) {
   if (++count_X >= throw_after_X) {
 throw 666;
   }
-  return true;
+  return false;
 }
 
 
diff --git a/libstdc++-v3/testsuite/23_containers/priority_queue/67085.cc 
b/libstdc++-v3/testsuite/23_containers/priority_queue/67085.cc
index 9f4da58..06c7a24 100644
--- a/libstdc++-v3/testsuite/23_containers/priority_queue/67085.cc
+++ b/libstdc++-v3/testsuite/23_containers/priority_queue/67085.cc
@@ -16,6 +16,7 @@
 // .
 
 // { dg-do run { target c++11 } }
+// { dg-require-normal-mode "" }
 
 #include 
 #include 
diff --git 
a/libstdc++-v3/testsuite/ext/pb_ds/regression/priority_queue_binary_heap-62045.cc
 
b/libstdc++-v3/testsuite/ext/pb_ds/regression/priority_queue_binary_heap-62045.cc
index a61d36f..1cc9285 100644
--- 
a/libstdc++-v3/testsuite/ext/pb_ds/regression/priority_queue_binary_heap-62045.cc
+++ 
b/libstdc++-v3/testsuite/ext/pb_ds/regression/priority_queue_binary_heap-62045.cc
@@ -15,7 +15,7 @@
 // with this library; see the file COPYING3.  If not see
 // .
 
-// { dg-do run }
+// { dg-require-normal-mode "" }
 
 #include 
 #include 


Re: [libstdc++,doc] Strip links to ANSI (web shop)

2017-03-23 Thread Jonathan Wakely

On 23/03/17 15:01 +, Jonathan Wakely wrote:

On 18/03/17 19:44 +0100, Gerald Pfeifer wrote:

On Wed, 15 Feb 2017, Jonathan Wakely wrote:

The C++14 standard is:
http://webstore.ansi.org/RecordDetail.aspx?sku=ISO%2fIEC+14882%3a2014


Thanks, Jonathan!


What do you think?

Should we make the FAQ link to the info in the manual, instead of just
removing it?


Great idea.  Unfortunately I do not know the Docbook magic to do
this for the libstdc++ documentation, so I went ahead with the
simple version below.

Can you perhaps help making this a "real" link?


Exactly how it's one on line 126 in that file. I'll commit this later.





commit 90a904a447d5153f1bca007b0b4a599e4d8a8731
Author: Jonathan Wakely 
Date:   Thu Mar 23 15:00:06 2017 +

   Add link to the right section of the manual
   
   	* doc/xml/faq.xml: Add link.

* doc/html/*: Regenerate.

diff --git a/libstdc++-v3/doc/xml/faq.xml b/libstdc++-v3/doc/xml/faq.xml
index db67626..340ba9d 100644
--- a/libstdc++-v3/doc/xml/faq.xml
+++ b/libstdc++-v3/doc/xml/faq.xml
@@ -1177,7 +1177,8 @@
  
  

-Please refer to the Contributing section in our manual.
+Please refer to the Contributing
+section in our manual.
 
  





The attached patch also fixes some broken links that Gerald pointed
out.

I'm committing this to trunk today.

commit 38850ccfa6dadea710556d5f408cef79c2ffd9ec
Author: Jonathan Wakely 
Date:   Thu Mar 23 15:00:06 2017 +

Fix broken links in manual and remove outdated info

	* doc/xml/faq.xml: Add link.
	* doc/xml/manual/backwards_compatibility.xml: Remove outdated
	information on pre-ISO headers. Replace broken link to C++ FAQ Lite.
	* doc/xml/manual/io.xml: Update broken link.
	* doc/html/*: Regenerate.

diff --git a/libstdc++-v3/doc/xml/faq.xml b/libstdc++-v3/doc/xml/faq.xml
index db67626..340ba9d 100644
--- a/libstdc++-v3/doc/xml/faq.xml
+++ b/libstdc++-v3/doc/xml/faq.xml
@@ -1177,7 +1177,8 @@
   
   
 
-Please refer to the Contributing section in our manual.
+Please refer to the Contributing
+section in our manual.
  
   
 
diff --git a/libstdc++-v3/doc/xml/manual/backwards_compatibility.xml b/libstdc++-v3/doc/xml/manual/backwards_compatibility.xml
index 579c368..dbb3371 100644
--- a/libstdc++-v3/doc/xml/manual/backwards_compatibility.xml
+++ b/libstdc++-v3/doc/xml/manual/backwards_compatibility.xml
@@ -598,86 +598,21 @@ libstdc++-v3.
 
 Portability notes and known implementation limitations are as follows.
 
-Pre-ISO headers moved to backwards or removed
+Pre-ISO headers removed
 
 
  The pre-ISO C++ headers
   (iostream.h,
   defalloc.h etc.) are
-  available, unlike previous libstdc++ versions, but inclusion
-  generates a warning that you are using deprecated headers.
+  not supported.
 
 
-This compatibility layer is constructed by including the
-standard C++ headers, and injecting any items in
-std:: into the global namespace.
-   
-   For those of you new to ISO C++ (welcome, time travelers!), no,
-  that isn't a typo. Yes, the headers really have new names.
-  Marshall Cline's C++ FAQ Lite has a good explanation in http://www.w3.org/1999/xlink; xlink:href="http://www.parashift.com/c++-faq-lite/std-headers.html;>What's
+   For those of you new to ISO C++ (welcome, time travelers!), the
+  ancient pre-ISO headers have new names.
+  The C++ FAQ has a good explanation in http://www.w3.org/1999/xlink; xlink:href="https://isocpp.org/wiki/faq/coding-standards#std-headers;>What's
   the difference between xxx and xxx.h headers?.

 
- Some include adjustment may be required. What follows is an
-autoconf test that defines PRE_STDCXX_HEADERS when they
-exist.
-
-
-# AC_HEADER_PRE_STDCXX
-AC_DEFUN([AC_HEADER_PRE_STDCXX], [
-  AC_CACHE_CHECK(for pre-ISO C++ include files,
-  ac_cv_cxx_pre_stdcxx,
-  [AC_LANG_SAVE
-  AC_LANG_CPLUSPLUS
-  ac_save_CXXFLAGS="$CXXFLAGS"
-  CXXFLAGS="$CXXFLAGS -Wno-deprecated"
-
-  # Omit defalloc.h, as compilation with newer compilers is problematic.
-  AC_TRY_COMPILE([
-  #include new.h
-  #include iterator.h
-  #include alloc.h
-  #include set.h
-  #include hashtable.h
-  #include hash_set.h
-  #include fstream.h
-  #include tempbuf.h
-  #include istream.h
-  #include bvector.h
-  #include stack.h
-  #include rope.h
-  #include complex.h
-  #include ostream.h
-  #include heap.h
-  #include iostream.h
-  #include function.h
-  #include multimap.h
-  #include pair.h
-  #include stream.h
-  #include iomanip.h
-  #include slist.h
-  #include tree.h
-  #include vector.h
-  #include deque.h
-  #include multiset.h
-  #include list.h
-  #include map.h
-  #include algobase.h
-  #include hash_map.h
-  #include algo.h
-  #include queue.h
-  #include streambuf.h
-  ],,
-  ac_cv_cxx_pre_stdcxx=yes, ac_cv_cxx_pre_stdcxx=no)
-  CXXFLAGS="$ac_save_CXXFLAGS"
-  AC_LANG_RESTORE
-  ])
-  if test "$ac_cv_cxx_pre_stdcxx" = yes; then
-

Re: [PATCH] Implement LWG 2686, hash

2017-03-23 Thread Jonathan Wakely

On 12/03/17 13:16 +0100, Daniel Krügler wrote:

The following is an *untested* patch suggestion, please verify.

Notes: My interpretation is that hash should be
defined outside of the _GLIBCXX_COMPATIBILITY_CXX0X block, please
double-check that course of action.


That's right.


I noticed that the preexisting hash did directly refer to
the private members of error_code albeit those have public access
functions. For consistency I mimicked that existing style when
implementing hash.


I see no reason for that, so I've removed the friend declaration and
used the public member functions.

Although this is a DR, I'm treating it as a new C++17 feature, so I've
adjusted the patch to only add the new specialization for C++17 mode.
We're too close to the GCC 7 release to be adding new things to the
default mode, even minor things like this. After GCC 7 is released we
can revisit it and decide if we want to enable it for all modes.

Here's what I've tested and will be committing.


commit 90ca0fd91f5c65af370beb20af06bdca257aaf63
Author: Jonathan Wakely 
Date:   Thu Mar 23 11:47:39 2017 +

Implement LWG 2686, std::hash, for C++17

2017-03-23  Daniel Kruegler  

	Implement LWG 2686, Why is std::hash specialized for error_code,
	but not error_condition?
	* include/std/system_error (hash): Define for C++17.
	* testsuite/20_util/hash/operators/size_t.cc (hash):
	Instantiate test for error_condition.
	* testsuite/20_util/hash/requirements/explicit_instantiation.cc
	(hash): Instantiate hash.

diff --git a/libstdc++-v3/include/std/system_error b/libstdc++-v3/include/std/system_error
index 6775a6e..ec7d25f 100644
--- a/libstdc++-v3/include/std/system_error
+++ b/libstdc++-v3/include/std/system_error
@@ -373,14 +373,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace
 
-#ifndef _GLIBCXX_COMPATIBILITY_CXX0X
-
 #include 
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
+#ifndef _GLIBCXX_COMPATIBILITY_CXX0X
   // DR 1182.
   /// std::hash specialization for error_code.
   template<>
@@ -394,12 +393,27 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	return std::_Hash_impl::__hash_combine(__e._M_cat, __tmp);
   }
 };
+#endif // _GLIBCXX_COMPATIBILITY_CXX0X
+
+#if __cplusplus > 201402L
+  // DR 2686.
+  /// std::hash specialization for error_condition.
+  template<>
+struct hash
+: public __hash_base
+{
+  size_t
+  operator()(const error_condition& __e) const noexcept
+  {
+	const size_t __tmp = std::_Hash_impl::hash(__e.value());
+	return std::_Hash_impl::__hash_combine(__e.category(), __tmp);
+  }
+};
+#endif
 
 _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace
 
-#endif // _GLIBCXX_COMPATIBILITY_CXX0X
-
 #endif // C++11
 
 #endif // _GLIBCXX_SYSTEM_ERROR
diff --git a/libstdc++-v3/testsuite/20_util/hash/operators/size_t.cc b/libstdc++-v3/testsuite/20_util/hash/operators/size_t.cc
index ad02843..cc191d6 100644
--- a/libstdc++-v3/testsuite/20_util/hash/operators/size_t.cc
+++ b/libstdc++-v3/testsuite/20_util/hash/operators/size_t.cc
@@ -43,6 +43,9 @@ template
 void test01()
 {
   do_test();
+#if __cplusplus > 201402L
+  do_test();
+#endif
 }
 
 int main()
diff --git a/libstdc++-v3/testsuite/20_util/hash/requirements/explicit_instantiation.cc b/libstdc++-v3/testsuite/20_util/hash/requirements/explicit_instantiation.cc
index e9e5ea1..d01829a 100644
--- a/libstdc++-v3/testsuite/20_util/hash/requirements/explicit_instantiation.cc
+++ b/libstdc++-v3/testsuite/20_util/hash/requirements/explicit_instantiation.cc
@@ -40,6 +40,9 @@ template class std::hash;
 template class std::hash;
 template class std::hash;
 template class std::hash;
+#if __cplusplus > 201402L
+template class std::hash;
+#endif
 
 #ifdef _GLIBCXX_USE_WCHAR_T
 template class std::hash;


Re: [Patch] Inline Variables for the Standard Library (p0607r0)

2017-03-23 Thread Tim Song
On Thu, Mar 23, 2017 at 1:45 PM, Jonathan Wakely  wrote:
> + // NOTE: This makes use of the fact that we know how moveable
> + // is implemented on pair, and also vector. If the implementation
> + // changes this test may begin to fail.

Maybe drop this comment?


Re: [Patch] Inline Variables for the Standard Library (p0607r0)

2017-03-23 Thread Jonathan Wakely

On 12/03/17 01:04 +0100, Daniel Krügler wrote:

2017-03-11 23:14 GMT+01:00 Daniel Krügler :

2017-03-11 23:09 GMT+01:00 Tim Song :

On Sat, Mar 11, 2017 at 3:37 PM, Daniel Krügler
 wrote:

2017-03-11 21:23 GMT+01:00 Tim Song :

On Sat, Mar 11, 2017 at 1:32 PM, Daniel Krügler
 wrote:

This patch applies inline to all namespace scope const variables
according to options A and B2 voted in during the Kona meeting, see:

http://wiki.edg.com/pub/Wg21kona2017/StrawPolls/p0607r0.html

During that work it has been found that std::ignore was not declared
constexpr (http://cplusplus.github.io/LWG/lwg-defects.html#2773),
which was fixed as well.


Just adding constexpr to std::ignore does ~nothing. The assignment
operator needs to be made constexpr too; there is really no other
reason to use std::ignore.


There is nothing in the resolution of the issue (Nor in the discussion
that argues for the change) that says so. Yes, I considered to make
the assignment function template constexpr, but decided against it,
because the wording of the issue doesn't have this requirement). I'm
also aware of a test-case which would show the difference, but again:
This was not what the issue said. In fact I'm in the process to open a
new library issue that should impose that additional requirement.

- Daniel


Well, technically speaking, the issue resolution doesn't even
guarantee that the use case mentioned in the issue would compile since
the standard says nothing about whether the copy constructor of
decltype(std::ignore) is constexpr, or even whether it can be copied
at all. Yes, std::ignore is underspecified, but I'm not sure I see the
point in waiting; it's a completely conforming change and IMHO the
issue's intent is clearly to make std::ignore fully usable in
constexpr functions.

Also, the only specification we have for std::ignore's semantics is
"when an argument in t is ignore, assigning any value to the
corresponding tuple element has no effect". I think one can argue that
"causes an expression to be non-constant" is an "effect".

Re "new library issue", we already have issue 2933.


Good point, it already exists ;-)


Revised patch attached.


The patch had a couple of typos, "onstexpr" in one place, and using
_GLIBCXX17_CONSTEXPR instead of _GLIBCXX17_INLINE in another file.

Here's what I've tested and am going to commit.


commit e17662621cde44bc6d367fafdba4b4f05f74df05
Author: Jonathan Wakely 
Date:   Thu Mar 23 11:22:48 2017 +

Implement P0607R0 "Inline Variables for Standard Library" for C++17

2017-03-23  Daniel Kruegler  

	* include/bits/c++config (_GLIBCXX17_INLINE): Define.
	* include/bits/regex_constants.h (All std::regex_constants constants):
	Add _GLIBCXX17_INLINE as per P0607R0.
	* include/bits/std_mutex.h (defer_lock, try_to_lock, adopt_lock):
	Likewise.
	* include/bits/stl_pair.h (piecewise_construct): Likewise.
	* include/bits/uses_allocator.h (allocator_arg, uses_allocator_v)
	(__is_uses_allocator_constructible_v)
	(__is_nothrow_uses_allocator_constructible_v): Likewise.
	* include/std/chrono (treat_as_floating_point_v): Likewise.
	* include/std/functional (is_bind_expression_v, is_placeholder_v):
	Likewise.
	* include/std/optional (nullopt): Likewise.
	* include/std/ratio (ratio_equal_v, ratio_not_equal_v, ratio_less_v)
	ratio_less_equal_v, ratio_greater_v, ratio_greater_equal_v): Likewise.
	* include/std/system_error (is_error_code_enum_v)
	(is_error_condition_enum_v): Likewise.
	* include/std/tuple (tuple_size_v, ignore): Likewise.
	(ignore): Declare ignore constexpr as per LWG 2773, declare assignment
	constexpr as per LWG 2933.
	* include/std/type_traits (All variable templates): Add
	_GLIBCXX17_INLINE as per P0607R0.
	* include/std/variant (variant_size_v, variant_npos, __index_of_v)
	(__tuple_count_v, __exactly_once): Likewise.
	* testsuite/18_support/headers/new/synopsis.cc
	(hardware_destructive_interference_size)
	(hardware_constructive_interference_size): Likewise for commented-out
	variables.
	* testsuite/20_util/tuple/creation_functions/constexpr.cc: Add new
	test function for constexpr std::ignore (LWG 2773).
	* testsuite/20_util/tuple/creation_functions/constexpr_cpp14.cc: New
	test for LWG 2933.

diff --git a/libstdc++-v3/include/bits/c++config b/libstdc++-v3/include/bits/c++config
index 3b694e0..8ca6b03 100644
--- a/libstdc++-v3/include/bits/c++config
+++ b/libstdc++-v3/include/bits/c++config
@@ -122,6 +122,14 @@
 # endif
 #endif
 
+#ifndef _GLIBCXX17_INLINE
+# if __cplusplus > 201402L
+#  define _GLIBCXX17_INLINE inline
+# else
+#  define _GLIBCXX17_INLINE
+# endif
+#endif
+
 // Macro for noexcept, to support in mixed 03/0x mode.
 #ifndef _GLIBCXX_NOEXCEPT
 # if 

Re: [testsuite] Add missing dg-require-effective-target alloca to gcc testsuite

2017-03-23 Thread Mike Stump
On Mar 23, 2017, at 8:46 AM, Tom de Vries  wrote:
> 
> I've run the gcc testsuite for target nvptx-none and ran into "test for 
> excess errors" FAILs due to:
> ...
> sorry, unimplemented: target cannot support alloca.

We'd encourage ports to support alloca.  :-)

> OK for trunk for stage1?

Ok.  Ok for release branches and trunk as well, if you want.  I'd recommend 
trunk, if your port is meant to work and test out nicely in gcc 7.

[PATCH,testsuite] Skip pic-3,4.c and pie-3,4.c for mips*-*-linux-*.

2017-03-23 Thread Toma Tabacu
Hi,

The pic-3,4.c and pie-3,4.c tests are failing for some configurations of
mips*-*-linux-*.

This is because PIC is always on for MIPS Linux by default, except when the
compiler is built with --with-mips-plt, in which case PIC is on by default only
for the n64 ABI, because in this case -mplt "has no effect without '-msym32'",
to quote the documentation, so -mplt is not passed by default.

Richard Sandiford also talked about this in the summary of a patch which was
skipping a test for mips*-*-linux-* because of PIC:
https://gcc.gnu.org/ml/gcc-patches/2009-01/msg00801.html

Therefore, I think the most reasonable solution would be to just skip these
tests for mips*-*-linux-*. The patch below does so.

Tested with mips-mti-linux-gnu.

Catherine, Matthew what do you think ?

Regards,
Toma

gcc/testsuite/

* gcc.dg/pic-3.c: Skip for mips*-*-linux-*.
* gcc.dg/pic-4.c: Likewise.
* gcc.dg/pie-3.c: Likewise.
* gcc.dg/pie-4.c: Likewise.

diff --git a/gcc/testsuite/gcc.dg/pic-3.c b/gcc/testsuite/gcc.dg/pic-3.c
index 23ce999..c56f06f 100644
--- a/gcc/testsuite/gcc.dg/pic-3.c
+++ b/gcc/testsuite/gcc.dg/pic-3.c
@@ -1,4 +1,4 @@
-/* { dg-do compile { target { ! { *-*-darwin* hppa*64*-*-* } } } } */
+/* { dg-do compile { target { ! { *-*-darwin* hppa*64*-*-* mips*-*-linux-* } } 
} } */
 /* { dg-options "-fno-pic" } */
 
 #ifdef __PIC__
diff --git a/gcc/testsuite/gcc.dg/pic-4.c b/gcc/testsuite/gcc.dg/pic-4.c
index 8e14714..2afdd99 100644
--- a/gcc/testsuite/gcc.dg/pic-4.c
+++ b/gcc/testsuite/gcc.dg/pic-4.c
@@ -1,4 +1,4 @@
-/* { dg-do compile { target { ! { *-*-darwin* hppa*64*-*-* } } } } */
+/* { dg-do compile { target { ! { *-*-darwin* hppa*64*-*-* mips*-*-linux-* } } 
} } */
 /* { dg-options "-fno-PIC" } */
 
 #ifdef __PIC__
diff --git a/gcc/testsuite/gcc.dg/pie-3.c b/gcc/testsuite/gcc.dg/pie-3.c
index a7201c0..5577437 100644
--- a/gcc/testsuite/gcc.dg/pie-3.c
+++ b/gcc/testsuite/gcc.dg/pie-3.c
@@ -1,4 +1,4 @@
-/* { dg-do compile { target { ! { *-*-darwin* hppa*64*-*-* } } } } */
+/* { dg-do compile { target { ! { *-*-darwin* hppa*64*-*-* mips*-*-linux-* } } 
} } */
 /* { dg-options "-fno-pie" } */
 
 #ifdef __PIC__
diff --git a/gcc/testsuite/gcc.dg/pie-4.c b/gcc/testsuite/gcc.dg/pie-4.c
index b24eb8c..4134676 100644
--- a/gcc/testsuite/gcc.dg/pie-4.c
+++ b/gcc/testsuite/gcc.dg/pie-4.c
@@ -1,4 +1,4 @@
-/* { dg-do compile { target { ! { *-*-darwin* hppa*64*-*-* } } } } */
+/* { dg-do compile { target { ! { *-*-darwin* hppa*64*-*-* mips*-*-linux-* } } 
} } */
 /* { dg-options "-fno-PIE" } */
 
 #ifdef __PIC__



Re: [PATCH, GCC/testsuite/ARM, stage4] Compile atomic_loaddi_11 for Cortex-R5

2017-03-23 Thread Thomas Preudhomme

My apologize, this works for both -march of -mcpu not cortex-r4 in RUNTESTFLAGS.

ChangeLog entry is unchanged:

*** gcc/testsuite/ChangeLog ***

2017-03-22  Thomas Preud'homme  

Re: [PATCH 4/5] tree-inline: implement SIMT privatization, part 3

2017-03-23 Thread Jakub Jelinek
On Thu, Mar 23, 2017 at 08:00:11PM +0300, Alexander Monakov wrote:
> On Thu, 23 Mar 2017, Jakub Jelinek wrote:
> > And then clear it.  That doesn't look like the right thing.
> > 
> > So either you need some bool variable whether you've actually allocated
> > the vector in the current expand_call_inline and use that instead of
> > if (id->dst_simt_vars), or maybe you should clear id->dst_simt_vars
> > otherwise and save/restore it around unconditionally.
> 
> Yes, thanks for catching this.  I went for the latter approach in the 
> following
> patch.

Ok for trunk, thanks.

For the nvptx bits, I think you need to ask Bernd to review it.

Jakub


Re: [PATCH 4/5] tree-inline: implement SIMT privatization, part 3

2017-03-23 Thread Alexander Monakov
On Thu, 23 Mar 2017, Jakub Jelinek wrote:
> And then clear it.  That doesn't look like the right thing.
> 
> So either you need some bool variable whether you've actually allocated
> the vector in the current expand_call_inline and use that instead of
> if (id->dst_simt_vars), or maybe you should clear id->dst_simt_vars
> otherwise and save/restore it around unconditionally.

Yes, thanks for catching this.  I went for the latter approach in the following
patch.

---
 gcc/tree-inline.c | 61 ---
 gcc/tree-inline.h |  4 
 2 files changed, 58 insertions(+), 7 deletions(-)

diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index 6b6d489..b3bb3d6 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -4385,6 +4385,11 @@ expand_call_inline (basic_block bb, gimple *stmt, 
copy_body_data *id)
   gcall *call_stmt;
   unsigned int i;
   unsigned int prop_mask, src_properties;
+  struct function *dst_cfun;
+  tree simduid;
+  use_operand_p use;
+  gimple *simtenter_stmt = NULL;
+  vec *simtvars_save;
 
   /* The gimplifier uses input_location in too many places, such as
  internal_get_tmp_var ().  */
@@ -4588,15 +4593,26 @@ expand_call_inline (basic_block bb, gimple *stmt, 
copy_body_data *id)
   id->src_cfun = DECL_STRUCT_FUNCTION (fn);
   id->call_stmt = call_stmt;
 
+  /* When inlining into an OpenMP SIMD-on-SIMT loop, arrange for new automatic
+ variables to be added to IFN_GOMP_SIMT_ENTER argument list.  */
+  dst_cfun = DECL_STRUCT_FUNCTION (id->dst_fn);
+  simtvars_save = id->dst_simt_vars;
+  if (!(dst_cfun->curr_properties & PROP_gimple_lomp_dev)
+  && (simduid = bb->loop_father->simduid) != NULL_TREE
+  && (simduid = ssa_default_def (dst_cfun, simduid)) != NULL_TREE
+  && single_imm_use (simduid, , _stmt)
+  && is_gimple_call (simtenter_stmt)
+  && gimple_call_internal_p (simtenter_stmt, IFN_GOMP_SIMT_ENTER))
+vec_alloc (id->dst_simt_vars, 0);
+  else
+id->dst_simt_vars = NULL;
+
   /* If the src function contains an IFN_VA_ARG, then so will the dst
  function after inlining.  Likewise for IFN_GOMP_USE_SIMT.  */
   prop_mask = PROP_gimple_lva | PROP_gimple_lomp_dev;
   src_properties = id->src_cfun->curr_properties & prop_mask;
   if (src_properties != prop_mask)
-{
-  struct function *dst_cfun = DECL_STRUCT_FUNCTION (id->dst_fn);
-  dst_cfun->curr_properties &= src_properties | ~prop_mask;
-}
+dst_cfun->curr_properties &= src_properties | ~prop_mask;
 
   gcc_assert (!id->src_cfun->after_inlining);
 
@@ -4730,6 +4746,27 @@ expand_call_inline (basic_block bb, gimple *stmt, 
copy_body_data *id)
   if (cfun->gimple_df)
 pt_solution_reset (>gimple_df->escaped);
 
+  /* Add new automatic variables to IFN_GOMP_SIMT_ENTER arguments.  */
+  if (id->dst_simt_vars && id->dst_simt_vars->length () > 0)
+{
+  size_t nargs = gimple_call_num_args (simtenter_stmt);
+  vec *vars = id->dst_simt_vars;
+  auto_vec newargs (nargs + vars->length ());
+  for (size_t i = 0; i < nargs; i++)
+   newargs.quick_push (gimple_call_arg (simtenter_stmt, i));
+  for (tree *pvar = vars->begin (); pvar != vars->end (); pvar++)
+   {
+ tree ptrtype = build_pointer_type (TREE_TYPE (*pvar));
+ newargs.quick_push (build1 (ADDR_EXPR, ptrtype, *pvar));
+   }
+  gcall *g = gimple_build_call_internal_vec (IFN_GOMP_SIMT_ENTER, newargs);
+  gimple_call_set_lhs (g, gimple_call_lhs (simtenter_stmt));
+  gimple_stmt_iterator gsi = gsi_for_stmt (simtenter_stmt);
+  gsi_replace (, g, false);
+}
+  vec_free (id->dst_simt_vars);
+  id->dst_simt_vars = simtvars_save;
+
   /* Clean up.  */
   if (id->debug_map)
 {
@@ -5453,9 +5490,19 @@ copy_decl_for_dup_finish (copy_body_data *id, tree decl, 
tree copy)
function.  */
 ;
   else
-/* Ordinary automatic local variables are now in the scope of the
-   new function.  */
-DECL_CONTEXT (copy) = id->dst_fn;
+{
+  /* Ordinary automatic local variables are now in the scope of the
+new function.  */
+  DECL_CONTEXT (copy) = id->dst_fn;
+  if (VAR_P (copy) && id->dst_simt_vars && !is_gimple_reg (copy))
+   {
+ if (!lookup_attribute ("omp simt private", DECL_ATTRIBUTES (copy)))
+   DECL_ATTRIBUTES (copy)
+ = tree_cons (get_identifier ("omp simt private"), NULL,
+  DECL_ATTRIBUTES (copy));
+ id->dst_simt_vars->safe_push (copy);
+   }
+}
 
   return copy;
 }
diff --git a/gcc/tree-inline.h b/gcc/tree-inline.h
index 88b3286..ffb8333 100644
--- a/gcc/tree-inline.h
+++ b/gcc/tree-inline.h
@@ -145,6 +145,10 @@ struct copy_body_data
  equivalents in the function into which it is being inlined.  */
   hash_map *dependence_map;
 
+  /* A list of addressable local variables remapped into the caller
+ when inlining a call within an OpenMP SIMD-on-SIMT loop.  */
+  vec 

Re: [PATCH, GCC/testsuite/ARM, stage4] Compile atomic_loaddi_11 for Cortex-R5

2017-03-23 Thread Thomas Preudhomme

Sorry, I forgot about -march. Hold on.

On 23/03/17 16:51, Thomas Preudhomme wrote:

Please find attached an updated patch. ChangeLog entry unchanged:

*** gcc/testsuite/ChangeLog ***

2017-03-22  Thomas Preud'homme  

Re: [PATCH, GCC/testsuite/ARM, stage4] Compile atomic_loaddi_11 for Cortex-R5

2017-03-23 Thread Thomas Preudhomme

Please find attached an updated patch. ChangeLog entry unchanged:

*** gcc/testsuite/ChangeLog ***

2017-03-22  Thomas Preud'homme  

Re: [testsuite] Add missing dg-require-effective-target alloca to gcc testsuite

2017-03-23 Thread Tom de Vries

On 23/03/17 17:24, Thomas Schwinge wrote:

Hi Tom!

On Thu, 23 Mar 2017 16:46:19 +0100, Tom de Vries  wrote:

I've run the gcc testsuite for target nvptx-none and ran into "test for
excess errors" FAILs due to:
...
sorry, unimplemented: target cannot support alloca.
...

This patch marks those testcases as requiring alloca.


Thanks!


OK for trunk for stage1?


Wouldn't this be good "as obvious" for trunk, right now?  (Or are there
any potential worries?)



I suppose you're right we could classify this as obvious.

I'm not sure about "right now" though, given that we're in stage4 and 
this is neither a regression fix nor a documentation fix.


Thanks,
- Tom


Re: [patch v2] Get rid of stack trampolines for nested functions (1/4)

2017-03-23 Thread Andreas Schwab
On Sep 04 2016, Eric Botcazou  wrote:

> Index: calls.c
> ===
> --- calls.c   (revision 239944)
> +++ calls.c   (working copy)
> @@ -183,17 +183,76 @@ static void restore_fixed_argument_area (rtx, rtx,
>  
>  rtx
>  prepare_call_address (tree fndecl_or_type, rtx funexp, rtx 
> static_chain_value,
> -   rtx *call_fusage, int reg_parm_seen, int sibcallp)
> +   rtx *call_fusage, int reg_parm_seen, int flags)
>  {
>/* Make a valid memory address and copy constants through pseudo-regs,
>   but not for a constant address if -fno-function-cse.  */
>if (GET_CODE (funexp) != SYMBOL_REF)
> -/* If we are using registers for parameters, force the
> -   function address into a register now.  */
> -funexp = ((reg_parm_seen
> -&& targetm.small_register_classes_for_mode_p (FUNCTION_MODE))
> -   ? force_not_mem (memory_address (FUNCTION_MODE, funexp))
> -   : memory_address (FUNCTION_MODE, funexp));
> +{
> +  /* If it's an indirect call by descriptor, generate code to perform
> +  runtime identification of the pointer and load the descriptor.  */
> +  if ((flags & ECF_BY_DESCRIPTOR) && !flag_trampolines)
> + {
> +   const int bit_val = targetm.calls.custom_function_descriptors;
> +   rtx call_lab = gen_label_rtx ();
> +
> +   gcc_assert (fndecl_or_type && TYPE_P (fndecl_or_type));
> +   fndecl_or_type
> + = build_decl (UNKNOWN_LOCATION, FUNCTION_DECL, NULL_TREE,
> +   fndecl_or_type);
> +   DECL_STATIC_CHAIN (fndecl_or_type) = 1;
> +   rtx chain = targetm.calls.static_chain (fndecl_or_type, false);
> +
> +   /* Avoid long live ranges around function calls.  */
> +   funexp = copy_to_mode_reg (Pmode, funexp);

That needs to use ptr_mode, not Pmode.

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: [testsuite] Add missing dg-require-effective-target alloca to gcc testsuite

2017-03-23 Thread Thomas Schwinge
Hi Tom!

On Thu, 23 Mar 2017 16:46:19 +0100, Tom de Vries  wrote:
> I've run the gcc testsuite for target nvptx-none and ran into "test for 
> excess errors" FAILs due to:
> ...
> sorry, unimplemented: target cannot support alloca.
> ...
> 
> This patch marks those testcases as requiring alloca.

Thanks!

> OK for trunk for stage1?

Wouldn't this be good "as obvious" for trunk, right now?  (Or are there
any potential worries?)

> 2017-03-23  Tom de Vries  

"PR testsuite/80092", I suppose?

>   * gcc.dg/Walloca-7.c: Add dg-require-effective-target alloca.
>   * gcc.dg/Walloca-12.c: Same.
>   * gcc.dg/attr-alloc_size-8.c: Same.
>   * gcc.dg/Walloca-4.c: Same.
>   * gcc.dg/Walloca-8.c: Same.
>   * gcc.dg/Walloca-13.c: Same.
>   * gcc.dg/attr-alloc_size-9.c: Same.
>   * gcc.dg/Walloca-1.c: Same.
>   * gcc.dg/tree-ssa/builtin-sprintf-warn-3.c: Same.
>   * gcc.dg/Walloca-5.c: Same.
>   * gcc.dg/Walloca-10.c: Same.
>   * gcc.dg/Walloca-9.c: Same.
>   * gcc.dg/attr-alloc_size-6.c: Same.
>   * gcc.dg/Wvla-larger-than-1.c: Same.
>   * gcc.dg/torture/pr71881.c: Same.
>   * gcc.dg/torture/pr71901.c: Same.
>   * gcc.dg/torture/pr78742.c: Same.
>   * gcc.dg/builtin-alloc-size.c: Same.
>   * gcc.dg/Walloca-2.c: Same.
>   * gcc.dg/Walloca-6.c: Same.
>   * gcc.dg/Walloca-11.c: Same.
>   * gcc.dg/attr-alloc_size-7.c: Same.
>   * gcc.dg/Wvla-larger-than-2.c: Same.
>   * gcc.dg/Walloca-3.c: Same.
>   * c-c++-common/Wimplicit-fallthrough-7.c: Same.
>   * gcc.c-torture/compile/pr79413.c: Same.
>   * gcc.c-torture/compile/pr78439.c: Same.


Grüße
 Thomas


Re: [PATCH 4/5] tree-inline: implement SIMT privatization, part 3

2017-03-23 Thread Jakub Jelinek
On Thu, Mar 23, 2017 at 07:15:52PM +0300, Alexander Monakov wrote:
>   * tree-inline.h (struct copy_body_data): New field dst_simt_vars.
> * tree-inline.c (expand_call_inline): Handle SIMT privatization.
> (copy_decl_for_dup_finish): Ditto.
> ---
>  gcc/tree-inline.c | 65 
> +--
>  gcc/tree-inline.h |  4 
>  2 files changed, 62 insertions(+), 7 deletions(-)
> 
> @@ -4588,15 +4593,26 @@ expand_call_inline (basic_block bb, gimple *stmt, 
> copy_body_data *id)
>id->src_cfun = DECL_STRUCT_FUNCTION (fn);
>id->call_stmt = call_stmt;
>  
> +  /* When inlining into an OpenMP SIMD-on-SIMT loop, arrange for new 
> automatic
> + variables to be added to IFN_GOMP_SIMT_ENTER argument list.  */
> +  dst_cfun = DECL_STRUCT_FUNCTION (id->dst_fn);
> +  if (!(dst_cfun->curr_properties & PROP_gimple_lomp_dev)
> +  && (simduid = bb->loop_father->simduid) != NULL_TREE
> +  && (simduid = ssa_default_def (dst_cfun, simduid)) != NULL_TREE
> +  && single_imm_use (simduid, , _stmt)
> +  && is_gimple_call (simtenter_stmt)
> +  && gimple_call_internal_p (simtenter_stmt, IFN_GOMP_SIMT_ENTER))
> +{
> +  simtvars_st = id->dst_simt_vars;
> +  vec_alloc (id->dst_simt_vars, 0);
> +}

One more thing.  If the above if condition is false, you keep
id->dst_simt_vars what it was (which means simtvars_st is NULL).
If it was non-NULL already, then:

> @@ -4730,6 +4746,31 @@ expand_call_inline (basic_block bb, gimple *stmt, 
> copy_body_data *id)
>if (cfun->gimple_df)
>  pt_solution_reset (>gimple_df->escaped);
>  
> +  /* Add new automatic variables to IFN_GOMP_SIMT_ENTER arguments.  */
> +  if (id->dst_simt_vars)
> +{

This will be true.

> +  if (id->dst_simt_vars->length () > 0)
> + {
> +   size_t nargs = gimple_call_num_args (simtenter_stmt);
> +   vec *vars = id->dst_simt_vars;
> +   auto_vec newargs (nargs + vars->length ());
> +   for (size_t i = 0; i < nargs; i++)
> + newargs.quick_push (gimple_call_arg (simtenter_stmt, i));
> +   for (tree *pvar = vars->begin (); pvar != vars->end (); pvar++)
> + {
> +   tree ptrtype = build_pointer_type (TREE_TYPE (*pvar));
> +   newargs.quick_push (build1 (ADDR_EXPR, ptrtype, *pvar));
> + }
> +   gcall *g
> + = gimple_build_call_internal_vec (IFN_GOMP_SIMT_ENTER, newargs);
> +   gimple_call_set_lhs (g, gimple_call_lhs (simtenter_stmt));
> +   gimple_stmt_iterator gsi = gsi_for_stmt (simtenter_stmt);
> +   gsi_replace (, g, false);
> + }

And you handle dst_simt_vars from some other invocation.

> +  vec_free (id->dst_simt_vars);
> +  id->dst_simt_vars = simtvars_st;

And then clear it.  That doesn't look like the right thing.

So either you need some bool variable whether you've actually allocated
the vector in the current expand_call_inline and use that instead of
if (id->dst_simt_vars), or maybe you should clear id->dst_simt_vars
otherwise and save/restore it around unconditionally.

Jakub


Re: [PATCH, GCC/testsuite/ARM, stage4] Compile atomic_loaddi_11 for Cortex-R5

2017-03-23 Thread Thomas Preudhomme

Mmmh I probably need to add a dg-skip-if in there. Will respin the patch.

Best regards,

Thomas

On 23/03/17 16:10, Richard Earnshaw (lists) wrote:

On 23/03/17 16:02, Thomas Preudhomme wrote:

Hi,

gcc.target/arm/atomic_loaddi_11.c testcase contributed in r246365 does
not test the changed code since ARMv7-R does not have division
instructions in ARM state. This patch changes it to target Cortex-R5
processor instead which does have division instructions in ARM state.

ChangeLog entry is as follows:

*** gcc/testsuite/ChangeLog ***

2017-03-22  Thomas Preud'homme  

Re: [PATCH 4/5] tree-inline: implement SIMT privatization, part 3

2017-03-23 Thread Alexander Monakov
On Thu, 23 Mar 2017, Jakub Jelinek wrote:
> > Sorry for missing the IR stability issue.  This code relies on dst_simt_vars
> > being a set and thus having no duplicate entries (so the implicit lookup 
> > when
> > adding an element is needed).
> > 
> > However, I think I was overly cautious: looking again, I think we can't 
> > enter
> > copy_decl_for_dup_finish twice with the same 'copy' VAR_DECL.  Changing it 
> > to a
> > vec should be fine then?
> 
> Yeah, callers of copy_decl* should look the orig var in the decl map first,
> plus if you look at all copy_decl_for_dup_finish callers, all of them first
> create a new tree (usually copy_node) and then pass it as copy to
> copy_decl_for_dup_finish, so you'll never get the same copy in there.
> 
> So, please change it into a vector.

Thanks — here's the updated patch.  I've also noticed there's no need to rebuild
the existing SIMT_ENTER statement if we didn't add any new privatized variables.

* tree-inline.h (struct copy_body_data): New field dst_simt_vars.
* tree-inline.c (expand_call_inline): Handle SIMT privatization.
(copy_decl_for_dup_finish): Ditto.
---
 gcc/tree-inline.c | 65 +--
 gcc/tree-inline.h |  4 
 2 files changed, 62 insertions(+), 7 deletions(-)

diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index 6b6d489..a84e569 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -4385,6 +4385,11 @@ expand_call_inline (basic_block bb, gimple *stmt, 
copy_body_data *id)
   gcall *call_stmt;
   unsigned int i;
   unsigned int prop_mask, src_properties;
+  struct function *dst_cfun;
+  tree simduid;
+  use_operand_p use;
+  gimple *simtenter_stmt = NULL;
+  vec *simtvars_st = NULL;
 
   /* The gimplifier uses input_location in too many places, such as
  internal_get_tmp_var ().  */
@@ -4588,15 +4593,26 @@ expand_call_inline (basic_block bb, gimple *stmt, 
copy_body_data *id)
   id->src_cfun = DECL_STRUCT_FUNCTION (fn);
   id->call_stmt = call_stmt;
 
+  /* When inlining into an OpenMP SIMD-on-SIMT loop, arrange for new automatic
+ variables to be added to IFN_GOMP_SIMT_ENTER argument list.  */
+  dst_cfun = DECL_STRUCT_FUNCTION (id->dst_fn);
+  if (!(dst_cfun->curr_properties & PROP_gimple_lomp_dev)
+  && (simduid = bb->loop_father->simduid) != NULL_TREE
+  && (simduid = ssa_default_def (dst_cfun, simduid)) != NULL_TREE
+  && single_imm_use (simduid, , _stmt)
+  && is_gimple_call (simtenter_stmt)
+  && gimple_call_internal_p (simtenter_stmt, IFN_GOMP_SIMT_ENTER))
+{
+  simtvars_st = id->dst_simt_vars;
+  vec_alloc (id->dst_simt_vars, 0);
+}
+
   /* If the src function contains an IFN_VA_ARG, then so will the dst
  function after inlining.  Likewise for IFN_GOMP_USE_SIMT.  */
   prop_mask = PROP_gimple_lva | PROP_gimple_lomp_dev;
   src_properties = id->src_cfun->curr_properties & prop_mask;
   if (src_properties != prop_mask)
-{
-  struct function *dst_cfun = DECL_STRUCT_FUNCTION (id->dst_fn);
-  dst_cfun->curr_properties &= src_properties | ~prop_mask;
-}
+dst_cfun->curr_properties &= src_properties | ~prop_mask;
 
   gcc_assert (!id->src_cfun->after_inlining);
 
@@ -4730,6 +4746,31 @@ expand_call_inline (basic_block bb, gimple *stmt, 
copy_body_data *id)
   if (cfun->gimple_df)
 pt_solution_reset (>gimple_df->escaped);
 
+  /* Add new automatic variables to IFN_GOMP_SIMT_ENTER arguments.  */
+  if (id->dst_simt_vars)
+{
+  if (id->dst_simt_vars->length () > 0)
+   {
+ size_t nargs = gimple_call_num_args (simtenter_stmt);
+ vec *vars = id->dst_simt_vars;
+ auto_vec newargs (nargs + vars->length ());
+ for (size_t i = 0; i < nargs; i++)
+   newargs.quick_push (gimple_call_arg (simtenter_stmt, i));
+ for (tree *pvar = vars->begin (); pvar != vars->end (); pvar++)
+   {
+ tree ptrtype = build_pointer_type (TREE_TYPE (*pvar));
+ newargs.quick_push (build1 (ADDR_EXPR, ptrtype, *pvar));
+   }
+ gcall *g
+   = gimple_build_call_internal_vec (IFN_GOMP_SIMT_ENTER, newargs);
+ gimple_call_set_lhs (g, gimple_call_lhs (simtenter_stmt));
+ gimple_stmt_iterator gsi = gsi_for_stmt (simtenter_stmt);
+ gsi_replace (, g, false);
+   }
+  vec_free (id->dst_simt_vars);
+  id->dst_simt_vars = simtvars_st;
+}
+
   /* Clean up.  */
   if (id->debug_map)
 {
@@ -5453,9 +5494,19 @@ copy_decl_for_dup_finish (copy_body_data *id, tree decl, 
tree copy)
function.  */
 ;
   else
-/* Ordinary automatic local variables are now in the scope of the
-   new function.  */
-DECL_CONTEXT (copy) = id->dst_fn;
+{
+  /* Ordinary automatic local variables are now in the scope of the
+new function.  */
+  DECL_CONTEXT (copy) = id->dst_fn;
+  if (VAR_P (copy) && id->dst_simt_vars && !is_gimple_reg (copy))
+   {
+ if 

Re: [PATCH, GCC/testsuite/ARM, stage4] Compile atomic_loaddi_11 for Cortex-R5

2017-03-23 Thread Richard Earnshaw (lists)
On 23/03/17 16:02, Thomas Preudhomme wrote:
> Hi,
> 
> gcc.target/arm/atomic_loaddi_11.c testcase contributed in r246365 does
> not test the changed code since ARMv7-R does not have division
> instructions in ARM state. This patch changes it to target Cortex-R5
> processor instead which does have division instructions in ARM state.
> 
> ChangeLog entry is as follows:
> 
> *** gcc/testsuite/ChangeLog ***
> 
> 2017-03-22  Thomas Preud'homme   
> PR target/80082
> * gcc.target/arm/atomic_loaddi_11.c: Target Cortex-R5 instead of
> ARMv7-R.
> 
> Is this ok for stage4?
> 
> Best regards,
> 
> Thomas
> 
> atomic_loaddi_11_cortexr5.patch
> 
> 
> diff --git a/gcc/testsuite/gcc.target/arm/atomic_loaddi_11.c 
> b/gcc/testsuite/gcc.target/arm/atomic_loaddi_11.c
> index 
> 275669bd76356dc7c7b6a5373792d9a5089ede51..4ada2efd5f047154f2ca2fb39e9432c96ee1d42b
>  100644
> --- a/gcc/testsuite/gcc.target/arm/atomic_loaddi_11.c
> +++ b/gcc/testsuite/gcc.target/arm/atomic_loaddi_11.c
> @@ -1,7 +1,6 @@
>  /* { dg-do compile } */
>  /* { dg-require-effective-target arm_arch_v7r_ok } */
> -/* { dg-options "-O2" } */
> -/* { dg-add-options arm_arch_v7r } */
> +/* { dg-options "-O2 -mcpu=cortex-r5" } */
>  
>  #include 
>  
> 

Will that work properly if doing multilib testing with a specific CPU
target?

R.


Re: [PATCH, GCC/LTO, stage4, ping3] Fix PR69866: LTO with def for weak alias in regular object file

2017-03-23 Thread Thomas Preudhomme

Ping?

Best regards,

Thomas

On 16/03/17 14:05, Thomas Preudhomme wrote:

Ping?

Is this ok for stage4?

Best regards,

Thomas

On 09/03/17 09:43, Thomas Preudhomme wrote:

Ping?

Best regards,

Thomas

On 02/03/17 14:51, Thomas Preudhomme wrote:

Hi,

This patch fixes an assert failure when linking one LTOed object file
having a weak alias with a regular object file containing a strong
definition for that same symbol. The patch is twofold:

+ do not add an alias to a partition if it is external
+ do not declare (.globl) an alias if it is external

ChangeLog entries are as follow:

*** gcc/lto/ChangeLog ***

2017-03-01  Thomas Preud'homme  

PR lto/69866
* lto/lto-partition.c (add_symbol_to_partition_1): Do not add external
aliases to partition.

*** gcc/ChangeLog ***

2017-03-01  Thomas Preud'homme  

PR lto/69866
* cgraphunit.c (cgraph_node::assemble_thunks_and_aliases): Do not
declare external aliases.

*** gcc/testsuite/ChangeLog ***

2017-02-28  Thomas Preud'homme  

PR lto/69866
* gcc.dg/lto/pr69866_0.c: New test.
* gcc.dg/lto/pr69866_1.c: Likewise.


Testing: Testsuite shows no regression when targeting Cortex-M3 with an
arm-none-eabi GCC cross-compiler, neither does it show any regression with
native LTO-bootstrapped x86-64_linux-gnu and aarch64-linux-gnu compilers.

Is this ok for stage4?

Best regards,

Thomas
diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
index c82a88a599ca61b068dd9783d2a6158163809b37..580500ff922b8546d33119261a2455235edbf16d 100644
--- a/gcc/cgraphunit.c
+++ b/gcc/cgraphunit.c
@@ -1972,7 +1972,7 @@ cgraph_node::assemble_thunks_and_aliases (void)
   FOR_EACH_ALIAS (this, ref)
 {
   cgraph_node *alias = dyn_cast  (ref->referring);
-  if (!alias->transparent_alias)
+  if (!alias->transparent_alias && !DECL_EXTERNAL (alias->decl))
 	{
 	  bool saved_written = TREE_ASM_WRITTEN (decl);
 
diff --git a/gcc/lto/lto-partition.c b/gcc/lto/lto-partition.c
index e27d0d1690c1fcfb39e2fac03ce0f4154031fc7c..f44fd435ed075a27e373bdfdf0464eb06e1731ef 100644
--- a/gcc/lto/lto-partition.c
+++ b/gcc/lto/lto-partition.c
@@ -178,7 +178,8 @@ add_symbol_to_partition_1 (ltrans_partition part, symtab_node *node)
   /* Add all aliases associated with the symbol.  */
 
   FOR_EACH_ALIAS (node, ref)
-if (!ref->referring->transparent_alias)
+if (!ref->referring->transparent_alias
+	&& ref->referring->get_partitioning_class () != SYMBOL_EXTERNAL)
   add_symbol_to_partition_1 (part, ref->referring);
 else
   {
@@ -189,7 +190,8 @@ add_symbol_to_partition_1 (ltrans_partition part, symtab_node *node)
 	  {
 	/* Nested transparent aliases are not permitted.  */
 	gcc_checking_assert (!ref2->referring->transparent_alias);
-	add_symbol_to_partition_1 (part, ref2->referring);
+	if (ref2->referring->get_partitioning_class () != SYMBOL_EXTERNAL)
+	  add_symbol_to_partition_1 (part, ref2->referring);
 	  }
   }
 
diff --git a/gcc/testsuite/gcc.dg/lto/pr69866_0.c b/gcc/testsuite/gcc.dg/lto/pr69866_0.c
new file mode 100644
index ..f49ef8d4c1da7a21d1bfb5409d647bd18141595b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/lto/pr69866_0.c
@@ -0,0 +1,13 @@
+/* { dg-lto-do link } */
+
+int _umh(int i)
+{
+  return i+1;
+}
+
+int weaks(int i) __attribute__((weak, alias("_umh")));
+
+int main()
+{
+  return weaks(10);
+}
diff --git a/gcc/testsuite/gcc.dg/lto/pr69866_1.c b/gcc/testsuite/gcc.dg/lto/pr69866_1.c
new file mode 100644
index ..3a14f850eefaffbf659ce4642adef7900330f4ed
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/lto/pr69866_1.c
@@ -0,0 +1,6 @@
+/* { dg-options { -fno-lto } } */
+
+int weaks(int i)
+{
+  return i+1;
+}


[PATCH, GCC/testsuite/ARM, stage4] Compile atomic_loaddi_11 for Cortex-R5

2017-03-23 Thread Thomas Preudhomme

Hi,

gcc.target/arm/atomic_loaddi_11.c testcase contributed in r246365 does
not test the changed code since ARMv7-R does not have division
instructions in ARM state. This patch changes it to target Cortex-R5
processor instead which does have division instructions in ARM state.

ChangeLog entry is as follows:

*** gcc/testsuite/ChangeLog ***

2017-03-22  Thomas Preud'homme  

[testsuite] Add missing dg-require-effective-target alloca to gcc testsuite

2017-03-23 Thread Tom de Vries

Hi,

I've run the gcc testsuite for target nvptx-none and ran into "test for 
excess errors" FAILs due to:

...
sorry, unimplemented: target cannot support alloca.
...

This patch marks those testcases as requiring alloca.

OK for trunk for stage1?

Thanks,
- Tom
2017-03-23  Tom de Vries  

	* gcc.dg/Walloca-7.c: Add dg-require-effective-target alloca.
	* gcc.dg/Walloca-12.c: Same.
	* gcc.dg/attr-alloc_size-8.c: Same.
	* gcc.dg/Walloca-4.c: Same.
	* gcc.dg/Walloca-8.c: Same.
	* gcc.dg/Walloca-13.c: Same.
	* gcc.dg/attr-alloc_size-9.c: Same.
	* gcc.dg/Walloca-1.c: Same.
	* gcc.dg/tree-ssa/builtin-sprintf-warn-3.c: Same.
	* gcc.dg/Walloca-5.c: Same.
	* gcc.dg/Walloca-10.c: Same.
	* gcc.dg/Walloca-9.c: Same.
	* gcc.dg/attr-alloc_size-6.c: Same.
	* gcc.dg/Wvla-larger-than-1.c: Same.
	* gcc.dg/torture/pr71881.c: Same.
	* gcc.dg/torture/pr71901.c: Same.
	* gcc.dg/torture/pr78742.c: Same.
	* gcc.dg/builtin-alloc-size.c: Same.
	* gcc.dg/Walloca-2.c: Same.
	* gcc.dg/Walloca-6.c: Same.
	* gcc.dg/Walloca-11.c: Same.
	* gcc.dg/attr-alloc_size-7.c: Same.
	* gcc.dg/Wvla-larger-than-2.c: Same.
	* gcc.dg/Walloca-3.c: Same.
	* c-c++-common/Wimplicit-fallthrough-7.c: Same.
	* gcc.c-torture/compile/pr79413.c: Same.
	* gcc.c-torture/compile/pr78439.c: Same.

Index: gcc/testsuite/gcc.dg/Walloca-7.c
===
--- gcc/testsuite/gcc.dg/Walloca-7.c (revision 246278)
+++ gcc/testsuite/gcc.dg/Walloca-7.c (working copy)
@@ -1,4 +1,5 @@
 /* { dg-do compile } */
+/* { dg-require-effective-target alloca } */
 /* { dg-options "-Walloca -O0" } */
 
 extern void f(void *);
Index: gcc/testsuite/gcc.dg/Walloca-12.c
===
--- gcc/testsuite/gcc.dg/Walloca-12.c (revision 246278)
+++ gcc/testsuite/gcc.dg/Walloca-12.c (working copy)
@@ -1,4 +1,5 @@
 /* { dg-do compile } */
+/* { dg-require-effective-target alloca } */
 /* { dg-options "-Walloca-larger-than=128 -O2" } */
 
 void f (void*);
Index: gcc/testsuite/gcc.dg/attr-alloc_size-8.c
===
--- gcc/testsuite/gcc.dg/attr-alloc_size-8.c (revision 246278)
+++ gcc/testsuite/gcc.dg/attr-alloc_size-8.c (working copy)
@@ -3,6 +3,7 @@
-Wvla-larger-than, and -Walloc-size-larger-than options.  The former
two more specific options override the more general latter option.  */
 /* { dg-do compile } */
+/* { dg-require-effective-target alloca } */
 /* { dg-options "-O2 -Walloc-size-larger-than=123 -Walloca-larger-than=234 -Wvla-larger-than=345" } */
 
 #define SIZE_MAX   __SIZE_MAX__
Index: gcc/testsuite/gcc.dg/Walloca-4.c
===
--- gcc/testsuite/gcc.dg/Walloca-4.c (revision 246278)
+++ gcc/testsuite/gcc.dg/Walloca-4.c (working copy)
@@ -1,4 +1,5 @@
 /* { dg-do compile } */
+/* { dg-require-effective-target alloca } */
 /* { dg-options "-Walloca-larger-than=5000 -O2" } */
 
  char *
Index: gcc/testsuite/gcc.dg/Walloca-8.c
===
--- gcc/testsuite/gcc.dg/Walloca-8.c (revision 246278)
+++ gcc/testsuite/gcc.dg/Walloca-8.c (working copy)
@@ -1,4 +1,5 @@
 /* { dg-do compile } */
+/* { dg-require-effective-target alloca } */
 /* { dg-options "-Walloca-larger-than=2000 -O2" } */
 
 void *p;
Index: gcc/testsuite/gcc.dg/Walloca-13.c
===
--- gcc/testsuite/gcc.dg/Walloca-13.c (revision 246278)
+++ gcc/testsuite/gcc.dg/Walloca-13.c (working copy)
@@ -1,4 +1,5 @@
 /* { dg-do compile } */
+/* { dg-require-effective-target alloca } */
 /* { dg-options "-Walloca-larger-than=100 -O2" } */
 
 void f (void*);
Index: gcc/testsuite/gcc.dg/attr-alloc_size-9.c
===
--- gcc/testsuite/gcc.dg/attr-alloc_size-9.c (revision 246278)
+++ gcc/testsuite/gcc.dg/attr-alloc_size-9.c (working copy)
@@ -3,6 +3,7 @@
with attribute malloc.  This means that the pointer they return
can be assumed not to alias any other valid pointer.  */
 /* { dg-do compile } */
+/* { dg-require-effective-target alloca } */
 /* { dg-options "-O2 -fdump-tree-optimized" } */
 
 void sink (void*);
Index: gcc/testsuite/gcc.dg/Walloca-1.c
===
--- gcc/testsuite/gcc.dg/Walloca-1.c (revision 246278)
+++ gcc/testsuite/gcc.dg/Walloca-1.c (working copy)
@@ -1,4 +1,5 @@
 /* { dg-do compile } */
+/* { dg-require-effective-target alloca } */
 /* { dg-options "-Walloca-larger-than=2000 -O2" } */
 
 #define alloca __builtin_alloca
Index: gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-3.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-3.c (revision 246278)
+++ gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-3.c (working copy)
@@ -4,6 +4,7 @@
-O2 (-ftree-vrp) is necessary 

Re: [PATCH] fwprop: Prevent infinite looping (PR79405)

2017-03-23 Thread Segher Boessenkool
On Thu, Mar 23, 2017 at 04:16:56PM +0100, Richard Biener wrote:
> On Thu, Mar 23, 2017 at 3:58 PM, Segher Boessenkool
>  wrote:
> > The algorithm fwprop uses never reconsiders a possible propagation,
> > although it could succeed if the def (in the def-use to propagate)
> > has changed.  This causes fwprop to do infinite propagations, like
> > in the scenario in the PR, where we end up with effectively
> >   B = A
> >   A = B
> >   D = A
> > where only propagations into the last statement are still tried, and
> > that loops (it becomes D = B, then back to D = A, etc.)
> >
> > Fixing this properly isn't easy; this patch instead limits the number
> > of propagations performed to the number of uses we originally had,
> > which is the maximum number of propagations that can be done if there
> > are no such infinite loops.
> >
> > Bootstrapped and regression checked on powerpc64-linux {-m64,-m32};
> > is this okay for trunk?
> 
> https://gcc.gnu.org/ml/gcc-patches/2017-02/msg01485.html
> 
> ?

What about it?  That suggestion would make fwprop do *less* useful work,
while in principle the problem is that it *already* does not enough!
If fwprop actually tried all propagations (and never tried substituting
X with X, which it currently does), there would be no problem.

This patch is a workaround; it makes no difference on pretty much any
code, except this single testcase (which has undefined behaviour,
uninitialised variables).


Segher


Re: [PATCH] fwprop: Prevent infinite looping (PR79405)

2017-03-23 Thread Richard Biener
On Thu, Mar 23, 2017 at 3:58 PM, Segher Boessenkool
 wrote:
> The algorithm fwprop uses never reconsiders a possible propagation,
> although it could succeed if the def (in the def-use to propagate)
> has changed.  This causes fwprop to do infinite propagations, like
> in the scenario in the PR, where we end up with effectively
>   B = A
>   A = B
>   D = A
> where only propagations into the last statement are still tried, and
> that loops (it becomes D = B, then back to D = A, etc.)
>
> Fixing this properly isn't easy; this patch instead limits the number
> of propagations performed to the number of uses we originally had,
> which is the maximum number of propagations that can be done if there
> are no such infinite loops.
>
> Bootstrapped and regression checked on powerpc64-linux {-m64,-m32};
> is this okay for trunk?

https://gcc.gnu.org/ml/gcc-patches/2017-02/msg01485.html

?

>
> Segher
>
>
> 2017-03-23  Segher Boessenkool  
>
> PR rtl-optimization/79405
> * fwprop.c (propagations_left): New variable.
> (forward_propagate_into): Decrement it.
> (fwprop_init): Initialize it.
> (fw_prop): If the variable has reached zero, stop propagating.
> (fwprop_addr): Ditto.
>
> gcc/testsuite/
> PR rtl-optimization/79405
> gcc.dg/pr79405.c: New testcase.
>
> ---
>  gcc/fwprop.c   | 17 
>  gcc/testsuite/gcc.dg/pr79405.c | 45 
> ++
>  2 files changed, 62 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.dg/pr79405.c
>
> diff --git a/gcc/fwprop.c b/gcc/fwprop.c
> index 285fb1a..0ab25ef 100644
> --- a/gcc/fwprop.c
> +++ b/gcc/fwprop.c
> @@ -120,6 +120,13 @@ static vec use_def_ref;
>  static vec reg_defs;
>  static vec reg_defs_stack;
>
> +/* The maximum number of propagations that are still allowed.  If we do
> +   more propagations than originally we had uses, we must have ended up
> +   in a propagation loop, as in PR79405.  Until the algorithm fwprop
> +   uses can obviously not get into such loops we need a workaround like
> +   this.  */
> +static int propagations_left;
> +
>  /* The MD bitmaps are trimmed to include only live registers to cut
> memory usage on testcases like insn-recog.c.  Track live registers
> in the basic block and do not perform forward propagation if the
> @@ -1407,6 +1414,8 @@ forward_propagate_into (df_ref use)
>if (forward_propagate_and_simplify (use, def_insn, def_set)
>|| forward_propagate_subreg (use, def_insn, def_set))
>  {
> +  propagations_left--;
> +
>if (cfun->can_throw_non_call_exceptions
>   && find_reg_note (use_insn, REG_EH_REGION, NULL_RTX)
>   && purge_dead_edges (DF_REF_BB (use)))
> @@ -1434,6 +1443,8 @@ fwprop_init (void)
>active_defs = XNEWVEC (df_ref, max_reg_num ());
>if (flag_checking)
>  active_defs_check = sparseset_alloc (max_reg_num ());
> +
> +  propagations_left = DF_USES_TABLE_SIZE ();
>  }
>
>  static void
> @@ -1480,6 +1491,9 @@ fwprop (void)
>
>for (i = 0; i < DF_USES_TABLE_SIZE (); i++)
>  {
> +  if (!propagations_left)
> +   break;
> +
>df_ref use = DF_USES_GET (i);
>if (use)
> if (DF_REF_TYPE (use) == DF_REF_REG_USE
> @@ -1540,6 +1554,9 @@ fwprop_addr (void)
>   end, and we'll go through them as well.  */
>for (i = 0; i < DF_USES_TABLE_SIZE (); i++)
>  {
> +  if (!propagations_left)
> +   break;
> +
>df_ref use = DF_USES_GET (i);
>if (use)
> if (DF_REF_TYPE (use) != DF_REF_REG_USE
> diff --git a/gcc/testsuite/gcc.dg/pr79405.c b/gcc/testsuite/gcc.dg/pr79405.c
> new file mode 100644
> index 000..c17baff
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/pr79405.c
> @@ -0,0 +1,45 @@
> +/* PR rtl-optimization/79405 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +char cz;
> +long long int xx, u2;
> +
> +void
> +qv (int js, int wl)
> +{
> +  if (js != 0)
> +{
> +  short int sc;
> +  int *at = (int *)
> +  long long int gx = 0;
> +
> +  for (;;)
> +   {
> + *at = 0;
> + js /= sc;
> +
> + for (wl = 0; wl < 2; ++wl)
> +   {
> + xx = gx;
> + u2 %= xx > 0;
> + cz /= u2;
> +
> + fa:
> + if (cz != u2)
> +   {
> + gx |= js;
> + cz = gx / js;
> +   }
> +   }
> +   }
> +
> + yq:
> +  wl /= 0x8000;
> +  u2 = wl;
> +  u2 |= (wl != 0) | (wl != 0 && gx != 0);
> +  js = u2;
> +  goto fa;
> +}
> +  goto yq;
> +}
> --
> 1.9.3
>


Re: [libstdc++,doc] Strip links to ANSI (web shop)

2017-03-23 Thread Jonathan Wakely

On 18/03/17 19:44 +0100, Gerald Pfeifer wrote:

On Wed, 15 Feb 2017, Jonathan Wakely wrote:

The C++14 standard is:
http://webstore.ansi.org/RecordDetail.aspx?sku=ISO%2fIEC+14882%3a2014


Thanks, Jonathan!


What do you think?

Should we make the FAQ link to the info in the manual, instead of just
removing it?


Great idea.  Unfortunately I do not know the Docbook magic to do
this for the libstdc++ documentation, so I went ahead with the
simple version below.

Can you perhaps help making this a "real" link?


Exactly how it's one on line 126 in that file. I'll commit this later.


commit 90a904a447d5153f1bca007b0b4a599e4d8a8731
Author: Jonathan Wakely 
Date:   Thu Mar 23 15:00:06 2017 +

Add link to the right section of the manual

	* doc/xml/faq.xml: Add link.
	* doc/html/*: Regenerate.

diff --git a/libstdc++-v3/doc/xml/faq.xml b/libstdc++-v3/doc/xml/faq.xml
index db67626..340ba9d 100644
--- a/libstdc++-v3/doc/xml/faq.xml
+++ b/libstdc++-v3/doc/xml/faq.xml
@@ -1177,7 +1177,8 @@
   
   
 
-Please refer to the Contributing section in our manual.
+Please refer to the Contributing
+section in our manual.
  
   
 


[PATCH] fwprop: Prevent infinite looping (PR79405)

2017-03-23 Thread Segher Boessenkool
The algorithm fwprop uses never reconsiders a possible propagation,
although it could succeed if the def (in the def-use to propagate)
has changed.  This causes fwprop to do infinite propagations, like
in the scenario in the PR, where we end up with effectively
  B = A
  A = B
  D = A
where only propagations into the last statement are still tried, and
that loops (it becomes D = B, then back to D = A, etc.)

Fixing this properly isn't easy; this patch instead limits the number
of propagations performed to the number of uses we originally had,
which is the maximum number of propagations that can be done if there
are no such infinite loops.

Bootstrapped and regression checked on powerpc64-linux {-m64,-m32};
is this okay for trunk?


Segher


2017-03-23  Segher Boessenkool  

PR rtl-optimization/79405
* fwprop.c (propagations_left): New variable.
(forward_propagate_into): Decrement it.
(fwprop_init): Initialize it.
(fw_prop): If the variable has reached zero, stop propagating.
(fwprop_addr): Ditto.

gcc/testsuite/
PR rtl-optimization/79405
gcc.dg/pr79405.c: New testcase.

---
 gcc/fwprop.c   | 17 
 gcc/testsuite/gcc.dg/pr79405.c | 45 ++
 2 files changed, 62 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/pr79405.c

diff --git a/gcc/fwprop.c b/gcc/fwprop.c
index 285fb1a..0ab25ef 100644
--- a/gcc/fwprop.c
+++ b/gcc/fwprop.c
@@ -120,6 +120,13 @@ static vec use_def_ref;
 static vec reg_defs;
 static vec reg_defs_stack;
 
+/* The maximum number of propagations that are still allowed.  If we do
+   more propagations than originally we had uses, we must have ended up
+   in a propagation loop, as in PR79405.  Until the algorithm fwprop
+   uses can obviously not get into such loops we need a workaround like
+   this.  */
+static int propagations_left;
+
 /* The MD bitmaps are trimmed to include only live registers to cut
memory usage on testcases like insn-recog.c.  Track live registers
in the basic block and do not perform forward propagation if the
@@ -1407,6 +1414,8 @@ forward_propagate_into (df_ref use)
   if (forward_propagate_and_simplify (use, def_insn, def_set)
   || forward_propagate_subreg (use, def_insn, def_set))
 {
+  propagations_left--;
+
   if (cfun->can_throw_non_call_exceptions
  && find_reg_note (use_insn, REG_EH_REGION, NULL_RTX)
  && purge_dead_edges (DF_REF_BB (use)))
@@ -1434,6 +1443,8 @@ fwprop_init (void)
   active_defs = XNEWVEC (df_ref, max_reg_num ());
   if (flag_checking)
 active_defs_check = sparseset_alloc (max_reg_num ());
+
+  propagations_left = DF_USES_TABLE_SIZE ();
 }
 
 static void
@@ -1480,6 +1491,9 @@ fwprop (void)
 
   for (i = 0; i < DF_USES_TABLE_SIZE (); i++)
 {
+  if (!propagations_left)
+   break;
+
   df_ref use = DF_USES_GET (i);
   if (use)
if (DF_REF_TYPE (use) == DF_REF_REG_USE
@@ -1540,6 +1554,9 @@ fwprop_addr (void)
  end, and we'll go through them as well.  */
   for (i = 0; i < DF_USES_TABLE_SIZE (); i++)
 {
+  if (!propagations_left)
+   break;
+
   df_ref use = DF_USES_GET (i);
   if (use)
if (DF_REF_TYPE (use) != DF_REF_REG_USE
diff --git a/gcc/testsuite/gcc.dg/pr79405.c b/gcc/testsuite/gcc.dg/pr79405.c
new file mode 100644
index 000..c17baff
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr79405.c
@@ -0,0 +1,45 @@
+/* PR rtl-optimization/79405 */
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+char cz;
+long long int xx, u2;
+
+void
+qv (int js, int wl)
+{
+  if (js != 0)
+{
+  short int sc;
+  int *at = (int *)
+  long long int gx = 0;
+
+  for (;;)
+   {
+ *at = 0;
+ js /= sc;
+
+ for (wl = 0; wl < 2; ++wl)
+   {
+ xx = gx;
+ u2 %= xx > 0;
+ cz /= u2;
+
+ fa:
+ if (cz != u2)
+   {
+ gx |= js;
+ cz = gx / js;
+   }
+   }
+   }
+
+ yq:
+  wl /= 0x8000;
+  u2 = wl;
+  u2 |= (wl != 0) | (wl != 0 && gx != 0);
+  js = u2;
+  goto fa;
+}
+  goto yq;
+}
-- 
1.9.3



[CHKP] Fix for PR79990

2017-03-23 Thread Alexander Ivchenko
Hi,

The patch below attempts to fix the PR. I checked that it did not
break any of mpx.exp tests, but I did not run the full testing yet.
Would like to know whether this approach is generally correct or not.

The issue is that we have the hard reg vector variable:

typedef int U __attribute__ ((vector_size (16)));
register U u asm("xmm0");

and chkp tries to instrument access to it:

return u[i];

by doing that:

__bound_tmp.0_4 = __builtin_ia32_bndmk (, 16);

However, you cannot take an address of a register variable (in fact if
you do that, the compiler will give you "address of register variable
‘u’ requested" error), so expand, sensibly, gives an ICE on on 
here. I believe that if there is no pointers, pointer bounds checker
shouldn't get involved into that business. What do you think?




diff --git a/gcc/tree-chkp.c b/gcc/tree-chkp.c
index 75caf83..e39ec9a 100644
--- a/gcc/tree-chkp.c
+++ b/gcc/tree-chkp.c
@@ -3383,6 +3383,7 @@ chkp_parse_array_and_component_ref (tree node, tree *ptr,
   tree comp_to_narrow = NULL_TREE;
   tree last_comp = NULL_TREE;
   bool array_ref_found = false;
+  bool is_register_var = false;
   tree *nodes;
   tree var;
   int len;
@@ -3440,6 +3441,9 @@ chkp_parse_array_and_component_ref (tree node, tree *ptr,
  || TREE_CODE (var) == STRING_CST
  || TREE_CODE (var) == SSA_NAME);

+  if (VAR_P (var) && DECL_HARD_REGISTER (var))
+   is_register_var = true;
+
   *ptr = chkp_build_addr_expr (var);
 }

@@ -3455,7 +3459,11 @@ chkp_parse_array_and_component_ref (tree node, tree *ptr,

   if (TREE_CODE (var) == ARRAY_REF)
{
- *safe = false;
+ // Mark it as unsafe, unless the array being accessed
+ // has been explicitly placed on a register: in this
+ // case we cannot take a pointer of this variable,
+ // so we don't instrument the access.
+ *safe = is_register_var;
  array_ref_found = true;
  if (flag_chkp_narrow_bounds
  && !flag_chkp_narrow_to_innermost_arrray
@@ -4001,6 +4009,19 @@ chkp_process_stmt (gimple_stmt_iterator *iter, tree node,
bool bitfield;
tree elt;

+   {
+ // We don't instrument accesses to arrays that
+ // are explicitely assigned to hard registers.
+ HOST_WIDE_INT bitsize, bitpos;
+ tree base, offset;
+ machine_mode mode;
+ int unsignedp, reversep, volatilep = 0;
+ base = get_inner_reference (node, , , , ,
+ , , );
+ if (VAR_P (base) && DECL_HARD_REGISTER (base))
+   safe = true;
+   }
+
if (safe)
  {
/* We are not going to generate any checks, so do not

diff --git a/gcc/testsuite/gcc.target/i386/mpx/pr79990.c
b/gcc/testsuite/gcc.target/i386/mpx/pr79990.c
new file mode 100644
index 000..a27734d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/mpx/pr79990.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-fcheck-pointer-bounds -mmpx" } */
+
+typedef int U __attribute__ ((vector_size (16)));
+
+int
+foo (int i)
+{
+#if __SSE2__
+  register
+#endif
+U u
+#if __SSE2__
+  asm ("xmm0")
+#endif
+  ;
+  return u[i];
+}

regards,
Alexander


Re: [PR80025] avoid cselib rtx_equal infinite recursion on XOR

2017-03-23 Thread Jakub Jelinek
On Tue, Mar 21, 2017 at 07:43:51PM -0300, Alexandre Oliva wrote:
> When two VALUEs are recorded in the cselib equivalence table such that
> they are equivalent to each other XORed with the same expression, if
> we started a cselib equivalence test between say the odd one and the
> even one, we'd end up recursing to compare the even one with the odd
> one, and then again, indefinitely.
> 
> This patch cuts off this infinite recursion by recognizing the XOR
> special case (it's the only binary commutative operation in which
> applying one of the operands of an operation to the result of that
> operation brings you back to the other operand) and determining
> whether we have an equivalence without recursing down the infinite
> path.

Is XOR the only special case though?  E.g. PLUS or MINUS with
the most negative constant act the same, and I don't see why if unlucky
enough with reverse ops etc. you couldn't get something similar through
combination thereof, perhaps indirectly through multiple VALUEs.

So I think it is safer to just put a cap on the rtx_equal_for_cselib_1
recursion depth (should be enough to bump the depth only when walking
locs of a VALUE).  I'll test such a patch.

Jakub


C/C++ PATCH to drop references to C_RID_YYCODE

2017-03-23 Thread Marek Polacek
C_RID_YYCODE was deleted a long time ago:

2003-02-18  Geoffrey Keating  

* cp-tree.h (rid_to_yy): Delete.
(C_RID_YYCODE): Delete.
(finish_file): Delete redundant declaration.

so let's 86 the references to it.

Applying to trunk.

2017-03-23  Marek Polacek  

* c-tree.h: Remove a C_RID_YYCODE reference.

* cp-tree.h: Remove a C_RID_YYCODE reference.

diff --git gcc/c/c-tree.h gcc/c/c-tree.h
index 13e40e6..9428d74 100644
--- gcc/c/c-tree.h
+++ gcc/c/c-tree.h
@@ -43,8 +43,7 @@ along with GCC; see the file COPYING3.  If not see
 #define C_TYPE_INCOMPLETE_VARS(TYPE) TYPE_VFIELD (TYPE)
 
 /* In an IDENTIFIER_NODE, nonzero if this identifier is actually a
-   keyword.  C_RID_CODE (node) is then the RID_* value of the keyword,
-   and C_RID_YYCODE is the token number wanted by Yacc.  */
+   keyword.  C_RID_CODE (node) is then the RID_* value of the keyword.  */
 #define C_IS_RESERVED_WORD(ID) TREE_LANG_FLAG_0 (ID)
 
 /* Record whether a type or decl was written with nonconstant size.
diff --git gcc/cp/cp-tree.h gcc/cp/cp-tree.h
index 5be5dfe..9f02a97 100644
--- gcc/cp/cp-tree.h
+++ gcc/cp/cp-tree.h
@@ -346,8 +346,7 @@ identifier_p (tree t)
 }
 
 /* In an IDENTIFIER_NODE, nonzero if this identifier is actually a
-   keyword.  C_RID_CODE (node) is then the RID_* value of the keyword,
-   and C_RID_YYCODE is the token number wanted by Yacc.  */
+   keyword.  C_RID_CODE (node) is then the RID_* value of the keyword.  */
 
 #define C_IS_RESERVED_WORD(ID) TREE_LANG_FLAG_5 (ID)
 

Marek


Re: [PATCH] Fix memory leak in identify_jump_threads()

2017-03-23 Thread Richard Biener
On Thu, Mar 23, 2017 at 12:20 PM, Markus Trippelsdorf
 wrote:
> Valgrind shows:
>
> ==553== 391,488 bytes in 24 blocks are possibly lost in loss record 4,339 of 
> 4,342
> ==553==at 0x4030C15: calloc (vg_replace_malloc.c:711)
> ==553==by 0x1607CA0: xcalloc (xmalloc.c:162)
> ==553==by 0x1004A81: data_alloc (hash-table.h:263)
> ==553==by 0x1004A81: alloc_entries (hash-table.h:650)
> ==553==by 0x1004A81: hash_table (hash-table.h:586)
> ==553==by 0x1004A81: identify_jump_threads (tree-vrp.c:11009)
> ==553==by 0x1004A81: execute_vrp (tree-vrp.c:11684)
> ==553==by 0x1004A81: (anonymous namespace)::pass_vrp::execute(function*) 
> (tree-vrp.c:11777)
> ==553==by 0xC80AA2: execute_one_pass(opt_pass*) (passes.c:2465)
> ==553==by 0xC81370: execute_pass_list_1(opt_pass*) (passes.c:2554)
> ==553==by 0xC81382: execute_pass_list_1(opt_pass*) (passes.c:2555)
> ==553==by 0xC813CC: execute_pass_list(function*, opt_pass*) 
> (passes.c:2565)
> ==553==by 0x9409DC: cgraph_node::expand() (cgraphunit.c:2038)
> ==553==by 0x94217B: expand_all_functions (cgraphunit.c:2174)
> ==553==by 0x94217B: symbol_table::compile() [clone .part.51] 
> (cgraphunit.c:2531)
> ==553==by 0x9448C4: compile (cgraphunit.c:2624)
> ==553==by 0x9448C4: symbol_table::finalize_compilation_unit() 
> (cgraphunit.c:2621)
> ==553==by 0xD57C82: compile_file() (toplev.c:492)
> ==553==by 0x5B8DFF: do_compile (toplev.c:2000)
> ==553==by 0x5B8DFF: toplev::main(int, char**) (toplev.c:2134)
> ==553==
> ==553== 606,520 (2,976 direct, 603,544 indirect) bytes in 62 blocks are 
> definitely lost in loss record 4,342 of 4,342
> ==553==at 0x402F26F: operator new(unsigned long) (vg_replace_malloc.c:334)
> ==553==by 0x1004A37: identify_jump_threads (tree-vrp.c:11009)
> ==553==by 0x1004A37: execute_vrp (tree-vrp.c:11684)
> ==553==by 0x1004A37: (anonymous namespace)::pass_vrp::execute(function*) 
> (tree-vrp.c:11777)
> ==553==by 0xC80AA2: execute_one_pass(opt_pass*) (passes.c:2465)
> ==553==by 0xC81370: execute_pass_list_1(opt_pass*) (passes.c:2554)
> ==553==by 0xC81382: execute_pass_list_1(opt_pass*) (passes.c:2555)
> ==553==by 0xC813CC: execute_pass_list(function*, opt_pass*) 
> (passes.c:2565)
> ==553==by 0x9409DC: cgraph_node::expand() (cgraphunit.c:2038)
> ==553==by 0x94217B: expand_all_functions (cgraphunit.c:2174)
> ==553==by 0x94217B: symbol_table::compile() [clone .part.51] 
> (cgraphunit.c:2531)
> ==553==by 0x9448C4: compile (cgraphunit.c:2624)
> ==553==by 0x9448C4: symbol_table::finalize_compilation_unit() 
> (cgraphunit.c:2621)
> ==553==by 0xD57C82: compile_file() (toplev.c:492)
> ==553==by 0x5B8DFF: do_compile (toplev.c:2000)
> ==553==by 0x5B8DFF: toplev::main(int, char**) (toplev.c:2134)
> ==553==by 0x5BB0DD: main (main.c:39)
>
> The fix is trivial. Tested on ppc64le,
> OK for trunk?

Ok.

Thanks,
Richard.

> Longer term it would be nice to have a smart pointer available in gcc to
> avoid those ugly and error-prone naked news.
> Pedro committed a solution for binutils-gdb last year:
> https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commitdiff;h=da804164742b83965b487bbff5b6334f2e63fe91
>
> * tree-vrp.c (identify_jump_threads): Delete avail_exprs.
>
> diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
> index 26652e3b048a..28d9c175dcd4 100644
> --- a/gcc/tree-vrp.c
> +++ b/gcc/tree-vrp.c
> @@ -11021,6 +11021,7 @@ identify_jump_threads (void)
>   ASSERT_EXPRs are still in the IL and cfg cleanup code does not yet
>   handle ASSERT_EXPRs gracefully.  */
>delete equiv_stack;
> +  delete avail_exprs;
>delete avail_exprs_stack;
>  }
>
> --
> Markus


Re: [PATCH 4/5] tree-inline: implement SIMT privatization, part 3

2017-03-23 Thread Jakub Jelinek
On Thu, Mar 23, 2017 at 02:13:45PM +0300, Alexander Monakov wrote:
> On Thu, 23 Mar 2017, Jakub Jelinek wrote:
> > On Wed, Mar 22, 2017 at 06:46:34PM +0300, Alexander Monakov wrote:
> > > @@ -4730,6 +4746,25 @@ expand_call_inline (basic_block bb, gimple *stmt, 
> > > copy_body_data *id)
> > >if (cfun->gimple_df)
> > >  pt_solution_reset (>gimple_df->escaped);
> > >  
> > > +  /* Add new automatic variables to IFN_GOMP_SIMT_ENTER arguments.  */
> > > +  if (id->dst_simt_vars)
> > > +{
> > > +  size_t nargs = gimple_call_num_args (simtenter_stmt);
> > > +  hash_set *vars = id->dst_simt_vars;
> > > +  auto_vec newargs (nargs + vars->elements ());
> > > +  for (size_t i = 0; i < nargs; i++)
> > > + newargs.quick_push (gimple_call_arg (simtenter_stmt, i));
> > > +  for (hash_set::iterator i = vars->begin (); i != vars->end 
> > > (); ++i)
> > > + newargs.quick_push (build1 (ADDR_EXPR,
> > > + build_pointer_type (TREE_TYPE (*i)), *i));
> > 
> > Traversing a hash table where the traversal affects code generation is
> > -fcompare-debug unfriendly.
> > Do you actually need a hash_set and not say just a vec of the vars?  I can't
> > find where you'd actually do any lookups there, just add and traverse.
> 
> Sorry for missing the IR stability issue.  This code relies on dst_simt_vars
> being a set and thus having no duplicate entries (so the implicit lookup when
> adding an element is needed).
> 
> However, I think I was overly cautious: looking again, I think we can't enter
> copy_decl_for_dup_finish twice with the same 'copy' VAR_DECL.  Changing it to 
> a
> vec should be fine then?

Yeah, callers of copy_decl* should look the orig var in the decl map first,
plus if you look at all copy_decl_for_dup_finish callers, all of them first
create a new tree (usually copy_node) and then pass it as copy to
copy_decl_for_dup_finish, so you'll never get the same copy in there.

So, please change it into a vector.

Jakub


[PATCH] Fix memory leak in identify_jump_threads()

2017-03-23 Thread Markus Trippelsdorf
Valgrind shows:

==553== 391,488 bytes in 24 blocks are possibly lost in loss record 4,339 of 
4,342
==553==at 0x4030C15: calloc (vg_replace_malloc.c:711)
==553==by 0x1607CA0: xcalloc (xmalloc.c:162)
==553==by 0x1004A81: data_alloc (hash-table.h:263)
==553==by 0x1004A81: alloc_entries (hash-table.h:650)
==553==by 0x1004A81: hash_table (hash-table.h:586)
==553==by 0x1004A81: identify_jump_threads (tree-vrp.c:11009)
==553==by 0x1004A81: execute_vrp (tree-vrp.c:11684)
==553==by 0x1004A81: (anonymous namespace)::pass_vrp::execute(function*) 
(tree-vrp.c:11777)
==553==by 0xC80AA2: execute_one_pass(opt_pass*) (passes.c:2465)
==553==by 0xC81370: execute_pass_list_1(opt_pass*) (passes.c:2554)
==553==by 0xC81382: execute_pass_list_1(opt_pass*) (passes.c:2555)
==553==by 0xC813CC: execute_pass_list(function*, opt_pass*) (passes.c:2565)
==553==by 0x9409DC: cgraph_node::expand() (cgraphunit.c:2038)
==553==by 0x94217B: expand_all_functions (cgraphunit.c:2174)
==553==by 0x94217B: symbol_table::compile() [clone .part.51] 
(cgraphunit.c:2531)
==553==by 0x9448C4: compile (cgraphunit.c:2624)
==553==by 0x9448C4: symbol_table::finalize_compilation_unit() 
(cgraphunit.c:2621)
==553==by 0xD57C82: compile_file() (toplev.c:492)
==553==by 0x5B8DFF: do_compile (toplev.c:2000)
==553==by 0x5B8DFF: toplev::main(int, char**) (toplev.c:2134)
==553==
==553== 606,520 (2,976 direct, 603,544 indirect) bytes in 62 blocks are 
definitely lost in loss record 4,342 of 4,342
==553==at 0x402F26F: operator new(unsigned long) (vg_replace_malloc.c:334)
==553==by 0x1004A37: identify_jump_threads (tree-vrp.c:11009)
==553==by 0x1004A37: execute_vrp (tree-vrp.c:11684)
==553==by 0x1004A37: (anonymous namespace)::pass_vrp::execute(function*) 
(tree-vrp.c:11777)
==553==by 0xC80AA2: execute_one_pass(opt_pass*) (passes.c:2465)
==553==by 0xC81370: execute_pass_list_1(opt_pass*) (passes.c:2554)
==553==by 0xC81382: execute_pass_list_1(opt_pass*) (passes.c:2555)
==553==by 0xC813CC: execute_pass_list(function*, opt_pass*) (passes.c:2565)
==553==by 0x9409DC: cgraph_node::expand() (cgraphunit.c:2038)
==553==by 0x94217B: expand_all_functions (cgraphunit.c:2174)
==553==by 0x94217B: symbol_table::compile() [clone .part.51] 
(cgraphunit.c:2531)
==553==by 0x9448C4: compile (cgraphunit.c:2624)
==553==by 0x9448C4: symbol_table::finalize_compilation_unit() 
(cgraphunit.c:2621)
==553==by 0xD57C82: compile_file() (toplev.c:492)
==553==by 0x5B8DFF: do_compile (toplev.c:2000)
==553==by 0x5B8DFF: toplev::main(int, char**) (toplev.c:2134)
==553==by 0x5BB0DD: main (main.c:39)

The fix is trivial. Tested on ppc64le,
OK for trunk?

Longer term it would be nice to have a smart pointer available in gcc to
avoid those ugly and error-prone naked news.
Pedro committed a solution for binutils-gdb last year:
https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commitdiff;h=da804164742b83965b487bbff5b6334f2e63fe91

* tree-vrp.c (identify_jump_threads): Delete avail_exprs.

diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index 26652e3b048a..28d9c175dcd4 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -11021,6 +11021,7 @@ identify_jump_threads (void)
  ASSERT_EXPRs are still in the IL and cfg cleanup code does not yet
  handle ASSERT_EXPRs gracefully.  */
   delete equiv_stack;
+  delete avail_exprs;
   delete avail_exprs_stack;
 }
 
-- 
Markus


Re: [PATCH 3/5] omp-offload: implement SIMT privatization, part 2

2017-03-23 Thread Jakub Jelinek
On Thu, Mar 23, 2017 at 01:53:37PM +0300, Alexander Monakov wrote:
> On Thu, 23 Mar 2017, Jakub Jelinek wrote:
> > > + if (vf != 1)
> > > +   continue;
> > > + unlink_stmt_vdef (stmt);
> > 
> > This is weird.  AFAIK unlink_stmt_vdef just replaces the uses of the vdef
> > of that stmt with the vuse, but it still keeps the vdef (and vuse) around
> > on the stmt, typically it is used when you are removing that stmt, but
> > that is not the case here.  So why are you doing it and not say removing the
> > vdef?
> 
> Maybe I misunderstand your question, but actually the statement is removed
> further below, when we break out of the switch:

Ah, ok, missed that.  Thus, the patch is ok with those 2 nits fixed.

Jakub


Re: [PATCH 4/5] tree-inline: implement SIMT privatization, part 3

2017-03-23 Thread Alexander Monakov
On Thu, 23 Mar 2017, Jakub Jelinek wrote:
> On Wed, Mar 22, 2017 at 06:46:34PM +0300, Alexander Monakov wrote:
> > @@ -4730,6 +4746,25 @@ expand_call_inline (basic_block bb, gimple *stmt, 
> > copy_body_data *id)
> >if (cfun->gimple_df)
> >  pt_solution_reset (>gimple_df->escaped);
> >  
> > +  /* Add new automatic variables to IFN_GOMP_SIMT_ENTER arguments.  */
> > +  if (id->dst_simt_vars)
> > +{
> > +  size_t nargs = gimple_call_num_args (simtenter_stmt);
> > +  hash_set *vars = id->dst_simt_vars;
> > +  auto_vec newargs (nargs + vars->elements ());
> > +  for (size_t i = 0; i < nargs; i++)
> > +   newargs.quick_push (gimple_call_arg (simtenter_stmt, i));
> > +  for (hash_set::iterator i = vars->begin (); i != vars->end (); 
> > ++i)
> > +   newargs.quick_push (build1 (ADDR_EXPR,
> > +   build_pointer_type (TREE_TYPE (*i)), *i));
> 
> Traversing a hash table where the traversal affects code generation is
> -fcompare-debug unfriendly.
> Do you actually need a hash_set and not say just a vec of the vars?  I can't
> find where you'd actually do any lookups there, just add and traverse.

Sorry for missing the IR stability issue.  This code relies on dst_simt_vars
being a set and thus having no duplicate entries (so the implicit lookup when
adding an element is needed).

However, I think I was overly cautious: looking again, I think we can't enter
copy_decl_for_dup_finish twice with the same 'copy' VAR_DECL.  Changing it to a
vec should be fine then?

Thanks.
Alexander


Re: [PATCH 3/5] omp-offload: implement SIMT privatization, part 2

2017-03-23 Thread Alexander Monakov
On Thu, 23 Mar 2017, Jakub Jelinek wrote:
> > +   if (vf != 1)
> > + continue;
> > +   unlink_stmt_vdef (stmt);
> 
> This is weird.  AFAIK unlink_stmt_vdef just replaces the uses of the vdef
> of that stmt with the vuse, but it still keeps the vdef (and vuse) around
> on the stmt, typically it is used when you are removing that stmt, but
> that is not the case here.  So why are you doing it and not say removing the
> vdef?

Maybe I misunderstand your question, but actually the statement is removed
further below, when we break out of the switch:

stmt = lhs ? gimple_build_assign (lhs, rhs) : gimple_build_nop ();
gsi_replace (, stmt, false);

The same tactic is already in place for cleaning up GOMP_SIMT_ORDERED_PRED.

Thus, there's just one place that actually replaces statements; the body of
the switch is only responsible for coming up with a suitable rhs and cleaning
up vdefs for those statements where we know they would be present.

Thanks.
Alexander


Re: [PATCH 5/5] address-taken: optimize SIMT privatized variables

2017-03-23 Thread Jakub Jelinek
On Wed, Mar 22, 2017 at 06:46:35PM +0300, Alexander Monakov wrote:
> This patch implements promotion of SIMT private variables if GOMP_SIMT_ENTER
> is the only remaining statement where their address is taken, by handling it
> similar to ASAN_MARK.
> 
> To avoid rebuilding GOMP_SIMT_ENTER statement from scratch, set argument
> slot to a null pointer when the corresponding variable is optimized.
> 
>   * tree-ssa.c (execute_update_addresses_taken): Handle GOMP_SIMT_ENTER.

Ok for trunk (if the rest is approved).

Jakub


Re: [wwwdocs] gcc-8/porting_to.html

2017-03-23 Thread Thomas Preudhomme

Ack. Please find updated patch as per suggestions.

Best regards,

Thomas

On 23/03/17 06:47, Gerald Pfeifer wrote:

Hi Thomas,

On Wed, 22 Mar 2017, Thomas Preudhomme wrote:

Is this ok for wwwdocs once [1] is committed in GCC 8 cycle?


+ GCC on Microsoft Windows can now be configured via
+   --enable-mingw-wildcard or
+   --disable-mingw-wildcard to force a specific behavior for
+   GCC itself with regards to supporting or not the wildcard character.

Here I would omit the "or not" which I believe does not work well
in English.

+   Prior versions of GCC would follow the configuration of MinGW runtime.

And here add "the" before "MinGW".

This looks fine to me with these two minor changes, thank you.

Gerald
Index: htdocs/gcc-8/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-8/changes.html,v
retrieving revision 1.2
diff -u -r1.2 changes.html
--- htdocs/gcc-8/changes.html	12 Mar 2017 14:25:34 -	1.2
+++ htdocs/gcc-8/changes.html	23 Mar 2017 10:44:35 -
@@ -135,7 +135,16 @@
 
 
 
-
+Windows
+   
+ GCC on Microsoft Windows can now be configured via
+   --enable-mingw-wildcard or
+   --disable-mingw-wildcard to force a specific behavior for
+   GCC itself with regards to supporting the wildcard character.  Prior
+   versions of GCC would follow the configuration of the MinGW runtime.
+   This behavior can still be obtained by not using the above options or by
+   using --enable-mingw-wildcard=platform.
+   
 
 
 


Re: [PATCH 4/5] tree-inline: implement SIMT privatization, part 3

2017-03-23 Thread Jakub Jelinek
On Wed, Mar 22, 2017 at 06:46:34PM +0300, Alexander Monakov wrote:
> @@ -4730,6 +4746,25 @@ expand_call_inline (basic_block bb, gimple *stmt, 
> copy_body_data *id)
>if (cfun->gimple_df)
>  pt_solution_reset (>gimple_df->escaped);
>  
> +  /* Add new automatic variables to IFN_GOMP_SIMT_ENTER arguments.  */
> +  if (id->dst_simt_vars)
> +{
> +  size_t nargs = gimple_call_num_args (simtenter_stmt);
> +  hash_set *vars = id->dst_simt_vars;
> +  auto_vec newargs (nargs + vars->elements ());
> +  for (size_t i = 0; i < nargs; i++)
> + newargs.quick_push (gimple_call_arg (simtenter_stmt, i));
> +  for (hash_set::iterator i = vars->begin (); i != vars->end (); 
> ++i)
> + newargs.quick_push (build1 (ADDR_EXPR,
> + build_pointer_type (TREE_TYPE (*i)), *i));

Traversing a hash table where the traversal affects code generation is
-fcompare-debug unfriendly.
Do you actually need a hash_set and not say just a vec of the vars?  I can't
find where you'd actually do any lookups there, just add and traverse.

Jakub


Re: [PATCH 3/5] omp-offload: implement SIMT privatization, part 2

2017-03-23 Thread Jakub Jelinek
On Wed, Mar 22, 2017 at 06:46:33PM +0300, Alexander Monakov wrote:
> @@ -1669,6 +1672,93 @@ make_pass_oacc_device_lower (gcc::context *ctxt)
>return new pass_oacc_device_lower (ctxt);
>  }
>  
> +
> +

I'd avoid the empty line after ^L.

> @@ -1694,6 +1785,20 @@ execute_omp_device_lower ()
> case IFN_GOMP_USE_SIMT:
>   rhs = vf == 1 ? integer_zero_node : integer_one_node;
>   break;
> +   case IFN_GOMP_SIMT_ENTER:
> + rhs = vf == 1 ? gimple_call_arg (stmt, 0) : NULL_TREE;
> + goto simtreg_enter_exit;
> +   case IFN_GOMP_SIMT_ENTER_ALLOC:
> + if (vf != 1)
> +   ompdevlow_adjust_simt_enter (, );
> + rhs = vf == 1 ? null_pointer_node : NULL_TREE;
> + goto simtreg_enter_exit;
> +   case IFN_GOMP_SIMT_EXIT:
> +simtreg_enter_exit:

Please align the label below case, instead of start of the line.

> + if (vf != 1)
> +   continue;
> + unlink_stmt_vdef (stmt);

This is weird.  AFAIK unlink_stmt_vdef just replaces the uses of the vdef
of that stmt with the vuse, but it still keeps the vdef (and vuse) around
on the stmt, typically it is used when you are removing that stmt, but
that is not the case here.  So why are you doing it and not say removing the
vdef?

Jakub


Re: [PATCH 2/5] omp-low: implement SIMT privatization, part 1

2017-03-23 Thread Jakub Jelinek
On Wed, Mar 22, 2017 at 06:46:32PM +0300, Alexander Monakov wrote:
> This patch adjusts privatization in OpenMP SIMD loops lowered for SIMT 
> targets.
> At lowering time, private variables receive "omp simt private" attribute, get
> mentioned in argument list of GOMP_SIMT_ENTER function, and get a clobbering
> assignment just prior to GOMP_SIMT_EXIT function.
> 
> The following patch will implement the second step: privatized variables are
> converted to fields of a struct allocated by a call to GOMP_SIMT_ENTER_ALLOC.
> This function is similar to __builtin_alloca_with_align, except that it
> obtains per-SIMT-lane storage and implicitly performs target-specific actions;
> on NVPTX that means a transition to per-lane softstacks and inverting the
> uniform-simt mask.

Ok for trunk (if all the other patches are acked).

>   * internal-fn.c (expand_GOMP_SIMT_ENTER): New.
> (expand_GOMP_SIMT_ENTER_ALLOC): New.
> (expand_GOMP_SIMT_EXIT): New.
> * internal-fn.def (GOMP_SIMT_ENTER): New internal function.
> (GOMP_SIMT_ENTER_ALLOC): Ditto.
> (GOMP_SIMT_EXIT): Ditto.
> * target-insns.def (omp_simt_enter): New insn.
> (omp_simt_exit): Ditto.
> * omp-low.c (struct omplow_simd_context): New fields simt_eargs,
> simt_dlist.
> (lower_rec_simd_input_clauses): Implement SIMT privatization.
> (lower_rec_input_clauses): Likewise.
> (lower_lastprivate_clauses): Handle SIMT privatization.

Jakub


Re: [PATCH v3] Fix PR79908 (and PR80136)

2017-03-23 Thread Richard Biener
On Wed, Mar 22, 2017 at 4:43 PM, Bill Schmidt
 wrote:
> Hi,
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79908 shows a case where
> pass_stdarg ICEs attempting to gimplify a COMPLEX_EXPR with side
> effects as an lvalue.  This occurs when the LHS of a VA_ARG has been
> cast away.  The previous patch (reverted) used force_gimple_operand
> to instantiate the necessary side effects rather than gimplify_expr
> using is_gimple_lvalue.  This proved to cause problems on some targets
> where the gimple expression produced by targetm.gimplify_va_arg_expr
> contains un-gimplified side effects using raw var_decls (see
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80136).  After some
> investigation, the right fix seems to be to just call
> gimplify_and_add on the expression.
>
> I've bootstrapped and tested this on powerpc64le-unknown-linux-gnu
> with no regressions.  I've also built a ppc64le->aarch64 cross and
> verified that Christophe's original report from PR80136 is now
> fixed.  I don't have immediate access to an aarch64 native system
> (compile farm authentication issues), so I would appreciate it if
> James could run a native bootstrap to see if his issues are
> resolved.
>
> Provided there are no issues uncovered with aarch64 native bootstrap,
> is this ok for trunk?

Ok.

Richard.

> Thanks,
> Bill
>
>
> [gcc]
>
> 2017-03-21  Bill Schmidt  
> Richard Biener  
>
> PR tree-optimization/79908
> PR tree-optimization/80136
> * tree-stdarg.c (expand_ifn_va_arg_1): For a VA_ARG whose LHS has
> been cast away, gimplify_and_add suffices.
>
> [gcc/testsuite]
>
> 2017-03-21  Bill Schmidt  
> Richard Biener  
>
> PR tree-optimization/79908
> PR tree-optimization/80136
> * gcc.dg/torture/pr79908.c: New file.
>
>
> Index: gcc/testsuite/gcc.dg/torture/pr79908.c
> ===
> --- gcc/testsuite/gcc.dg/torture/pr79908.c  (revision 246334)
> +++ gcc/testsuite/gcc.dg/torture/pr79908.c  (working copy)
> @@ -0,0 +1,12 @@
> +/* { dg-do compile } */
> +
> +/* Used to fail in the stdarg pass before fix for PR79908.  */
> +
> +typedef __builtin_va_list __gnuc_va_list;
> +typedef __gnuc_va_list va_list;
> +
> +void testva (int n, ...)
> +{
> +  va_list ap;
> +  _Complex int i = __builtin_va_arg (ap, _Complex int);
> +}
> Index: gcc/tree-stdarg.c
> ===
> --- gcc/tree-stdarg.c   (revision 246334)
> +++ gcc/tree-stdarg.c   (working copy)
> @@ -1058,7 +1058,7 @@ expand_ifn_va_arg_1 (function *fun)
> gimplify_assign (lhs, expr, );
>   }
> else
> - gimplify_expr (, , , is_gimple_lvalue, fb_lvalue);
> + gimplify_and_add (expr, );
>
> input_location = saved_location;
> pop_gimplify_context (NULL);
>


RE: [wwwdocs] ARC's gcc7.x release notes

2017-03-23 Thread Gerald Pfeifer
On Wed, 22 Mar 2017, Claudiu Zissulescu wrote:
> That's great news, thank you Gerald for the quick review. I still have 
> one question, as I never worked with wwwdocs, do I need to perform extra 
> operations (like updating a ChangeLog kind of file) before committing my 
> patch to cvs trunk?

No special magic, just go for it (`cvs commit`). :-)

Gerald


Re: [PATCH][ARM] PR target/71436: Restrict *load_multiple pattern till after LRA

2017-03-23 Thread Ramana Radhakrishnan

On 07/02/17 14:49, Kyrill Tkachov wrote:


On 18/01/17 09:49, Kyrill Tkachov wrote:


On 19/12/16 14:53, Jakub Jelinek wrote:

On Thu, Dec 15, 2016 at 10:00:14AM +, Richard Earnshaw (lists)
wrote:

sorry, pasted the wrong bit of code.

That should read when we generate:

(insn 55 19 67 3 (parallel [
 (set (reg:SI 0 r0)
 (mem/u/c:SI (reg/f:SI 147) [2 c+0 S4 A32]))
 (set (reg:SI 158)
 (mem/u/c:SI (plus:SI (reg/f:SI 147)
 (const_int 4 [0x4])) [2 c+4 S4 A32]))
 ]) "ldm.c":25 404 {*load_multiple}
  (expr_list:REG_UNUSED (reg:SI 0 r0)
 (nil)))

ie when we put a pseudo into the register load list.

We put a pseudo there because the predicate on the insn allows it:
(define_special_predicate "load_multiple_operation"
   (match_code "parallel")
{
  return ldm_stm_operation_p (op, /*load=*/true, SImode,
  /*consecutive=*/false,
  /*return_pc=*/false);
})
and the consecutive = false argument says that (almost) no verification
is performed on the SET_DEST, just that it is a REG and doesn't have
REGNO smaller than the first reg.
That said, RA is still not able to cope with such instructions, because
only the first set is represented with constraints, so if such an insn
needs any kind of reloading, it just will not happen.
So I think the posted patch makes lots of sense, otherwise if you use
such a pattern before reload, you just have to hope no reloading will be
needed on it.


In that case I believe restricting the pattern till after LRA is the
way to go.
However, we should still add the check from [1] to ldm_stm_operation_p
for when
'consecutive' is true as consecutive order has no useful meaning on
pseudos here.



In the interest of fixing this PR I think the patch at
https://gcc.gnu.org/ml/gcc-patches/2016-11/msg03078.html
is the way to go, that is disable this pattern until after reload.
It was intended to handle epilogue pops that is generated after reload
anyway and all the general load/store multiple
patterns are handled by ldmstmd.md so not having this before reload is
not risky.

I'll also propose a variant of
https://gcc.gnu.org/ml/gcc-patches/2016-12/msg01400.html to tighten up
the consecutivity
check when 'consecutive' is passed down as true, but that is
indepdendent of this PR.

Is the patch at https://gcc.gnu.org/ml/gcc-patches/2016-11/msg03078.html
ok for trunk?
It's been in my tree for a couple of months now getting testing and
there's been no fallout.




We shouldn't generate any ldm / stm operations for anything other than 
memcpy before reload and all of those are capped at 4 integer registers 
(MAX_LDM_STM_OPS in arm.h) at a time and that will be handled almost 
entirely by the patterns in ldmstm.md. The way to generate this 
generically this would be through movmem expanders or through the 
load_multiple expander in the backend. However all of this falls through 
the arm_gen_load_multiple interface all of which goes through 
arm_gen_multiple_op and we have an assert in arm_gen_multiple_op that 
the number of registers is <= MAX_LDM_STM_OPS.


Thus OK, please apply but be aware of any fallout

Thanks,
Ramana



Thanks,
Kyrill




[1] https://gcc.gnu.org/ml/gcc-patches/2016-12/msg01400.html



Jakub








[wwwdocs] Remove two company links

2017-03-23 Thread Gerald Pfeifer
My link checker spotted some redirects, and so I noticed these
two old links (2002 and 2000), and since we haven't had such in
any later announcements (and generally avoid them), I applied
the below.

Of course we have, and will keep, deep links such as Nick's blog.

Gerald

Index: news/dfa.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/news/dfa.html,v
retrieving revision 1.7
diff -u -r1.7 dfa.html
--- news/dfa.html   29 Dec 2016 11:49:46 -  1.7
+++ news/dfa.html   23 Mar 2017 08:18:32 -
@@ -10,8 +10,8 @@
 Original April 30, 2002.
 Last Updated May 5, 2002
 
-We are pleased to announce that Vladimir Makarov, of https://www.redhat.com;>Red Hat, has contributed support
+We are pleased to announce that Vladimir Makarov, of
+Red Hat, has contributed support
 for using Deterministic Finite Automata (DFA) to describe structural
 hazards in processor pipelines to the instruction scheduler.   This
 work is based on literature from various sources, including, but not
Index: news/ssa.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/news/ssa.html,v
retrieving revision 1.8
diff -u -r1.8 ssa.html
--- news/ssa.html   29 Dec 2016 11:49:46 -  1.8
+++ news/ssa.html   23 Mar 2017 08:18:32 -
@@ -12,8 +12,7 @@
 
 We are pleased to announce that 
 CodeSourcery, LLC
-and 
-https://www.redhat.com;>Cygnus, a Red Hat company have
+and Cygnus, a Red Hat company have
 contributed an implementation of the static single assignment (SSA)
 representation for the GCC compiler.  SSA is used in many modern
 compilers to facilitate a wide range of powerful optimizations.  Now


Re: [wwwdocs] gcc-8/porting_to.html

2017-03-23 Thread Gerald Pfeifer

Hi Thomas,

On Wed, 22 Mar 2017, Thomas Preudhomme wrote:

Is this ok for wwwdocs once [1] is committed in GCC 8 cycle?


+ GCC on Microsoft Windows can now be configured via
+   --enable-mingw-wildcard or
+   --disable-mingw-wildcard to force a specific behavior for
+   GCC itself with regards to supporting or not the wildcard character.

Here I would omit the "or not" which I believe does not work well
in English.

+   Prior versions of GCC would follow the configuration of MinGW runtime.

And here add "the" before "MinGW".

This looks fine to me with these two minor changes, thank you.

Gerald