Re: [PATCH] Fix PR c++/21802 (two-stage name lookup fails for operators)

2015-12-14 Thread Jason Merrill

OK.

Jason


Re: RFA (hash-*): PATCH for c++/68309

2015-12-14 Thread Trevor Saunders
>  
> +  hash_map (const hash_map , bool ggc = false,
> + bool gather_mem_stats = true CXX_MEM_STAT_INFO)

sorry about the late response, but wouldn't it be better to make this
and the hash_table constructor explicit?  Its probably less important
than other constructors, but is there a reasonable use for taking or
return them by value?

Trev


> +: m_table (h.m_table, ggc, gather_mem_stats,
> +HASH_MAP_ORIGIN PASS_MEM_STAT) {}
> +
>/* Create a hash_map in ggc memory.  */
>static hash_map *create_ggc (size_t size, bool gather_mem_stats = true
>  CXX_MEM_STAT_INFO)
> diff --git a/gcc/hash-table.h b/gcc/hash-table.h
> index 192be30..d172841 100644
> --- a/gcc/hash-table.h
> +++ b/gcc/hash-table.h
> @@ -364,6 +364,10 @@ public:
>explicit hash_table (size_t, bool ggc = false, bool gather_mem_stats = 
> true,
>  mem_alloc_origin origin = HASH_TABLE_ORIGIN
>  CXX_MEM_STAT_INFO);
> +  hash_table (const hash_table &, bool ggc = false,
> +   bool gather_mem_stats = true,
> +   mem_alloc_origin origin = HASH_TABLE_ORIGIN
> +   CXX_MEM_STAT_INFO);
>~hash_table ();
>  
>/* Create a hash_table in gc memory.  */
> @@ -580,6 +584,27 @@ hash_table::hash_table (size_t 
> size, bool ggc, bool
>  }
>  
>  template class Allocator>
> +hash_table::hash_table (const hash_table , bool ggc,
> +bool gather_mem_stats,
> +mem_alloc_origin origin
> +MEM_STAT_DECL) :
> +m_n_elements (h.m_n_elements), m_n_deleted (h.m_n_deleted),
> +m_searches (0), m_collisions (0),
> +m_ggc (ggc), m_gather_mem_stats (gather_mem_stats)
> +{
> +  size_t size = h.m_size;
> +
> +  if (m_gather_mem_stats)
> +hash_table_usage.register_descriptor (this, origin, ggc
> +   FINAL_PASS_MEM_STAT);
> +
> +  m_entries = alloc_entries (size PASS_MEM_STAT);
> +  memcpy (m_entries, h.m_entries, size * sizeof (value_type));
> +  m_size = size;
> +  m_size_prime_index = h.m_size_prime_index;
> +}
> +
> +template class Allocator>
>  hash_table::~hash_table ()
>  {
>for (size_t i = m_size - 1; i < m_size; i--)
> diff --git a/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-variadic3.C 
> b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-variadic3.C
> new file mode 100644
> index 000..76b6b3f
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-variadic3.C
> @@ -0,0 +1,9 @@
> +// PR c++/68309
> +// { dg-do compile { target c++11 } }
> +
> +template  void f(Ts...);
> +template  T g(T);
> +template  void print(Ts... args) {
> +  [&] { f(g(args)...); }();
> +}
> +int main() { print(5.2); }



[PATCH] Fix PR c++/21802 (two-stage name lookup fails for operators)

2015-12-14 Thread Patrick Palka
On Mon, 14 Dec 2015, Jason Merrill wrote:

> On 12/12/2015 06:32 PM, Patrick Palka wrote:
>>> >This should use cp_tree_operand_length.
>> Hmm, I don't immediately see how I can use this function here.  It
>> expects a tree but I dont have an appropriate tree to give to it, only a
>> tree_code.
>
> True.  So let's introduce cp_tree_code_length next to cp_tree_operand_length.
>
> Jason
>
>

Like this?  Incremental diff followed by patch v4:

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
 index 3487d77..59c70b2 100644
 --- a/gcc/cp/cp-tree.h
 +++ b/gcc/cp/cp-tree.h
 @@ -6477,6 +6477,7 @@ extern bool is_lambda_ignored_entity(tree);

 /* in tree.c */
 extern int cp_tree_operand_length  (const_tree);
+extern int cp_tree_code_length (enum tree_code);
 void cp_free_lang_data (tree t);
 extern tree force_target_expr  (tree, tree, tsubst_flags_t);
 extern tree build_target_expr_with_type(tree, tree, 
tsubst_flags_t);
diff --git a/gcc/cp/tree.c b/gcc/cp/tree.c
 index ca72877..6e671b7 100644
 --- a/gcc/cp/tree.c
 +++ b/gcc/cp/tree.c
 @@ -2766,14 +2766,10 @@ build_min_non_dep_op_overload (enum tree_code op,

   nargs = call_expr_nargs (non_dep);

-  if (op == PREINCREMENT_EXPR
-  || op == PREDECREMENT_EXPR)
-expected_nargs = 1;
-  else if (op == MODOP_EXPR
-  || op == ARRAY_REF)
-expected_nargs = 2;
-  else
-expected_nargs = TREE_CODE_LENGTH (op);
+  expected_nargs = cp_tree_code_length (op);
+  if (op == POSTINCREMENT_EXPR
+  || op == POSTDECREMENT_EXPR)
+expected_nargs += 1;
   gcc_assert (nargs == expected_nargs);

   args = make_tree_vector ();
 @@ -4450,6 +4446,32 @@ cp_tree_operand_length (const_tree t)
 }
 }

+/* Like cp_tree_operand_length, but takes a tree_code CODE.  */
+
+int
+cp_tree_code_length (enum tree_code code)
+{
+  gcc_assert (TREE_CODE_CLASS (code) != tcc_vl_exp);
+
+  switch (code)
+{
+case PREINCREMENT_EXPR:
+case PREDECREMENT_EXPR:
+case POSTINCREMENT_EXPR:
+case POSTDECREMENT_EXPR:
+  return 1;
+
+case ARRAY_REF:
+  return 2;
+
+case EXPR_PACK_EXPANSION:
+  return 1;
+
+default:
+  return TREE_CODE_LENGTH (code);
+}
+}
+
 /* Implement -Wzero_as_null_pointer_constant.  Return true if the
conditions for the warning hold, false otherwise.  */
 bool
diff --git a/gcc/cp/typeck.c b/gcc/cp/typeck.c
 index 2e5e46e..39c1af2 100644
 --- a/gcc/cp/typeck.c
 +++ b/gcc/cp/typeck.c
 @@ -7864,7 +7864,7 @@ build_x_modify_expr (location_t loc, tree lhs, enum 
tree_code modifycode,
{
  if (overload != NULL_TREE)
return (build_min_non_dep_op_overload
-   (MODOP_EXPR, rval, overload, orig_lhs, orig_rhs));
+   (MODIFY_EXPR, rval, overload, orig_lhs, orig_rhs));

  return (build_min_non_dep
  (MODOP_EXPR, rval, orig_lhs, op, orig_rhs));

-- 8< --

Bootstrapped and regtested on x86_64-pc-linux-gnu.

gcc/cp/ChangeLog:

PR c++/21802
PR c++/53223
* cp-tree.h (cp_tree_code_length): Declare.
(build_min_non_dep_op_overload): Declare.
* tree.c (cp_tree_code_length): Define.
(build_min_non_dep_op_overload): Define.
(build_win_non_dep_call_vec): Copy the KOENIG_LOOKUP_P flag.
* typeck.c (build_x_indirect_ref): Use
build_min_non_dep_op_overload when the given expression
has been resolved to an operator overload.
(build_x_binary_op): Likewise.
(build_x_array_ref): Likewise.
(build_x_unary_op): Likewise.
(build_x_compound_expr): Likewise.
(build_x_modify_expr): Likewise.
* decl2.c (grok_array_decl): Likewise.
* call.c (build_new_op_1): If during template processing we
chose an operator overload that is a hidden friend function, set
the call's KOENIG_LOOKUP_P flag to 1.

gcc/testsuite/ChangeLog:

PR c++/21802
PR c++/53223
* g++.dg/cpp0x/pr53223.C: New test.
* g++.dg/lookup/pr21802.C: New test.
* g++.dg/lookup/two-stage4.C: Remove XFAIL.
---
 gcc/cp/call.c|  13 ++
 gcc/cp/cp-tree.h |   2 +
 gcc/cp/decl2.c   |  13 +-
 gcc/cp/tree.c|  89 ++
 gcc/cp/typeck.c  | 100 ---
 gcc/testsuite/g++.dg/cpp0x/pr53223.C |  45 +
 gcc/testsuite/g++.dg/lookup/pr21802.C| 276 +++
 gcc/testsuite/g++.dg/lookup/two-stage4.C |   2 +-
 8 files changed, 517 insertions(+), 23 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/pr53223.C
 create mode 100644 gcc/testsuite/g++.dg/lookup/pr21802.C

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index 117dd79..cdfa01a 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -5630,6 +5630,19 @@ build_new_op_1 (location_t loc, enum tree_code 

[PATCH] c/68868 - atomic_init emits an unnecessary fence

2015-12-14 Thread Martin Sebor

The C atomic_init macro is implemented in terms of simple assignment
to the atomic variable pointed to by its first argument.  That's
inefficient since the variable under initialization must not be
accessed by other threads and assignment provides sequentially
consistent semantics.  The inefficiency is apparent in the generated
dumps (e.g. the gimple dump contains calls to __atomic_store (...,
memory_order_seq_cst), and the assembly dump contains the fence
instruction).

The attached patch changes the macro to use atomic_store with relaxed
consistency semantics and adds a test verifying that invocations of
the atomic_init macro emit __atomic_store_N with a zero last argument
(memory_order_relaxed).

This brings GCC on par with Clang.

Tested on powerpc64le and x86_64.

Martin
gcc/ChangeLog
2015-12-14  Martin Sebor  

	PR c/68868
	* ginclude/stdatomic.h (atomic_init): Use atomic_store instead
	of plain assignment.

gcc/testsuite/ChangeLog
2015-12-14  Martin Sebor  

	PR c/68868
	* testsuite/gcc.dg/atomic/stdatomic-init.c: New test.

Index: ginclude/stdatomic.h
===
--- ginclude/stdatomic.h	(revision 231532)
+++ ginclude/stdatomic.h	(working copy)
@@ -77,13 +77,11 @@
 
 
 #define ATOMIC_VAR_INIT(VALUE)	(VALUE)
-#define atomic_init(PTR, VAL)			\
-  do		\
-{		\
-  *(PTR) = (VAL);\
-}		\
-  while (0)
 
+/* Initialize an atomic object pointed to by PTR with VAL.  */
+#define atomic_init(PTR, VAL)   \
+  atomic_store_explicit (PTR, VAL, __ATOMIC_RELAXED)
+
 #define kill_dependency(Y)			\
   __extension__	\
   ({		\
Index: testsuite/gcc.dg/atomic/stdatomic-init.c
===
--- testsuite/gcc.dg/atomic/stdatomic-init.c	(revision 0)
+++ testsuite/gcc.dg/atomic/stdatomic-init.c	(working copy)
@@ -0,0 +1,121 @@
+/* Test the atomic_init generic function.  Verify that __atomic_store_N
+   is called with the last argument of memory_order_relaxed (i.e., 0)
+   for each invocation of the atomic_init() macro in the test and that
+   there are no calls to __atomic_store_N with a non-zero last argument.  */
+/* { dg-do compile } */
+/* { dg-options "-fdump-tree-gimple -std=c11 -pedantic-errors" } */
+/* { dg-final { scan-tree-dump-times "__atomic_store_. \\(\[^\n\r]*, 0\\)" 54 "gimple" } } */
+/* { dg-final { scan-tree-dump-not "__atomic_store_. \\(\[^\n\r]*, \[1-5\]\\)" "gimple" } } */
+
+#include 
+
+struct Atomic {
+  /* Volatile to prevent re-initialization from being optimized away.  */
+  volatile atomic_bool   b;
+  volatile atomic_char   c;
+  volatile atomic_schar  sc;
+  volatile atomic_uchar  uc;
+  volatile atomic_short  ss;
+  volatile atomic_ushort us;
+  volatile atomic_intsi;
+  volatile atomic_uint   ui;
+  volatile atomic_long   sl;
+  volatile atomic_ulong  ul;
+  volatile atomic_llong  sll;
+  volatile atomic_ullong ull;
+  volatile atomic_size_t sz;
+};
+
+struct Value {
+  _Bool  b;
+  char   c;
+  signed charsc;
+  unsigned char  uc;
+  short  ss;
+  unsigned short us;
+  intsi;
+  unsigned int   ui;
+  long   sl;
+  unsigned long  ul;
+  long long  sll;
+  unsigned long long ull;
+  __SIZE_TYPE__  sz;
+};
+
+/* Exercise the atomic_init() macro with a literal argument.  */
+
+void atomic_init_lit (struct Atomic *pa)
+{
+  atomic_init (>b, 0);
+  atomic_init (>b, 1);
+
+  atomic_init (>c, 'x');
+  atomic_init (>c, 0);
+  atomic_init (>c, 1);
+  atomic_init (>c, 255);
+  
+  atomic_init (>sc, (signed char)'x');
+  atomic_init (>sc, (signed char)0);
+  atomic_init (>sc, (signed char)1);
+  atomic_init (>sc, (signed char)__SCHAR_MAX__);
+
+  atomic_init (>uc, (unsigned char)'x');
+  atomic_init (>uc, (unsigned char)0);
+  atomic_init (>uc, (unsigned char)1);
+  atomic_init (>sc, (unsigned char)__SCHAR_MAX__);
+
+  atomic_init (>ss, (signed short)0);
+  atomic_init (>ss, (signed short)1);
+  atomic_init (>ss, (signed short)__SHRT_MAX__);
+
+  atomic_init (>us, (unsigned short)0);
+  atomic_init (>us, (unsigned short)1);
+  atomic_init (>us, (unsigned short)__SHRT_MAX__);
+
+  atomic_init (>si, (signed int)0);
+  atomic_init (>si, (signed int)1);
+  atomic_init (>si, (signed int)__INT_MAX__);
+
+  atomic_init (>ui, (unsigned int)0);
+  atomic_init (>ui, (unsigned int)1);
+  atomic_init (>ui, (unsigned int)__INT_MAX__);
+  
+  atomic_init (>sl, (signed long)0);
+  atomic_init (>sl, (signed long)1);
+  atomic_init (>sl, (signed long)__LONG_MAX__);
+
+  atomic_init (>ul, (unsigned long)0);
+  atomic_init (>ul, (unsigned long)1);
+  atomic_init (>ul, (unsigned long)__LONG_MAX__);
+
+  atomic_init (>sll, (signed long long)0);
+  atomic_init (>sll, (signed long long)1);
+  atomic_init (>sll, (signed long long)__LONG_LONG_MAX__);
+
+  atomic_init (>ull, (unsigned long long)0);
+  

Re: [PATCH][combine] Don't create LSHIFTRT of zero bits in change_zero_ext

2015-12-14 Thread Kyrill Tkachov


On 11/12/15 01:26, Segher Boessenkool wrote:

On Thu, Dec 10, 2015 at 05:05:12PM +0100, Bernd Schmidt wrote:

On 12/10/2015 03:36 PM, Kyrill Tkachov wrote:

I'm okay with delaying this for next stage 1 if people prefer, though I
think it's
pretty low risk.

I think this is something we should fix now.

I agree.


+ x = XEXP (x, 0);
+ if (start > 0)
+   x = gen_rtx_LSHIFTRT (mode, x, GEN_INT (start));

I think this should just use simplify_shift_const. gen_rtx_FOO should be
avoided.

A lot of combine does that, it really is stuck in the 80's.  I wouldn't
use simplify_shift_const here, but simply simplify_gen_binary.


In change_zero_ext it also creates the final AND with gen_rtx_AND.
Perhaps that should also be changed to simplify_gen_binary.
But I haven't seen any cases where it causes trouble yet, so we
could fix it up separately.



The patch is okay with or without that change.


Thanks for the suggestions.
I'll go with simplify_gen_binary.
Here's what I'll be committing.

Thanks,
Kyrill



2015-12-10  Kyrylo Tkachov  

* combine.c (change_zero_ext): Do not create a shift of zero length.



Segher



diff --git a/gcc/combine.c b/gcc/combine.c
index 9465e5927145e768f1a5fc43ce7c3621033d8aef..8601d8983ce345e2129dd047b3520d98c0582842 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -11037,7 +11037,8 @@ change_zero_ext (rtx *src)
 	  if (BITS_BIG_ENDIAN)
 	start = GET_MODE_PRECISION (mode) - size - start;
 
-	  x = gen_rtx_LSHIFTRT (mode, XEXP (x, 0), GEN_INT (start));
+	  x = simplify_gen_binary (LSHIFTRT, mode,
+   XEXP (x, 0), GEN_INT (start));
 	}
   else if (GET_CODE (x) == ZERO_EXTEND
 	   && GET_CODE (XEXP (x, 0)) == SUBREG


Re: [PATCH 2/4][AArch64] Increase the loop peeling limit

2015-12-14 Thread James Greenhalgh
On Thu, Dec 03, 2015 at 03:07:43PM -0600, Evandro Menezes wrote:
> On 11/20/2015 05:53 AM, James Greenhalgh wrote:
> >On Thu, Nov 19, 2015 at 04:04:41PM -0600, Evandro Menezes wrote:
> >>On 11/05/2015 02:51 PM, Evandro Menezes wrote:
> >>>2015-11-05  Evandro Menezes 
> >>>
> >>>   gcc/
> >>>
> >>>   * config/aarch64/aarch64.c (aarch64_override_options_internal):
> >>>   Increase loop peeling limit.
> >>>
> >>>This patch increases the limit for the number of peeled insns.
> >>>With this change, I noticed no major regression in either
> >>>Geekbench v3 or SPEC CPU2000 while some benchmarks, typically FP
> >>>ones, improved significantly.
> >>>
> >>>I tested this tuning on Exynos M1 and on A57.  ThunderX seems to
> >>>benefit from this tuning too.  However, I'd appreciate comments
> >>>from other stakeholders.
> >>
> >>Ping.
> >I'd like to leave this for a call from the port maintainers. I can see why
> >this leads to more opportunities for vectorization, but I'm concerned about
> >the wider impact on code size. Certainly I wouldn't expect this to be our
> >default at -O2 and below.
> >
> >My gut feeling is that this doesn't really belong in the back-end (there are
> >presumably good reasons why the default for this parameter across GCC has
> >fluctuated from 400 to 100 to 200 over recent years), but as I say, I'd
> >like Marcus or Richard to make the call as to whether or not we take this
> >patch.
> 
> Please, correct me if I'm wrong, but loop peeling is enabled only
> with loop unrolling (and with PGO).  If so, then extra code size is
> not a concern, for this heuristic is only active when unrolling
> loops, when code size is already of secondary importance.

My understanding was that loop peeling is enabled from -O2 upwards, and
is also used to partially peel unaligned loops for vectorization (allowing
the vector code to be well aligned), or to completely peel inner loops which
may then become amenable to SLP vectorization.

If I'm wrong then I take back these objections. But I was sure this
parameter was used in a number of situations outside of just
-funroll-loops/-funroll-all-loops . Certainly I remember seeing performance
sensitivities to this parameter at -O3 in some internal workloads I was
analysing.

Thanks,
James



Re: [v4] avoid alignment of static variables affecting stack's

2015-12-14 Thread Bernd Schmidt

On 12/14/2015 10:07 AM, Richard Biener wrote:


Note that we also record alignment to make sure we can spill to properly
aligned stack slots.



I don't see why we don't need to do that for used statics/externs.  That is
are you sure we never need to spill a var of their type?


Why would they be different from other global variables declared outside 
a function? We don't have to worry about those.



Bernd



[PATCH][combine] PR rtl-optimization/68651 Try changing rtx from (r + r) to (r << 1) to aid recognition

2015-12-14 Thread Kyrill Tkachov

Hi all,

For this PR I want to teach combine to deal with unrecognisable patterns that 
contain a sub-expression like
(x + x) by transforming it into (x << 1) and trying to match the result. This 
is because some instruction
sets like arm and aarch64 can combine shifts with other arithmetic operations 
or have shifts in their RTL representation
of more complex operations (like the aarch64 UBFIZ instruction which can be 
expressed as a zero_extend+ashift pattern).

Due to a change in rtx costs for -mcpu=cortex-a53 in GCC 5 we no longer expand an 
expression like x * 2 as x << 1
but rather as x + x, which hurts combination opportunities dues to this 
deficiency.

This patch addresses the issue in the recog_for_combine function in combine.c 
in a similar way to the change_zero_ext
trick. That is, if it recog_for_combine fails to match a pattern it replaces 
all instances of x + x in the
rtx with x << 1 and tries again.

This way I've been able to get combine to more aggressively generate the 
arithmetic+shift forms of instructions for
-mcpu=cortex-a53 on aarch64 as well as instructions like ubfiz and sbfiz that 
contain shift-by-immediate sub-expressions.

This patch shouldn't affect rtxes that already match, so it should have no 
fallout on other cases.

Bootstrapped and tested on arm, aarch64, x86_64.

For the testcase:
int
foo (int x)
{
  return (x * 2) & 65535;
}

int
bar (int x, int y)
{
  return (x * 2) | y;
}

with -O2 -mcpu=cortex-a53 for aarch64 we now generate:
foo:
ubfiz   w0, w0, 1, 15
ret

bar:
orr w0, w1, w0, lsl 1
ret

whereas without this patch we generate:
foo:
add w0, w0, w0
uxthw0, w0
ret

bar:
add w0, w0, w0
orr w0, w0, w1
ret


PR 68651 is a code quality regression for GCC 5 and GCC 6 that was introduced 
due to updated rtx costs
for -mcpu=cortex-a53 that affected expansion.  The costs changes were correct 
(to the extent that rtx
costs have any meaning) and I think this is a deficiency in combine that should 
be fixed.

I wouldn't propose to backport this to GCC 5.

P.S. For the ubfiz testcase above to combine successfully it needs an aarch64 
rtx costs issue to be resolved
that I proposed a fix for in 
https://gcc.gnu.org/ml/gcc-patches/2015-12/msg00526.html.
Otherwise the backend wrongly rejects it on the grounds of wrong costs.

Is this ok for trunk once the costs issue at 
https://gcc.gnu.org/ml/gcc-patches/2015-12/msg00526.html
gets resolved?

Thanks,
Kyrill

2015-12-14  Kyrylo Tkachov  

PR rtl-optimization/68651
* combine.c (change_shift_by_one): New function.
(change_rtx_with_func): Likewise.
(recog_for_combine): Use the above to transform reg + reg
sub-expressions into reg << 1 within non-recognisable patterns
and try to match the result.

2015-12-14  Kyrylo Tkachov  

PR rtl-optimization/68651
* gcc.target/aarch64/pr68651_1.c: New test.
diff --git a/gcc/combine.c b/gcc/combine.c
index c008d2a786ebeaa7560acbd60c7c2e8cdacdc9aa..9465e5927145e768f1a5fc43ce7c3621033d8aef 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -11063,6 +11063,60 @@ change_zero_ext (rtx *src)
   return changed;
 }
 
+/* Replace in SRC all sub-rtxes of the form x + x
+   with x << 1 to help recognition on targets with combined
+   shift operations.  Return true iff such any such change was made.  */
+
+static bool
+change_shift_by_one (rtx *src)
+{
+  bool changed = false;
+
+  subrtx_ptr_iterator::array_type array;
+  FOR_EACH_SUBRTX_PTR (iter, array, src, NONCONST)
+{
+  rtx x = **iter;
+  machine_mode mode = GET_MODE (x);
+
+  if (GET_CODE (x) == PLUS && GET_MODE_CLASS (mode) == MODE_INT
+	  && !side_effects_p (XEXP (x, 0))
+	  && rtx_equal_p (XEXP (x, 0), XEXP (x, 1)))
+	x = simplify_gen_binary (ASHIFT, mode, XEXP (x, 0), const1_rtx);
+  else
+	continue;
+
+  SUBST (**iter, x);
+  changed = true;
+}
+
+  return changed;
+}
+
+/* Modify the sources of all sets in PAT using the function FUNC that takes
+   a pointer to the rtx to modify and returns true iff it made any
+   modifications.  Used by recog_for_combine to twiddle non-matching patterns
+   into something recognisable.  */
+
+static bool
+change_rtx_with_func (rtx pat, bool (*func) (rtx *))
+{
+  bool changed = false;
+
+  if (GET_CODE (pat) == SET)
+changed = func (_SRC (pat));
+  else if (GET_CODE (pat) == PARALLEL)
+{
+  int i;
+  for (i = 0; i < XVECLEN (pat, 0); i++)
+	{
+	  rtx set = XVECEXP (pat, 0, i);
+	  if (GET_CODE (set) == SET)
+	changed |= func (_SRC (set));
+	}
+}
+  return changed;
+}
+
 /* Like recog, but we receive the address of a pointer to a new pattern.
We try to match the rtx that the pointer points to.
If that fails, we may try to modify or replace the pattern,
@@ -11073,6 +11127,9 @@ change_zero_ext (rtx *src)
to the equivalent AND and perhaps LSHIFTRT patterns, and try with 

Re: [v4] avoid alignment of static variables affecting stack's

2015-12-14 Thread Richard Biener
On Mon, Dec 14, 2015 at 9:44 AM, Jan Beulich  wrote:
 On 14.12.15 at 09:35,  wrote:
>> On Fri, Dec 11, 2015 at 2:54 PM, Bernd Schmidt  wrote:
>>> On 12/11/2015 02:48 PM, Jan Beulich wrote:

 Function (or more narrow) scope static variables (as well as others not
 placed on the stack) should also not have any effect on the stack
 alignment. I noticed the issue first with Linux'es dynamic_pr_debug()
 construct using an 8-byte aligned sub-file-scope local variable.

 According to my checking bad behavior started with 4.6.x (4.5.3 was
 still okay), but generated code got quite a bit worse as of 4.9.0.

 [v4: Bail early, using is_global_var(), as requested by Bernd.]
>>>
>>>
>>> In case I haven't made it obvious, this is OK.
>>
>> But I wonder if it makes sense because shortly after the early-out we check
>>
>>   if (TREE_STATIC (var)
>>   || DECL_EXTERNAL (var)
>>   || (TREE_CODE (origvar) == SSA_NAME && use_register_for_decl 
>> (var)))
>>
>> so either there are obvious cleanup opportunities left or the patch is
>> broken.
>
> Looks like a cleanup opportunity I overlooked when following
> Bernd's advice.

Well, looking at callers it doesn't sound so obvious ...

/* Expand all variables used in the function.  */

static rtx_insn *
expand_used_vars (void)
{
..
  len = vec_safe_length (cfun->local_decls);
  FOR_EACH_LOCAL_DECL (cfun, i, var)
{
...
  /* We didn't set a block for static or extern because it's hard
 to tell the difference between a global variable (re)declared
 in a local scope, and one that's really declared there to
 begin with.  And it doesn't really matter much, since we're
 not giving them stack space.  Expand them now.  */
  else if (TREE_STATIC (var) || DECL_EXTERNAL (var))
expand_now = true;
...
  if (expand_now)
expand_one_var (var, true, true);

but then you'll immediately return.

So to make sense of your patch a larger refactorig is necessary.  expand_one_var
also later contains

  else if (DECL_EXTERNAL (var))
;
  else if (DECL_HAS_VALUE_EXPR_P (var))
;
  else if (TREE_STATIC (var))
;

so expects externals/statics after recording alignment.  Which means
recording alignment is necessary.

Note that we also record alignment to make sure we can spill to properly
aligned stack slots.

I don't see why we don't need to do that for used statics/externs.  That is
are you sure we never need to spill a var of their type?

Richard.

> Jan
>


[committed] Adjust nvptx-related test annotations

2015-12-14 Thread Alexander Monakov
Hello,

I have committed the following testsuite patch as obvious.

On NVPTX, __builtin_apply does not work because in emitted assembly all calls
(including indirect) are annotated with expected function prototype.  As a
result, "untyped_assembly" effective-target test was added to skip tests with
__builtin_apply on NVPTX.  This patch corrects two instances where a wrong
test was used: one test checked for "alloca" instead, and the other hard-coded
the target triplet.

Noticed while testing NVPTX with alloca provided under -msoft-stack.

* gcc.dg/builtin-return-1.c: Correct effective-target test.
* gcc.dg/stack-usage-2.c: Use effective-target test.

Index: gcc.dg/builtin-return-1.c
===
--- gcc.dg/builtin-return-1.c   (revision 231608)
+++ gcc.dg/builtin-return-1.c   (working copy)
@@ -2,7 +2,7 @@
 /* Originator: Andrew Church  */
 /* { dg-do run } */
 /* { dg-xfail-run-if "PR36571 untyped return is char register" { "avr-*-*" } { 
"*" } { "" } } */
-/* { dg-require-effective-target alloca } */
+/* { dg-require-effective-target untyped_assembly } */
 /* This used to fail on SPARC because the (undefined) return
value of 'bar' was overwriting that of 'foo'.  */
 
Index: gcc.dg/stack-usage-2.c
===
--- gcc.dg/stack-usage-2.c  (revision 231608)
+++ gcc.dg/stack-usage-2.c  (working copy)
@@ -1,7 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-Wstack-usage=512" } */
-/* nvptx gets very upset with mismatched function types.  */
-/* { dg-skip-if "" { nvptx-*-* } { "*" } { "" } } */
+/* { dg-require-effective-target untyped_assembly } */
 
 int foo1 (void)  /* { dg-bogus "stack usage" } */
 {



Re: [build] Only support -gstabs on Mac OS X if assember supports it (PR target/67973)

2015-12-14 Thread Rainer Orth
Hi Iain,

>> However, I'm not really comfortable with this solution.  Initially, I
>> forgot to wrap the -Q option to as in %{gstabs*:...}, which lead to a
>> bootstrap failure: the gas- and LLVM-based assemblers differ in a
>> number of other ways, as can be seen when comparing gcc/auto-host.h:
>
> FAOD, 
> the changes below only occur if you omit the guard on “-Q” ?
> or they are present always?

they are from previous builds, one with the LLVM-based /usr/bin/as, the
other configure with --with-as=/vol/gcc/bin/as-6.4 (gas-based as from
Xcode 6.4).

>> Given this, I'd rather have us not support stabs at all than via this
>> half-hearted approach.
>
> I agree, we should just disable it if the “default” assembler doesn’t
> support - so most of your patch would be unchanged.
> The actual practical outcome is really not terribly severe, it ony affects
> folks trying to target x86-darwin8 ( the last OS to use stabs as the first
> choice), and even those systems support some debugging with DWARF.
>> 
>> What do you think?
>
> My view is it should still be a configure test, rather than a blanket
> disable for d15+ (simply because I’ve got back-ports of the newer tools
> that should be useable across the Darwin range, and hopefully those
> back-ports will be published before Xmas).  The same issues would generally
> apply there.

Fine with me, it's certainly the safest approach.  I'll update the patch
accordingly.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH][AArch64] PR target/68696 FAIL: gcc.target/aarch64/vbslq_u64_1.c scan-assembler-times bif\tv 1

2015-12-14 Thread Kyrill Tkachov

Ping.
https://gcc.gnu.org/ml/gcc-patches/2015-12/msg00826.html

Thanks,
Kyrill
On 08/12/15 09:21, Kyrill Tkachov wrote:

Hi all,

The test gcc.target/aarch64/vbslq_u64_1.c started failing recently due to some 
tree-level changes.
This just exposed a deficiency in our xor-and-xor pattern for the vector 
bit-select pattern:
aarch64_simd_bsl_internal.

We now fail to match the rtx:
(set (reg:V4SI 79)
(xor:V4SI (and:V4SI (xor:V4SI (reg:V4SI 32 v0 [ a ])
(reg/v:V4SI 77 [ b ]))
(reg:V4SI 34 v2 [ mask ]))
(reg/v:V4SI 77 [ b ])))

whereas before combine attempted:
(set (reg:V4SI 79)
(xor:V4SI (and:V4SI (xor:V4SI (reg/v:V4SI 77 [ b ])
(reg:V4SI 32 v0 [ a ]))
(reg:V4SI 34 v2 [ mask ]))
(reg/v:V4SI 77 [ b ])))

Note that just the order of the operands of the inner XOR has changed.
This could be solved by making the second operand of the outer XOR a 4th operand
of the pattern, enforcing that it should be equal to operand 2 or 3 in the 
pattern
condition and performing the appropriate swapping in the output template.
However, the aarch64_simd_bsl_internal pattern is expanded to by other
places in aarch64-simd.md and updating all the callsites to add a 4th operand is
wasteful and makes them harder to understand.

Therefore this patch adds a new define_insn with the match_dup of operand 2 in
the outer XOR.  I also had to update the alternatives/constraints in the pattern
and the output template. Basically it involves swapping operands 2 and 3 around 
in the
constraints and output templates.

The test now combines to a single vector bfi instruction again.

Bootstrapped and tested on aarch64.

Ok for trunk?

Thanks,
Kyrill

2015-12-08  Kyrylo Tkachov  

PR target/68696
* config/aarch64/aarch64-simd.md (*aarch64_simd_bsl_alt):
New pattern.
(aarch64_simd_bsl_internal): Update comment to reflect
the above.




Re: [RFC] Dump ssaname info for default defs

2015-12-14 Thread Tom de Vries

On 14/12/15 09:47, Richard Biener wrote:

On Fri, Dec 11, 2015 at 6:05 PM, Tom de Vries  wrote:

Hi,

atm, we dump ssa-name info for lhs-es of statements. That leaves out the ssa
names with default defs.

This proof-of-concept patch prints the ssa-name info for default defs, in
the following format:
...
__attribute__((noclone, noinline))
bar (intD.6 * cD.1755, intD.6 * dD.1756)
# PT = nonlocal
# DEFAULT_DEF c_2(D)
# PT = { D.1762 } (nonlocal)
# ALIGN = 4, MISALIGN = 0
# DEFAULT_DEF d_4(D)
{
;;   basic block 2, loop depth 0, count 0, freq 1, maybe hot
;;prev block 0, next block 1, flags: (NEW, REACHABLE)
;;pred:   ENTRY [100.0%]  (FALLTHRU,EXECUTABLE)
   # .MEM_3 = VDEF <.MEM_1(D)>
   *c_2(D) = 1;
   # .MEM_5 = VDEF <.MEM_3>
   *d_4(D) = 2;
   # VUSE <.MEM_5>
   return;
;;succ:   EXIT [100.0%]  (EXECUTABLE)

}
...

Good idea? Any further comments, f.i. on formatting?


I've had a similar patch in my dev tree for quite some while but never
pushed it because
of "formatting"...

That said,

+  if (gimple_in_ssa_p (fun))

Please add flags & TDF_ALIAS here to avoid issues with dump-file scanning.



Done.


+{
+  arg = DECL_ARGUMENTS (fndecl);
+  while (arg)
+   {
+ tree def = ssa_default_def (fun, arg);
+ if (flags & TDF_ALIAS)
+   dump_ssaname_info_to_file (file, def);
+ fprintf (file, "# DEFAULT_DEF ");
+ print_generic_expr (file, def, dump_flags);

Rather than

# DEFAULT_DEF d_4(D)

I'd print

d_4(D) = GIMPLE_NOP;

(or how gimple-nop is printed - that is, just print the def-stmt).

My local patch simply adjusted the dumping of function
locals, thus I amended the existing

   if (gimple_in_ssa_p (cfun))
 for (ix = 1; ix < num_ssa_names; ++ix)
   {
 tree name = ssa_name (ix);
 if (name && !SSA_NAME_VAR (name))
   {

loop.  Of course that intermixed default-defs with other anonymous
SSA vars which might be a little confusing.

But prepending the list of locals with

type d_4(D) = NOP();

together with SSA info might be the best.


Done.

In addition, I added printing of SSA_NAME_VAR(def) as argument to NOP.
I think that's even more clear: the var is printed in the same format as 
in the arguments list, so it makes it easier to relate the two:

...
__attribute__((noclone, noinline))
bar (intD.6 * cD.1755, intD.6 * dD.1756)
{
  # PT = nonlocal
  intD.6 * c_2(D) = NOP(cD.1755);
  # PT = { D.1762 } (nonlocal)
  # ALIGN = 4, MISALIGN = 0
  intD.6 * d_4(D) = NOP(dD.1756);
  intD.6 _6;
  intD.6 _7;
...


 Note there is also the
static chain and the result decl (if DECL_BY_REFERENCE) to print.



Done, though I haven't tested that bit yet.

Thanks,
- Tom
Dump default defs for arguments, static chain and decl-by-reference

2015-12-14  Tom de Vries  

	* gimple-pretty-print.c (dump_ssaname_info_to_file): New function.
	* gimple-pretty-print.h (dump_ssaname_info_to_file): Declare.
	* tree-cfg.c (dump_default_def): New function.
	(dump_function_to_file): Dump default defs for arguments, static chain,
	and decl-by-reference.

---
 gcc/gimple-pretty-print.c | 11 +++
 gcc/gimple-pretty-print.h |  1 +
 gcc/tree-cfg.c| 46 ++
 3 files changed, 58 insertions(+)

diff --git a/gcc/gimple-pretty-print.c b/gcc/gimple-pretty-print.c
index f1abf5c..01e9b6b 100644
--- a/gcc/gimple-pretty-print.c
+++ b/gcc/gimple-pretty-print.c
@@ -1887,6 +1887,17 @@ dump_ssaname_info (pretty_printer *buffer, tree node, int spc)
 }
 }
 
+/* As dump_ssaname_info, but dump to FILE.  */
+
+void
+dump_ssaname_info_to_file (FILE *file, tree node, int spc)
+{
+  pretty_printer buffer;
+  pp_needs_newline () = true;
+  buffer.buffer->stream = file;
+  dump_ssaname_info (, node, spc);
+  pp_flush ();
+}
 
 /* Dump a PHI node PHI.  BUFFER, SPC and FLAGS are as in pp_gimple_stmt_1.
The caller is responsible for calling pp_flush on BUFFER to finalize
diff --git a/gcc/gimple-pretty-print.h b/gcc/gimple-pretty-print.h
index 1ab24b8..740965f 100644
--- a/gcc/gimple-pretty-print.h
+++ b/gcc/gimple-pretty-print.h
@@ -34,5 +34,6 @@ extern void print_gimple_expr (FILE *, gimple *, int, int);
 extern void pp_gimple_stmt_1 (pretty_printer *, gimple *, int, int);
 extern void gimple_dump_bb (FILE *, basic_block, int, int);
 extern void gimple_dump_bb_for_graph (pretty_printer *, basic_block);
+extern void dump_ssaname_info_to_file (FILE *, tree, int);
 
 #endif /* ! GCC_GIMPLE_PRETTY_PRINT_H */
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index 0c624aa..163fef6 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -7312,6 +7312,23 @@ move_sese_region_to_fn (struct function *dest_cfun, basic_block entry_bb,
   return bb;
 }
 
+/* Dump default default def DEF to file FILE using FLAGS and indentation
+   SPC.  */
+
+static void
+dump_default_def (FILE *file, tree def, int spc, int flags)
+{
+  for (int i = 0; i < spc; ++i)

[testsuite] Tweak gcc.dg/torture/pr68264.c for Solaris

2015-12-14 Thread Eric Botcazou
The testcase was already tweaked for Glibc but it needs to be further tweaked 
because of bugs in Solaris' libm.  With the attached patch the test now passes 
on my SPARC/Solaris 10 and my x86/Solaris 10 test machines.

OK for the mainline?


2015-12-14  Eric Botcazou  

* gcc.dg/torture/pr68264.c: Tweak for Solaris.

-- 
Eric BotcazouIndex: gcc.dg/torture/pr68264.c
===
--- gcc.dg/torture/pr68264.c	(revision 231605)
+++ gcc.dg/torture/pr68264.c	(working copy)
@@ -68,14 +68,24 @@ test (void)
   TEST (cosh (d), LARGE_ERANGE);
   TEST (sinh (d), LARGE_ERANGE);
   TEST (log (d), LARGE_NEG_EDOM);
-  TEST (log2 (d), LARGE_NEG_EDOM);
+#if defined (__sun__) && defined (__unix__)
+  /* Disabled due to a bug in Solaris libm.  */
+  if (0)
+#endif
+TEST (log2 (d), LARGE_NEG_EDOM);
   TEST (log10 (d), LARGE_NEG_EDOM);
   /* Disabled due to glibc PR 6792, fixed in Apr 2015.  */
   if (0)
 TEST (log1p (d), LARGE_NEG_EDOM);
   TEST (exp (d), POWER_ERANGE);
-  TEST (exp2 (d), POWER_ERANGE);
-  TEST (expm1 (d), POWER_ERANGE);
+#if defined (__sun__) && defined (__unix__)
+  /* Disabled due to a bug in Solaris libm.  */
+  if (0)
+#endif
+{
+  TEST (exp2 (d), POWER_ERANGE);
+  TEST (expm1 (d), POWER_ERANGE);
+}
   TEST (sqrt (d), LARGE_NEG_EDOM);
   TEST (pow (100.0, d), POWER_ERANGE);
   TEST (pow (i, d), POWER_ERANGE);


Re: [v3 PATCH] Document the implementation of Logical Operator Type Traits.

2015-12-14 Thread Jonathan Wakely

On 12/12/15 22:34 +0200, Ville Voutilainen wrote:

   Document the implementation of Logical Operator Type Traits.

   * doc/html/index.html: Regenerate.
   * doc/html/manual/status.html: Likewise.
   * doc/xml/manual/status_cxx2017.xml: Add P0013R1 to C++ 201z
   and to Library Fundamentals 2 TS.


OK, thanks.

N.B. there's no need to include the HTML diff in the patch posted for
review, because it's generated (same rule as for autoconf-generated
files).

The docbook HTML output is barely readable, it's just a wall of
markup, so I doubt anyone looks at it anyway.



Re: [build] Only support -gstabs on Mac OS X if assember supports it (PR target/67973)

2015-12-14 Thread Iain Sandoe
Hi Rainer, 

Thanks for looking at this!

> On 14 Dec 2015, at 10:40, Rainer Orth  wrote:
> 
> As described in PR PR target/67973, newer assemblers on Mac OS X, which
> are based on LLVM instead of gas, don't support .stab* directives any
> longer.  The following patch detects this situation and tries to fall
> back to the older gas-based as if it is still accessible via as -Q.
> 
> Tested on x86_64-apple-darwin15.2.0 and as expected the -gstabs* tests
> now pass.
> 
> However, I'm not really comfortable with this solution.  Initially, I
> forgot to wrap the -Q option to as in %{gstabs*:...}, which lead to a
> bootstrap failure: the gas- and LLVM-based assemblers differ in a
> number of other ways, as can be seen when comparing gcc/auto-host.h:

FAOD, 
the changes below only occur if you omit the guard on “-Q” ?
or they are present always?

> --- build.llvm-as/gcc/auto-host.h 2015-11-27 10:53:31.0 +0100
> +++ build.gas/gcc/auto-host.h 2015-12-04 20:25:30.0 +0100
> @@ -351 +357 @@
> -/* #undef HAVE_AS_GDWARF2_DEBUG_FLAG */
> +#define HAVE_AS_GDWARF2_DEBUG_FLAG 1
> @@ -369 +375 @@
> -/* #undef HAVE_AS_GSTABS_DEBUG_FLAG */
> +#define HAVE_AS_GSTABS_DEBUG_FLAG 1
> @@ -388 +394 @@
> -/* #undef HAVE_AS_IX86_FFREEP */
> +#define HAVE_AS_IX86_FFREEP 1
> @@ -412 +418 @@
> -#define HAVE_AS_IX86_INTERUNIT_MOVQ 1
> +#define HAVE_AS_IX86_INTERUNIT_MOVQ 0
> @@ -424 +430 @@
> -#define HAVE_AS_IX86_REP_LOCK_PREFIX 1
> +/* #undef HAVE_AS_IX86_REP_LOCK_PREFIX */
> @@ -1176 +1182 @@

> -#define HAVE_GAS_CFI_PERSONALITY_DIRECTIVE 1
> +#define HAVE_GAS_CFI_PERSONALITY_DIRECTIVE 0
> @@ -1179 +1185 @@
> -#define HAVE_GAS_CFI_SECTIONS_DIRECTIVE 1
> +#define HAVE_GAS_CFI_SECTIONS_DIRECTIVE 0

^^^ These two are definitely not safe without other changes (I have a local 
patch, but it might not be stage3 stuff).

> @@ -1183 +1189 @@
> -#define HAVE_GAS_DISCRIMINATOR 1
> +/* #undef HAVE_GAS_DISCRIMINATOR */
> @@ -1298 +1304 @@
> -#define HAVE_GNU_AS 0
> +#define HAVE_GNU_AS 1
> 
> So, we can be pretty certain to hit cases where some file compiles and
> assembles without -gstabs, but fails to assemble with -gstabs.  Not
> exactly the user experience I prefer.
> 
> Given this, I'd rather have us not support stabs at all than via this
> half-hearted approach.

I agree, we should just disable it if the “default” assembler doesn’t support - 
so most of your patch would be unchanged.
The actual practical outcome is really not terribly severe, it ony affects 
folks trying to target x86-darwin8 ( the last OS to use stabs as the first 
choice), and even those systems support some debugging with DWARF.

> 
> What do you think?

My view is it should still be a configure test, rather than a blanket disable 
for d15+ (simply because I’ve got back-ports of the newer tools that should be 
useable across the Darwin range, and hopefully those back-ports will be 
published before Xmas).  The same issues would generally apply there.

Iain

> 
>   Rainer
> 
> 
> 2015-12-11  Rainer Orth  
> 
>   PR target/67973
>   * configure.ac (gcc_cv_as_stabs_directive): New test.
>   (gcc_cv_as_darwin_stabs_Q): New test.
>   * configure: Regenerate.
>   * config.in: Regenerate.
>   * config/darwin.h (DBX_DEBUGGING_INFO): Wrap in
>   HAVE_AS_STABS_DIRECTIVE.
>   (PREFERRED_DEBUGGING_TYPE): Likewise.
>   * config/i386/darwin.h (PREFERRED_DEBUGGING_TYPE): Only include
>   DBX_DEBUG if HAVE_AS_STABS_DIRECTIVE.
>   (ASM_STABS_Q_SPEC): Define.
>   (ASM_SPEC): Use it.
> 
> # HG changeset patch
> # Parent  7029fd86ac40d7ff34e1c43d729c6bf469416643
> Only support -gstabs on Mac OS X if assember supports it (PR target/67973)
> 
> diff --git a/gcc/config/darwin.h b/gcc/config/darwin.h
> --- a/gcc/config/darwin.h
> +++ b/gcc/config/darwin.h
> @@ -400,12 +400,13 @@ extern GTY(()) int darwin_ms_struct;
> 
> #define ASM_DEBUG_SPEC  "%{g*:%{!g0:%{!gdwarf*:--gstabs}}}"
> 
> -/* We still allow output of STABS.  */
> -
> +/* We still allow output of STABS if the assembler supports it.  */
> +#ifdef HAVE_AS_STABS_DIRECTIVE
> #define DBX_DEBUGGING_INFO 1
> +#define PREFERRED_DEBUGGING_TYPE DBX_DEBUG
> +#endif
> 
> #define DWARF2_DEBUGGING_INFO 1
> -#define PREFERRED_DEBUGGING_TYPE DBX_DEBUG
> 
> #define DEBUG_FRAME_SECTION   "__DWARF,__debug_frame,regular,debug"
> #define DEBUG_INFO_SECTION"__DWARF,__debug_info,regular,debug"
> diff --git a/gcc/config/i386/darwin.h b/gcc/config/i386/darwin.h
> --- a/gcc/config/i386/darwin.h
> +++ b/gcc/config/i386/darwin.h
> @@ -111,9 +111,16 @@ extern int darwin_emit_branch_islands;
>   %{g: %{!fno-eliminate-unused-debug-symbols: 
> -feliminate-unused-debug-symbols }} " \
>   DARWIN_CC1_SPEC
> 
> +/* Pass -Q to assembler if necessary for stabs support.  */
> +#ifdef HAVE_AS_DARWIN_STABS_Q
> +#define ASM_STABS_Q_SPEC " %{gstabs*:-Q}"
> +#else
> +#define ASM_STABS_Q_SPEC ""
> +#endif
> +
> #undef ASM_SPEC
> #define 

Re: [PATCH][ARM] PR target/68648: Fold NOT of CONST_INT in andsi_iorsi3_notsi splitter

2015-12-14 Thread Kyrill Tkachov

Ping.
https://gcc.gnu.org/ml/gcc-patches/2015-12/msg00708.html

Thanks,
Kyrill

On 07/12/15 10:39, Kyrill Tkachov wrote:

Hi all,

In this PR we ICE because during post-reload splitting we generate the insn:
(insn 27 26 11 2 (set (reg:SI 0 r0 [orig:121 D.4992 ] [121])
(and:SI (not:SI (const_int 1 [0x1]))
(reg:SI 0 r0 [orig:121 D.4992 ] [121])))
 (nil))


The splitter at fault is andsi_iorsi3_notsi that accepts a const_int in 
operands[3]
and outputs (not (match_dup 3)). It should really be trying to constant fold 
the result
first.  This patch does that by calling simplify_gen_unary to generate the 
complement
of operands[3] if it's a register or the appropriate const_int rtx with the 
correct
folded result that will still properly match the arm bic-immediate instruction.

Bootstrapped and tested on arm-none-eabi.

Is this ok for trunk?

This appears on GCC 4.9 and GCC 5 and I'll be testing the fix there as well.
Ok for those branches if testing is successful?

Thanks,
Kyrill

2015-12-07  Kyrylo Tkachov  

PR target/68648
* config/arm/arm.md (*andsi_iorsi3_notsi): Try to simplify
the complement of operands[3] during splitting.

2015-12-07  Kyrylo Tkachov  

PR target/68648
* gcc.c-torture/execute/pr68648.c: New test.




Re: [build] Only support -gstabs on Mac OS X if assember supports it (PR target/67973)

2015-12-14 Thread Rainer Orth
Hi Iain,

>> On 14 Dec 2015, at 11:13, Rainer Orth  wrote:
>> 
 However, I'm not really comfortable with this solution.  Initially, I
 forgot to wrap the -Q option to as in %{gstabs*:...}, which lead to a
 bootstrap failure: the gas- and LLVM-based assemblers differ in a
 number of other ways, as can be seen when comparing gcc/auto-host.h:
>>> 
>>> FAOD, 
>>> the changes below only occur if you omit the guard on “-Q” ?
>>> or they are present always?
>> 
>> they are from previous builds, one with the LLVM-based /usr/bin/as, the
>> other configure with --with-as=/vol/gcc/bin/as-6.4 (gas-based as from
>> Xcode 6.4).
>
> Hrm, this needs more investigation, and will affect 10.10 too, since xc7 is
> the default there.
> (separate issue, let’s start a new PR, or at least a new thread).

right, but it's only an issue if you switch assemblers (or linkers) used
by gcc without rebuilding.  This has never been safe on any platform.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH 0/3] Re: [PATCH] PR c/68473: sanitize source range-printing within certain macro expansions

2015-12-14 Thread Bernd Schmidt

On 12/11/2015 07:45 PM, David Malcolm wrote:

The third patch in the kit is the earlier workaround for the bug; as
before it degrades diagnostic-printing to just print a caret for the
awkward cases, rather than attempting to print a range.


I'm a little confused now, do the first two patches actually help or are 
they just cleanups? If they have no immediate effect, let's postpone 
them to stage 1 and just do your initial workaround for now.



Bernd


Re: [v4] avoid alignment of static variables affecting stack's

2015-12-14 Thread Jan Beulich
>>> On 14.12.15 at 10:07,  wrote:
> Note that we also record alignment to make sure we can spill to properly
> aligned stack slots.
> 
> I don't see why we don't need to do that for used statics/externs.  That is
> are you sure we never need to spill a var of their type?

No, I'm not, but note that the discussion on v1/v2 of this patch never
really led anywhere, which prompted me to resend the patch after
several months of silence. Also I'm not convinced that hypothetical
spilling needs should lead to unconditional stack alignment increases.
I.e. either at the time alignment gets recorded it is known that a spill
is needed, or the spill gets avoided when stack alignment isn't large
enough (after all such spilling should only be an optimization, not a
correctness requirement aiui). For me to really look into this situation
I'd need to know conditions that would result in such a spill to actually
occur (I've never observed one in practice).

In any event (and again taking into consideration the long period of
silence on the previous discussion thread) I don't mind my change to
be reverted if only the problem finally gets taken care of. Globally
changing very many functions' stack alignment in e.g. the Linux kernel
just because of a function local static debugging variable getting
emitted in certain not uncommon configurations is not acceptable imo.

Jan



[PATCH 1/2] mark *-knetbsd-* as obsolete

2015-12-14 Thread tbsaunde+gcc
From: Trevor Saunders 

gcc/ChangeLog:

2015-12-14  Trevor Saunders  

* config.gcc: mark knetbsd targets as obsolete.
---
 gcc/config.gcc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 882e413..59f77da 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -237,7 +237,7 @@ md_file=
 # Obsolete configurations.
 case ${target} in
 # Currently there are no obsolete targets.
- nothing   \
+ *-knetbsd-*   \
  )
 if test "x$enable_obsolete" != xyes; then
   echo "*** Configuration ${target} is obsolete." >&2
-- 
2.4.0



[PATCH 0/2] obsolete some old targets

2015-12-14 Thread tbsaunde+gcc
From: Trevor Saunders 

Hi,

http://gcc.gnu.org/ml/gcc-patches/2015-12/msg00365.html reminded me I hadn't
gotten around to marking *-knetbsd and openbsd 2/3 obsolete as I offered to do
back in the spring.

I tested I could still build on x86_64-linux-gnu, and could only cross compile
to i386-openbsd2 i386-openbsd3 and x86_64_64-knetbsd-gnu with
--enable-obsolete.  Given how late in the cycle we are I'm not sure if we
should remove these targets as soon as stage 1 opens, but we might as well
obsolete them I guess, ok to commit?

Trev


Trevor Saunders (2):
  mark *-knetbsd-* as obsolete
  obsolete openbsd 2.0 and 3.X

 gcc/config.gcc | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

-- 
2.4.0



[PATCH 2/2] obsolete openbsd 2.0 and 3.X

2015-12-14 Thread tbsaunde+gcc
From: Trevor Saunders 

gcc/ChangeLog:

2015-12-14  Trevor Saunders  

* config.gcc: Makr openbsd 2.0 and 3.X as obsolete.
---
 gcc/config.gcc | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 59f77da..35ae048 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -238,6 +238,8 @@ md_file=
 case ${target} in
 # Currently there are no obsolete targets.
  *-knetbsd-*   \
+ | *-openbsd2* \
+ | *-openbsd3* \
  )
 if test "x$enable_obsolete" != xyes; then
   echo "*** Configuration ${target} is obsolete." >&2
-- 
2.4.0



Re: [PATCH] doc: discourage use of __attribute__((optimize())) in production code

2015-12-14 Thread Trevor Saunders
On Mon, Dec 14, 2015 at 10:01:27AM +0100, Richard Biener wrote:
> On Sun, Dec 13, 2015 at 9:03 PM, Andi Kleen  wrote:
> > Markus Trippelsdorf  writes:
> >
> >> Many developers are still using __attribute__((optimize())) in
> >> production code, although it quite broken.
> >
> > Wo reads documentation? @) If you want to discourage it better warn once
> > at runtime.
> 
> We're also quite heavily using it in LTO internally now.

besides that does this really make sense?  I suspect very few people are
using this for the fun of it.  I'd guess most usage is to disable
optimizations to work around bugs, or maybe trying to get a very hot
function optimized more.  Either way I suspect its only used by people
with good reason and this would just really iritate them.

Trev

> 
> Richard.
> 
> > -Andi


[build] Only support -gstabs on Mac OS X if assember supports it (PR target/67973)

2015-12-14 Thread Rainer Orth
As described in PR PR target/67973, newer assemblers on Mac OS X, which
are based on LLVM instead of gas, don't support .stab* directives any
longer.  The following patch detects this situation and tries to fall
back to the older gas-based as if it is still accessible via as -Q.

Tested on x86_64-apple-darwin15.2.0 and as expected the -gstabs* tests
now pass.

However, I'm not really comfortable with this solution.  Initially, I
forgot to wrap the -Q option to as in %{gstabs*:...}, which lead to a
bootstrap failure: the gas- and LLVM-based assemblers differ in a
number of other ways, as can be seen when comparing gcc/auto-host.h:

--- build.llvm-as/gcc/auto-host.h   2015-11-27 10:53:31.0 +0100
+++ build.gas/gcc/auto-host.h   2015-12-04 20:25:30.0 +0100
@@ -351 +357 @@
-/* #undef HAVE_AS_GDWARF2_DEBUG_FLAG */
+#define HAVE_AS_GDWARF2_DEBUG_FLAG 1
@@ -369 +375 @@
-/* #undef HAVE_AS_GSTABS_DEBUG_FLAG */
+#define HAVE_AS_GSTABS_DEBUG_FLAG 1
@@ -388 +394 @@
-/* #undef HAVE_AS_IX86_FFREEP */
+#define HAVE_AS_IX86_FFREEP 1
@@ -412 +418 @@
-#define HAVE_AS_IX86_INTERUNIT_MOVQ 1
+#define HAVE_AS_IX86_INTERUNIT_MOVQ 0
@@ -424 +430 @@
-#define HAVE_AS_IX86_REP_LOCK_PREFIX 1
+/* #undef HAVE_AS_IX86_REP_LOCK_PREFIX */
@@ -1176 +1182 @@
-#define HAVE_GAS_CFI_PERSONALITY_DIRECTIVE 1
+#define HAVE_GAS_CFI_PERSONALITY_DIRECTIVE 0
@@ -1179 +1185 @@
-#define HAVE_GAS_CFI_SECTIONS_DIRECTIVE 1
+#define HAVE_GAS_CFI_SECTIONS_DIRECTIVE 0
@@ -1183 +1189 @@
-#define HAVE_GAS_DISCRIMINATOR 1
+/* #undef HAVE_GAS_DISCRIMINATOR */
@@ -1298 +1304 @@
-#define HAVE_GNU_AS 0
+#define HAVE_GNU_AS 1

So, we can be pretty certain to hit cases where some file compiles and
assembles without -gstabs, but fails to assemble with -gstabs.  Not
exactly the user experience I prefer.

Given this, I'd rather have us not support stabs at all than via this
half-hearted approach.

What do you think?

Rainer


2015-12-11  Rainer Orth  

PR target/67973
* configure.ac (gcc_cv_as_stabs_directive): New test.
(gcc_cv_as_darwin_stabs_Q): New test.
* configure: Regenerate.
* config.in: Regenerate.
* config/darwin.h (DBX_DEBUGGING_INFO): Wrap in
HAVE_AS_STABS_DIRECTIVE.
(PREFERRED_DEBUGGING_TYPE): Likewise.
* config/i386/darwin.h (PREFERRED_DEBUGGING_TYPE): Only include
DBX_DEBUG if HAVE_AS_STABS_DIRECTIVE.
(ASM_STABS_Q_SPEC): Define.
(ASM_SPEC): Use it.

# HG changeset patch
# Parent  7029fd86ac40d7ff34e1c43d729c6bf469416643
Only support -gstabs on Mac OS X if assember supports it (PR target/67973)

diff --git a/gcc/config/darwin.h b/gcc/config/darwin.h
--- a/gcc/config/darwin.h
+++ b/gcc/config/darwin.h
@@ -400,12 +400,13 @@ extern GTY(()) int darwin_ms_struct;
 
 #define ASM_DEBUG_SPEC  "%{g*:%{!g0:%{!gdwarf*:--gstabs}}}"
 
-/* We still allow output of STABS.  */
-
+/* We still allow output of STABS if the assembler supports it.  */
+#ifdef HAVE_AS_STABS_DIRECTIVE
 #define DBX_DEBUGGING_INFO 1
+#define PREFERRED_DEBUGGING_TYPE DBX_DEBUG
+#endif
 
 #define DWARF2_DEBUGGING_INFO 1
-#define PREFERRED_DEBUGGING_TYPE DBX_DEBUG
 
 #define DEBUG_FRAME_SECTION	"__DWARF,__debug_frame,regular,debug"
 #define DEBUG_INFO_SECTION	"__DWARF,__debug_info,regular,debug"
diff --git a/gcc/config/i386/darwin.h b/gcc/config/i386/darwin.h
--- a/gcc/config/i386/darwin.h
+++ b/gcc/config/i386/darwin.h
@@ -111,9 +111,16 @@ extern int darwin_emit_branch_islands;
   %{g: %{!fno-eliminate-unused-debug-symbols: -feliminate-unused-debug-symbols }} " \
   DARWIN_CC1_SPEC
 
+/* Pass -Q to assembler if necessary for stabs support.  */
+#ifdef HAVE_AS_DARWIN_STABS_Q
+#define ASM_STABS_Q_SPEC " %{gstabs*:-Q}"
+#else
+#define ASM_STABS_Q_SPEC ""
+#endif
+
 #undef ASM_SPEC
 #define ASM_SPEC "-arch %(darwin_arch) -force_cpusubtype_ALL \
-  %{static}"
+  %{static}" ASM_STABS_Q_SPEC
 
 #define DARWIN_ARCH_SPEC "%{m64:x86_64;:i386}"
 #define DARWIN_SUBARCH_SPEC DARWIN_ARCH_SPEC
@@ -226,7 +233,11 @@ do {	\
compiles default to stabs+.  darwin9+ defaults to dwarf-2.  */
 #ifndef DARWIN_PREFER_DWARF
 #undef PREFERRED_DEBUGGING_TYPE
+#ifdef HAVE_AS_STABS_DIRECTIVE
 #define PREFERRED_DEBUGGING_TYPE (TARGET_64BIT ? DWARF2_DEBUG : DBX_DEBUG)
+#else
+#define PREFERRED_DEBUGGING_TYPE DWARF2_DEBUG
+#endif
 #endif
 
 /* Darwin uses the standard DWARF register numbers but the default
diff --git a/gcc/configure.ac b/gcc/configure.ac
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -2909,6 +2909,11 @@ AC_DEFINE_UNQUOTED(HAVE_GAS_SHF_MERGE,
   [`if test $gcc_cv_as_shf_merge = yes; then echo 1; else echo 0; fi`],
 [Define 0/1 if your assembler supports marking sections with SHF_MERGE flag.])
 
+gcc_GAS_CHECK_FEATURE([stabs directive], gcc_cv_as_stabs_directive, ,,
+[.stabs "gcc2_compiled.",60,0,0,0],,
+[AC_DEFINE(HAVE_AS_STABS_DIRECTIVE, 1,
+  [Define if your assembler supports .stabs.])])
+
 gcc_GAS_CHECK_FEATURE([COMDAT group support (GNU as)],
  

Re: [Patch, libstdc++/68863] Let lookahead regex use captured contents

2015-12-14 Thread Jonathan Wakely

On 11/12/15 22:11 -0800, Tim Shen wrote:

This is a one-line quick fix for correctness.

I bootstrapped trunk and tested on x86_64-pc-linux-gnu, but I wish I
can backport it at least to gcc-5-branch.


I don't fully understand the patch, but it's OK for trunk, and if
you're confident it's definitely correct and safe it's OK for the
gcc-5 and gcc-4_9 branches too.

Was it just completely wrong before, creating a vector of
default-constructed match results, that were not matched?



Re: [build] Only support -gstabs on Mac OS X if assember supports it (PR target/67973)

2015-12-14 Thread Iain Sandoe
Hi Rainer,

> On 14 Dec 2015, at 11:13, Rainer Orth  wrote:
> 
>>> However, I'm not really comfortable with this solution.  Initially, I
>>> forgot to wrap the -Q option to as in %{gstabs*:...}, which lead to a
>>> bootstrap failure: the gas- and LLVM-based assemblers differ in a
>>> number of other ways, as can be seen when comparing gcc/auto-host.h:
>> 
>> FAOD, 
>> the changes below only occur if you omit the guard on “-Q” ?
>> or they are present always?
> 
> they are from previous builds, one with the LLVM-based /usr/bin/as, the
> other configure with --with-as=/vol/gcc/bin/as-6.4 (gas-based as from
> Xcode 6.4).

Hrm, this needs more investigation, and will affect 10.10 too, since xc7 is the 
default there.
(separate issue, let’s start a new PR, or at least a new thread).

Iain



Re: [Patch AArch64] Reinstate CANNOT_CHANGE_MODE_CLASS to fix pr67609

2015-12-14 Thread Marcus Shawcroft
On 14 December 2015 at 11:01, James Greenhalgh  wrote:
> On Wed, Dec 09, 2015 at 01:13:20PM +, Marcus Shawcroft wrote:
>> On 27 November 2015 at 13:01, James Greenhalgh  
>> wrote:
>>
>> > 2015-11-27  James Greenhalgh  
>> >
>> > * config/aarch64/aarch64-protos.h
>> > (aarch64_cannot_change_mode_class): Bring back.
>> > * config/aarch64/aarch64.c
>> > (aarch64_cannot_change_mode_class): Likewise.
>> > * config/aarch64/aarch64.h (CANNOT_CHANGE_MODE_CLASS): Likewise.
>> > * config/aarch64/aarch64.md (aarch64_movdi_low): Use
>> > zero_extract rather than truncate.
>> > (aarch64_movdi_high): Likewise.
>> >
>> > 2015-11-27  James Greenhalgh  
>> >
>> > * gcc.dg/torture/pr67609.c: New.
>> >
>>
>> + detailed dicussion.  In all other cases, we want to be premissive
>>
>> s/premissive/permissive/
>>
>> OK /Marcus
>
> Thanks.
>
> This has had a week or so to soak on trunk now, is it OK to backport to GCC
> 5 and 4.9?
>
> The patch applies as-good-as clean, with only a little bit to fix up in
> aarch64-protos.h to keep alphabetical order, and I've bootstrapped and tested
> the backports with no issue.

OK /Marcus


[PATCH] Fix PR68852

2015-12-14 Thread Richard Biener

The following fixes PR68852 - so I finally needed to sit down and
fix the "build-from-scalars" hack in the SLP vectorizer by pretending
we'd have a sane vectorizer IL.  Basically I now mark the SLP node
with a proper vect_def_type but I have to push that down to the
stmt-info level whenever sth would look at it.

It's a bit ugly but not too much yet ;)

Anyway, the proper fix is to have a sane data structure, nothing for
GCC 6 though.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Verified SPEC CPU 2006 is happy with the patch.

Richard.

2015-12-14  Richard Biener  

PR tree-optimization/68852
* tree-vectorizer.h (struct _slp_tree): Add def_type member.
(SLP_TREE_DEF_TYPE): New accessor.
* tree-vect-stmts.c (vect_is_simple_use): Remove BB vectorization
hack.
* tree-vect-slp.c (vect_create_new_slp_node): Initialize
SLP_TREE_DEF_TYPE.
(vect_build_slp_tree): When a node is to be built up from scalars
do not push a NULL as child but instead set its def_type to
vect_external_def.
(vect_analyze_slp_cost_1): Check for child def-type instead
of NULL.
(vect_detect_hybrid_slp_stmts): Likewise.
(vect_bb_slp_scalar_cost): Likewise.
(vect_get_slp_defs): Likewise.
(vect_slp_analyze_node_operations): Likewise.  Before
processing node push the children def-types to the underlying
stmts vinfo and restore it afterwards.
(vect_schedule_slp_instance): Likewise.
(vect_slp_analyze_bb_1): Do not mark stmts not in SLP instances
as not vectorizable.

* g++.dg/torture/pr68852.C: New testcase.

Index: gcc/tree-vectorizer.h
===
*** gcc/tree-vectorizer.h   (revision 231552)
--- gcc/tree-vectorizer.h   (working copy)
*** struct _slp_tree {
*** 107,112 
--- 107,114 
unsigned int vec_stmts_size;
/* Whether the scalar computations use two different operators.  */
bool two_operators;
+   /* The DEF type of this node.  */
+   enum vect_def_type def_type;
  };
  
  
*** typedef struct _slp_instance {
*** 139,144 
--- 141,147 
  #define SLP_TREE_NUMBER_OF_VEC_STMTS(S)  (S)->vec_stmts_size
  #define SLP_TREE_LOAD_PERMUTATION(S) (S)->load_permutation
  #define SLP_TREE_TWO_OPERATORS(S)  (S)->two_operators
+ #define SLP_TREE_DEF_TYPE(S)   (S)->def_type
  
  
  
Index: gcc/tree-vect-stmts.c
===
*** gcc/tree-vect-stmts.c   (revision 231552)
--- gcc/tree-vect-stmts.c   (working copy)
*** vect_is_simple_use (tree operand, vec_in
*** 8649,8658 
else
  {
stmt_vec_info stmt_vinfo = vinfo_for_stmt (*def_stmt);
!   if (is_a  (vinfo) && !STMT_VINFO_VECTORIZABLE (stmt_vinfo))
!   *dt = vect_external_def;
!   else
!   *dt = STMT_VINFO_DEF_TYPE (stmt_vinfo);
  }
  
if (dump_enabled_p ())
--- 8652,8658 
else
  {
stmt_vec_info stmt_vinfo = vinfo_for_stmt (*def_stmt);
!   *dt = STMT_VINFO_DEF_TYPE (stmt_vinfo);
  }
  
if (dump_enabled_p ())
Index: gcc/testsuite/g++.dg/torture/pr68852.C
===
--- gcc/testsuite/g++.dg/torture/pr68852.C  (revision 0)
+++ gcc/testsuite/g++.dg/torture/pr68852.C  (working copy)
@@ -0,0 +1,51 @@
+/* { dg-do compile } */
+
+struct A {
+double x, y, z, w;
+A() {}
+A(double, double p2, double p3, double) : y(p2), z(p3) {}
+void m_fn1();
+};
+
+struct B {
+double x, y;
+};
+struct D : A {
+D() {}
+D(double p1, double p2, double p3, double p4) : A(p1, p2, p3, p4) {}
+};
+
+class C {
+public:
+float _11, _12, _13, _14;
+float _21, _22, _23, _24;
+float _31, _32, _33, _34;
+float _41, _42, _43, _44;
+D m_fn2(B p1) {
+   double z(p1.x + _43);
+   return *this * D(p1.x, p1.y, z, 1);
+}
+int ProjectRectBounds_next;
+B __trans_tmp_3;
+int m_fn3(int) {
+   B a, b;
+   D c[1];
+   b = __trans_tmp_3;
+   c[2] = m_fn2(b);
+   c[3] = m_fn2(a);
+   c[ProjectRectBounds_next].m_fn1();
+}
+D operator*(D p1) {
+   D d;
+   d.x = p1.x * _11 + p1.y * _21 + p1.z * _31 + _41;
+   d.y = p1.x * _12 + p1.y * _22 + p1.z * _32 + _42;
+   d.z = p1.x * _13 + p1.y * _23 + p1.z * _33 + _43;
+   d.w = p1.x * _14 + p1.y * _24 + p1.z * _34 + _44;
+   return d;
+}
+};
+
+void fn1() {
+C e;
+int f = e.m_fn3(f);
+}
Index: gcc/tree-vect-slp.c
===
*** gcc/tree-vect-slp.c (revision 231610)
--- gcc/tree-vect-slp.c (working copy)
*** vect_free_slp_tree (slp_tree node)
*** 51,59 
int i;
slp_tree child;
  
-   if (!node)
- return;
- 

Re: [PATCH] Fix PR68852

2015-12-14 Thread Richard Biener
On Mon, 14 Dec 2015, Richard Biener wrote:

> 
> The following fixes PR68852 - so I finally needed to sit down and
> fix the "build-from-scalars" hack in the SLP vectorizer by pretending
> we'd have a sane vectorizer IL.  Basically I now mark the SLP node
> with a proper vect_def_type but I have to push that down to the
> stmt-info level whenever sth would look at it.
> 
> It's a bit ugly but not too much yet ;)
> 
> Anyway, the proper fix is to have a sane data structure, nothing for
> GCC 6 though.
> 
> Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
> 
> Verified SPEC CPU 2006 is happy with the patch.

Ick.  I reverted the acidentially applied fix for PR68707 that went
with this patch.  The other unrelated hunk was already applied
as fix for PR68775.

Richard.

> Richard.
> 
> 2015-12-14  Richard Biener  
> 
>   PR tree-optimization/68852
>   * tree-vectorizer.h (struct _slp_tree): Add def_type member.
>   (SLP_TREE_DEF_TYPE): New accessor.
>   * tree-vect-stmts.c (vect_is_simple_use): Remove BB vectorization
>   hack.
>   * tree-vect-slp.c (vect_create_new_slp_node): Initialize
>   SLP_TREE_DEF_TYPE.
>   (vect_build_slp_tree): When a node is to be built up from scalars
>   do not push a NULL as child but instead set its def_type to
>   vect_external_def.
>   (vect_analyze_slp_cost_1): Check for child def-type instead
>   of NULL.
>   (vect_detect_hybrid_slp_stmts): Likewise.
>   (vect_bb_slp_scalar_cost): Likewise.
>   (vect_get_slp_defs): Likewise.
>   (vect_slp_analyze_node_operations): Likewise.  Before
>   processing node push the children def-types to the underlying
>   stmts vinfo and restore it afterwards.
>   (vect_schedule_slp_instance): Likewise.
>   (vect_slp_analyze_bb_1): Do not mark stmts not in SLP instances
>   as not vectorizable.
> 
>   * g++.dg/torture/pr68852.C: New testcase.
> 
> Index: gcc/tree-vectorizer.h
> ===
> *** gcc/tree-vectorizer.h (revision 231552)
> --- gcc/tree-vectorizer.h (working copy)
> *** struct _slp_tree {
> *** 107,112 
> --- 107,114 
> unsigned int vec_stmts_size;
> /* Whether the scalar computations use two different operators.  */
> bool two_operators;
> +   /* The DEF type of this node.  */
> +   enum vect_def_type def_type;
>   };
>   
>   
> *** typedef struct _slp_instance {
> *** 139,144 
> --- 141,147 
>   #define SLP_TREE_NUMBER_OF_VEC_STMTS(S)  (S)->vec_stmts_size
>   #define SLP_TREE_LOAD_PERMUTATION(S) (S)->load_permutation
>   #define SLP_TREE_TWO_OPERATORS(S)(S)->two_operators
> + #define SLP_TREE_DEF_TYPE(S) (S)->def_type
>   
>   
>   
> Index: gcc/tree-vect-stmts.c
> ===
> *** gcc/tree-vect-stmts.c (revision 231552)
> --- gcc/tree-vect-stmts.c (working copy)
> *** vect_is_simple_use (tree operand, vec_in
> *** 8649,8658 
> else
>   {
> stmt_vec_info stmt_vinfo = vinfo_for_stmt (*def_stmt);
> !   if (is_a  (vinfo) && !STMT_VINFO_VECTORIZABLE 
> (stmt_vinfo))
> ! *dt = vect_external_def;
> !   else
> ! *dt = STMT_VINFO_DEF_TYPE (stmt_vinfo);
>   }
>   
> if (dump_enabled_p ())
> --- 8652,8658 
> else
>   {
> stmt_vec_info stmt_vinfo = vinfo_for_stmt (*def_stmt);
> !   *dt = STMT_VINFO_DEF_TYPE (stmt_vinfo);
>   }
>   
> if (dump_enabled_p ())
> Index: gcc/testsuite/g++.dg/torture/pr68852.C
> ===
> --- gcc/testsuite/g++.dg/torture/pr68852.C(revision 0)
> +++ gcc/testsuite/g++.dg/torture/pr68852.C(working copy)
> @@ -0,0 +1,51 @@
> +/* { dg-do compile } */
> +
> +struct A {
> +double x, y, z, w;
> +A() {}
> +A(double, double p2, double p3, double) : y(p2), z(p3) {}
> +void m_fn1();
> +};
> +
> +struct B {
> +double x, y;
> +};
> +struct D : A {
> +D() {}
> +D(double p1, double p2, double p3, double p4) : A(p1, p2, p3, p4) {}
> +};
> +
> +class C {
> +public:
> +float _11, _12, _13, _14;
> +float _21, _22, _23, _24;
> +float _31, _32, _33, _34;
> +float _41, _42, _43, _44;
> +D m_fn2(B p1) {
> + double z(p1.x + _43);
> + return *this * D(p1.x, p1.y, z, 1);
> +}
> +int ProjectRectBounds_next;
> +B __trans_tmp_3;
> +int m_fn3(int) {
> + B a, b;
> + D c[1];
> + b = __trans_tmp_3;
> + c[2] = m_fn2(b);
> + c[3] = m_fn2(a);
> + c[ProjectRectBounds_next].m_fn1();
> +}
> +D operator*(D p1) {
> + D d;
> + d.x = p1.x * _11 + p1.y * _21 + p1.z * _31 + _41;
> + d.y = p1.x * _12 + p1.y * _22 + p1.z * _32 + _42;
> + d.z = p1.x * _13 + p1.y * _23 + p1.z * _33 + _43;
> + d.w = p1.x * _14 + p1.y * _24 + p1.z * _34 + 

Re: [PATCH] S/390: Allow to use r1 to r4 as literal pool base.

2015-12-14 Thread Ulrich Weigand
Dominik Vogt wrote:

> The attached patch enables using r1 to r4 as the literal pool base pointer if
> one of them is unused in a leaf function.  The unpatched code supports only r5
> and r13.

I don't think that r1 is actually safe here.  Note that it may be used
(unconditionally) as temp register in s390_emit_prologue in certain cases;
the upcoming split-stack code will also need to use r1 in some cases.

r2 through r4 should be fine.  [ Not sure if there will be many (any?) cases
where one of those is unused but r5 isn't, however. ]

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  ulrich.weig...@de.ibm.com



Re: [PIING][PATCH, 9/16] Add pass_parallelize_loops_oacc_kernels

2015-12-14 Thread Richard Biener
On Sun, Dec 13, 2015 at 5:58 PM, Tom de Vries  wrote:
> On 24/11/15 13:24, Tom de Vries wrote:
>>
>> On 16/11/15 12:59, Tom de Vries wrote:
>>>
>>> On 09/11/15 20:52, Tom de Vries wrote:

 On 09/11/15 16:35, Tom de Vries wrote:
>
> Hi,
>
> this patch series for stage1 trunk adds support to:
> - parallelize oacc kernels regions using parloops, and
> - map the loops onto the oacc gang dimension.
>
> The patch series contains these patches:
>
>   1Insert new exit block only when needed in
>  transform_to_exit_first_loop_alt
>   2Make create_parallel_loop return void
>   3Ignore reduction clause on kernels directive
>   4Implement -foffload-alias
>   5Add in_oacc_kernels_region in struct loop
>   6Add pass_oacc_kernels
>   7Add pass_dominator_oacc_kernels
>   8Add pass_ch_oacc_kernels
>   9Add pass_parallelize_loops_oacc_kernels
>  10Add pass_oacc_kernels pass group in passes.def
>  11Update testcases after adding kernels pass group
>  12Handle acc loop directive
>  13Add c-c++-common/goacc/kernels-*.c
>  14Add gfortran.dg/goacc/kernels-*.f95
>  15Add libgomp.oacc-c-c++-common/kernels-*.c
>  16Add libgomp.oacc-fortran/kernels-*.f95
>
> The first 9 patches are more or less independent, but patches 10-16 are
> intended to be committed at the same time.
>
> Bootstrapped and reg-tested on x86_64.
>
> Build and reg-tested with nvidia accelerator, in combination with a
> patch that enables accelerator testing (which is submitted at
> https://gcc.gnu.org/ml/gcc-patches/2015-10/msg01771.html ).
>
> I'll post the individual patches in reply to this message.


 This patch adds pass_parallelize_loops_oacc_kernels.

 There's a number of things we do differently in parloops for oacc
 kernels:
 - in normal parloops, we generate code to choose between a parallel
version of the loop, and a sequential (low iteration count) version.
Since the code in oacc kernels region is supposed to run on the
accelerator anyway, we skip this check, and don't add a low iteration
count loop.
 - in normal parloops, we generate an #pragma omp parallel /
GIMPLE_OMP_RETURN pair to delimit the region which will we split off
into a thread function. Since the oacc kernels region is already
split off, we don't add this pair.
 - we indicate the parallelization factor by setting the oacc function
attributes
 - we generate an #pragma oacc loop instead of an #pragma omp for, and
we add the gang clause
 - in normal parloops, we rewrite the variable accesses in the loop in
terms into accesses relative to a thread function parameter. For the
oacc kernels region, that rewrite has already been done at omp-lower,
so we skip this.
 - we need to ensure that the entire kernels region can be run in
parallel. The loop independence check is already present, so for oacc
kernels we add a check between blocks outside the loop and the entire
region.
 - we guard stores in the blocks outside the loop with gang_pos == 0.
There's no need for each gang to write to a single location, we can
do this in just one gang. (Typically this is the write of the final
value of the iteration variable if that one is copied back to the
host).

>>>
>>> Reposting with loop optimizer init added in
>>> pass_parallelize_loops_oacc_kernels::execute.
>>>
>>
>> Reposting with loop_optimizer_finalize,scev_initialize and scev_finalize
>>   added in pass_parallelize_loops_oacc_kernels::execute.
>>
>
> Ping.
>
> Anything I can do to facilitate the review?

Document new functions, avoid if (1).

Ideally some refactoring would avoid some of the if (!oacc_kernels_p) spaghetti
but I'm considering tree-parloops.c (and its bugs) yours.

Can the pass not just use a pass parameter to switch between oacc/non-oacc?

Richard.

> Thanks,
>  Tom
>>
>>
>


Re: [PATCH] Fix -Werror= handling with aliases (PR c/68833)

2015-12-14 Thread Bernd Schmidt

On 12/11/2015 09:18 PM, Jakub Jelinek wrote:


Unfortunately, my patch broke some cases with warning aliases that happened
to work (by accident) and left some other warning alias cases broken.

This patch attempts to fix that (and add Warning keyword to two warning
aliases that didn't have it), so that -Werror= works even for them again.
As we do nothing beyond cancelling -Werror for -Wno-error=, there is no need
to deal with neg_alias_arg, just alias_arg is enough.


Took me a while to figure out what you were saying with that last 
sentence, but in the end I agree.



2015-12-11  Jakub Jelinek  

PR c/68833
* common.opt (Wmissing-noreturn): Add Warning option.
* opts-common.c (control_warning_option): If opt is
alias_target with alias_arg, set arg to it.

* c.opt (Wmissing-format-attribute, Wnormalized): Add Warning option.

* c-c++-common/pr68833-1.c: New test.
* c-c++-common/pr68833-2.c: New test.


Ok.


Bernd


[PATCH] Fix PR68892

2015-12-14 Thread Richard Biener

The following fixes PR68892, the BB vectorizer now happily creates
a load of dead vector loads (we had a similar bug with loop
single-element interleaving support in the past).  Fixed as a side-effect
of making the SLP load cost reflect reality.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2015-12-14  Richard Biener  

PR tree-optimization/68892
* tree-vect-slp.c (vect_analyze_slp_cost_1): Properly compute
cost for permuted loads.

* gcc.dg/vect/bb-slp-pr68892.c: New testcase.

Index: gcc/tree-vect-slp.c
===
*** gcc/tree-vect-slp.c (revision 231617)
--- gcc/tree-vect-slp.c (working copy)
*** vect_analyze_slp_cost_1 (slp_instance in
*** 1405,1414 
  {
unsigned i;
slp_tree child;
!   gimple *stmt, *s;
stmt_vec_info stmt_info;
tree lhs;
-   unsigned group_size = SLP_INSTANCE_GROUP_SIZE (instance);
  
/* Recurse down the SLP tree.  */
FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child)
--- 1405,1413 
  {
unsigned i;
slp_tree child;
!   gimple *stmt;
stmt_vec_info stmt_info;
tree lhs;
  
/* Recurse down the SLP tree.  */
FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child)
*** vect_analyze_slp_cost_1 (slp_instance in
*** 1427,1470 
   node, prologue_cost_vec, body_cost_vec);
else
{
- int i;
  gcc_checking_assert (DR_IS_READ (STMT_VINFO_DATA_REF (stmt_info)));
- /* If the load is permuted then the alignment is determined by
-the first group element not by the first scalar stmt DR.  */
  if (SLP_TREE_LOAD_PERMUTATION (node).exists ())
{
  stmt = GROUP_FIRST_ELEMENT (stmt_info);
  stmt_info = vinfo_for_stmt (stmt);
}
  vect_model_load_cost (stmt_info, ncopies_for_cost, false,
node, prologue_cost_vec, body_cost_vec);
- /* If the load is permuted record the cost for the permutation.
-???  Loads from multiple chains are let through here only
-for a single special case involving complex numbers where
-in the end no permutation is necessary.  */
- FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (node), i, s)
-   if ((STMT_VINFO_GROUP_FIRST_ELEMENT (vinfo_for_stmt (s))
-== STMT_VINFO_GROUP_FIRST_ELEMENT (stmt_info))
-   && vect_get_place_in_interleaving_chain
-(s, STMT_VINFO_GROUP_FIRST_ELEMENT (stmt_info)) != i)
- {
-   record_stmt_cost (body_cost_vec, group_size, vec_perm,
- stmt_info, 0, vect_body);
-   break;
- }
}
  }
!   else
  {
record_stmt_cost (body_cost_vec, ncopies_for_cost, vector_stmt,
stmt_info, 0, vect_body);
!   if (SLP_TREE_TWO_OPERATORS (node))
!   {
! record_stmt_cost (body_cost_vec, ncopies_for_cost, vector_stmt,
!   stmt_info, 0, vect_body);
! record_stmt_cost (body_cost_vec, ncopies_for_cost, vec_perm,
!   stmt_info, 0, vect_body);
!   }
  }
  
/* Scan operands and account for prologue cost of constants/externals.
--- 1426,1464 
   node, prologue_cost_vec, body_cost_vec);
else
{
  gcc_checking_assert (DR_IS_READ (STMT_VINFO_DATA_REF (stmt_info)));
  if (SLP_TREE_LOAD_PERMUTATION (node).exists ())
{
+ /* If the load is permuted then the alignment is determined by
+the first group element not by the first scalar stmt DR.  */
  stmt = GROUP_FIRST_ELEMENT (stmt_info);
  stmt_info = vinfo_for_stmt (stmt);
+ /* Record the cost for the permutation.  */
+ record_stmt_cost (body_cost_vec, ncopies_for_cost, vec_perm,
+   stmt_info, 0, vect_body);
+ /* And adjust the number of loads performed.  */
+ unsigned nunits
+   = TYPE_VECTOR_SUBPARTS (STMT_VINFO_VECTYPE (stmt_info));
+ ncopies_for_cost
+   = (GROUP_SIZE (stmt_info) - GROUP_GAP (stmt_info)
+  + nunits - 1) / nunits;
+ ncopies_for_cost *= SLP_INSTANCE_UNROLLING_FACTOR (instance);
}
+ /* Record the cost for the vector loads.  */
  vect_model_load_cost (stmt_info, ncopies_for_cost, false,
node, prologue_cost_vec, body_cost_vec);
}
+   return;
  }
! 
!   record_stmt_cost (body_cost_vec, ncopies_for_cost, vector_stmt,
!   stmt_info, 0, vect_body);
!   if (SLP_TREE_TWO_OPERATORS (node))
  {
record_stmt_cost (body_cost_vec, ncopies_for_cost, vector_stmt,

Re: [PATCH, 4/16] Implement -foffload-alias

2015-12-14 Thread Tom de Vries

On 14/12/15 14:26, Richard Biener wrote:

On Sun, 13 Dec 2015, Tom de Vries wrote:


On 11/12/15 14:00, Richard Biener wrote:

On Fri, 11 Dec 2015, Tom de Vries wrote:


On 13/11/15 12:39, Jakub Jelinek wrote:

We simply have some compiler internal interface between the caller and
callee of the outlined regions, each interface in between those has
its own structure type used to communicate the info;
we can attach attributes on the fields, or some flags to indicate some
properties interesting from aliasing POV.  We don't really need to
perform
full IPA-PTA, perhaps it would be enough to a) record somewhere in
cgraph
the relationship in between such callers and callees (for offloading
regions
we already have "omp target entrypoint" attribute on the callee and a
singler caller), tell LTO if possible not to split those into different
partitions if easily possible, and then just for these pairs perform
aliasing/points-to analysis in the caller and the result record using
cliques/special attributes/whatever to the callee side, so that the
callee
(outlined OpenMP/OpenACC/Cilk+ region) can then improve its alias
analysis.


Hi,

This work-in-progress patch allows me to use IPA PTA information in the
kernels pass group.

Since:
-  I'm running IPA PTA before ealias, and IPA PTA does not interpret
 restrict, and
- compute_may_alias doesn't run if IPA PTA information is present
I needed to convince ealias to do the restrict clique/base annotation.

It would be more logical to fit IPA PTA after ealias, but one is an IPA
pass,
the other a regular one-function pass, so I would have to split the
containing
pass groups pass_all_early_optimizations and
pass_local_optimization_passes.
I'll give that a try now.



I've tried this approach, but realized that this changes the order in which
non-openacc functions are processed in the compiler, so I've abandoned this
idea.


Any comments?


I don't think you want to run IPA PTA before early
optimizations, it (and ealias) rely on some initial cleanup to
do anything meaningful with well-spent ressources.

The local PTA "hack" also looks more like a waste of resources, but well
... teaching IPA PTA to honor restrict might be an impossible task
though I didn't think much about it other than handling it only for
nonlocal_p functions (for others we should see all incoming args
if IPA PTA works optimally).  The restrict tags will leak all over
the place of course and in the end no meaningful cliques may remain.



This patch:
- moves the kernels pass group to the first position in the pass list
   after ealias where we're back in ipa mode
- inserts an new ipa pass to contain the gimple pass group called
   pass_oacc_ipa
- inserts a version of ipa-pta before the pass group.


In principle I like this a lot, but

+  NEXT_PASS (pass_ipa_pta_oacc_kernels);
+  NEXT_PASS (pass_oacc_ipa);
+  PUSH_INSERT_PASSES_WITHIN (pass_oacc_ipa)

I think you can put pass_ipa_pta_oacc_kernels into the pass_oacc_ipa
group and thus just "clone" ipa_pta?


Done. But using a clone means using the same gate function, and that 
means that this pass_ipa_pta instance no longer runs by default for 
openacc by default.


I've added enabling-by-default of fipa-pta for fopenacc in 
default_options_optimization to fix that.



sub-passes of IPA passes can
be both ipa passes and non-ipa passes.


Right. It does mean that I need yet another pass (pass_ipa_oacc_kernels) 
to do the IPA/non-IPA transition at pass/sub-pass boundary:

...
  NEXT_PASS (pass_ipa_oacc);
  PUSH_INSERT_PASSES_WITHIN (pass_ipa_oacc)
  NEXT_PASS (pass_ipa_pta);
  NEXT_PASS (pass_ipa_oacc_kernels);
  PUSH_INSERT_PASSES_WITHIN (pass_ipa_oacc_kernels)
 /* out-of-ipa */
 NEXT_PASS (pass_oacc_kernels);
 PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels)
...

OK for stage3 if bootstrap and reg-test succeeds?

Thanks,
- Tom

Add pass_oacc_ipa

2015-12-14  Tom de Vries  

	* opts.c (default_options_optimization): Set fipa-pta on by default for
	fopenacc.
	* passes.def: Move kernels pass group to pass_ipa_oacc.
	* tree-pass.h (make_pass_oacc_kernels2): Remove.
	(make_pass_ipa_oacc, make_pass_ipa_oacc_kernels): Declare.
	* tree-ssa-loop.c (pass_oacc_kernels2, make_pass_oacc_kernels2): Remove.
	(pass_ipa_oacc, pass_ipa_oacc_kernels): New pass.
	(make_pass_ipa_oacc, make_pass_ipa_oacc_kernels): New function.
	* tree-ssa-structalias.c (pass_ipa_pta::clone): New function.

	* g++.dg/ipa/devirt-37.C: Update for new fre2 pass.
	* g++.dg/ipa/devirt-40.C: Same.
	* g++.dg/tree-ssa/pr61034.C: Same.
	* gcc.dg/ipa/ipa-pta-13.c: Same.
	* gcc.dg/ipa/ipa-pta-3.c: Same.
	* gcc.dg/ipa/ipa-pta-4.c: Same.

---
 gcc/opts.c  |  9 
 gcc/passes.def  | 41 ++
 gcc/testsuite/g++.dg/ipa/devirt-37.C| 10 ++---
 gcc/testsuite/g++.dg/ipa/devirt-40.C|  4 +-
 gcc/testsuite/g++.dg/tree-ssa/pr61034.C | 10 ++---
 gcc/testsuite/gcc.dg/ipa/ipa-pta-13.c   |  4 

[PATCH 1/3, libgomp] Resolve libgomp plugin deadlock on exit, libgomp proper parts

2015-12-14 Thread Chung-Lin Tang
[sorry, forgot to C gcc-patches in last send]

Hi Jakub,
these patches are a revision of 
https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01701.html
since that patch set have bitrotten by now.

To recap the original situation, due to the way that device locks are held
when entering plugin code, a GOMP_PLUGIN_fatal() call will deadlock when the
GOMP_unregister_var() exit destructor tries to obtain the same device lock.

This patch set revises many functions on libgomp plugin interface to return 
false on error,
and back to libgomp to release the lock and call gomp_fatal() there.

This first patch is the changes for the machine independent libgomp proper. The 
entire patch
set was tested without regressions. Is this okay for trunk?

Thanks,
Chung-Lin

2015-12-14  Chung-Lin Tang  

* target.c (gomp_device_copy): New function.
(gomp_copy_host2dev): Likewise.
(gomp_copy_dev2host): Likewise.
(gomp_free_device_memory): Likewise.
(gomp_map_vars_existing): Adjust to call gomp_copy_host2dev().
(gomp_map_pointer): Likewise.
(gomp_map_vars): Adjust to call gomp_copy_host2dev(), handle
NULL value from alloc_func plugin hook.
(gomp_unmap_tgt): Adjust to call gomp_free_device_memory().
(gomp_copy_from_async): Adjust to call gomp_copy_dev2host().
(gomp_unmap_vars): Likewise.
(gomp_update): Adjust to call gomp_copy_dev2host() and
gomp_copy_host2dev() functions.
(gomp_init_device): Handle false value from init_device_func
plugin hook.
(gomp_fini_device): Handle false value from fini_device_func
plugin hook.
(gomp_exit_data): Adjust to call gomp_copy_dev2host().
(omp_target_free): Adjust to call gomp_free_device_memory().
(omp_target_memcpy): Handle return values from host2dev_func,
dev2host_func, and dev2dev_func plugin hooks.
(omp_target_memcpy_rect_worker): Likewise.
* libgomp.h (struct gomp_device_descr): Adjust return type of
init_device_func, fini_device_func, free_func, dev2host_func,
host2dev_func, and dev2dev_func plugin hooks from 'void *' to
bool.
* oacc-host.c (host_init_device): Change return type to bool.
(host_fini_device): Likewise.
(host_free): Likewise.
(host_dev2host): Likewise.
(host_host2dev): Likewise.
* oacc-mem.c (acc_free): Handle plugin hook fatal error case.
(acc_memcpy_to_device): Likewise.
(acc_memcpy_from_device): Likewise.
(delete_copyout): Add libfnname parameter, handle free_func
hook fatal error case.
(acc_delete): Adjust delete_copyout call.
(acc_copyout): Likewise.



Index: libgomp/libgomp.h
===
--- libgomp/libgomp.h	(revision 231613)
+++ libgomp/libgomp.h	(working copy)
@@ -914,16 +914,17 @@ struct gomp_device_descr
   unsigned int (*get_caps_func) (void);
   int (*get_type_func) (void);
   int (*get_num_devices_func) (void);
-  void (*init_device_func) (int);
-  void (*fini_device_func) (int);
+  bool (*init_device_func) (int);
+  bool (*fini_device_func) (int);
   unsigned (*version_func) (void);
   int (*load_image_func) (int, unsigned, const void *, struct addr_pair **);
   void (*unload_image_func) (int, unsigned, const void *);
   void *(*alloc_func) (int, size_t);
-  void (*free_func) (int, void *);
-  void *(*dev2host_func) (int, void *, const void *, size_t);
-  void *(*host2dev_func) (int, void *, const void *, size_t);
-  void *(*dev2dev_func) (int, void *, const void *, size_t);
+  bool (*free_func) (int, void *);
+  bool (*dev2host_func) (int, void *, const void *, size_t);
+  bool (*host2dev_func) (int, void *, const void *, size_t);
+  /*xxx*/
+  bool (*dev2dev_func) (int, void *, const void *, size_t);
   void (*run_func) (int, void *, void *);
   void (*async_run_func) (int, void *, void *, void *);
 
Index: libgomp/oacc-host.c
===
--- libgomp/oacc-host.c	(revision 231613)
+++ libgomp/oacc-host.c	(working copy)
@@ -60,14 +60,16 @@ host_get_num_devices (void)
   return 1;
 }
 
-static void
+static bool
 host_init_device (int n __attribute__ ((unused)))
 {
+  return true;
 }
 
-static void
+static bool
 host_fini_device (int n __attribute__ ((unused)))
 {
+  return true;
 }
 
 static unsigned
@@ -98,28 +100,29 @@ host_alloc (int n __attribute__ ((unused)), size_t
   return gomp_malloc (s);
 }
 
-static void
+static bool
 host_free (int n __attribute__ ((unused)), void *p)
 {
   free (p);
+  return true;
 }
 
-static void *
+static bool
 host_dev2host (int n __attribute__ ((unused)),
 	   void *h __attribute__ ((unused)),
 	   const void *d __attribute__ ((unused)),
 	   size_t s __attribute__ ((unused)))
 {
-  return NULL;
+  return true;
 }
 
-static void *
+static bool
 host_host2dev (int n __attribute__ 

Re: [PATCH 2/2] obsolete openbsd 2.0 and 3.X

2015-12-14 Thread Mike Stump
On Dec 14, 2015, at 7:55 PM, tbsaunde+...@tbsaunde.org wrote:
>   * config.gcc: Makr openbsd 2.0 and 3.X as obsolete.

English:

Mark...



Re: [PATCH 1/2] mark *-knetbsd-* as obsolete

2015-12-14 Thread Mike Stump
On Dec 14, 2015, at 7:55 PM, tbsaunde+...@tbsaunde.org wrote:
>   * config.gcc: mark knetbsd targets as obsolete.

English:

  Mark...


Re: [Patch, libstdc++/68863] Let lookahead regex use captured contents

2015-12-14 Thread Tim Shen
On Mon, Dec 14, 2015 at 10:03 AM, Jonathan Wakely  wrote:
> OK then I do understand it and it's definitely OK to commit :-)
>
> Thanks.
>

Committed to trunk, gcc 5 and gcc 4.9.


-- 
Regards,
Tim Shen


[PATCH] IRA: Fix % constraint modifier handling on disabled alternatives.

2015-12-14 Thread Andreas Krebbel
Hi,

the constraint modifier % applies to all the alternatives of a pattern
and hence is mostly added to the first constraint of an operand.  IRA
currently ignores it if the alternative with the % gets disabled by
using the `enabled' attribute or if it is not among the preferred
alternatives.

Fixed with the attached patch by moving the % check to the first loop
which walks unconditionally over all the constraints.

Ok for mainline?

Bye,

-Andreas-

gcc/ChangeLog:

2015-12-14  Andreas Krebbel  

* ira.c (ira_setup_alts): Move the scan for commutative modifier
to the first loop to make it work even with disabled alternatives.

diff --git a/gcc/ira.c b/gcc/ira.c
index 97edf8c..9824e4a 100644
--- a/gcc/ira.c
+++ b/gcc/ira.c
@@ -1800,7 +1800,13 @@ ira_setup_alts (rtx_insn *insn, HARD_REG_SET )
{
  insn_constraints[nop * recog_data.n_alternatives + nalt] = p;
  while (*p && *p != ',')
-   p++;
+   {
+ /* We only support one commutative marker, the first
+one.  We already set commutative above.  */
+ if (*p == '%' && commutative < 0)
+   commutative = nop;
+ p++;
+   }
  if (*p)
p++;
}
@@ -1831,11 +1837,7 @@ ira_setup_alts (rtx_insn *insn, HARD_REG_SET )
break;
  
  case '%':
-   /* We only support one commutative marker, the
-  first one.  We already set commutative
-  above.  */
-   if (commutative < 0)
- commutative = nop;
+   /* The commutative modifier is handled above.  */
break;
 
  case '0':  case '1':  case '2':  case '3':  case '4':



Re: [PATCH] rs6000: Fix a mistake in cstore_si_as_di (PR68865, PR68879)

2015-12-14 Thread David Edelsohn
On Mon, Dec 14, 2015 at 2:04 AM, Segher Boessenkool
 wrote:
> convert_move does not know how to zero-extend a constant integer to the
> target mode -- simply because it does not know the target mode.  As a
> result, 32-bit SImode with the high bit set would be effectively sign-
> extended instead of zero-extended.
>
> This patch fixes it.  Is this okay for trunk?  (bootstrap+regtest in
> progress, on powerpc64-linux).
>
>
> Segher
>
>
> 2015-12-14  Segher Boessenkool  
>
> PR target/68865
> PR target/68879
> * rs6000.md (cstore_si_as_di): Force all operands into registers.

Okay.

Thanks, David


Re: [build] Only support -gstabs on Mac OS X if assember supports it (PR target/67973)

2015-12-14 Thread Rainer Orth
Hi Iain,

>>> Hrm, this needs more investigation, and will affect 10.10 too, since xc7 is
>>> the default there.
>>> (separate issue, let’s start a new PR, or at least a new thread).
>> 
>> right, but it's only an issue if you switch assemblers (or linkers) used
>> by gcc without rebuilding.  This has never been safe on any platform.
>
> The issue that worries me is that the new assembler supports .cfi_xxx
> (YAY!), but the Darwin port is not 100% ready for it yet (BOO!) (I have
> patches, and expect to make them available for folks to try in the next ~ 2
> weeks).  However, still not sure that they would exactly be stage3 stuff.

ah, I see.  The platform maintainers have some leeway even in stage 3,
so let's see when you post the patches.

> Did you say that bootstrap fails if -Q is jammed in everywhere?
> (that would be a short-term safety net, falling back to the cctools 
> assembler).

Yes, but only if you run the configure test with the LLVM assembler, but
always call as -Q during the build.  It should be possible to perform
the assembler tests with as -Q instead, so there shouldn't be an
inconsistency and you get the same result as if configuring with a copy
of the Xcode 6.4 as.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH] Allow embedded timestamps by C/C++ macros to be set externally (2)

2015-12-14 Thread Dhole
Hi,

The copyright assignment process is now complete :)
Let me know if I'm required to do anything else regarding the patch I sent.

Best regards,
Dhole


[PATCH] Fix PR68775

2015-12-14 Thread Richard Biener

The following fixes PR68775 a miscompile of 465.tonto on ppc64le.  The
issue is somewhat hard to trigger as it requires operand swapping to
trigger in SLP, thus no testcase (in fact the issue went latent on
trunk recently).

Bootstrapped and tested on x86_64-unknown-linux-gnu.

Richard.

2015-12-14  Richard Biener  

PR tree-optimization/68775
* tree-vect-slp.c (vect_build_slp_tree): Make sure to apply
a operand swapping even if replacing the op with scalars.

Index: gcc/tree-vect-slp.c
===
--- gcc/tree-vect-slp.c (revision 230855)
+++ gcc/tree-vect-slp.c (working copy)
@@ -1078,6 +1078,20 @@ vect_build_slp_tree (vec_info *vinfo,
   tem, npermutes, _tree_size,
   max_tree_size))
{
+ /* ... so if successful we can apply the operand swapping
+to the GIMPLE IL.  This is necessary because for example
+vect_get_slp_defs uses operand indexes and thus expects
+canonical operand order.  This is also necessary even
+if we end up building the operand from scalars as
+we'll continue to process swapped operand two.  */
+ for (j = 0; j < group_size; ++j)
+   if (!matches[j])
+ {
+   gimple *stmt = SLP_TREE_SCALAR_STMTS (*node)[j];
+   swap_ssa_operands (stmt, gimple_assign_rhs1_ptr (stmt),
+  gimple_assign_rhs2_ptr (stmt));
+ }
+
  /* If we have all children of child built up from scalars then
 just throw that away and build it up this node from scalars.  
*/
  if (!SLP_TREE_CHILDREN (child).is_empty ())
@@ -1107,17 +1121,6 @@ vect_build_slp_tree (vec_info *vinfo,
}
}
 
- /* ... so if successful we can apply the operand swapping
-to the GIMPLE IL.  This is necessary because for example
-vect_get_slp_defs uses operand indexes and thus expects
-canonical operand order.  */
- for (j = 0; j < group_size; ++j)
-   if (!matches[j])
- {
-   gimple *stmt = SLP_TREE_SCALAR_STMTS (*node)[j];
-   swap_ssa_operands (stmt, gimple_assign_rhs1_ptr (stmt),
-  gimple_assign_rhs2_ptr (stmt));
- }
  oprnd_info->def_stmts = vNULL;
  SLP_TREE_CHILDREN (*node).quick_push (child);
  continue;



[gomp-nvptx] nvptx backend: implement alloca with -msoft-stack

2015-12-14 Thread Alexander Monakov
This patch implements variable stack allocation for alloca/VLA on NVPTX if
-msoft-stack is enabled.  In addition to moving the stack pointer, we need to
copy the updated pointer into __nvptx_stacks[tid.y].

* config/nvptx/nvptx.c (nvptx_declare_function_name): Emit %outargs
using .local %outargs_ar only if not TARGET_SOFT_STACK.  Emit %outargs
under TARGET_SOFT_STACK by offsetting from %frame.
(nvptx_get_drap_rtx): Return %argp as the DRAP if needed.
* config/nvptx/nvptx.md (nvptx_register_operand): Allow %outargs under
TARGET_SOFT_STACK.
(nvptx_nonimmediate_operand): Ditto.
(allocate_stack): Implement for TARGET_SOFT_STACK.  Remove unused code.
(allocate_stack_): Remove unused pattern.
(set_softstack_insn): New pattern.
(restore_stack_block): Handle for TARGET_SOFT_STACK.
---

I have committed this patch to the gomp-nvptx branch.  Bernd, Nathan, I would
appreciate if you could comment on 'define_predicate' changes in nvptx.md.
There are three predicates that start like this:

  if (REG_P (op))
return !HARD_REGISTER_P (op);
  if (GET_CODE (op) == SUBREG && MEM_P (SUBREG_REG (op)))
return false;
  if (GET_CODE (op) == SUBREG)
return false;

For stack adjustments I need to allow operations on the stack pointer.  For
now I've implemented that as a fairly straightforward shortcut, but I guess it
doesn't look very nice.  What is the reason to reject "hard registers" there,
in the first place?  In any case, I'd like your input if you see a better way
to handle it.

Also, note that there's either a bug or a cleanup opportunity: the third "if"
statement is clearly more general than the second.

No regressions on check-c testsuite (with 'alloca' effective-target enabled).

Thanks.
Alexander

diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index b12a7a8..599e460 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -883,7 +883,7 @@ nvptx_declare_function_name (FILE *file, const char *name, 
const_tree decl)
   HOST_WIDE_INT sz = crtl->outgoing_args_size;
   if (sz == 0)
 sz = 1;
-  if (cfun->machine->has_call_with_varargs)
+  if (!TARGET_SOFT_STACK && cfun->machine->has_call_with_varargs)
 {
   fprintf (file, "\t.reg.u%d %%outargs;\n"
   "\t.local.align 8 .b8 %%outargs_ar["
@@ -897,7 +897,8 @@ nvptx_declare_function_name (FILE *file, const char *name, 
const_tree decl)
   sz = get_frame_size ();
   if (sz == 0 && cfun->machine->has_call_with_sc)
 sz = 1;
-  if (sz > 0)
+  bool need_sp = cfun->calls_alloca || cfun->machine->has_call_with_varargs;
+  if (sz > 0 || TARGET_SOFT_STACK && need_sp)
 {
   int alignment = crtl->stack_alignment_needed / BITS_PER_UNIT;
 
@@ -923,10 +924,15 @@ nvptx_declare_function_name (FILE *file, const char 
*name, const_tree decl)
  if (alignment > keep_align)
fprintf (file, "\tand.b%d %%frame, %%frame, %d;\n",
 bits, -alignment);
+ fprintf (file, "\t.reg.u%d %%outargs;\n", bits);
+ sz = crtl->outgoing_args_size;
+ gcc_assert (sz % keep_align == 0);
+ fprintf (file, "\tsub.u%d %%outargs, %%frame, "
+  HOST_WIDE_INT_PRINT_DEC ";\n", bits, sz);
  /* crtl->is_leaf is not initialized because RA is not run.  */
  if (!leaf_function_p ())
{
- fprintf (file, "\tst.shared.u%d [%%fstmp2], %%frame;\n", bits);
+ fprintf (file, "\tst.shared.u%d [%%fstmp2], %%outargs;\n", bits);
  cfun->machine->using_softstack = true;
}
  need_softstack_decl = true;
@@ -996,6 +1002,8 @@ nvptx_function_ok_for_sibcall (tree, tree)
 static rtx
 nvptx_get_drap_rtx (void)
 {
+  if (TARGET_SOFT_STACK && stack_realign_drap)
+return arg_pointer_rtx;
   return NULL_RTX;
 }
 
diff --git a/gcc/config/nvptx/nvptx.md b/gcc/config/nvptx/nvptx.md
index ae1909d..130c809 100644
--- a/gcc/config/nvptx/nvptx.md
+++ b/gcc/config/nvptx/nvptx.md
@@ -69,6 +69,8 @@ (define_attr "divergent" "false,true"
 (define_predicate "nvptx_register_operand"
   (match_code "reg,subreg")
 {
+  if (TARGET_SOFT_STACK && op == stack_pointer_rtx)
+return true;
   if (REG_P (op))
 return !HARD_REGISTER_P (op);
   if (GET_CODE (op) == SUBREG && MEM_P (SUBREG_REG (op)))
@@ -123,6 +125,8 @@ (define_predicate "nvptx_general_operand"
 (define_predicate "nvptx_nonimmediate_operand"
   (match_code "reg,subreg,mem")
 {
+  if (TARGET_SOFT_STACK && op == stack_pointer_rtx)
+return true;
   if (REG_P (op))
 return (op != frame_pointer_rtx
&& op != arg_pointer_rtx
@@ -1061,31 +1065,41 @@ (define_expand "allocate_stack"
(match_operand 1 "nvptx_register_operand")]
   ""
 {
+  if (TARGET_SOFT_STACK)
+{
+  emit_move_insn (stack_pointer_rtx,
+ gen_rtx_MINUS (Pmode, stack_pointer_rtx, operands[1]));
+  emit_insn (gen_set_softstack_insn (stack_pointer_rtx));
+  

Re: [v4] avoid alignment of static variables affecting stack's

2015-12-14 Thread Richard Biener
On Mon, Dec 14, 2015 at 1:09 PM, Bernd Schmidt  wrote:
> On 12/14/2015 10:07 AM, Richard Biener wrote:
>
>> Note that we also record alignment to make sure we can spill to properly
>> aligned stack slots.
>
>
>> I don't see why we don't need to do that for used statics/externs.  That
>> is
>> are you sure we never need to spill a var of their type?
>
>
> Why would they be different from other global variables declared outside a
> function? We don't have to worry about those.

No idea.

But there must be a reason statics/externals are expected and handled in all the
callers (and the function).

The new early-out just makes the code even more confusing than before.

Richard.

>
> Bernd
>


Re: ipa-cp heuristics fixes

2015-12-14 Thread Martin Jambor
Hi,

On Fri, Dec 11, 2015 at 10:20:20PM +0100, Jan Hubicka wrote:
> Actually I added
>   if (!ipa_is_param_used (info, i))   
>   
> continue; 
>   
> shortcut to gather_context_independent_values which prevents
> us from recording context_independent_aggregate_values for unused
> aggregate parameters. Perhaps that is causing the isssue?
> We can simply record them and just avoid returning true if
> all propagations happen to those.

No, it's a different thing changed by the patch.  The following patch
makes the testcase pass but of course it is not a good fix.  We were
not performing any IPA-CP on the testcase before, now we are, so I
suppose this has has uncovered a debug info deficiency.

Martin


diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
index 8087f66..8c44b5a 100644
--- a/gcc/ipa-cp.c
+++ b/gcc/ipa-cp.c
@@ -2507,7 +2507,7 @@ estimate_local_effects (struct cgraph_node *node)
   known_aggs_ptrs = agg_jmp_p_vec_for_t_vec (known_aggs);
   int devirt_bonus = devirtualization_time_bonus (node, known_csts,
   known_contexts, known_aggs_ptrs);
-  if (always_const || devirt_bonus || removable_params_cost)
+  if (always_const || devirt_bonus)/* || removable_params_cost)*/
 {
   struct caller_statistics stats;
   inline_hints hints;


[visium] skip block move insn test on gr5

2015-12-14 Thread Olivier Hainque
Block move insns are specific to gr6. They aren't
available at all on gr5.

Committing to mainline with offline agreement from Eric.

Olivier

2015-12-14  Olivier Hainque  

testsuite/
*  gcc.target/visium/block_move.c: Skip for gr5.



skip-bmd-gr5.diff
Description: Binary data





[gomp4] Fix handling of kernel launch mappings.

2015-12-14 Thread James Norris

Hi,

The attached patch fixes an issue in the managing of
the page-locked buffer which holds the kernel launch
mappings. In the process of fixing the issue, I discovered
that 'struct map' was no longer needed, so that has
been removed as well.

Committed to gomp-4_0-branch.

Thanks,
Jim
Index: libgomp/ChangeLog.gomp
===
--- libgomp/ChangeLog.gomp	(revision 231614)
+++ libgomp/ChangeLog.gomp	(working copy)
@@ -1,3 +1,10 @@
+2015-12-14  James Norris  
+
+	* plugin/plugin-nvptx.c (struct map): Removed.
+	(map_init, map_pop): Remove use of struct map. (map_push):
+	Likewise and change argument list.
+	* testsuite/libgomp.oacc-c-c++-common/mapping-1.c: New
+
 2015-12-09  James Norris  
 
 	* oacc-parallel.c (handle_ftn_pointers): New function.
Index: libgomp/plugin/plugin-nvptx.c
===
--- libgomp/plugin/plugin-nvptx.c	(revision 231614)
+++ libgomp/plugin/plugin-nvptx.c	(working copy)
@@ -91,13 +91,6 @@
   struct ptx_device *ptx_dev;
 };
 
-struct map
-{
-  int async;
-  size_t  size;
-  charmappings[0];
-};
-
 static void
 map_init (struct ptx_stream *s)
 {
@@ -140,17 +133,13 @@
 static void
 map_pop (struct ptx_stream *s)
 {
-  struct map *m;
-
   assert (s != NULL);
   assert (s->h_next);
   assert (s->h_prev);
   assert (s->h_tail);
 
-  m = s->h_tail;
+  s->h_tail = s->h_next;
 
-  s->h_tail += m->size;
-
   if (s->h_tail >= s->h_end)
 s->h_tail = s->h_begin + (int) (s->h_tail - s->h_end);
 
@@ -167,16 +156,14 @@
 }
 
 static void
-map_push (struct ptx_stream *s, int async, size_t size, void **h, void **d)
+map_push (struct ptx_stream *s, size_t size, void **h, void **d)
 {
   int left;
   int offset;
-  struct map *m;
 
   assert (s != NULL);
 
   left = s->h_end - s->h_next;
-  size += sizeof (struct map);
 
   assert (s->h_prev);
   assert (s->h_next);
@@ -183,22 +170,14 @@
 
   if (size >= left)
 {
-  m = s->h_prev;
-  m->size += left;
-  s->h_next = s->h_begin;
-
-  if (s->h_next + size > s->h_end)
-	GOMP_PLUGIN_fatal ("unable to push map");
+  assert (s->h_next == s->h_prev);
+  s->h_next = s->h_prev = s->h_tail = s->h_begin;
 }
 
   assert (s->h_next);
 
-  m = s->h_next;
-  m->async = async;
-  m->size = size;
+  offset = s->h_next - s->h;
 
-  offset = (void *)>mappings[0] - s->h;
-
   *d = (void *)(s->d + offset);
   *h = (void *)(s->h + offset);
 
@@ -925,7 +904,7 @@
   /* This reserves a chunk of a pre-allocated page of memory mapped on both
  the host and the device. HP is a host pointer to the new chunk, and DP is
  the corresponding device pointer.  */
-  map_push (dev_str, async, mapnum * sizeof (void *), , );
+  map_push (dev_str, mapnum * sizeof (void *), , );
 
   GOMP_PLUGIN_debug (0, "  %s: prepare mappings\n", __FUNCTION__);
 
Index: libgomp/testsuite/libgomp.oacc-c-c++-common/mapping-1.c
===
--- libgomp/testsuite/libgomp.oacc-c-c++-common/mapping-1.c	(revision 0)
+++ libgomp/testsuite/libgomp.oacc-c-c++-common/mapping-1.c	(working copy)
@@ -0,0 +1,63 @@
+/* { dg-do run } */
+
+#include 
+#include 
+#include 
+
+/* Exercise the kernel launch argument mapping.  */
+
+int
+main (int argc, char **argv)
+{
+  int a[256], b[256], c[256], d[256], e[256], f[256];
+  int i;
+  int n;
+
+  /* 48 is the size of the mappings for the first parallel construct.  */
+  n = sysconf (_SC_PAGESIZE) / 48 - 1;
+
+  i = 0;
+
+  for (i = 0; i < n; i++)
+{
+  #pragma acc parallel copy (a, b, c, d)
+	{
+	  int j;
+
+	  for (j = 0; j < 256; j++)
+	{
+	  a[j] = j;
+	  b[j] = j;
+	  c[j] = j;
+	  d[j] = j;
+	}
+	}
+}
+
+#pragma acc parallel copy (a, b, c, d, e, f)
+  {
+int j;
+
+for (j = 0; j < 256; j++)
+  {
+	a[j] = j;
+	b[j] = j;
+	c[j] = j;
+	d[j] = j;
+	e[j] = j;
+	f[j] = j;
+  }
+  }
+
+  for (i = 0; i < 256; i++)
+   {
+ if (a[i] != i) abort();
+ if (b[i] != i) abort();
+ if (c[i] != i) abort();
+ if (d[i] != i) abort();
+ if (e[i] != i) abort();
+ if (f[i] != i) abort();
+   }
+
+  exit (0);
+}


Re: [RFC] Dump ssaname info for default defs

2015-12-14 Thread Richard Biener
On Mon, Dec 14, 2015 at 11:50 AM, Tom de Vries  wrote:
> On 14/12/15 09:47, Richard Biener wrote:
>>
>> On Fri, Dec 11, 2015 at 6:05 PM, Tom de Vries 
>> wrote:
>>>
>>> Hi,
>>>
>>> atm, we dump ssa-name info for lhs-es of statements. That leaves out the
>>> ssa
>>> names with default defs.
>>>
>>> This proof-of-concept patch prints the ssa-name info for default defs, in
>>> the following format:
>>> ...
>>> __attribute__((noclone, noinline))
>>> bar (intD.6 * cD.1755, intD.6 * dD.1756)
>>> # PT = nonlocal
>>> # DEFAULT_DEF c_2(D)
>>> # PT = { D.1762 } (nonlocal)
>>> # ALIGN = 4, MISALIGN = 0
>>> # DEFAULT_DEF d_4(D)
>>> {
>>> ;;   basic block 2, loop depth 0, count 0, freq 1, maybe hot
>>> ;;prev block 0, next block 1, flags: (NEW, REACHABLE)
>>> ;;pred:   ENTRY [100.0%]  (FALLTHRU,EXECUTABLE)
>>># .MEM_3 = VDEF <.MEM_1(D)>
>>>*c_2(D) = 1;
>>># .MEM_5 = VDEF <.MEM_3>
>>>*d_4(D) = 2;
>>># VUSE <.MEM_5>
>>>return;
>>> ;;succ:   EXIT [100.0%]  (EXECUTABLE)
>>>
>>> }
>>> ...
>>>
>>> Good idea? Any further comments, f.i. on formatting?
>>
>>
>> I've had a similar patch in my dev tree for quite some while but never
>> pushed it because
>> of "formatting"...
>>
>> That said,
>>
>> +  if (gimple_in_ssa_p (fun))
>>
>> Please add flags & TDF_ALIAS here to avoid issues with dump-file scanning.
>>
>
> Done.
>
>
>> +{
>> +  arg = DECL_ARGUMENTS (fndecl);
>> +  while (arg)
>> +   {
>> + tree def = ssa_default_def (fun, arg);
>> + if (flags & TDF_ALIAS)
>> +   dump_ssaname_info_to_file (file, def);
>> + fprintf (file, "# DEFAULT_DEF ");
>> + print_generic_expr (file, def, dump_flags);
>>
>> Rather than
>>
>> # DEFAULT_DEF d_4(D)
>>
>> I'd print
>>
>> d_4(D) = GIMPLE_NOP;
>>
>> (or how gimple-nop is printed - that is, just print the def-stmt).
>>
>> My local patch simply adjusted the dumping of function
>> locals, thus I amended the existing
>>
>>if (gimple_in_ssa_p (cfun))
>>  for (ix = 1; ix < num_ssa_names; ++ix)
>>{
>>  tree name = ssa_name (ix);
>>  if (name && !SSA_NAME_VAR (name))
>>{
>>
>> loop.  Of course that intermixed default-defs with other anonymous
>> SSA vars which might be a little confusing.
>>
>> But prepending the list of locals with
>>
>> type d_4(D) = NOP();
>>
>> together with SSA info might be the best.
>
>
> Done.
>
> In addition, I added printing of SSA_NAME_VAR(def) as argument to NOP.
> I think that's even more clear: the var is printed in the same format as in
> the arguments list, so it makes it easier to relate the two:

Indeed.  Note that I'd drop the "NOP()" then, an assignment from the
parameter decl is clear enough a "nop".

Ok with that change.

Richard.

> ...
> __attribute__((noclone, noinline))
> bar (intD.6 * cD.1755, intD.6 * dD.1756)
> {
>   # PT = nonlocal
>   intD.6 * c_2(D) = NOP(cD.1755);
>   # PT = { D.1762 } (nonlocal)
>   # ALIGN = 4, MISALIGN = 0
>   intD.6 * d_4(D) = NOP(dD.1756);
>   intD.6 _6;
>   intD.6 _7;
> ...
>
>>  Note there is also the
>> static chain and the result decl (if DECL_BY_REFERENCE) to print.
>>
>
> Done, though I haven't tested that bit yet.
>
> Thanks,
> - Tom


[PATCH 2/3, libgomp] Resolve libgomp plugin deadlock on exit, nvptx parts

2015-12-14 Thread Chung-Lin Tang
These are the nvptx parts.

Thanks,
Chung-Lin

* plugin/plugin-nvptx.c (CUDA_CALL_ERET): New convenience macro.
(CUDA_CALL): Likewise.
(CUDA_CALL_ASSERT): Likewise.
(map_init): Change return type to bool, use CUDA_CALL* macros.
(map_fini): Likewise.
(init_streams_for_device): Change return type to bool, adjust
call to map_init.
(fini_streams_for_device): Change return type to bool, adjust
call to map_fini.
(select_stream_for_async): Release stream_lock before calls to
GOMP_PLUGIN_fatal, adjust call to map_init.
(nvptx_init): Use CUDA_CALL* macros.
(nvptx_attach_host_thread_to_device): Change return type to bool,
use CUDA_CALL* macros.
(nvptx_open_device): Use CUDA_CALL* macros.
(nvptx_close_device): Change return type to bool, use CUDA_CALL*
macros.
(nvptx_get_num_devices): Use CUDA_CALL* macros.
(link_ptx): Change return type to bool, use CUDA_CALL* macros.
(nvptx_exec): Use CUDA_CALL* macros.
(nvptx_alloc): Use CUDA_CALL* macros.
(nvptx_free): Change return type to bool, use CUDA_CALL* macros.
(nvptx_host2dev): Likewise.
(nvptx_dev2host): Likewise.
(nvptx_wait): Use CUDA_CALL* macros.
(nvptx_wait_async): Likewise.
(nvptx_wait_all): Likewise.
(nvptx_wait_all_async): Likewise.
(nvptx_set_cuda_stream): Adjust order of stream_lock acquire,
use CUDA_CALL* macros, adjust call to map_fini.
(GOMP_OFFLOAD_init_device): Change return type to bool,
adjust code accordingly.
(GOMP_OFFLOAD_fini_device): Likewise.
(GOMP_OFFLOAD_load_image): Adjust calls to
nvptx_attach_host_thread_to_device/link_ptx to handle errors,
use CUDA_CALL* macros.
(GOMP_OFFLOAD_alloc): Adjust calls to code to handle error return.
(GOMP_OFFLOAD_free): Change return type to bool, adjust calls to
handle error return.
(GOMP_OFFLOAD_dev2host): Likewise.
(GOMP_OFFLOAD_host2dev): Likewise.
(GOMP_OFFLOAD_openacc_register_async_cleanup): Use CUDA_CALL* macros.
(GOMP_OFFLOAD_openacc_create_thread_data): Likewise.
Index: libgomp/plugin/plugin-nvptx.c
===
--- libgomp/plugin/plugin-nvptx.c	(revision 231613)
+++ libgomp/plugin/plugin-nvptx.c	(working copy)
@@ -63,6 +63,34 @@ cuda_error (CUresult r)
   return desc;
 }
 
+/* Convenience macros for the frequently used CUDA library call and
+   error handling sequence.  This does not capture all the cases we
+   use in this file, but is common enough.  */
+
+#define CUDA_CALL_ERET(ERET, FN, ...)		\
+  do {		\
+unsigned __r = FN (__VA_ARGS__);		\
+if (__r != CUDA_SUCCESS)			\
+  {		\
+	GOMP_PLUGIN_error (#FN " error: %s",	\
+			   cuda_error (__r));	\
+	return ERET;\
+  }		\
+  } while (0)
+
+#define CUDA_CALL(FN, ...)			\
+  CUDA_CALL_ERET (false, (FN), __VA_ARGS__)
+
+#define CUDA_CALL_ASSERT(FN, ...)		\
+  do {		\
+unsigned __r = FN (__VA_ARGS__);		\
+if (__r != CUDA_SUCCESS)			\
+  {		\
+	GOMP_PLUGIN_fatal (#FN " error: %s",	\
+			   cuda_error (__r));	\
+  }		\
+  } while (0)
+
 static unsigned int instantiated_devices = 0;
 static pthread_mutex_t ptx_dev_lock = PTHREAD_MUTEX_INITIALIZER;
 
@@ -98,25 +126,18 @@ struct map
   charmappings[0];
 };
 
-static void
+static bool
 map_init (struct ptx_stream *s)
 {
-  CUresult r;
-
   int size = getpagesize ();
 
   assert (s);
   assert (!s->d);
   assert (!s->h);
 
-  r = cuMemAllocHost (>h, size);
-  if (r != CUDA_SUCCESS)
-GOMP_PLUGIN_fatal ("cuMemAllocHost error: %s", cuda_error (r));
+  CUDA_CALL (cuMemAllocHost, >h, size);
+  CUDA_CALL (cuMemHostGetDevicePointer, >d, s->h, 0);
 
-  r = cuMemHostGetDevicePointer (>d, s->h, 0);
-  if (r != CUDA_SUCCESS)
-GOMP_PLUGIN_fatal ("cuMemHostGetDevicePointer error: %s", cuda_error (r));
-
   assert (s->h);
 
   s->h_begin = s->h;
@@ -125,16 +146,14 @@ map_init (struct ptx_stream *s)
 
   assert (s->h_next);
   assert (s->h_end);
+  return true;
 }
 
-static void
+static bool
 map_fini (struct ptx_stream *s)
 {
-  CUresult r;
-
-  r = cuMemFreeHost (s->h);
-  if (r != CUDA_SUCCESS)
-GOMP_PLUGIN_fatal ("cuMemFreeHost error: %s", cuda_error (r));
+  CUDA_CALL (cuMemFreeHost, s->h);
+  return true;
 }
 
 static void
@@ -325,7 +344,7 @@ nvptx_thread (void)
   return (struct nvptx_thread *) GOMP_PLUGIN_acc_thread ();
 }
 
-static void
+static bool
 init_streams_for_device (struct ptx_device *ptx_dev, int concurrency)
 {
   int i;
@@ -337,9 +356,10 @@ init_streams_for_device (struct ptx_device *ptx_de
   null_stream->multithreaded = true;
   null_stream->d = (CUdeviceptr) NULL;
   null_stream->h = NULL;
-  map_init (null_stream);
-  ptx_dev->null_stream = null_stream;
+  if (!map_init (null_stream))
+return false;
 
+  ptx_dev->null_stream = 

[PATCH 3/3, libgomp] Resolve libgomp plugin deadlock on exit, intelmic parts

2015-12-14 Thread Chung-Lin Tang
Hi Ilya,
thanks for the prior review 
(https://gcc.gnu.org/ml/gcc-patches/2015-09/msg01893.html),
This version is mostly a like the prior one, with some minor code updates.

Thanks,
Chung-Lin

2015-12-14  Chung-Lin Tang  

* plugin/libgomp-plugin-intelmic.cpp (offload): Change return type
to bool, adjust return code.
(GOMP_OFFLOAD_init_device): Likewise.
(GOMP_OFFLOAD_fini_device): Likewise.
(get_target_table): Likewise.
(offload_image): Likwise.
(GOMP_OFFLOAD_load_image): Adjust call to offload_image(), change
exit() to return error.
(GOMP_OFFLOAD_alloc): Change return type to bool, change to use out
parameter to return allocated pointer.
(GOMP_OFFLOAD_free): Change return type to bool, adjust return code.
(GOMP_OFFLOAD_host2dev): Likewise.
(GOMP_OFFLOAD_dev2host): Likewise.
(GOMP_OFFLOAD_dev2dev): Likewise.
Index: liboffloadmic/plugin/libgomp-plugin-intelmic.cpp
===
--- liboffloadmic/plugin/libgomp-plugin-intelmic.cpp	(revision 231613)
+++ liboffloadmic/plugin/libgomp-plugin-intelmic.cpp	(working copy)
@@ -205,7 +205,7 @@ GOMP_OFFLOAD_get_num_devices (void)
   return num_devices;
 }
 
-static void
+static bool
 offload (const char *file, uint64_t line, int device, const char *name,
 	 int num_vars, VarDesc *vars, const void **async_data)
 {
@@ -213,20 +213,21 @@ offload (const char *file, uint64_t line, int devi
   if (ofld)
 {
   if (async_data == NULL)
-	__offload_offload1 (ofld, name, 0, num_vars, vars, NULL, 0, NULL, NULL);
+	return __offload_offload1 (ofld, name, 0, num_vars, vars, NULL, 0,
+   NULL, NULL);
   else
 	{
 	  OffloadFlags flags;
 	  flags.flags = 0;
 	  flags.bits.omp_async = 1;
-	  __offload_offload3 (ofld, name, 0, num_vars, vars, NULL, 0, NULL,
-			  async_data, 0, NULL, flags, NULL);
+	  return __offload_offload3 (ofld, name, 0, num_vars, vars, NULL, 0,
+ NULL, async_data, 0, NULL, flags, NULL);
 	}
 }
   else
 {
-  fprintf (stderr, "%s:%d: Offload target acquire failed\n", file, line);
-  exit (1);
+  GOMP_PLUGIN_error ("%s:%d: Offload target acquire failed\n", file, line);
+  return false;
 }
 }
 
@@ -256,24 +257,25 @@ register_main_image ()
 
 /* liboffloadmic loads and runs offload_target_main on all available devices
during a first call to offload ().  */
-extern "C" void
+extern "C" bool
 GOMP_OFFLOAD_init_device (int device)
 {
   TRACE ("(device = %d)", device);
   pthread_once (_image_is_registered, register_main_image);
-  offload (__FILE__, __LINE__, device, "__offload_target_init_proc", 0, NULL,
-	   NULL);
+  return offload (__FILE__, __LINE__, device, "__offload_target_init_proc", 0,
+		  NULL, NULL);
 }
 
-extern "C" void
+extern "C" bool
 GOMP_OFFLOAD_fini_device (int device)
 {
   TRACE ("(device = %d)", device);
   /* Unreachable for GOMP_OFFLOAD_CAP_OPENMP_400.  */
   abort ();
+  return true;
 }
 
-static void
+static bool
 get_target_table (int device, int _funcs, int _vars, void **)
 {
   VarDesc vd1[2] = { vd_tgt2host, vd_tgt2host };
@@ -282,8 +284,9 @@ get_target_table (int device, int _funcs, int
   vd1[1].ptr = _vars;
   vd1[1].size = sizeof (num_vars);
 
-  offload (__FILE__, __LINE__, device, "__offload_target_table_p1", 2, vd1,
-	   NULL);
+  if (!offload (__FILE__, __LINE__, device, "__offload_target_table_p1", 2,
+		vd1, NULL))
+return false;
 
   int table_size = num_funcs + 2 * num_vars;
   if (table_size > 0)
@@ -295,15 +298,16 @@ get_target_table (int device, int _funcs, int
   vd2.ptr = table;
   vd2.size = table_size * sizeof (void *);
 
-  offload (__FILE__, __LINE__, device, "__offload_target_table_p2", 1, ,
-	   NULL);
+  return offload (__FILE__, __LINE__, device, "__offload_target_table_p2",
+		  1, , NULL);
 }
+  return true;
 }
 
 /* Offload TARGET_IMAGE to all available devices and fill address_table with
corresponding target addresses.  */
 
-static void
+static bool
 offload_image (const void *target_image)
 {
   void *image_start = ((void **) target_image)[0];
@@ -317,8 +321,8 @@ offload_image (const void *target_image)
 		   + image_size);
   if (!image)
 {
-  fprintf (stderr, "%s: Can't allocate memory\n", __FILE__);
-  exit (1);
+  GOMP_PLUGIN_error ("%s: Can't allocate memory\n", __FILE__);
+  return false;
 }
 
   image->size = image_size;
@@ -333,13 +337,14 @@ offload_image (const void *target_image)
 
   /* Receive tables for target_image from all devices.  */
   DevAddrVect dev_table;
+  bool ret = true;
   for (int dev = 0; dev < num_devices; dev++)
 {
   int num_funcs = 0;
   int num_vars = 0;
   void **table = NULL;
 
-  get_target_table (dev, num_funcs, num_vars, table);
+  ret &= get_target_table (dev, num_funcs, num_vars, table);
 
   AddrVect curr_dev_table;
 
@@ 

Re: [PATCH, 4/16] Implement -foffload-alias

2015-12-14 Thread Richard Biener
On Sun, 13 Dec 2015, Tom de Vries wrote:

> On 11/12/15 14:00, Richard Biener wrote:
> > On Fri, 11 Dec 2015, Tom de Vries wrote:
> > 
> > > On 13/11/15 12:39, Jakub Jelinek wrote:
> > > > We simply have some compiler internal interface between the caller and
> > > > callee of the outlined regions, each interface in between those has
> > > > its own structure type used to communicate the info;
> > > > we can attach attributes on the fields, or some flags to indicate some
> > > > properties interesting from aliasing POV.  We don't really need to
> > > > perform
> > > > full IPA-PTA, perhaps it would be enough to a) record somewhere in
> > > > cgraph
> > > > the relationship in between such callers and callees (for offloading
> > > > regions
> > > > we already have "omp target entrypoint" attribute on the callee and a
> > > > singler caller), tell LTO if possible not to split those into different
> > > > partitions if easily possible, and then just for these pairs perform
> > > > aliasing/points-to analysis in the caller and the result record using
> > > > cliques/special attributes/whatever to the callee side, so that the
> > > > callee
> > > > (outlined OpenMP/OpenACC/Cilk+ region) can then improve its alias
> > > > analysis.
> > > 
> > > Hi,
> > > 
> > > This work-in-progress patch allows me to use IPA PTA information in the
> > > kernels pass group.
> > > 
> > > Since:
> > > -  I'm running IPA PTA before ealias, and IPA PTA does not interpret
> > > restrict, and
> > > - compute_may_alias doesn't run if IPA PTA information is present
> > > I needed to convince ealias to do the restrict clique/base annotation.
> > > 
> > > It would be more logical to fit IPA PTA after ealias, but one is an IPA
> > > pass,
> > > the other a regular one-function pass, so I would have to split the
> > > containing
> > > pass groups pass_all_early_optimizations and
> > > pass_local_optimization_passes.
> > > I'll give that a try now.
> > > 
> 
> I've tried this approach, but realized that this changes the order in which
> non-openacc functions are processed in the compiler, so I've abandoned this
> idea.
> 
> > > Any comments?
> > 
> > I don't think you want to run IPA PTA before early
> > optimizations, it (and ealias) rely on some initial cleanup to
> > do anything meaningful with well-spent ressources.
> > 
> > The local PTA "hack" also looks more like a waste of resources, but well
> > ... teaching IPA PTA to honor restrict might be an impossible task
> > though I didn't think much about it other than handling it only for
> > nonlocal_p functions (for others we should see all incoming args
> > if IPA PTA works optimally).  The restrict tags will leak all over
> > the place of course and in the end no meaningful cliques may remain.
> > 
> 
> This patch:
> - moves the kernels pass group to the first position in the pass list
>   after ealias where we're back in ipa mode
> - inserts an new ipa pass to contain the gimple pass group called
>   pass_oacc_ipa
> - inserts a version of ipa-pta before the pass group.

In principle I like this a lot, but

+  NEXT_PASS (pass_ipa_pta_oacc_kernels);
+  NEXT_PASS (pass_oacc_ipa);
+  PUSH_INSERT_PASSES_WITHIN (pass_oacc_ipa)

I think you can put pass_ipa_pta_oacc_kernels into the pass_oacc_ipa
group and thus just "clone" ipa_pta?  sub-passes of IPA passes can
be both ipa passes and non-ipa passes.

Thanks,
Richard.


Re: [gomp-nvptx] nvptx backend: implement alloca with -msoft-stack

2015-12-14 Thread Nathan Sidwell

On 12/14/15 08:50, Alexander Monakov wrote:

I have committed this patch to the gomp-nvptx branch.  Bernd, Nathan, I would
appreciate if you could comment on 'define_predicate' changes in nvptx.md.
There are three predicates that start like this:

   if (REG_P (op))
 return !HARD_REGISTER_P (op);
   if (GET_CODE (op) == SUBREG && MEM_P (SUBREG_REG (op)))
 return false;
   if (GET_CODE (op) == SUBREG)
 return false;

For stack adjustments I need to allow operations on the stack pointer.  For
now I've implemented that as a fairly straightforward shortcut, but I guess it
doesn't look very nice.  What is the reason to reject "hard registers" there,
in the first place?  In any case, I'd like your input if you see a better way
to handle it.


just a quick note that moving onto the MD file is on my todo this week.


Also, note that there's either a bug or a cleanup opportunity: the third "if"
statement is clearly more general than the second.


correct, I think there's a bunch of such cleanups.


nathan



Re: [v4] avoid alignment of static variables affecting stack's

2015-12-14 Thread Bernd Schmidt

On 12/14/2015 03:20 PM, Richard Biener wrote:


But there must be a reason statics/externals are expected and handled in all the
callers (and the function).


At the time this was written (git rev 60d031234), expand_one_var looked 
very different. It called a subroutine expand_one_static_var which did 
things like call rest_of_decl_compilation. That stuff was then removed 
by Jan in a patch to drop non-unit-at-a-time compilation.



The new early-out just makes the code even more confusing than before.


Ok, so let's bail out even earlier. The whole expand_now thing for 
statics is likely to be just a historical artifact at this point.



Bernd


Re: adding -Wshadow-local and -Wshadow-compatible-local ?

2015-12-14 Thread Diego Novillo
On Fri, Dec 11, 2015 at 6:41 PM, Jim Meyering  wrote:
>
> Hi Diego,
>
> I noticed this patch that adds support for improved -Wshadow-related options:
>
>   [google] Add two new -Wshadow warnings (issue4452058)
>https://gcc.gnu.org/ml/gcc-patches/2011-04/msg02317.html
>https://codereview.appspot.com/4452058/
>
> Here are the proposed descriptions:
>
> -Wshadow-local which warns if a local variable shadows another local
> variable or parameter,
>
> -Wshadow-compatible-local which warns if a local variable shadows another
> local variable or parameter whose type is compatible with that of the
> shadowing variable.
>
> Yet, I see no further discussion of them, other than Jason's review feedback.
> Was this change deemed unsuitable for upstream gcc?

TBH, I do not remember.  That patch is available in the google
branches, IIRC.   I have no plans to pursue it for trunk.  Feel free
to propose it again.


Diego.


Re: [PATCH] doc: discourage use of __attribute__((optimize())) in production code

2015-12-14 Thread Manuel López-Ibáñez

On 14/12/15 16:40, Markus Trippelsdorf wrote:

On 2015.12.14@11:20 -0500, Trevor Saunders wrote:

On Mon, Dec 14, 2015@10:01:27AM +0100, Richard Biener wrote:

On Sun, Dec 13, 2015@9:03 PM, Andi Kleen  wrote:

Markus Trippelsdorf  writes:


Many developers are still using __attribute__((optimize())) in
production code, although it quite broken.


Wo reads documentation? @) If you want to discourage it better warn once
@runtime.


We're also quite heavily using it in LTO internally now.


besides that does this really make sense?  I suspect very few people are
using this for the fun of it.  I'd guess most usage is to disable
optimizations to work around bugs, or maybe trying to get a very hot
function optimized more.  Either way I suspect its only used by people
with good reason and this would just really iritate them.


Well, if you look@bugzilla you'll find several wrong code bugs caused
by this attribute, e.g.: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59262

Also Richi stated in the past (I quote):
»I consider the optimize attribute code seriously broken and
unmaintained (but sometimes useful for debugging - and only that).«

https://gcc.gnu.org/ml/gcc/2012-07/msg00201.html


It is even a FAQ: https://gcc.gnu.org/wiki/FAQ#optimize_attribute_broken

Cheers,

Manuel.




Re: Ping [PATCH] c++/42121 - diagnose invalid flexible array members

2015-12-14 Thread Jakub Jelinek
On Mon, Dec 14, 2015 at 09:45:16AM -0700, Martin Sebor wrote:
> --- a/gcc/testsuite/g++.dg/compat/struct-layout-1_generate.c
> +++ b/gcc/testsuite/g++.dg/compat/struct-layout-1_generate.c
> @@ -605,8 +605,11 @@ getrandll (void)
>return ret;
>  }
>  
> +/* Generate a subfield.  The object pointed to by FLEX is set to a non-zero
> +   value when the generated field is a flexible array member.  When set, it
> +   prevents subsequent fields from being generated (a flexible array mem*/
>  int
> -subfield (struct entry *e, char *letter)
> +subfield (struct entry *e, char *letter, int* flex)

Formatting, space before * instead of after it.

Jakub


Ping [PATCH] c++/42121 - diagnose invalid flexible array members

2015-12-14 Thread Martin Sebor

Ping:

The most recent patch revealed a problem in the test suite where
the g++.dg/compat/struct-layout-1_generate.c program generates
structs with invalid flexible array members.  The attached patch
fixes the generator to avoid that.

Jason,

Are there any further changes you'd like to suggest for this patch
or is it okay to commit?

Thanks
Martin

On 12/08/2015 09:59 AM, Martin Sebor wrote:

Thanks for the review and the helpful hints!

I've reworked and simplified the diagnostic part of the patch and
corrected the remaining issues I uncovered while testing the new
version (failing to reject some invalid flexible array members in
base classes).  Please find the new version in the attachment.

FWIW, in the patch, I tried to address only the reported problems
with flexible array members without changing how zero-length arrays
are treated.  That means that the latter are accepted in more cases
(with a pedantic warning) than the latter.  For example, the
following is accepted:

 struct A {
 int a[0];   // pedantic warning: zero-size array member
 int n;  // not at end of struct A
 };

while this is rejected

 struct B {
 int a[];// hard error: flexible array member not at
 int n;  // end of struct B
 };

It would be easy to change the patch to treat zero-length arrays
more like flexible array members if that's viewed as desirable.

I also tried to avoid rejecting flexible array members that are
currently accepted and that are safe.  For example, GCC currently
accepts both flexible array members and zero-length arrays in base
classes (even polymorphic ones).  The patch continues to accept
those for compatibility with code that relies on it as long as
the flexible array members didn't overlap other members in derived
classes.  For example, this is still accepted:

 struct A { int x; };
 struct B { int n, a[]; };
 struct C: A, B { };

but this is rejected:

 struct D: B, A { };

My replies to your comments are below.

On 12/04/2015 10:51 AM, Jason Merrill wrote:

On 12/03/2015 11:42 PM, Martin Sebor wrote:

+  if (next && TREE_CODE (next) == FIELD_DECL)


This will break if there's a non-field between the array and the next
field.


You're right, I missed that case in my testing.  Fixed.


@@ -4114,7 +4115,10 @@ walk_subobject_offsets (tree type,

   /* Avoid recursing into objects that are not interesting.  */
   if (!CLASS_TYPE_P (element_type)
-  || !CLASSTYPE_CONTAINS_EMPTY_CLASS_P (element_type))
+  || !CLASSTYPE_CONTAINS_EMPTY_CLASS_P (element_type)
+  || !domain
+  /* Flexible array members have no upper bound.  */
+  || !TYPE_MAX_VALUE (domain))


Why is this desirable?  We do want to avoid empty bases at the same
address as a flexible array of the same type.


As we discussed on IRC, this bit is fine.  I added a few tests for
the layout to make sure the offset of flexible array members matches
the size of the containing class.  While adding these tests I found
a couple of regressions unrelated to my changes (68727 and 68711) so
it was time well spent.


+  && !tree_int_cst_equal (size_max_node, TYPE_MAX_VALUE (dom)))


This can be integer_minus_onep or integer_all_onesp.


Thanks.




+ its fields.  The recursive call to the function will
+ either return 0 or the flexible array member whose


Let's say NULL_TREE here rather than 0.


Sure.




+  {
+bool dummy = false;
+check_flexarrays (t, TYPE_FIELDS (t), );
+  }


This should be called from check_bases_and_members, or even integrated
into check_field_decls.


I tried moving it to check_bases_and_members there but with more
testing found out that calling it there was too early. In order
to detect invalid flexible array members in virtual base classes
without rejecting valid ones, the primary base class needs to
have been determined.  That happens in in layout_class_type()
called later on in finish_struct_1(). So I moved it just past
that call.




-  else if (name)
-pedwarn (input_location, OPT_Wpedantic, "ISO C++ forbids
zero-size array %qD", name);


Why?


At one point, the diagnostic would emit a badly messed up name
in some corner cases.  I think it might have been when I set
TYPE_DOMAIN to NULL_TREE rather than with the current approach
(I can't reproduce it anymore). I've restored the else block.


Can we leave TYPE_DOMAIN null for flexible arrays so you don't need to
add special new handling all over the place?


This was my initial approach until I noticed that it diverged
from C where TYPE_DOMAIN is set to the range [0, NULL_TREE], so
I redid it for consistency.




-tree decl;
+tree decl = NULL_TREE;


Why?


To avoid an ICE later on.  I didn't spend too much time trying
to understand how the control flow changed to trigger it but my
guess is that it has to do with the change to the upper bound.

/home/msebor/scm/fsf/gcc-42121/gcc/testsuite/g++.dg/ext/flexary2.C:16:9:
internal 

Re: [PATCH] doc: discourage use of __attribute__((optimize())) in production code

2015-12-14 Thread Markus Trippelsdorf
On 2015.12.14 at 11:20 -0500, Trevor Saunders wrote:
> On Mon, Dec 14, 2015 at 10:01:27AM +0100, Richard Biener wrote:
> > On Sun, Dec 13, 2015 at 9:03 PM, Andi Kleen  wrote:
> > > Markus Trippelsdorf  writes:
> > >
> > >> Many developers are still using __attribute__((optimize())) in
> > >> production code, although it quite broken.
> > >
> > > Wo reads documentation? @) If you want to discourage it better warn once
> > > at runtime.
> > 
> > We're also quite heavily using it in LTO internally now.
> 
> besides that does this really make sense?  I suspect very few people are
> using this for the fun of it.  I'd guess most usage is to disable
> optimizations to work around bugs, or maybe trying to get a very hot
> function optimized more.  Either way I suspect its only used by people
> with good reason and this would just really iritate them.

Well, if you look at bugzilla you'll find several wrong code bugs caused
by this attribute, e.g.: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59262

Also Richi stated in the past (I quote):
»I consider the optimize attribute code seriously broken and
unmaintained (but sometimes useful for debugging - and only that).«

https://gcc.gnu.org/ml/gcc/2012-07/msg00201.html

-- 
Markus


Re: [gomp4.5] Handle #pragma omp declare target link

2015-12-14 Thread Ilya Verbin
On Mon, Nov 30, 2015 at 21:49:02 +0100, Jakub Jelinek wrote:
> On Mon, Nov 30, 2015 at 11:29:34PM +0300, Ilya Verbin wrote:
> > > This looks wrong, both of these clearly could affect anything with
> > > DECL_HAS_VALUE_EXPR_P, not just the link vars.
> > > So, if you need to handle the "omp declare target link" vars specially,
> > > you should only handle those specially and nothing else.  And please try 
> > > to
> > > explain why.
> > 
> > Actually these ifndefs are not needed, because assemble_decl never will be
> > called by accel compiler for original link vars.  I've added a check into
> > output_in_order, but missed a second place where assemble_decl is called -
> > symbol_table::output_variables.  So, fixed now.
> 
> Great.
> 
> > > Do we need to do anything in gomp_unload_image_from_device ?
> > > I mean at least in questionable programs that for link vars don't 
> > > decrement
> > > the refcount of the var that replaced the link var to 0 first before
> > > dlclosing the library.
> > > At least host_var_table[j * 2 + 1] will have the MSB set, so we need to
> > > handle it differently.  Perhaps for that case perform a lookup, and if we
> > > get something which has link_map non-NULL, first perform as if there is
> > > target exit data delete (var) on it first?
> > 
> > You're right, it doesn't deallocate memory on the device if DSO leaves 
> > nonzero
> > refcount.  And currently host compiler doesn't set MSB in host_var_table, 
> > it's
> > set only by accel compiler.  But it's possible to do splay_tree_lookup for 
> > each
> > var to determine whether is it linked or not, like in the patch bellow.
> > Or do you prefer to set the bit in host compiler too?  It requires
> > lookup_attribute ("omp declare target link") for all vars in the table 
> > during
> > compilation, but allows to do splay_tree_lookup at run-time only for vars 
> > with
> > MSB set in host_var_table.
> > Unfortunately, calling gomp_exit_data from gomp_unload_image_from_device 
> > works
> > only for DSO, but it crashed when an executable leaves nonzero refcount, 
> > because
> > target device may be already uninitialized from plugin's __run_exit_handlers
> > (and it is in case of intelmic), so gomp_exit_data cannot run free_func.
> > Is it possible do add some atexit (...) to libgomp, which will set 
> > shutting_down
> > flag, and just do nothing in gomp_unload_image_from_device if it is set?
> 
> Sorry, I didn't mean you should call gomp_exit_data, what I meant was that
> you perform the same action as would delete(var) do in that case.
> Calling gomp_exit_data e.g. looks it up again etc.
> Supposedly having the MSB in host table too is useful, so if you could
> handle that, it would be nice.  And splay_tree_lookup only if the MSB is
> set.
> So,
> if (!host_data_has_msb_set)
>   splay_tree_remove (>mem_map, );
> else
>   {
> splay_tree_key n = splay_tree_lookup (>mem_map, );
> if (n->link_key)
> {
>   n->refcount = 0;
>   n->link_key = NULL;
>   splay_tree_remove (>mem_map, n);
>   if (n->tgt->refcount > 1)
> n->tgt->refcount--;
>   else
> gomp_unmap_tgt (n->tgt);
> }
>   else
> splay_tree_remove (>mem_map, n);
>   }
> or so.

Here is an updated patch.  Now MSB is set in both tables, and
gomp_unload_image_from_device is changed.  I've verified using simple DSO
testcase, that memory on target is freed after dlclose.
bootstrap and make check on x86_64-linux passed.


gcc/c-family/
* c-common.c (c_common_attribute_table): Handle "omp declare target
link" attribute.
gcc/
* cgraphunit.c (output_in_order): Do not assemble "omp declare target
link" variables in ACCEL_COMPILER.
* gimplify.c (gimplify_adjust_omp_clauses): Do not remove mapping of
"omp declare target link" variables.
* lto/lto.c: Include stringpool.h and fold-const.h.
(offload_handle_link_vars): New static function.
(lto_main): Call offload_handle_link_vars.
* omp-low.c (scan_sharing_clauses): Do not remove mapping of "omp
declare target link" variables.
(add_decls_addresses_to_decl_constructor): For "omp declare target link"
variables output address of the artificial pointer instead of address of
the variable.  Set most significant bit of the size to mark them.
(pass_data_omp_target_link): New pass_data.
(pass_omp_target_link): New class.
(find_link_var_op): New static function.
(make_pass_omp_target_link): New function.
* passes.def: Add pass_omp_target_link.
* tree-pass.h (make_pass_omp_target_link): Declare.
* varpool.c (symbol_table::output_variables): Do not assemble "omp
declare target link" variables in ACCEL_COMPILER.
libgomp/
* libgomp.h (REFCOUNT_LINK): Define.
(struct splay_tree_key_s): Add link_key.
* target.c (gomp_map_vars): Treat 

Re: [gomp4.5] Handle #pragma omp declare target link

2015-12-14 Thread Ilya Verbin
On Fri, Dec 11, 2015 at 18:27:13 +0100, Jakub Jelinek wrote:
> On Tue, Dec 08, 2015 at 05:45:59PM +0300, Ilya Verbin wrote:
> > @@ -356,6 +361,11 @@ gomp_map_vars (struct gomp_device_descr *devicep, 
> > size_t mapnum,
> >  }
> >  
> >gomp_mutex_lock (>lock);
> > +  if (devicep->state == GOMP_DEVICE_FINALIZED)
> > +{
> > +  gomp_mutex_unlock (>lock);
> 
> You need to free (tgt); here I think to avoid leaking memory.

Done.

> > +  return NULL;
> > +}
> >  
> >for (i = 0; i < mapnum; i++)
> >  {
> > @@ -834,6 +844,11 @@ gomp_unmap_vars (struct target_mem_desc *tgt, bool 
> > do_copyfrom)
> >  }
> >  
> >gomp_mutex_lock (>lock);
> > +  if (devicep->state == GOMP_DEVICE_FINALIZED)
> > +{
> > +  gomp_mutex_unlock (>lock);
> > +  return;
> 
> Supposedly you want at least free (tgt->array); free (tgt); here.

Done.

> Plus the question is if the mappings shouldn't be removed from the splay tree
> before that.

This code can be executed only at program shutdown, so I think that removing
from the splay tree isn't necessary here, it will only consume time.
Besides, we do not remove at shutdown those vars, which have non-zero refcount.

> > +/* This function finalizes all initialized devices.  */
> > +
> > +static void
> > +gomp_target_fini (void)
> > +{
> > +  int i;
> > +  for (i = 0; i < num_devices; i++)
> > +{
> > +  struct gomp_device_descr *devicep = [i];
> > +  gomp_mutex_lock (>lock);
> > +  if (devicep->state == GOMP_DEVICE_INITIALIZED)
> > +   {
> > + devicep->fini_device_func (devicep->target_id);
> > + devicep->state = GOMP_DEVICE_FINALIZED;
> > +   }
> > +  gomp_mutex_unlock (>lock);
> > +}
> > +}
> 
> The question is what will this do if there are async target tasks still
> running on some of the devices at this point (forgotten #pragma omp taskwait
> or similar if target nowait regions are started outside of parallel region,
> or exit inside of parallel, etc.  But perhaps it can be handled incrementally.
> Also there is the question that the 
> So I think the patch is ok with the above mentioned changes.

Here is what I've committed to trunk.


libgomp/
* libgomp.h (gomp_device_state): New enum.
(struct gomp_device_descr): Replace is_initialized with state.
(gomp_fini_device): Remove declaration.
* oacc-host.c (host_dispatch): Use state instead of is_initialized.
* oacc-init.c (acc_init_1): Use state instead of is_initialized.
(acc_shutdown_1): Likewise.  Inline gomp_fini_device.
(acc_set_device_type): Use state instead of is_initialized.
(acc_set_device_num): Likewise.
* target.c (resolve_device): Use state instead of is_initialized.
Do not initialize finalized device.
(gomp_map_vars): Do nothing if device is finalized.
(gomp_unmap_vars): Likewise.
(gomp_update): Likewise.
(GOMP_offload_register_ver): Use state instead of is_initialized.
(GOMP_offload_unregister_ver): Likewise.
(gomp_init_device): Likewise.
(gomp_unload_device): Likewise.
(gomp_fini_device): Remove.
(gomp_get_target_fn_addr): Do nothing if device is finalized.
(GOMP_target): Go to host fallback if device is finalized.
(GOMP_target_ext): Likewise.
(gomp_exit_data): Do nothing if device is finalized.
(gomp_target_task_fn): Go to host fallback if device is finalized.
(gomp_target_fini): New static function.
(gomp_target_init): Use state instead of is_initialized.
Call gomp_target_fini at exit.
liboffloadmic/
* plugin/libgomp-plugin-intelmic.cpp (unregister_main_image): Remove.
(register_main_image): Do not call unregister_main_image at exit.
(GOMP_OFFLOAD_fini_device): Allow for OpenMP.  Unregister main image.


diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h
index c467f97..9d9949f 100644
--- a/libgomp/libgomp.h
+++ b/libgomp/libgomp.h
@@ -888,6 +888,14 @@ typedef struct acc_dispatch_t
   } cuda;
 } acc_dispatch_t;
 
+/* Various state of the accelerator device.  */
+enum gomp_device_state
+{
+  GOMP_DEVICE_UNINITIALIZED,
+  GOMP_DEVICE_INITIALIZED,
+  GOMP_DEVICE_FINALIZED
+};
+
 /* This structure describes accelerator device.
It contains name of the corresponding libgomp plugin, function handlers for
interaction with the device, ID-number of the device, and information about
@@ -933,8 +941,10 @@ struct gomp_device_descr
   /* Mutex for the mutable data.  */
   gomp_mutex_t lock;
 
-  /* Set to true when device is initialized.  */
-  bool is_initialized;
+  /* Current state of the device.  OpenACC allows to move from INITIALIZED 
state
+ back to UNINITIALIZED state.  OpenMP allows only to move from INITIALIZED
+ to FINALIZED state (at program shutdown).  */
+  enum gomp_device_state state;
 
   /* OpenACC-specific data and functions.  */
   /* This is mutable because of its mutable data_environ and 

Re: [build] Only support -gstabs on Mac OS X if assember supports it (PR target/67973)

2015-12-14 Thread Mike Stump
On Dec 14, 2015, at 2:40 AM, Rainer Orth  wrote:
> As described in PR PR target/67973, newer assemblers on Mac OS X, which
> are based on LLVM instead of gas, don't support .stab* directives any
> longer.  The following patch detects this situation and tries to fall
> back to the older gas-based as if it is still accessible via as -Q.
> 
> Tested on x86_64-apple-darwin15.2.0 and as expected the -gstabs* tests
> now pass.
> 
> However, I'm not really comfortable with this solution.

When I proposed automagically adding -Q, it sounded like a good idea.  :-(

Yeah, hard to disagree with your intuition.  If a future assembler had or added 
stabs that had or added all these features, it would come first on the path, 
and it all work just work out nicely with just a configure check to disable 
stabs if it didn’t work.  That simple check should be reliable and work well.

> Initially, I
> forgot to wrap the -Q option to as in %{gstabs*:...}, which lead to a
> bootstrap failure: the gas- and LLVM-based assemblers differ in a
> number of other ways

Yeah, having the feature set be a dynamic property when our software decides on 
static basis is bound to hurt.  Seem that the most likely patch would be to 
just turn off stabs in a way that the test suite disables the tests by itself, 
or to just quite the tests suite.

[PATCH 2/2] [graphite] update required isl versions

2015-12-14 Thread Sebastian Pop
we now check the isl version, as there are no real differences in existing files
in between isl 0.14 and isl 0.15.
---
 config/isl.m4| 29 +++
 configure| 23 +--
 gcc/config.in| 12 
 gcc/configure| 61 ++--
 gcc/configure.ac | 23 ---
 gcc/graphite-isl-ast-to-gimple.c | 10 ++-
 gcc/graphite-optimize-isl.c  | 30 ++--
 gcc/graphite-poly.c  |  8 --
 gcc/graphite-sese-to-poly.c  |  8 --
 gcc/graphite.h   |  1 +
 10 files changed, 39 insertions(+), 166 deletions(-)

diff --git a/config/isl.m4 b/config/isl.m4
index 459fac1..7387ff2 100644
--- a/config/isl.m4
+++ b/config/isl.m4
@@ -19,23 +19,23 @@
 
 # ISL_INIT_FLAGS ()
 # -
-# Provide configure switches for ISL support.
+# Provide configure switches for isl support.
 # Initialize isllibs/islinc according to the user input.
 AC_DEFUN([ISL_INIT_FLAGS],
 [
   AC_ARG_WITH([isl-include],
 [AS_HELP_STRING(
   [--with-isl-include=PATH],
-  [Specify directory for installed ISL include files])])
+  [Specify directory for installed isl include files])])
   AC_ARG_WITH([isl-lib],
 [AS_HELP_STRING(
   [--with-isl-lib=PATH],
-  [Specify the directory for the installed ISL library])])
+  [Specify the directory for the installed isl library])])
 
   AC_ARG_ENABLE(isl-version-check,
 [AS_HELP_STRING(
   [--disable-isl-version-check],
-  [disable check for ISL version])],
+  [disable check for isl version])],
 ENABLE_ISL_CHECK=$enableval,
 ENABLE_ISL_CHECK=yes)
   
@@ -58,15 +58,15 @@ AC_DEFUN([ISL_INIT_FLAGS],
   if test "x${with_isl_lib}" != x; then
 isllibs="-L$with_isl_lib"
   fi
-  dnl If no --with-isl flag was specified and there is in-tree ISL
+  dnl If no --with-isl flag was specified and there is in-tree isl
   dnl source, set up flags to use that and skip any version tests
-  dnl as we cannot run them before building ISL.
+  dnl as we cannot run them before building isl.
   if test "x${islinc}" = x && test "x${isllibs}" = x \
  && test -d ${srcdir}/isl; then
 isllibs='-L$$r/$(HOST_SUBDIR)/isl/'"$lt_cv_objdir"' '
 islinc='-I$$r/$(HOST_SUBDIR)/isl/include -I$$s/isl/include'
 ENABLE_ISL_CHECK=no
-AC_MSG_WARN([using in-tree ISL, disabling version check])
+AC_MSG_WARN([using in-tree isl, disabling version check])
   fi
 
   isllibs="${isllibs} -lisl"
@@ -75,7 +75,7 @@ AC_DEFUN([ISL_INIT_FLAGS],
 
 # ISL_REQUESTED (ACTION-IF-REQUESTED, ACTION-IF-NOT)
 # 
-# Provide actions for failed ISL detection.
+# Provide actions for failed isl detection.
 AC_DEFUN([ISL_REQUESTED],
 [
   AC_REQUIRE([ISL_INIT_FLAGS])
@@ -106,12 +106,17 @@ AC_DEFUN([ISL_CHECK_VERSION],
 LDFLAGS="${_isl_saved_LDFLAGS} ${isllibs}"
 LIBS="${_isl_saved_LIBS} -lisl"
 
-AC_MSG_CHECKING([for compatible ISL])
-AC_LINK_IFELSE([AC_LANG_PROGRAM([[#include ]], [[;]])],
-   [gcc_cv_isl=yes],
-   [gcc_cv_isl=no])
+AC_MSG_CHECKING([for isl 0.15 (or deprecated 0.14)])
+AC_TRY_LINK([#include ],
+[isl_ctx_get_max_operations (isl_ctx_alloc ());],
+[gcc_cv_isl=yes],
+[gcc_cv_isl=no])
 AC_MSG_RESULT([$gcc_cv_isl])
 
+if test "${gcc_cv_isl}" = no ; then
+  AC_MSG_RESULT([recommended isl version is 0.15, minimum required isl 
version 0.14 is deprecated])
+fi
+
 CFLAGS=$_isl_saved_CFLAGS
 LDFLAGS=$_isl_saved_LDFLAGS
 LIBS=$_isl_saved_LIBS
diff --git a/configure b/configure
index 090615f..a6495c4 100755
--- a/configure
+++ b/configure
@@ -1492,7 +1492,7 @@ Optional Features:
   build static libjava [default=no]
   --enable-bootstrap  enable bootstrapping [yes if native build]
   --disable-isl-version-check
-  disable check for ISL version
+  disable check for isl version
   --enable-ltoenable link time optimization support
   --enable-linker-plugin-configure-flags=FLAGS
   additional flags for configuring linker plugins
@@ -1553,8 +1553,8 @@ Optional Packages:
   package. Equivalent to
   --with-isl-include=PATH/include plus
   --with-isl-lib=PATH/lib
-  --with-isl-include=PATH Specify directory for installed ISL include files
-  --with-isl-lib=PATH Specify the directory for the installed ISL library
+  --with-isl-include=PATH Specify directory for installed isl include files
+  --with-isl-lib=PATH Specify the directory for the installed isl library
   --with-build-sysroot=SYSROOT
   use sysroot as the system root during the build
   --with-debug-prefix-map='A=B C=D ...'
@@ -6003,8 +6003,8 @@ fi
 

[PATCH 1/2] [graphite] add more dumps on data dependence graph

2015-12-14 Thread Sebastian Pop
---
 gcc/graphite-dependences.c| 31 +++
 gcc/graphite-poly.c   | 15 ++-
 gcc/graphite-scop-detection.c | 21 -
 3 files changed, 57 insertions(+), 10 deletions(-)

diff --git a/gcc/graphite-dependences.c b/gcc/graphite-dependences.c
index bb81ae3..7b7912a 100644
--- a/gcc/graphite-dependences.c
+++ b/gcc/graphite-dependences.c
@@ -89,8 +89,16 @@ scop_get_reads (scop_p scop, vec pbbs)
if (pdr_read_p (pdr))
  {
if (dump_file)
- print_pdr (dump_file, pdr);
+ {
+   fprintf (dump_file, "Adding read to depedence graph: ");
+   print_pdr (dump_file, pdr);
+ }
res = isl_union_map_add_map (res, add_pdr_constraints (pdr, pbb));
+   if (dump_file)
+ {
+   fprintf (dump_file, "Reads depedence graph: ");
+   print_isl_union_map (dump_file, res);
+ }
  }
 }
 
@@ -114,8 +122,16 @@ scop_get_must_writes (scop_p scop, vec pbbs)
if (pdr_write_p (pdr))
  {
if (dump_file)
- print_pdr (dump_file, pdr);
+ {
+   fprintf (dump_file, "Adding must write to depedence graph: ");
+   print_pdr (dump_file, pdr);
+ }
res = isl_union_map_add_map (res, add_pdr_constraints (pdr, pbb));
+   if (dump_file)
+ {
+   fprintf (dump_file, "Must writes depedence graph: ");
+   print_isl_union_map (dump_file, res);
+ }
  }
 }
 
@@ -139,9 +155,16 @@ scop_get_may_writes (scop_p scop, vec pbbs)
if (pdr_may_write_p (pdr))
  {
if (dump_file)
- print_pdr (dump_file, pdr);
-
+ {
+   fprintf (dump_file, "Adding may write to depedence graph: ");
+   print_pdr (dump_file, pdr);
+ }
res = isl_union_map_add_map (res, add_pdr_constraints (pdr, pbb));
+   if (dump_file)
+ {
+   fprintf (dump_file, "May writes depedence graph: ");
+   print_isl_union_map (dump_file, res);
+ }
  }
 }
 
diff --git a/gcc/graphite-poly.c b/gcc/graphite-poly.c
index f4bdd40..6c01a4c 100644
--- a/gcc/graphite-poly.c
+++ b/gcc/graphite-poly.c
@@ -31,13 +31,15 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree.h"
 #include "gimple.h"
 #include "cfghooks.h"
-#include "gimple-pretty-print.h"
 #include "diagnostic-core.h"
 #include "fold-const.h"
 #include "gimple-iterator.h"
 #include "tree-ssa-loop.h"
 #include "cfgloop.h"
 #include "tree-data-ref.h"
+#include "pretty-print.h"
+#include "gimple-pretty-print.h"
+#include "tree-dump.h"
 
 #include 
 #include 
@@ -147,6 +149,17 @@ new_poly_dr (poly_bb_p pbb, gimple *stmt, enum 
poly_dr_type type,
   pdr->subscript_sizes = subscript_sizes;
   PDR_TYPE (pdr) = type;
   PBB_DRS (pbb).safe_push (pdr);
+
+  if (dump_file)
+{
+  fprintf (dump_file, "Converting dr: ");
+  print_pdr (dump_file, pdr);
+  fprintf (dump_file, "To polyhedral representation:\n");
+  fprintf (dump_file, "  - access functions: ");
+  print_isl_map (dump_file, acc);
+  fprintf (dump_file, "  - subscripts: ");
+  print_isl_set (dump_file, subscript_sizes);
+}
 }
 
 /* Free polyhedral data reference PDR.  */
diff --git a/gcc/graphite-scop-detection.c b/gcc/graphite-scop-detection.c
index 729a5fd..23562d1 100644
--- a/gcc/graphite-scop-detection.c
+++ b/gcc/graphite-scop-detection.c
@@ -1684,9 +1684,9 @@ build_cross_bb_scalars_def (scop_p scop, tree def, 
basic_block def_bb,
 if (def_bb != gimple_bb (use_stmt) && !is_gimple_debug (use_stmt))
   {
writes->safe_push (def);
-   DEBUG_PRINT (dp << "Adding scalar write:\n";
+   DEBUG_PRINT (dp << "Adding scalar write: ";
 print_generic_expr (dump_file, def, 0);
-dp << "From stmt:\n";
+dp << "\nFrom stmt: ";
 print_gimple_stmt (dump_file,
SSA_NAME_DEF_STMT (def), 0, 0));
/* This is required by the FOR_EACH_IMM_USE_STMT when we want to break
@@ -1713,9 +1713,9 @@ build_cross_bb_scalars_use (scop_p scop, tree use, gimple 
*use_stmt,
   gimple *def_stmt = SSA_NAME_DEF_STMT (use);
   if (gimple_bb (def_stmt) != gimple_bb (use_stmt))
 {
-  DEBUG_PRINT (dp << "Adding scalar read:";
+  DEBUG_PRINT (dp << "Adding scalar read: ";
   print_generic_expr (dump_file, use, 0);
-  dp << "\nFrom stmt:";
+  dp << "\nFrom stmt: ";
   print_gimple_stmt (dump_file, use_stmt, 0, 0));
   reads->safe_push (std::make_pair (use_stmt, use));
 }
@@ -1879,7 +1879,18 @@ gather_bbs::before_dom_children (basic_block bb)
   int i;
   data_reference_p dr;
   FOR_EACH_VEC_ELT (gbb->data_refs, i, dr)
-

[PATCH] Fix -fcompare-debug issue in tree-cfgcleanup (PR tree-optimization/66688)

2015-12-14 Thread Jakub Jelinek
Hi!

As the testcase used to show on ppc64le with slightly older trunk,
cleanup_control_flow_bb can be called on a bb with newly noreturn
call followed by debug stmts.  With -g0, cleanup_control_flow_bb
removes the fallthru edge, so we need to do it even if followed by debug
stmts.

This patch is one possible way to fix this, another one is attached to the
PR.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2015-12-14  Jakub Jelinek  

PR tree-optimization/66688
* tree-cfgcleanup.c (cleanup_control_flow_bb): Handle
noreturn call followed only by debug stmts by removing
the debug stmts and handling it the same as if the noreturn
call is the last stmt.

* gcc.dg/pr66688.c: New test.

--- gcc/tree-cfgcleanup.c.jj2015-12-02 20:26:59.0 +0100
+++ gcc/tree-cfgcleanup.c   2015-12-14 17:34:10.748487811 +0100
@@ -186,7 +186,7 @@ cleanup_control_flow_bb (basic_block bb)
  we need to prune cfg.  */
   retval |= gimple_purge_dead_eh_edges (bb);
 
-  gsi = gsi_last_bb (bb);
+  gsi = gsi_last_nondebug_bb (bb);
   if (gsi_end_p (gsi))
 return retval;
 
@@ -197,7 +197,10 @@ cleanup_control_flow_bb (basic_block bb)
 
   if (gimple_code (stmt) == GIMPLE_COND
   || gimple_code (stmt) == GIMPLE_SWITCH)
-retval |= cleanup_control_expr_graph (bb, gsi);
+{
+  gcc_checking_assert (gsi_stmt (gsi_last_bb (bb)) == stmt);
+  retval |= cleanup_control_expr_graph (bb, gsi);
+}
   else if (gimple_code (stmt) == GIMPLE_GOTO
   && TREE_CODE (gimple_goto_dest (stmt)) == ADDR_EXPR
   && (TREE_CODE (TREE_OPERAND (gimple_goto_dest (stmt), 0))
@@ -210,6 +213,7 @@ cleanup_control_flow_bb (basic_block bb)
   edge_iterator ei;
   basic_block target_block;
 
+  gcc_checking_assert (gsi_stmt (gsi_last_bb (bb)) == stmt);
   /* First look at all the outgoing edges.  Delete any outgoing
 edges which do not go to the right block.  For the one
 edge which goes to the right block, fix up its flags.  */
@@ -242,9 +246,15 @@ cleanup_control_flow_bb (basic_block bb)
   /* Check for indirect calls that have been turned into
  noreturn calls.  */
   else if (is_gimple_call (stmt)
-   && gimple_call_noreturn_p (stmt)
-   && remove_fallthru_edge (bb->succs))
-retval = true;
+  && gimple_call_noreturn_p (stmt))
+{
+  /* If there are debug stmts after the noreturn call, remove them
+now, they should be all unreachable anyway.  */
+  for (gsi_next (); !gsi_end_p (gsi); )
+   gsi_remove (, true);
+  if (remove_fallthru_edge (bb->succs))
+   retval = true;
+}
 
   return retval;
 }
--- gcc/testsuite/gcc.dg/pr66688.c.jj   2015-12-14 14:51:43.652481658 +0100
+++ gcc/testsuite/gcc.dg/pr66688.c  2015-12-14 14:51:05.0 +0100
@@ -0,0 +1,39 @@
+/* PR tree-optimization/66688 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-reorder-blocks -fcompare-debug" } */
+
+struct fdt_header { unsigned magic; } *a;
+
+int d;
+
+int
+__fswab32 (int p1)
+{
+  return __builtin_bswap32 (p1);
+}
+
+void
+fdt_set_magic (int p1)
+{
+  struct fdt_header *b = a;
+  b->magic = __builtin_constant_p (p1) ? : __fswab32 (p1);
+}
+
+int
+_fdt_sw_check_header ()
+{
+  int c = ((struct fdt_header *) 1)->magic;
+  if (c)
+return 1;
+  return 0;
+}
+
+int
+fdt_finish ()
+{
+  if (_fdt_sw_check_header ())
+if (d)
+  return 0;
+  fdt_set_magic (0);
+  return 0;
+}

Jakub


Re: PING^1: [PATCH] Add TYPE_EMPTY_RECORD for C++ empty class

2015-12-14 Thread Jason Merrill

On 12/12/2015 01:42 PM, Marc Glisse wrote:

On Sat, 12 Dec 2015, Jakub Jelinek wrote:


On Sat, Dec 12, 2015 at 09:51:23AM -0500, Jason Merrill wrote:

On 12/11/2015 06:52 PM, H.J. Lu wrote:

On Thu, Dec 10, 2015 at 3:24 AM, Richard Biener
 wrote:

On Wed, Dec 9, 2015 at 10:31 PM, Markus Trippelsdorf
 wrote:

On 2015.12.09 at 10:53 -0800, H.J. Lu wrote:


Empty C++ class is a corner case which isn't covered in psABI nor
C++ ABI.
There is no mention of "empty record" in GCC documentation.  But
there are
plenty of "empty class" in gcc/cp.  This change affects all
targets.  C++ ABI
should specify how it should be passed.



About this patch, aren't we supposed to enable new C++ ABIs with
-fabi-version=42 (or whatever the next number is)?


Yes, the patch should definitely make this conditional on 
abi_version_at_least.



There is a C++ ABI mailinglist, where you could discuss this issue:
http://sourcerytools.com/cgi-bin/mailman/listinfo/cxx-abi-dev


Yep.  As long as the ABI doesn't state how to pass those I'd rather
_not_ change GCCs way.


It is agreed that GCC is wrong on this:

http://sourcerytools.com/pipermail/cxx-abi-dev/2015-December/002876.html



Yes, I think this is just a (nasty) bug on some GCC targets.


Well, the argument in that thread is weird, because C and C++ empty structs
are different, so it isn't surprising they are passed differently.
C++ makes those sizeof == 1, while C has them sizeof == 0.


Maybe it isn't surprising, but it isn't particularly helpful either. It
increases the number of places where the 2 are incompatible.
(I personally don't care about empty C structs)


Yep.  The C standard doesn't have empty structs; it's a GNU extension. 
But in any case argument passing can be compatible between C and C++, so 
it really should be.


Jason



Re: Ping [PATCH] c++/42121 - diagnose invalid flexible array members

2015-12-14 Thread Jason Merrill

On 12/14/2015 11:45 AM, Martin Sebor wrote:

+  if (NULL_TREE == size)


Usually NULL_TREE goes on the right.


@@ -8744,6 +8748,7 @@ compute_array_index_type (tree name, tree size, 
tsubst_flags_t complain)
  else
pedwarn (input_location, OPT_Wpedantic, "ISO C++ forbids zero-size 
array");
}
+
 }


Unnecessary blank line.


+   if (TREE_CODE (type) != ARRAY_TYPE
+   || !COMPLETE_TYPE_P (TREE_TYPE (type)))
+ {
+
if (unqualified_id)


Here too.

OK with those changes and the one Jakub suggested.

Jason



Re: [BUILDROBOT] "error: null argument where non-null required" on multiple targets

2015-12-14 Thread Jan-Benedict Glaw
On Mon, 2015-12-14 18:54:28 +, Moore, Catherine 
 wrote:
> > avr-rtems   
> > http://toolchain.lug-owl.de/buildbot/show_build_details.php?id=478544
> > mipsel-elf  
> > http://toolchain.lug-owl.de/buildbot/show_build_details.php?id=478844
> > mipsisa64r2-sde-elf 
> > http://toolchain.lug-owl.de/buildbot/show_build_details.php?id=478855
> > mipsisa64sb1-elf
> > http://toolchain.lug-owl.de/buildbot/show_build_details.php?id=478865
> > mips-rtems  
> > http://toolchain.lug-owl.de/buildbot/show_build_details.php?id=478877
> > powerpc-eabialtivec 
> > http://toolchain.lug-owl.de/buildbot/show_build_details.php?id=478922
> > powerpc-eabispe 
> > http://toolchain.lug-owl.de/buildbot/show_build_details.php?id=478932
> > powerpc-rtems   
> > http://toolchain.lug-owl.de/buildbot/show_build_details.php?id=478956
> > ppc-elf 
> > http://toolchain.lug-owl.de/buildbot/show_build_details.php?id=478968
> > sh-superh-elf   
> > http://toolchain.lug-owl.de/buildbot/show_build_details.php?id=479077
> 
> Is there an easy way to reproduce the MIPS problems that you
> reported?  I don't seem to be able to do it with a cross-compiler
> targeting mipsel-elf.

What's your build compiler? For these builds, where it showed up, I'm
using a freshly compiles HEAD/master version. So basically, compile a
current GCC for your build machine:

.../configure --prefix=/tmp/foo --disable-multilib \
--enable-languages=all,ada,go
make
make install

...and then put that compiler into your $PATH and build a cross compiler:

.../configure --target=mipsel-elf --enable-werror-always \
--enable-languages=all,ada,go
make all-gcc

(This is how contrib/config-list.mk does its test builds, which is
what I'm calling.)

MfG, JBG

-- 
  Jan-Benedict Glaw  jbg...@lug-owl.de  +49-172-7608481
Signature of:  Alles sollte so einfach wie möglich gemacht sein.
the second  :  Aber nicht einfacher.  (Einstein)


signature.asc
Description: Digital signature


[PATCH] Fix -fcompare-debug issue in cross-jumping (PR rtl-optimization/65980)

2015-12-14 Thread Jakub Jelinek
Hi!

rtx_renumbered_equal_p considers two LABEL_REFs equivalent if they
have the same next_real_insn, unfortunately next_real_insn doesn't ignore
debug insns.  It ignores BARRIERs/JUMP_TABLE_DATA insns too, which is IMHO
not desirable either, so this patch uses next_nonnote_nondebug_insn instead
(which stops at CODE_LABEL) and keeps iterating if CODE_LABELs are found.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2015-12-14  Jakub Jelinek  

PR rtl-optimization/65980
* jump.c (rtx_renumbered_equal_p) : Use
next_nonnote_nondebug_insn instead of next_real_insn and
skip over CODE_LABELs too.

* gcc.dg/pr65980.c: New test.

--- gcc/jump.c.jj   2015-11-04 11:12:18.0 +0100
+++ gcc/jump.c  2015-12-14 17:17:08.859741360 +0100
@@ -1802,8 +1802,16 @@ rtx_renumbered_equal_p (const_rtx x, con
 
   /* Two label-refs are equivalent if they point at labels
 in the same position in the instruction stream.  */
-  return (next_real_insn (LABEL_REF_LABEL (x))
- == next_real_insn (LABEL_REF_LABEL (y)));
+  else
+   {
+ rtx_insn *xi = next_nonnote_nondebug_insn (LABEL_REF_LABEL (x));
+ rtx_insn *yi = next_nonnote_nondebug_insn (LABEL_REF_LABEL (y));
+ while (xi && LABEL_P (xi))
+   xi = next_nonnote_nondebug_insn (xi);
+ while (yi && LABEL_P (yi))
+   yi = next_nonnote_nondebug_insn (yi);
+ return xi == yi;
+   }
 
 case SYMBOL_REF:
   return XSTR (x, 0) == XSTR (y, 0);
--- gcc/testsuite/gcc.dg/pr65980.c.jj   2015-12-14 17:07:54.398479666 +0100
+++ gcc/testsuite/gcc.dg/pr65980.c  2015-12-14 17:08:32.616950620 +0100
@@ -0,0 +1,30 @@
+/* PR rtl-optimization/65980 */
+/* { dg-do compile } */
+/* { dg-options "-O3 -fcompare-debug" } */
+
+typedef struct { int b; } A;
+void (*a) (int);
+int b;
+
+int
+foo (A *v)
+{
+  asm goto ("" : : "m" (v->b) : : l);
+  return 0;
+l:
+  return 1;
+}
+
+int
+bar (void)
+{
+  if (b)
+{
+  if (foo (0) && a)
+   a (0);
+  return 0;
+}
+  if (foo (0) && a)
+a (0);
+  return 0;
+}

Jakub


Re: [PATCH] Fix -fcompare-debug issue in tree-cfgcleanup (PR tree-optimization/66688)

2015-12-14 Thread Richard Biener
On December 14, 2015 9:11:39 PM GMT+01:00, Jakub Jelinek  
wrote:
>Hi!
>
>As the testcase used to show on ppc64le with slightly older trunk,
>cleanup_control_flow_bb can be called on a bb with newly noreturn
>call followed by debug stmts.  With -g0, cleanup_control_flow_bb
>removes the fallthru edge, so we need to do it even if followed by
>debug
>stmts.
>
>This patch is one possible way to fix this, another one is attached to
>the
>PR.
>
>Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Thanks,
Richard.

>2015-12-14  Jakub Jelinek  
>
>   PR tree-optimization/66688
>   * tree-cfgcleanup.c (cleanup_control_flow_bb): Handle
>   noreturn call followed only by debug stmts by removing
>   the debug stmts and handling it the same as if the noreturn
>   call is the last stmt.
>
>   * gcc.dg/pr66688.c: New test.
>
>--- gcc/tree-cfgcleanup.c.jj   2015-12-02 20:26:59.0 +0100
>+++ gcc/tree-cfgcleanup.c  2015-12-14 17:34:10.748487811 +0100
>@@ -186,7 +186,7 @@ cleanup_control_flow_bb (basic_block bb)
>  we need to prune cfg.  */
>   retval |= gimple_purge_dead_eh_edges (bb);
> 
>-  gsi = gsi_last_bb (bb);
>+  gsi = gsi_last_nondebug_bb (bb);
>   if (gsi_end_p (gsi))
> return retval;
> 
>@@ -197,7 +197,10 @@ cleanup_control_flow_bb (basic_block bb)
> 
>   if (gimple_code (stmt) == GIMPLE_COND
>   || gimple_code (stmt) == GIMPLE_SWITCH)
>-retval |= cleanup_control_expr_graph (bb, gsi);
>+{
>+  gcc_checking_assert (gsi_stmt (gsi_last_bb (bb)) == stmt);
>+  retval |= cleanup_control_expr_graph (bb, gsi);
>+}
>   else if (gimple_code (stmt) == GIMPLE_GOTO
>  && TREE_CODE (gimple_goto_dest (stmt)) == ADDR_EXPR
>  && (TREE_CODE (TREE_OPERAND (gimple_goto_dest (stmt), 0))
>@@ -210,6 +213,7 @@ cleanup_control_flow_bb (basic_block bb)
>   edge_iterator ei;
>   basic_block target_block;
> 
>+  gcc_checking_assert (gsi_stmt (gsi_last_bb (bb)) == stmt);
>   /* First look at all the outgoing edges.  Delete any outgoing
>edges which do not go to the right block.  For the one
>edge which goes to the right block, fix up its flags.  */
>@@ -242,9 +246,15 @@ cleanup_control_flow_bb (basic_block bb)
>   /* Check for indirect calls that have been turned into
>  noreturn calls.  */
>   else if (is_gimple_call (stmt)
>-   && gimple_call_noreturn_p (stmt)
>-   && remove_fallthru_edge (bb->succs))
>-retval = true;
>+ && gimple_call_noreturn_p (stmt))
>+{
>+  /* If there are debug stmts after the noreturn call, remove them
>+   now, they should be all unreachable anyway.  */
>+  for (gsi_next (); !gsi_end_p (gsi); )
>+  gsi_remove (, true);
>+  if (remove_fallthru_edge (bb->succs))
>+  retval = true;
>+}
> 
>   return retval;
> }
>--- gcc/testsuite/gcc.dg/pr66688.c.jj  2015-12-14 14:51:43.652481658
>+0100
>+++ gcc/testsuite/gcc.dg/pr66688.c 2015-12-14 14:51:05.0 +0100
>@@ -0,0 +1,39 @@
>+/* PR tree-optimization/66688 */
>+/* { dg-do compile } */
>+/* { dg-options "-O2 -fno-reorder-blocks -fcompare-debug" } */
>+
>+struct fdt_header { unsigned magic; } *a;
>+
>+int d;
>+
>+int
>+__fswab32 (int p1)
>+{
>+  return __builtin_bswap32 (p1);
>+}
>+
>+void
>+fdt_set_magic (int p1)
>+{
>+  struct fdt_header *b = a;
>+  b->magic = __builtin_constant_p (p1) ? : __fswab32 (p1);
>+}
>+
>+int
>+_fdt_sw_check_header ()
>+{
>+  int c = ((struct fdt_header *) 1)->magic;
>+  if (c)
>+return 1;
>+  return 0;
>+}
>+
>+int
>+fdt_finish ()
>+{
>+  if (_fdt_sw_check_header ())
>+if (d)
>+  return 0;
>+  fdt_set_magic (0);
>+  return 0;
>+}
>
>   Jakub




RE: [PATCH][ARC] Refurbish emitting DWARF2 for epilogue.

2015-12-14 Thread Claudiu Zissulescu
Patches submitted.

Thanks,
Claudiu

> -Original Message-
> From: Joern Wolfgang Rennecke [mailto:g...@amylaar.uk]
> Sent: Saturday, December 12, 2015 3:49 PM
> To: Claudiu Zissulescu; gcc-patches@gcc.gnu.org
> Cc: francois.bed...@synopsys.com; jeremy.benn...@embecosm.com
> Subject: Re: [PATCH][ARC] Refurbish emitting DWARF2 for epilogue.
> 
> 
> 
> On 11/12/15 10:29, Claudiu Zissulescu wrote:
> > I did some testing here. For size, I used CSiBE testbench, and for speed, I
> used coremark and dhrystone. Using a blockage or not, doesn't affect the
> size or speed figures. However, using
> TARGET_NO_SPECULATION_IN_DELAY_SLOTS_P hook betters the size
> figures (not much, just .1%), and improves the coremark by 2% and
> Dhrystone by 1%.
> >
> > Hence, in the light of the new figures, I favor the above two patch 
> > solution.
> Both patches are checked using dg.exp and compile.exp. Ok to submit?
> Both patches are OK.


Re: [Patch AArch64] Reinstate CANNOT_CHANGE_MODE_CLASS to fix pr67609

2015-12-14 Thread James Greenhalgh
On Wed, Dec 09, 2015 at 01:13:20PM +, Marcus Shawcroft wrote:
> On 27 November 2015 at 13:01, James Greenhalgh  
> wrote:
> 
> > 2015-11-27  James Greenhalgh  
> >
> > * config/aarch64/aarch64-protos.h
> > (aarch64_cannot_change_mode_class): Bring back.
> > * config/aarch64/aarch64.c
> > (aarch64_cannot_change_mode_class): Likewise.
> > * config/aarch64/aarch64.h (CANNOT_CHANGE_MODE_CLASS): Likewise.
> > * config/aarch64/aarch64.md (aarch64_movdi_low): Use
> > zero_extract rather than truncate.
> > (aarch64_movdi_high): Likewise.
> >
> > 2015-11-27  James Greenhalgh  
> >
> > * gcc.dg/torture/pr67609.c: New.
> >
> 
> + detailed dicussion.  In all other cases, we want to be premissive
> 
> s/premissive/permissive/
> 
> OK /Marcus

Thanks.

This has had a week or so to soak on trunk now, is it OK to backport to GCC
5 and 4.9?

The patch applies as-good-as clean, with only a little bit to fix up in
aarch64-protos.h to keep alphabetical order, and I've bootstrapped and tested
the backports with no issue.

Cheers,
James



Re: [build] Only support -gstabs on Mac OS X if assember supports it (PR target/67973)

2015-12-14 Thread Iain Sandoe

> On 14 Dec 2015, at 11:21, Rainer Orth  wrote:
> 
> Hi Iain,
> 
>>> On 14 Dec 2015, at 11:13, Rainer Orth  wrote:
>>> 
> However, I'm not really comfortable with this solution.  Initially, I
> forgot to wrap the -Q option to as in %{gstabs*:...}, which lead to a
> bootstrap failure: the gas- and LLVM-based assemblers differ in a
> number of other ways, as can be seen when comparing gcc/auto-host.h:
 
 FAOD, 
 the changes below only occur if you omit the guard on “-Q” ?
 or they are present always?
>>> 
>>> they are from previous builds, one with the LLVM-based /usr/bin/as, the
>>> other configure with --with-as=/vol/gcc/bin/as-6.4 (gas-based as from
>>> Xcode 6.4).
>> 
>> Hrm, this needs more investigation, and will affect 10.10 too, since xc7 is
>> the default there.
>> (separate issue, let’s start a new PR, or at least a new thread).
> 
> right, but it's only an issue if you switch assemblers (or linkers) used
> by gcc without rebuilding.  This has never been safe on any platform.

The issue that worries me is that the new assembler supports .cfi_xxx (YAY!), 
but the Darwin port is not 100% ready for it yet (BOO!) (I have patches, and 
expect to make them available for folks to try in the next ~ 2 weeks).  
However, still not sure that they would exactly be stage3 stuff.

Did you say that bootstrap fails if -Q is jammed in everywhere?
(that would be a short-term safety net, falling back to the cctools assembler).

Iain



Fix omnetpp miscompilation

2015-12-14 Thread Jan Hubicka
Hi,
this patch is not so nice fix for quite nasty problem with devirtualization and
decl merging.  Once we determine the call target, we use
possible_polymorphic_call_target_p to check if this is type consistent and if
not we redirect to builtin_unreachable and do not account this as a hint
to the inlining/cloning.

THe testing is done by collecing all polymorphic target of a given virtual call
and checking that the determined target is in the list.  What happens in
omnetpp is that it has external vtables which do not have symbol nodes
associated with them (as they are not use by the code only by the devirt
machinery). Consequently they are not seeen by lto-symtab.c and kept unmerged.
Now ipa-cp has different copy of same vtable than what is used by the the
ipa-devirt and each of them has different function unmerged external decls in
them.  Since ipa-cp pulls out one decl and possible_polymorphic_call_target_p
has other decl and we do not have the the alias link between them, we
incorrectly think the function is not in the list.

The patch makes the virtuals to be always merged, so ODR rule is represented
correctly. I am also working on patch that makes the trensparent alias links
for duplicated decls at node creation time that will also fix the wrong code,
but this patch also improves the hitrate of devirtualization code because
it now can get to the vtable constructors more often (on Firefox from 800
to 3900 devirtualizations), so this patch still makes sense.

An alternative solution I will try again (probably only next stage1) is to
record all those vtables into symbol table at streaming time. I did that in
past but the extra streaming overhead did not seem to pay back for the extra
devirtualizaitons achieved.

Note that the bug to some degree exists at the release branches too.  These
still won't merge declarations that have no symbol tables at all.  I will try
to make a testcase.

Bootstrapped/regtested x86_64-linux, will commit it after re-testing at Firefox
and libreoffice.

Honza
PR lto/68878
* lto-symtab.c (lto_symtab_prevailing_virtual_decl): New function.
* lto-symtab.h (lto_symtab_prevailing_virtual_decl): Declare.
(lto_symtab_prevailing_decl): Use it.
Index: lto/lto-symtab.c
===
--- lto/lto-symtab.c(revision 231581)
+++ lto/lto-symtab.c(working copy)
@@ -968,3 +975,33 @@ lto_symtab_merge_symbols (void)
}
 }
 }
+
+/* Virtual tables may matter for code generation even if they are not
+   directly refernced by the code because they may be used for 
devirtualizaiton.
+   For this reason it is important to merge even virtual tables that have no
+   associated symbol table entries.  Without doing so we lose optimization
+   oppurtunities by losing track of the vtable constructor.
+   FIXME: we probably ought to introduce explicit symbol table entries for
+   those before streaming.  */
+
+tree
+lto_symtab_prevailing_virtual_decl (tree decl)
+{
+  gcc_checking_assert (!type_in_anonymous_namespace_p (DECL_CONTEXT (decl))
+  && DECL_ASSEMBLER_NAME_SET_P (decl));
+
+  symtab_node *n = symtab_node::get_for_asmname
+(DECL_ASSEMBLER_NAME (decl));
+  while (n && ((!DECL_EXTERNAL (n->decl) && !TREE_PUBLIC (n->decl))
+  || !DECL_VIRTUAL_P (n->decl)))
+n = n->next_sharing_asm_name;
+  if (n)
+{
+  lto_symtab_prevail_decl (n->decl, decl);
+  decl = n->decl;
+}
+  else
+symtab_node::get_create (decl);
+
+  return decl;
+}
Index: lto/lto-symtab.h
===
--- lto/lto-symtab.h(revision 231581)
+++ lto/lto-symtab.h(working copy)
@@ -20,6 +20,7 @@ along with GCC; see the file COPYING3.
 extern void lto_symtab_merge_decls (void);
 extern void lto_symtab_merge_symbols (void);
 extern tree lto_symtab_prevailing_decl (tree decl);
+extern tree lto_symtab_prevailing_virtual_decl (tree decl);
 
 /* Mark DECL to be previailed by PREVAILING.
Use DECL_ABSTRACT_ORIGIN and DECL_CHAIN as special markers; those do not
@@ -31,6 +32,7 @@ inline void
 lto_symtab_prevail_decl (tree prevailing, tree decl)
 {
   gcc_checking_assert (DECL_ABSTRACT_ORIGIN (decl) != error_mark_node);
+  gcc_assert (TREE_PUBLIC (decl) || DECL_EXTERNAL (decl));
   DECL_CHAIN (decl) = prevailing;
   DECL_ABSTRACT_ORIGIN (decl) = error_mark_node;
 }
@@ -43,5 +45,12 @@ lto_symtab_prevailing_decl (tree decl)
   if (DECL_ABSTRACT_ORIGIN (decl) == error_mark_node)
 return DECL_CHAIN (decl);
   else
-return decl;
+{
+  if ((TREE_CODE (decl) == VAR_DECL || TREE_CODE (decl) == FUNCTION_DECL)
+ && DECL_VIRTUAL_P (decl)
+ && (TREE_PUBLIC (decl) || DECL_EXTERNAL (decl))
+ && !symtab_node::get (decl))
+   return lto_symtab_prevailing_virtual_decl (decl);
+  return decl;
+}
 }


Re: [PR 68851] Do not collect thunks in callect_callers

2015-12-14 Thread Jan Hubicka
> Hi,
> 
> in PR 68851, IPA-CP decides to clone for all known contexts, even when
> it is not local because the code is not supposed to grow anyway.  The
> code doing that uses collect_callers method of cgraph_edge to find all
> the edges which are to be redirected in such case.  However, there is
> also an edge from a hunk to the cloned node and that gets collected
> and redirected too.  Later on, this inconsistency (a thunk calling a
> wrong node) leads to an assert in comdat handling, but it can lead to
> all sorts of trouble.
> 
> The following patch fixes it by checking that thunks are not added
> into the vector in that method (which is only used by IPA-CP at this
> one spot and IPA-SRA so it should be fine).  Bootstrapped and tested
> on x86_64-linux.  OK for trunk?  And perhaps for the gcc-5 branch too?
> 
> Thanks,
> 
> Martin
> 
> 
> 2015-12-14  Martin Jambor  
> 
>   PR ipa/68851
>   * cgraph.c (collect_callers_of_node_1): Do not collect thunks.
>   * cgraph.h (cgraph_node): Change comment of collect_callers.
> 
> testsuite/
>   * g++.dg/ipa/pr68851.C: New test.

This is OK (for branches too if it won't cause issues for a week)
thanks!
Honza


Re: [Fortran, Patch] Memory sync after coarray image control statements and assignment

2015-12-14 Thread Tobias Burnus

Alessandro Fanfarillo wrote:

In attachment the patch for gcc5-branch.


Commited as Rev. 231626.

Tobias


2015-12-10 10:03 GMT+01:00 Tobias Burnus :

Hi Alessandro (off list),

On Thu, Dec 10, 2015 at 09:44:16AM +0100, Alessandro Fanfarillo wrote:

Yes, the patch should be applied to GCC 5 too.

Can you create a patch? Requires a rediff plus removing the bits which
do not exist on GCC 5 - like events.


Index: gcc/fortran/ChangeLog
===
--- gcc/fortran/ChangeLog	(Revision 231625)
+++ gcc/fortran/ChangeLog	(Arbeitskopie)
@@ -1,3 +1,19 @@
+2015-12-09  Tobias Burnus  
+	Alessandro Fanfarillo 
+
+	Backport from mainline.
+	2015-12-09  Tobias Burnus  
+	Alessandro Fanfarillo 
+
+	* trans.c (gfc_allocate_using_lib,gfc_deallocate_with_status):
+	Introducing __asm__ __volatile__ ("":::"memory")
+	after image control statements.
+	* trans-stmt.c 	(gfc_trans_sync, gfc_trans_event_post_wait,
+	gfc_trans_lock_unlock, gfc_trans_critical): Ditto.
+	* trans-intrinsic.c (gfc_conv_intrinsic_caf_get,
+	conv_caf_send): Introducing __asm__ __volatile__ ("":::"memory")
+	after send, before get and around sendget.
+
 2015-12-04  Release Manager
 
 	* GCC 5.3.0 released.
Index: gcc/fortran/trans-intrinsic.c
===
--- gcc/fortran/trans-intrinsic.c	(Revision 231625)
+++ gcc/fortran/trans-intrinsic.c	(Arbeitskopie)
@@ -1221,12 +1221,22 @@
   /* No overlap possible as we have generated a temporary.  */
   if (lhs == NULL_TREE)
 may_require_tmp = boolean_false_node;
+  
+  /* It guarantees memory consistency within the same segment */
+  tmp = gfc_build_string_const (strlen ("memory")+1, "memory"),
+tmp = build5_loc (input_location, ASM_EXPR, void_type_node,
+		  gfc_build_string_const (1, ""),
+		  NULL_TREE, NULL_TREE,
+		  tree_cons (NULL_TREE, tmp, NULL_TREE),
+		  NULL_TREE);
+  ASM_VOLATILE_P (tmp) = 1;
+  gfc_add_expr_to_block (>pre, tmp);
 
   tmp = build_call_expr_loc (input_location, gfor_fndecl_caf_get, 9,
 			 token, offset, image_index, argse.expr, vec,
 			 dst_var, kind, lhs_kind, may_require_tmp);
   gfc_add_expr_to_block (>pre, tmp);
-
+  
   if (se->ss)
 gfc_advance_se_ss_chain (se);
 
@@ -1386,6 +1396,16 @@
 {
   tree rhs_token, rhs_offset, rhs_image_index;
 
+  /* It guarantees memory consistency within the same segment */
+  tmp = gfc_build_string_const (strlen ("memory")+1, "memory"),
+	tmp = build5_loc (input_location, ASM_EXPR, void_type_node,
+			  gfc_build_string_const (1, ""),
+			  NULL_TREE, NULL_TREE,
+			  tree_cons (NULL_TREE, tmp, NULL_TREE),
+			  NULL_TREE);
+  ASM_VOLATILE_P (tmp) = 1;
+  gfc_add_expr_to_block (, tmp);
+
   caf_decl = gfc_get_tree_for_caf_expr (rhs_expr);
   if (TREE_CODE (TREE_TYPE (caf_decl)) == REFERENCE_TYPE)
 	caf_decl = build_fold_indirect_ref_loc (input_location, caf_decl);
@@ -1401,6 +1421,17 @@
   gfc_add_expr_to_block (, tmp);
   gfc_add_block_to_block (, _se.post);
   gfc_add_block_to_block (, _se.post);
+
+  /* It guarantees memory consistency within the same segment */
+  tmp = gfc_build_string_const (strlen ("memory")+1, "memory"),
+tmp = build5_loc (input_location, ASM_EXPR, void_type_node,
+		  gfc_build_string_const (1, ""),
+		  NULL_TREE, NULL_TREE,
+		  tree_cons (NULL_TREE, tmp, NULL_TREE),
+		  NULL_TREE);
+  ASM_VOLATILE_P (tmp) = 1;
+  gfc_add_expr_to_block (, tmp);
+
   return gfc_finish_block ();
 }
 
Index: gcc/fortran/trans-stmt.c
===
--- gcc/fortran/trans-stmt.c	(Revision 231625)
+++ gcc/fortran/trans-stmt.c	(Arbeitskopie)
@@ -829,6 +829,17 @@
    errmsg, errmsg_len);
   gfc_add_expr_to_block (, tmp);
 
+  /* It guarantees memory consistency within the same segment */
+  tmp = gfc_build_string_const (strlen ("memory")+1, "memory"),
+	tmp = build5_loc (input_location, ASM_EXPR, void_type_node,
+			  gfc_build_string_const (1, ""),
+			  NULL_TREE, NULL_TREE,
+			  tree_cons (NULL_TREE, tmp, NULL_TREE),
+			  NULL_TREE);
+  ASM_VOLATILE_P (tmp) = 1;
+
+  gfc_add_expr_to_block (, tmp);
+
   if (stat2 != NULL_TREE)
 	gfc_add_modify (, stat2,
 			fold_convert (TREE_TYPE (stat2), stat));
@@ -931,6 +942,20 @@
 			   fold_convert (integer_type_node, images));
 }
 
+  /* Per F2008, 8.5.1, a SYNC MEMORY is implied by calling the
+ image control statements SYNC IMAGES and SYNC ALL.  */
+  if (flag_coarray == GFC_FCOARRAY_LIB)
+{
+  tmp = gfc_build_string_const (strlen ("memory")+1, "memory"),
+	tmp = build5_loc (input_location, ASM_EXPR, void_type_node,
+			  gfc_build_string_const (1, ""),
+			  NULL_TREE, NULL_TREE,
+			  tree_cons (NULL_TREE, tmp, NULL_TREE),
+			  NULL_TREE);
+  ASM_VOLATILE_P (tmp) = 1;
+ 

[PTX] parameters and return values

2015-12-14 Thread Nathan Sidwell
This patch further cleans up the parameter passing and return machinery.  Now 
both the PTX prototyp emission and the regular gcc  hooks use the same 
underlying functions (or the former uses the gcc hooks directly).  There were a 
few inconsistencies with promotion of QH & HI mode registers -- for instance, 
PROMOTE_MODE promoted them, but the parameter passing and returh didn't always 
appear to do that.  This changes things to consistently always promote, which 
apart from being simpler, is more in keeping with C expectations.  PARM_BOUNDARY 
was set at 1 byte, and nvptx_function_arg_boundary did some rather funky 
calculations, again resolved by setting  PARM boundary to 4 bytes and removing 
the special boundary handling.


The parameter and return codes was nearly unconditionally just using the modee 
to determine any promotion behavior -- except for some cases of checking for an 
aggregate type.  This checks the type more rigorously, to prevent passing more 
complex types (such asvectors) that happen to get a simple mode from being 
passed as the integer type the mode corresponds to.


Finally, figured out the C++ named return value case.  For some types returned 
by additional parameter, GCC may also return a pointer to that object in the 
regular return register.  Whether it does so is optimization-dependent.  This 
causes problems for PTX because it'll mean the PTX prototype would be 
optimization-dependent, which is clearly wrong.  AFAICT, because of the 
optimization-dependence, no actual code can make use of the returned pointer 
itself -- even in TUs containing both the caller and callee.  So of the two 
alternatives that occurred to me,  I chose the one that doesn't mention the 
returned pointer type, and inhibits the copy of the retval register to the 
(non-existent) param region.  (The other alternative was to always declare such 
a return parameter on functions that could be optimized to return one, and have 
it contain garbage  in the unoptimized case).


Added a bunch of C and C++ testcases.

nathan
2015-12-14  Nathan Sidwell  

	gcc/
	* config/nvptx/nvptx.h (PARM_BOUNDARY): Set to 32.
	* config/nvptx/nvptx.c (PASS_IN_REG_P, RETURN_IN_REG_P): Delete.
	(pass_in_memory, promote_arg, promote_return): New.
	(nvptx_function_arg_boundary): Delete.
	(nvptx_function_value): Use promote_return.
	(nvptx_pass_by_reference): Use pass_in_memory.
	(nvptx_return_in_memory): Use pass_in_memory.
	(nvptx_promote_function_mode): Use promote_arg.
	(write_arg): Adjust arg splitting logic.
	(write_return): Check and clear ret_reg_mode, if needed.
	(write_fn_proto, nvptx_declare_function_name): Adust write_return
	calls.
	(TARGET_RUNCTION_ARG_BOUNDARY,
	TARGET_FUNCTION_ARG_ROUND_BOUNDARY): Don't override.

	gcc/testsuite/
	* g++.dg/abi/nvptx-nrv1.C: New.
	* g++.dg/abi/nvptx-ptrmem1.C: New.
	* gcc.target/nvptx/abi-complex-arg.c: New.
	* gcc.target/nvptx/abi-complex-ret.c: New.
	* gcc.target/nvptx/abi-enum-arg.c: New.
	* gcc.target/nvptx/abi-enum-ret.c: New.
	* gcc.target/nvptx/abi-knr-arg.c: New.
	* gcc.target/nvptx/abi-knr-ret.c: New.
	* gcc.target/nvptx/abi-scalar-arg.c: New.
	* gcc.target/nvptx/abi-scalar-ret.c: New.
	* gcc.target/nvptx/abi-struct-arg.c: New.
	* gcc.target/nvptx/abi-struct-ret.c: New.
	* gcc.target/nvptx/abi-vararg-1.c: New.
	* gcc.target/nvptx/abi-vararg-2.c: New.
	* gcc.target/nvptx/abi-vect-arg.c: New.
	* gcc.target/nvptx/abi-vect-ret.c: New.

Index: gcc/config/nvptx/nvptx.h
===
--- gcc/config/nvptx/nvptx.h	(revision 231624)
+++ gcc/config/nvptx/nvptx.h	(working copy)
@@ -46,7 +46,8 @@
 /* Chosen such that we won't have to deal with multi-word subregs.  */
 #define UNITS_PER_WORD 8
 
-#define PARM_BOUNDARY 8
+/* Alignments in bits.  */
+#define PARM_BOUNDARY 32
 #define STACK_BOUNDARY 64
 #define FUNCTION_BOUNDARY 32
 #define BIGGEST_ALIGNMENT 64
Index: gcc/config/nvptx/nvptx.c
===
--- gcc/config/nvptx/nvptx.c	(revision 231624)
+++ gcc/config/nvptx/nvptx.c	(working copy)
@@ -365,18 +365,6 @@ nvptx_emit_joining (unsigned mask, bool
 }
 }
 
-#define PASS_IN_REG_P(MODE, TYPE)\
-  ((GET_MODE_CLASS (MODE) == MODE_INT\
-|| GET_MODE_CLASS (MODE) == MODE_FLOAT			\
-|| ((GET_MODE_CLASS (MODE) == MODE_COMPLEX_INT		\
-	 || GET_MODE_CLASS (MODE) == MODE_COMPLEX_FLOAT)	\
-	&& !AGGREGATE_TYPE_P (TYPE)))\
-   && (MODE) != TImode)
-
-#define RETURN_IN_REG_P(MODE)			\
-  ((GET_MODE_CLASS (MODE) == MODE_INT		\
-|| GET_MODE_CLASS (MODE) == MODE_FLOAT)	\
-   && GET_MODE_SIZE (MODE) <= 8)
 
 /* Perform a mode promotion for a function argument with MODE.  Return
the promoted mode.  */
@@ -389,6 +377,61 @@ arg_promotion (machine_mode mode)
   return mode;
 }
 
+/* Determine whether MODE and TYPE (possibly NULL) should be passed or
+   returned in memory.  Integer and floating types supported by the
+   machine are passed in registers, 

[PR 68851] Do not collect thunks in callect_callers

2015-12-14 Thread Martin Jambor
Hi,

in PR 68851, IPA-CP decides to clone for all known contexts, even when
it is not local because the code is not supposed to grow anyway.  The
code doing that uses collect_callers method of cgraph_edge to find all
the edges which are to be redirected in such case.  However, there is
also an edge from a hunk to the cloned node and that gets collected
and redirected too.  Later on, this inconsistency (a thunk calling a
wrong node) leads to an assert in comdat handling, but it can lead to
all sorts of trouble.

The following patch fixes it by checking that thunks are not added
into the vector in that method (which is only used by IPA-CP at this
one spot and IPA-SRA so it should be fine).  Bootstrapped and tested
on x86_64-linux.  OK for trunk?  And perhaps for the gcc-5 branch too?

Thanks,

Martin


2015-12-14  Martin Jambor  

PR ipa/68851
* cgraph.c (collect_callers_of_node_1): Do not collect thunks.
* cgraph.h (cgraph_node): Change comment of collect_callers.

testsuite/
* g++.dg/ipa/pr68851.C: New test.
---
 gcc/cgraph.c   |  3 ++-
 gcc/cgraph.h   |  2 +-
 gcc/testsuite/g++.dg/ipa/pr68851.C | 29 +
 3 files changed, 32 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/ipa/pr68851.C

diff --git a/gcc/cgraph.c b/gcc/cgraph.c
index c8c3370..5a9c2a2 100644
--- a/gcc/cgraph.c
+++ b/gcc/cgraph.c
@@ -2592,7 +2592,8 @@ collect_callers_of_node_1 (cgraph_node *node, void *data)
 
   if (avail > AVAIL_INTERPOSABLE)
 for (cs = node->callers; cs != NULL; cs = cs->next_caller)
-  if (!cs->indirect_inlining_edge)
+  if (!cs->indirect_inlining_edge
+ && !cs->caller->thunk.thunk_p)
 redirect_callers->safe_push (cs);
   return false;
 }
diff --git a/gcc/cgraph.h b/gcc/cgraph.h
index 0a09391..ba14215 100644
--- a/gcc/cgraph.h
+++ b/gcc/cgraph.h
@@ -1070,7 +1070,7 @@ public:
   cgraph_edge *get_edge (gimple *call_stmt);
 
   /* Collect all callers of cgraph_node and its aliases that are known to lead
- to NODE (i.e. are not overwritable).  */
+ to NODE (i.e. are not overwritable) and that are not thunks.  */
   vec collect_callers (void);
 
   /* Remove all callers from the node.  */
diff --git a/gcc/testsuite/g++.dg/ipa/pr68851.C 
b/gcc/testsuite/g++.dg/ipa/pr68851.C
new file mode 100644
index 000..659e4cd
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ipa/pr68851.C
@@ -0,0 +1,29 @@
+// { dg-do compile }
+// { dg-options "-O3" }
+
+class A;
+class B {
+public:
+  operator A *() const;
+};
+class A {
+public:
+  virtual bool isFormControlElement() const {}
+};
+class C {
+  struct D {
+B element;
+  };
+  bool checkPseudoClass(const D &, int &) const;
+};
+class F {
+  virtual bool isFormControlElement() const;
+};
+class G : A, F {
+  bool isFormControlElement() const {}
+};
+bool C::checkPseudoClass(const D , int &) const {
+  A  = *p1.element;
+  a.isFormControlElement();
+  a.isFormControlElement() || a.isFormControlElement();
+}
-- 
2.6.3



Re: [PATCH] doc: discourage use of __attribute__((optimize())) in production code

2015-12-14 Thread Trevor Saunders
On Mon, Dec 14, 2015 at 05:40:57PM +0100, Markus Trippelsdorf wrote:
> On 2015.12.14 at 11:20 -0500, Trevor Saunders wrote:
> > On Mon, Dec 14, 2015 at 10:01:27AM +0100, Richard Biener wrote:
> > > On Sun, Dec 13, 2015 at 9:03 PM, Andi Kleen  wrote:
> > > > Markus Trippelsdorf  writes:
> > > >
> > > >> Many developers are still using __attribute__((optimize())) in
> > > >> production code, although it quite broken.
> > > >
> > > > Wo reads documentation? @) If you want to discourage it better warn once
> > > > at runtime.
> > > 
> > > We're also quite heavily using it in LTO internally now.
> > 
> > besides that does this really make sense?  I suspect very few people are
> > using this for the fun of it.  I'd guess most usage is to disable
> > optimizations to work around bugs, or maybe trying to get a very hot
> > function optimized more.  Either way I suspect its only used by people
> > with good reason and this would just really iritate them.
> 
> Well, if you look at bugzilla you'll find several wrong code bugs caused
> by this attribute, e.g.: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59262

is that wrong code? it looks to me like somebody thinks its missed
optimization, and personally I'd say its not a bug just not fully
expected behavior.

> Also Richi stated in the past (I quote):
> »I consider the optimize attribute code seriously broken and
> unmaintained (but sometimes useful for debugging - and only that).«
> 
> https://gcc.gnu.org/ml/gcc/2012-07/msg00201.html

I'm certainly not recommending its use, and noting in docs that its
results can be suprising seems reasonable, but runtime warnings seems
likely to annoy users.

Trev

> 
> -- 
> Markus


Re: [Patch, libstdc++/68863] Let lookahead regex use captured contents

2015-12-14 Thread Tim Shen
On Mon, Dec 14, 2015 at 3:00 AM, Jonathan Wakely  wrote:
> I don't fully understand the patch, but it's OK for trunk, and if
> you're confident it's definitely correct and safe it's OK for the
> gcc-5 and gcc-4_9 branches too.
>
> Was it just completely wrong before, creating a vector of
> default-constructed match results, that were not matched?
>

Yes, that's the case. I'm not sure why I missed this. Perhaps all I
was focusing on is to get the captures in the lookahead sub-expression
out of it, so later user can use it; but I didn't think about the
other way around.

-- 
Regards,
Tim Shen


Re: [Patch, libstdc++/68863] Let lookahead regex use captured contents

2015-12-14 Thread Jonathan Wakely

On 14/12/15 09:58 -0800, Tim Shen wrote:

On Mon, Dec 14, 2015 at 3:00 AM, Jonathan Wakely  wrote:

I don't fully understand the patch, but it's OK for trunk, and if
you're confident it's definitely correct and safe it's OK for the
gcc-5 and gcc-4_9 branches too.

Was it just completely wrong before, creating a vector of
default-constructed match results, that were not matched?



Yes, that's the case. I'm not sure why I missed this. Perhaps all I
was focusing on is to get the captures in the lookahead sub-expression
out of it, so later user can use it; but I didn't think about the
other way around.


OK then I do understand it and it's definitely OK to commit :-)

Thanks.



RE: [BUILDROBOT] "error: null argument where non-null required" on multiple targets

2015-12-14 Thread Moore, Catherine


> -Original Message-
> From: Jan-Benedict Glaw [mailto:jbg...@lug-owl.de]
> Sent: Sunday, December 06, 2015 4:49 PM
> To: Denis Chertykov; Moore, Catherine; Eric Christopher; Matthew Fortune;
> David Edelsohn; Alexandre Oliva; Kaz Kojima; Oleg Endo
> Cc: Jonathan Wakely; gcc-patches
> Subject: [BUILDROBOT] "error: null argument where non-null required" on
> multiple targets
> 
> Hi!
> 
> I'm not 100% sure, but I *think* that this patch
> 
>   2015-11-15  Jonathan Wakely  
> 
>   PR libstdc++/68353
>   * include/bits/basic_string.h: Test value of
> _GLIBCXX_USE_C99_WCHAR
>   not whether it is defined.
>   * include/ext/vstring.h: Likewise.
> 
> uncovered errors like this:
> 
> /---
> | g++ -fno-PIE -c  -DIN_GCC_FRONTEND -DIN_GCC_FRONTEND -g -O2 -
> DIN_GCC  -DCROSS_DIRECTORY_STRUCTURE   -fno-exceptions -fno-rtti -
> fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -
> Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual -pedantic -
> Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Werror -fno-
> common  -DHAVE_CONFIG_H -I. -Ic-family -I../../../gcc/gcc -
> I../../../gcc/gcc/c-family -I../../../gcc/gcc/../include -
> I../../../gcc/gcc/../libcpp/include -I/opt/cfarm/mpc/include  -
> I../../../gcc/gcc/../libdecnumber -I../../../gcc/gcc/../libdecnumber/dpd -
> I../libdecnumber -I../../../gcc/gcc/../libbacktrace   -o c-family/c-common.o -
> MT c-family/c-common.o -MMD -MP -MF c-family/.deps/c-common.TPo
> ../../../gcc/gcc/c-family/c-common.c
> | In file included from ../../../gcc/gcc/c-family/c-common.c:31:0:
> | ../../../gcc/gcc/c-family/c-common.c: In function ‘void
> c_common_nodes_and_builtins()’:
> | ../../../gcc/gcc/stringpool.h:39:53: error: null argument where non-null
> required (argument 1) [-Werror=nonnull]
> |  ? get_identifier_with_length ((str), strlen (str))  \
> |  ^
> |
> | ../../../gcc/gcc/c-family/c-common.c:5501:22: note: in expansion of macro
> ‘get_identifier’
> |char32_type_node = get_identifier (CHAR32_TYPE);
> |   ^~
> |
> | cc1plus: all warnings being treated as errors
> | Makefile:1085: recipe for target 'c-family/c-common.o' failed
> \---
> 
> Shows up when using a newly build compiler to build the contrib/config-
> list.mk targets. So these targets probably need some small touch-ups.
> 
> avr-rtems http://toolchain.lug-
> owl.de/buildbot/show_build_details.php?id=478544
> mipsel-elfhttp://toolchain.lug-
> owl.de/buildbot/show_build_details.php?id=478844
> mipsisa64r2-sde-elf   http://toolchain.lug-
> owl.de/buildbot/show_build_details.php?id=478855
> mipsisa64sb1-elf  http://toolchain.lug-
> owl.de/buildbot/show_build_details.php?id=478865
> mips-rtemshttp://toolchain.lug-
> owl.de/buildbot/show_build_details.php?id=478877
> powerpc-eabialtivec   http://toolchain.lug-
> owl.de/buildbot/show_build_details.php?id=478922
> powerpc-eabispe   http://toolchain.lug-
> owl.de/buildbot/show_build_details.php?id=478932
> powerpc-rtems http://toolchain.lug-
> owl.de/buildbot/show_build_details.php?id=478956
> ppc-elf   http://toolchain.lug-
> owl.de/buildbot/show_build_details.php?id=478968
> sh-superh-elf http://toolchain.lug-
> owl.de/buildbot/show_build_details.php?id=479077
> 

Is there an easy way to reproduce the MIPS problems that you reported?  I don't 
seem to be able to do it with a cross-compiler targeting mipsel-elf.
Thanks,
Catherine



Re: [Fortran, Patch} Fix ICE for coarray Critical inside module procedure

2015-12-14 Thread Tobias Burnus

Dear Alessandro,

Alessandro Fanfarillo wrote:

the compiler returns an ICE when a coarray critical section is used
inside a module procedure.
The symbols related with the lock variables were left uncommitted
inside resolve_critical(). A gfc_commit_symbol after each symbol or a
gfc_commit_symbols at the end of resolve_critical() fixed the issue.

The latter solution is proposed in the attached patch.
Built and regtested on x86_64-pc-linux-gnu


Looks good to me.


PS: This patch should be also included in GCC 5.


Yes, that's fine with me.

Tobias

PS: I saw that you now have a GCC account, which you can use to commit 
to both the trunk and gcc-5-branch. See https://gcc.gnu.org/svnwrite.html.
Additionally, you should update MAINTAINERS (trunk only) by adding 
yourself under "Write After Approval"; you can simply commit this patch 
yourself, but you should write an email to gcc-patches with the patch - 
like Alan did at https://gcc.gnu.org/ml/gcc-patches/2015-10/msg02390.html


Re: [PATCH] Handle sizes and kinds params of GOACC_paralllel in find_func_clobbers

2015-12-14 Thread Tom de Vries

On 11/12/15 13:31, Richard Biener wrote:

On Fri, 11 Dec 2015, Tom de Vries wrote:


Hi,

while testing the oacc kernels patch series on top of trunk, using the optimal
handling of BUILTIN_IN_GOACC_PARALLEL in fipa-pta  I ran into a failure where
the stores to the omp_data_sizes array were removed by dse.

The call bb in the failing testcase normally looks like this:
...
   :
   .omp_data_arr.10.D.2550 = c.2_18;
   .omp_data_arr.10.c = 
   .omp_data_arr.10.D.2553 = b.1_15;
   .omp_data_arr.10.b = 
   .omp_data_arr.10.D.2556 = a.0_11;
   .omp_data_arr.10.a = 
D.2572 = n_6(D);
   .omp_data_arr.10.n = 
   .omp_data_sizes.11[0] = _8;
   .omp_data_sizes.11[1] = 0;
   .omp_data_sizes.11[2] = _8;
   .omp_data_sizes.11[3] = 0;
   .omp_data_sizes.11[4] = _8;
   .omp_data_sizes.11[5] = 0;
   .omp_data_sizes.11[6] = 4;
   __builtin_GOACC_parallel_keyed (-1, foo._omp_fn.0, 7,
   &.omp_data_arr.10,
   &.omp_data_sizes.11,
   &.omp_data_kinds.12, 0);
...

Dse removed the stores, because omp_data_sizes was not marked as a used by
__builtin_GOACC_parallel_keyed.

We pretend in fipa-pta that __builtin_GOACC_parallel_keyed is never called,
and instead handle the call foo._omp_fn.0 (&.omp_data_arr.10). That means the
use of omp_data_sizes by __builtin_GOACC_parallel_keyed is ignored.

This patch fixes that (for both sizes and kinds arrays), as confirmed with a
test run of target-libgomp c.exp on the accelerator.

OK for stage3 if bootstrap and reg-test succeeds?


Ok, though techincally they are used by the OMP runtime (but this we
could only represent by letting them escape).  I wonder what can of
worms we'd open if you LTO the OMP runtime in ... (and thus
builtins map to real functions!)


I guess for fipa-pta, when encoutering a call to the built-in, we could 
add the equivalent of initial constraints to the runtime function.


I'd also imagine we don't want the built-in to be inlined,  since that 
would break the optimal treatment of the built-in.


Thanks,
- Tom



Re: PING^1: [PATCH] Add TYPE_EMPTY_RECORD for C++ empty class

2015-12-14 Thread H.J. Lu
On Mon, Dec 14, 2015 at 12:43 PM, Jason Merrill  wrote:
> On 12/14/2015 03:39 PM, H.J. Lu wrote:
>>
>> On Mon, Dec 14, 2015 at 12:16 PM, Jason Merrill  wrote:
>>>
>>> On 12/12/2015 01:42 PM, Marc Glisse wrote:


 On Sat, 12 Dec 2015, Jakub Jelinek wrote:

> On Sat, Dec 12, 2015 at 09:51:23AM -0500, Jason Merrill wrote:
>>
>>
>> On 12/11/2015 06:52 PM, H.J. Lu wrote:
>>>
>>>
>>> On Thu, Dec 10, 2015 at 3:24 AM, Richard Biener
>>>  wrote:


 On Wed, Dec 9, 2015 at 10:31 PM, Markus Trippelsdorf
  wrote:
>
>
> On 2015.12.09 at 10:53 -0800, H.J. Lu wrote:
>>
>>
>>
>> Empty C++ class is a corner case which isn't covered in psABI nor
>> C++ ABI.
>> There is no mention of "empty record" in GCC documentation.  But
>> there are
>> plenty of "empty class" in gcc/cp.  This change affects all
>> targets.  C++ ABI
>> should specify how it should be passed.




 About this patch, aren't we supposed to enable new C++ ABIs with
 -fabi-version=42 (or whatever the next number is)?
>>>
>>>
>>>
>>> Yes, the patch should definitely make this conditional on
>>> abi_version_at_least.
>>>
> There is a C++ ABI mailinglist, where you could discuss this issue:
> http://sourcerytools.com/cgi-bin/mailman/listinfo/cxx-abi-dev



 Yep.  As long as the ABI doesn't state how to pass those I'd rather
 _not_ change GCCs way.
>>>
>>>
>>>
>>> It is agreed that GCC is wrong on this:
>>>
>>>
>>>
>>> http://sourcerytools.com/pipermail/cxx-abi-dev/2015-December/002876.html
>>>
>>
>> Yes, I think this is just a (nasty) bug on some GCC targets.
>
>
>
> Well, the argument in that thread is weird, because C and C++ empty
> structs
> are different, so it isn't surprising they are passed differently.
> C++ makes those sizeof == 1, while C has them sizeof == 0.



 Maybe it isn't surprising, but it isn't particularly helpful either. It
 increases the number of places where the 2 are incompatible.
 (I personally don't care about empty C structs)
>>>
>>>
>>>
>>> Yep.  The C standard doesn't have empty structs; it's a GNU extension.
>>> But
>>> in any case argument passing can be compatible between C and C++, so it
>>> really should be.
>>>
>>>
>>
>> Before I make any changes, I'd like to ask if we should make
>> argument passing can be compatible between C and C++ for
>> all targets GCC support or just x86.
>
>
> All.

Here is the patch to guard this ABI change with the ABI level 10,
which is updated in GCC 6.  OK for master if there is no regression
on x86?

The patch for non-x86 targets is at

https://gcc.gnu.org/ml/gcc-patches/2015-12/msg01063.html


-- 
H.J.
From fccd449a091589fedaf6ee4998271a16d93147fc Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Sun, 15 Nov 2015 13:19:05 -0800
Subject: [PATCH] Add TYPE_EMPTY_RECORD for C++ empty class

Empty record should be returned and passed the same way in C and C++.
This patch overloads a bit, side_effects_flag, in tree_base for C++
empty class.  Middle-end and x86 backend are updated to ignore empty
records for parameter passing and function value return.  Other targets
may need similar changes.

get_ref_base_and_extent is changed to set bitsize to 0 for empty records
so that when ref_maybe_used_by_call_p_1 calls get_ref_base_and_extent to
get 0 as the maximum size on empty record.  Otherwise, find_tail_calls
won't perform tail call optimization for functions with empty record
parameters, as shown in g++.dg/pr60336-1.C and g++.dg/pr60336-2.C.

This ABI change is enabled only if the ABI level is at least 10, which
is updated in GCC 6.

gcc/

	PR c++/60336
	PR middle-end/67239
	PR target/68355
	* calls.c (initialize_argument_information): Warn empty record
	if they are used in a variable argument list or aren't the last
	arguments.  Replace targetm.calls.function_arg,
	targetm.calls.function_incoming_arg and
	targetm.calls.function_arg_advance with function_arg,
	function_incoming_arg and function_arg_advance.
	(expand_call): Replace targetm.calls.function_arg,
	targetm.calls.function_incoming_arg and
	targetm.calls.function_arg_advance with function_arg,
	function_incoming_arg and function_arg_advance.
	(emit_library_call_value_1): Likewise.
	(store_one_arg): Use 0 for empty record size.  Don't
	push 0 size argument onto stack.
	(must_pass_in_stack_var_size_or_pad): Return false for empty
	record.
	* dse.c (get_call_args): Replace targetm.calls.function_arg
	and targetm.calls.function_arg_advance with function_arg and
	function_arg_advance.
	* expr.c (block_move_libcall_safe_for_call_parm): Likewise.
	* function.c (aggregate_value_p): Replace
	

Re: [gomp4] [WIP] OpenACC bind, nohost clauses

2015-12-14 Thread Cesar Philippidis
On 12/08/2015 11:55 AM, Thomas Schwinge wrote:

Just for clarification, we're implementing the bind clause with the
semantics defined in OpenACC 2.5, correct? The 2.0a semantics aren't clear.

> On Sat, 14 Nov 2015 09:36:36 +0100, I wrote:
>> Initial support for the OpenACC bind and nohost clauses (routine
>> directive) for C, C++.  Fortran to follow.  Middle end handling and more
>> complete testsuite coverage also to follow once we got a few details
>> clarified.  OK for trunk?
> 
> (Has not yet been reviewed.)  Meanwhile, I continued working on the
> implementation, focussing on C.  See also my question "How to rewrite
> call targets (OpenACC bind clause)",
> .
> 
> To enable Cesar to help with the C++ and Fortran front ends (thanks!), in
> r231423, I just committed "[WIP] OpenACC bind, nohost clauses" to
> gomp-4_0-branch.  (There has already been initial support, parsing only,
> on gomp-4_0-branch.)  I'll try to make progress with the generic middle
> end bits, but will appreciate any review comments, so before inlining the
> complete patch, first a few questions/comments:
> 
> In the OpenACC bind(Y) clause attached to a routine(X) directive, Y can
> be an identifier or a string.  In the front ends, I canonicalize that
> into a string, as we -- at least currently -- don't have any use for the
> identifier (or decl?) later on:
> 
> --- gcc/tree-core.h
> +++ gcc/tree-core.h
> @@ -461,7 +461,7 @@ enum omp_clause_code {
> -  /* OpenACC clause: bind ( identifer | string ).  */
> +  /* OpenACC clause: bind (string).  */
>OMP_CLAUSE_BIND,

So what happens in c++ then? E.g. Say that we have a function sum which
is overloaded as follows:

  int sum (int a, int b) { return a + b; }
  double sum (double a, double b) { return a + b; }

  #pragma acc routine (sum) bind (cuda_sum)

First of all, does this bind apply to both int sum and double sum, or
just the double sum? Second, if the identifier gets canonicalized as a
string, will that prevent the name from being mangled, and hence disable
function overloading?

Also, while I'm asking about c++, is it possible apply bind individually
to an overloaded function. E.g.

 #pragma acc routine (sum) bind (cuda_sum_int)
 int sum (int a, int b) { return a + b; }

 #pragma acc routine (sum) bind (cuda_sum_double)
 double sum (double a, double b) { return a + b; }

> All the following are unreachable for OMP_CLAUSE_BIND, OMP_CLAUSE_NOHOST;
> document that to make it obvious/expected:
> 
> --- gcc/cp/pt.c
> +++ gcc/cp/pt.c
> @@ -14501,6 +14501,8 @@ tsubst_omp_clauses (tree clauses, bool 
> declare_simd, bool allow_fields,
>   }
>   }
>   break;
> +   case OMP_CLAUSE_BIND:
> +   case OMP_CLAUSE_NOHOST:
> default:
>   gcc_unreachable ();
> }
> --- gcc/gimplify.c
> +++ gcc/gimplify.c
> @@ -7413,6 +7413,8 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq 
> *pre_p,
>   ctx->default_kind = OMP_CLAUSE_DEFAULT_KIND (c);
>   break;
>  
> +   case OMP_CLAUSE_BIND:
> +   case OMP_CLAUSE_NOHOST:
> default:
>   gcc_unreachable ();
> }
> @@ -8104,6 +8106,8 @@ gimplify_adjust_omp_clauses (gimple_seq *pre_p, 
> gimple_seq body, tree *list_p,
> case OMP_CLAUSE_DEVICE_TYPE:
>   break;
>  
> +   case OMP_CLAUSE_BIND:
> +   case OMP_CLAUSE_NOHOST:
> default:
>   gcc_unreachable ();
> }
> --- gcc/omp-low.c
> +++ gcc/omp-low.c
> @@ -2279,6 +2279,8 @@ scan_sharing_clauses (tree clauses, omp_context 
> *ctx)
>   sorry ("Clause not supported yet");
>   break;
>  
> +   case OMP_CLAUSE_BIND:
> +   case OMP_CLAUSE_NOHOST:
> default:
>   gcc_unreachable ();
> }
> @@ -2453,6 +2455,8 @@ scan_sharing_clauses (tree clauses, omp_context 
> *ctx)
>   sorry ("Clause not supported yet");
>   break;
>  
> +   case OMP_CLAUSE_BIND:
> +   case OMP_CLAUSE_NOHOST:
> default:
>   gcc_unreachable ();
> }
> --- gcc/tree-nested.c
> +++ gcc/tree-nested.c
> @@ -1200,6 +1200,8 @@ convert_nonlocal_omp_clauses (tree *pclauses, 
> struct walk_stmt_info *wi)
> case OMP_CLAUSE_SEQ:
>   break;
>  
> +   case OMP_CLAUSE_BIND:
> +   case OMP_CLAUSE_NOHOST:
> default:
>   gcc_unreachable ();
> }
> @@ -1882,6 +1884,8 @@ convert_local_omp_clauses (tree *pclauses, struct 
> walk_stmt_info *wi)
> case OMP_CLAUSE_SEQ:
>   break;
>  
> +   case OMP_CLAUSE_BIND:
> +   case OMP_CLAUSE_NOHOST:
> default:
> 

Re: PING^1: [PATCH] Add TYPE_EMPTY_RECORD for C++ empty class

2015-12-14 Thread H.J. Lu
On Mon, Dec 14, 2015 at 12:16 PM, Jason Merrill  wrote:
> On 12/12/2015 01:42 PM, Marc Glisse wrote:
>>
>> On Sat, 12 Dec 2015, Jakub Jelinek wrote:
>>
>>> On Sat, Dec 12, 2015 at 09:51:23AM -0500, Jason Merrill wrote:

 On 12/11/2015 06:52 PM, H.J. Lu wrote:
>
> On Thu, Dec 10, 2015 at 3:24 AM, Richard Biener
>  wrote:
>>
>> On Wed, Dec 9, 2015 at 10:31 PM, Markus Trippelsdorf
>>  wrote:
>>>
>>> On 2015.12.09 at 10:53 -0800, H.J. Lu wrote:


 Empty C++ class is a corner case which isn't covered in psABI nor
 C++ ABI.
 There is no mention of "empty record" in GCC documentation.  But
 there are
 plenty of "empty class" in gcc/cp.  This change affects all
 targets.  C++ ABI
 should specify how it should be passed.
>>
>>
>>
>> About this patch, aren't we supposed to enable new C++ ABIs with
>> -fabi-version=42 (or whatever the next number is)?
>
>
> Yes, the patch should definitely make this conditional on
> abi_version_at_least.
>
>>> There is a C++ ABI mailinglist, where you could discuss this issue:
>>> http://sourcerytools.com/cgi-bin/mailman/listinfo/cxx-abi-dev
>>
>>
>> Yep.  As long as the ABI doesn't state how to pass those I'd rather
>> _not_ change GCCs way.
>
>
> It is agreed that GCC is wrong on this:
>
>
> http://sourcerytools.com/pipermail/cxx-abi-dev/2015-December/002876.html
>

 Yes, I think this is just a (nasty) bug on some GCC targets.
>>>
>>>
>>> Well, the argument in that thread is weird, because C and C++ empty
>>> structs
>>> are different, so it isn't surprising they are passed differently.
>>> C++ makes those sizeof == 1, while C has them sizeof == 0.
>>
>>
>> Maybe it isn't surprising, but it isn't particularly helpful either. It
>> increases the number of places where the 2 are incompatible.
>> (I personally don't care about empty C structs)
>
>
> Yep.  The C standard doesn't have empty structs; it's a GNU extension. But
> in any case argument passing can be compatible between C and C++, so it
> really should be.
>
>

Before I make any changes, I'd like to ask if we should make
argument passing can be compatible between C and C++ for
all targets GCC support or just x86.


-- 
H.J.


Re: PING^1: [PATCH] Add TYPE_EMPTY_RECORD for C++ empty class

2015-12-14 Thread Jason Merrill

On 12/14/2015 03:39 PM, H.J. Lu wrote:

On Mon, Dec 14, 2015 at 12:16 PM, Jason Merrill  wrote:

On 12/12/2015 01:42 PM, Marc Glisse wrote:


On Sat, 12 Dec 2015, Jakub Jelinek wrote:


On Sat, Dec 12, 2015 at 09:51:23AM -0500, Jason Merrill wrote:


On 12/11/2015 06:52 PM, H.J. Lu wrote:


On Thu, Dec 10, 2015 at 3:24 AM, Richard Biener
 wrote:


On Wed, Dec 9, 2015 at 10:31 PM, Markus Trippelsdorf
 wrote:


On 2015.12.09 at 10:53 -0800, H.J. Lu wrote:



Empty C++ class is a corner case which isn't covered in psABI nor
C++ ABI.
There is no mention of "empty record" in GCC documentation.  But
there are
plenty of "empty class" in gcc/cp.  This change affects all
targets.  C++ ABI
should specify how it should be passed.




About this patch, aren't we supposed to enable new C++ ABIs with
-fabi-version=42 (or whatever the next number is)?



Yes, the patch should definitely make this conditional on
abi_version_at_least.


There is a C++ ABI mailinglist, where you could discuss this issue:
http://sourcerytools.com/cgi-bin/mailman/listinfo/cxx-abi-dev



Yep.  As long as the ABI doesn't state how to pass those I'd rather
_not_ change GCCs way.



It is agreed that GCC is wrong on this:


http://sourcerytools.com/pipermail/cxx-abi-dev/2015-December/002876.html



Yes, I think this is just a (nasty) bug on some GCC targets.



Well, the argument in that thread is weird, because C and C++ empty
structs
are different, so it isn't surprising they are passed differently.
C++ makes those sizeof == 1, while C has them sizeof == 0.



Maybe it isn't surprising, but it isn't particularly helpful either. It
increases the number of places where the 2 are incompatible.
(I personally don't care about empty C structs)



Yep.  The C standard doesn't have empty structs; it's a GNU extension. But
in any case argument passing can be compatible between C and C++, so it
really should be.




Before I make any changes, I'd like to ask if we should make
argument passing can be compatible between C and C++ for
all targets GCC support or just x86.


All.

Jason



[Patch, libstdc++/68877] Reimplement __is_[nothrow_]swappable

2015-12-14 Thread Daniel Krügler
This is a reimplementation of __is_swappable and
__is_nothrow_swappable according to

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n4511.html

and also adds a missing usage of __is_nothrow_swappable in the swap
overload for arrays. Strictly speaking the latter change differs from
the Standard specification which requires the expression
noexcept(swap(*a, *b)) to be used. On the other hand the Standard is
broken in this regard, as pointed out by

http://cplusplus.github.io/LWG/lwg-active.html#2554

Thanks,

- Daniel


changelog.patch
Description: Binary data


68877.patch
Description: Binary data


Re: [PATCH] Fix PR c++/21802 (two-stage name lookup fails for operators)

2015-12-14 Thread Jason Merrill

On 12/12/2015 06:32 PM, Patrick Palka wrote:

>This should use cp_tree_operand_length.

Hmm, I don't immediately see how I can use this function here.  It
expects a tree but I dont have an appropriate tree to give to it, only a
tree_code.


True.  So let's introduce cp_tree_code_length next to 
cp_tree_operand_length.


Jason



Re: extend shift count warnings to vector types

2015-12-14 Thread Jan Beulich
>>> On 11.12.15 at 21:40,  wrote:
> On 12/11/2015 12:28 AM, Jan Beulich wrote:
>> gcc/c/
>> 2015-12-10  Jan Beulich  
>>
>>  * c-fold.c (c_fully_fold_internal): Also emit shift count
>>  warnings for vector types.
>>  * c-typeck.c (build_binary_op): Likewise.
> Needs testcases for the added warnings.
> 
> My additional concern here would be that in build_binary_op, after your 
> change, we'll be setting doing_shift to true.  That in turn will enable 
> ubsan instrumentation of the shift.  Does ubsan work properly for vector 
> shifts?

You say that it may be safe with that other patch you replied to a
little later. I have no idea myself.

Jan



Re: [PR 68064] Testcase and an assert for an already fixed bug

2015-12-14 Thread Richard Biener
On Fri, Dec 11, 2015 at 12:16 PM, Martin Jambor  wrote:
> Hi,
>
> PR 68064 has been fixed by Richi's revision 231246.  I would still
> like to add the testcase to the testsuite and add a checking assert
> so that if ever get zero alignment again, we catch it in the analysis
> part of IPA-CP (which with LTO means in compilation and not linking
> phase which makes a big difference for debugging).
>
> I have tossed this into a bootstrap and test run on an x86_64-linux
> and found no issues.  I believe the patch is quite obvious and so will
> go ahead and commit it to trunk.
>
> Thanks,
>
> Martin
>
>
> Add asssert and testcase for PR 68064
>
> 2015-12-09  Martin Jambor  
>
> * ipa-prop.c (ipa_compute_jump_functions_for_edge): Add checking
> assert that align is nonzero.
>
> testsuite/
> * g++.dg/torture/pr68064.C: New test.
> ---
>  gcc/ipa-prop.c |  1 +
>  gcc/testsuite/g++.dg/torture/pr68064.C | 35 
> ++
>  2 files changed, 36 insertions(+)
>  create mode 100644 gcc/testsuite/g++.dg/torture/pr68064.C
>
> diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c
> index f379ea7..d0a3501 100644
> --- a/gcc/ipa-prop.c
> +++ b/gcc/ipa-prop.c
> @@ -1646,6 +1646,7 @@ ipa_compute_jump_functions_for_edge (struct 
> ipa_func_body_info *fbi,
>   && align % BITS_PER_UNIT == 0
>   && hwi_bitpos % BITS_PER_UNIT == 0)
> {
> + gcc_checking_assert (align != 0);
>   jfunc->alignment.known = true;
>   jfunc->alignment.align = align / BITS_PER_UNIT;

don't you want to assert align / BITS_PER_UNIT is not 0?

>   jfunc->alignment.misalign = hwi_bitpos / BITS_PER_UNIT;
> diff --git a/gcc/testsuite/g++.dg/torture/pr68064.C 
> b/gcc/testsuite/g++.dg/torture/pr68064.C
> new file mode 100644
> index 000..59b6897
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/torture/pr68064.C
> @@ -0,0 +1,35 @@
> +// { dg-do compile }
> +
> +template  class A {
> +public:
> +  class B;
> +  typedef typename Config::template D::type TypeHandle;
> +  static A *Tagged() { return B::New(B::kTagged); }
> +  static TypeHandle Union(TypeHandle);
> +  static TypeHandle Representation(TypeHandle, typename Config::Region *);
> +  bool Is();
> +};
> +
> +template  class A::B {
> +  friend A;
> +  enum { kTaggedPointer = 1 << 31, kTagged = kTaggedPointer };
> +  static A *New(int p1) { return Config::from_bitset(p1); }
> +};
> +
> +struct C {
> +  typedef int Region;
> +  template  struct D { typedef A *type; };
> +  static A *from_bitset(unsigned);
> +};
> +A *C::from_bitset(unsigned p1) { return reinterpret_cast(p1); }
> +
> +namespace {
> +int *a;
> +void fn1(A *p1) { A::Union(A::Representation(p1, a)); }
> +}
> +
> +void fn2() {
> +  A b;
> +  A *c = b.Is() ? 0 : A::Tagged();
> +  fn1(c);
> +}
> --
> 2.6.3
>


Re: [v4] avoid alignment of static variables affecting stack's

2015-12-14 Thread Richard Biener
On Fri, Dec 11, 2015 at 2:54 PM, Bernd Schmidt  wrote:
> On 12/11/2015 02:48 PM, Jan Beulich wrote:
>>
>> Function (or more narrow) scope static variables (as well as others not
>> placed on the stack) should also not have any effect on the stack
>> alignment. I noticed the issue first with Linux'es dynamic_pr_debug()
>> construct using an 8-byte aligned sub-file-scope local variable.
>>
>> According to my checking bad behavior started with 4.6.x (4.5.3 was
>> still okay), but generated code got quite a bit worse as of 4.9.0.
>>
>> [v4: Bail early, using is_global_var(), as requested by Bernd.]
>
>
> In case I haven't made it obvious, this is OK.

But I wonder if it makes sense because shortly after the early-out we check

  if (TREE_STATIC (var)
  || DECL_EXTERNAL (var)
  || (TREE_CODE (origvar) == SSA_NAME && use_register_for_decl (var)))

so either there are obvious cleanup opportunities left or the patch is broken.

Richard.

>
> Bernd


Re: [PATCH] Fix warnings from including fdl.texi into gnat-style.texi

2015-12-14 Thread Tom de Vries

On 08/12/15 19:10, Gerald Pfeifer wrote:

Hi Tom,

On Tue, 8 Dec 2015, Tom de Vries wrote:

Can you approve the fdl part?

Let's assume I can.  Okay.

was the 'Okay' above:
- a figure of speech (as I read it), or
- an actual approval (conditional on the adding of the comment)
?


I should have written this as "Let's assume I can: Okay." or
better "Let's assume I can. -> Okay."

Yes, please consider this approved.

Sorry if you have been waiting due to this!



Np :) , and thanks for the review.

- Tom



Re: [v4] avoid alignment of static variables affecting stack's

2015-12-14 Thread Jan Beulich
>>> On 14.12.15 at 09:35,  wrote:
> On Fri, Dec 11, 2015 at 2:54 PM, Bernd Schmidt  wrote:
>> On 12/11/2015 02:48 PM, Jan Beulich wrote:
>>>
>>> Function (or more narrow) scope static variables (as well as others not
>>> placed on the stack) should also not have any effect on the stack
>>> alignment. I noticed the issue first with Linux'es dynamic_pr_debug()
>>> construct using an 8-byte aligned sub-file-scope local variable.
>>>
>>> According to my checking bad behavior started with 4.6.x (4.5.3 was
>>> still okay), but generated code got quite a bit worse as of 4.9.0.
>>>
>>> [v4: Bail early, using is_global_var(), as requested by Bernd.]
>>
>>
>> In case I haven't made it obvious, this is OK.
> 
> But I wonder if it makes sense because shortly after the early-out we check
> 
>   if (TREE_STATIC (var)
>   || DECL_EXTERNAL (var)
>   || (TREE_CODE (origvar) == SSA_NAME && use_register_for_decl (var)))
> 
> so either there are obvious cleanup opportunities left or the patch is 
> broken.

Looks like a cleanup opportunity I overlooked when following
Bernd's advice.

Jan



Re: [RFC] Dump ssaname info for default defs

2015-12-14 Thread Richard Biener
On Fri, Dec 11, 2015 at 6:05 PM, Tom de Vries  wrote:
> Hi,
>
> atm, we dump ssa-name info for lhs-es of statements. That leaves out the ssa
> names with default defs.
>
> This proof-of-concept patch prints the ssa-name info for default defs, in
> the following format:
> ...
> __attribute__((noclone, noinline))
> bar (intD.6 * cD.1755, intD.6 * dD.1756)
> # PT = nonlocal
> # DEFAULT_DEF c_2(D)
> # PT = { D.1762 } (nonlocal)
> # ALIGN = 4, MISALIGN = 0
> # DEFAULT_DEF d_4(D)
> {
> ;;   basic block 2, loop depth 0, count 0, freq 1, maybe hot
> ;;prev block 0, next block 1, flags: (NEW, REACHABLE)
> ;;pred:   ENTRY [100.0%]  (FALLTHRU,EXECUTABLE)
>   # .MEM_3 = VDEF <.MEM_1(D)>
>   *c_2(D) = 1;
>   # .MEM_5 = VDEF <.MEM_3>
>   *d_4(D) = 2;
>   # VUSE <.MEM_5>
>   return;
> ;;succ:   EXIT [100.0%]  (EXECUTABLE)
>
> }
> ...
>
> Good idea? Any further comments, f.i. on formatting?

I've had a similar patch in my dev tree for quite some while but never
pushed it because
of "formatting"...

That said,

+  if (gimple_in_ssa_p (fun))

Please add flags & TDF_ALIAS here to avoid issues with dump-file scanning.

+{
+  arg = DECL_ARGUMENTS (fndecl);
+  while (arg)
+   {
+ tree def = ssa_default_def (fun, arg);
+ if (flags & TDF_ALIAS)
+   dump_ssaname_info_to_file (file, def);
+ fprintf (file, "# DEFAULT_DEF ");
+ print_generic_expr (file, def, dump_flags);

Rather than

# DEFAULT_DEF d_4(D)

I'd print

d_4(D) = GIMPLE_NOP;

(or how gimple-nop is printed - that is, just print the def-stmt).

My local patch simply adjusted the dumping of function
locals, thus I amended the existing

  if (gimple_in_ssa_p (cfun))
for (ix = 1; ix < num_ssa_names; ++ix)
  {
tree name = ssa_name (ix);
if (name && !SSA_NAME_VAR (name))
  {

loop.  Of course that intermixed default-defs with other anonymous
SSA vars which might be a little confusing.

But prepending the list of locals with

   type d_4(D) = NOP();

together with SSA info might be the best.  Note there is also the
static chain and the result decl (if DECL_BY_REFERENCE) to print.

Richard.

+ fprintf (file, "\n");
+ arg = DECL_CHAIN (arg);
+   }
+}


- Tom


Re: RFA (hash-*): PATCH for c++/68309

2015-12-14 Thread Richard Biener
On Fri, Dec 11, 2015 at 8:05 PM, Jason Merrill  wrote:
> On 12/11/2015 05:10 AM, Richard Biener wrote:
>>
>> On Thu, Dec 10, 2015 at 11:03 PM, Jason Merrill  wrote:
>>>
>>> The C++ front end uses a temporary hash table to remember specializations
>>> of
>>> local variables during template instantiations.  In a nested function
>>> such
>>> as a lambda or local class member function, we need to retain the
>>> elements
>>> from the enclosing function's local_specializations table; otherwise the
>>> testcase crashes because we don't find a local specialization for the
>>> non-captured use of 'args' in the decltype.
>>>
>>> This patch addresses that by making a copy of the enclosing
>>> local_specializations table if it exists; to enable that I've added copy
>>> constructors to hash_table and hash_map.
>>>
>>> Tested x86_64-pc-linux-gnu.  OK for trunk?
>>
>>
>> I don't think  you can copy the elements with memcpy they may be C++
>> classes
>> that are not copyable.
>
>
> True.  Fixed thus:
>
>> +  for (size_t i = 0; i < size; ++i)
>> +{
>> +  value_type  = h.m_entries[i];
>> +  if (is_deleted (entry))
>> +   mark_deleted (m_entries[i]);
>> +  else if (!is_empty (entry))
>> +   m_entries[i] = entry;
>> +}
>
>
>> Also watch out for the bool gather_mem_stats = true
>> to bool gather_mem_stats = GATHER_STATISTICS change if that crosses your
>> change.
>
>
> OK.
>
>> I also think copying hash tables should be discouraged ;)  I wonder if you
>> can get around the copying by adding a generation count (to easily
>> "backtrack")
>> instead.
>
>
> I considered having a chain of tables to check, to handle generations, but I
> figured that the tables in question were small enough (only containing local
> variables for a single function) that a simple copy was reasonable.

Looks good to me now.  Needs adjustment to use GATHER_STATISTICS
as default-arg for gather_mem_stats now.

Thanks,
Richard.

> Jason
>


Re: [RFC] Request for comments on ivopts patch

2015-12-14 Thread Richard Biener
On Sat, Dec 12, 2015 at 12:48 AM, Steve Ellcey  wrote:
> On Wed, 2015-12-09 at 11:24 +0100, Richard Biener wrote:
>
>> > This second case (without the preference for the original IV)
>> > generates better code on MIPS because the final assembly
>> > has the increment instructions between the loads and the tests
>> > of the values being loaded and so there is no delay (or less delay)
>> > between the load and use.  It seems like this could easily be
>> > the case for other platforms too so I was wondering what people
>> > thought of this patch:
>>
>> You don't comment on the comment you remove ... debugging
>> programs is also important!
>>
>> So if then the cost of both cases should be distinguished
>> somewhere else, like granting a bonus for increment before
>> exit test or so.
>>
>> Richard.
>
> Here is new patch that tries to do that.  It accomplishes the same thing
> as my original patch but by checking different features.  Basically, for
> machines with no autoinc/autodec it has a preference for IVs that don't
> change during loop (i.e. var_before == var_after).
>
> What do you think about this approach?
>
> Steve Ellcey
> sell...@imgtec.com
>
>
> 2015-12-11  Steve Ellcey  
>
> * tree-ssa-loop-ivopts.c (determine_iv_cost): Add cost to ivs that
> need to be updated during loop.
>
>
> diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c
> index 98dc451..ecf9737 100644
> --- a/gcc/tree-ssa-loop-ivopts.c
> +++ b/gcc/tree-ssa-loop-ivopts.c
> @@ -5826,6 +5826,14 @@ determine_iv_cost (struct ivopts_data *data, struct 
> iv_cand *cand)
>|| DECL_ARTIFICIAL (SSA_NAME_VAR (cand->var_before)))
>  cost++;
>
> +  /* If we are not using autoincrement or autodecrement, prefer ivs that
> + do not have to be incremented/decremented during the loop.  This can
> + move loads ahead of the instructions that update the address.  */
> +  if (cand->pos != IP_BEFORE_USE
> +  && cand->pos != IP_AFTER_USE
> +  && cand->var_before != cand->var_after)
> +cost++;
> +

I don't know enough to assess the effect of this but

 1) not all archs can do auto-incdec so either the comment is misleading
or the test should probably be amended
 2) I wonder why with the comment ("during the loop") you exclude IP_NORMAL/END

that said, the comment needs to explain the situation better.

Of course all such patches need some code-gen effect investigation
on more than one arch.

[I wonder if a IV cost adjust target hook makes sense at some point]

Richard.

>/* Prefer not to insert statements into latch unless there are some
>   already (so that we do not create unnecessary jumps).  */
>if (cand->pos == IP_END
>
>


Re: [PATCH] doc: discourage use of __attribute__((optimize())) in production code

2015-12-14 Thread Richard Biener
On Sun, Dec 13, 2015 at 9:03 PM, Andi Kleen  wrote:
> Markus Trippelsdorf  writes:
>
>> Many developers are still using __attribute__((optimize())) in
>> production code, although it quite broken.
>
> Wo reads documentation? @) If you want to discourage it better warn once
> at runtime.

We're also quite heavily using it in LTO internally now.

Richard.

> -Andi


  1   2   >