date:20160125

[PATCH] Low-hanging C++-lexer speedup (PR c++/24208)

2016-01-25 Thread Patrick Palka

Within cp/parser.c, cp_lexer_peek_token and the rest of the
token-related functions are bloated by lexer-debugging code, code that
is completely dead unless calls to the functions
cp_lexer_[start|stop]_debugging are deliberately inserted somewhere in
the parser source code for temporary debugging purposes.

The compiler doesn't fold away this dead code at compile time because it
cannot prove that the flag lexer->debugging_p doesn't change.  So we end
up with this dead debugging code, guarded by cp_lexer_debugging_p, in
the release binary.  This is especially wasteful with code like

 while (cp_lexer_next_token_is_not (parser->lexer, CPP_EQ)
&& cp_lexer_next_token_is_not (parser->lexer, CPP_COMMA)
&& cp_lexer_next_token_is_not (parser->lexer, CPP_CLOSE_PAREN)
&& cp_lexer_next_token_is_not (parser->lexer, CPP_EOF))

which, after inlining, ought to be equivalent to:

 token = parser->lexer->token;
 while (token != CPP_EQ
&& token != CPP_COMMA
&& token != CPP_CLOSE_PAREN
&& token != CPP_EOF)

but because of the lexer-debugging stuff getting in the way of
inlining/CSE, the final code is much worse.

This patch helps the compiler to fold away calls to cp_lexer_debugging_p
when the lexer is not being debugged, by adding a new macro that
short-circuits the cp_lexer_debugging_p predicate.

This change reduces the size of parser.o by 3.5% -- from 6060 Kb to 5852
Kb.  This change also reduces the time it takes to compile a dummy C++
file of mine from 1.95s to 1.85s, a reduction of 5%.

Bootstrapped + regtested on x86_64-pc-linux-gnu.  Does this patch look
OK to commit?

gcc/cp/ChangeLog:

PR c++/24208
* parser.c (LEXER_DEBUGGING_ENABLED_P): New macro.
(cp_lexer_debugging_p): Use it.
(cp_lexer_start_debugging): Likewise.
(cp_lexer_stop_debugging): Likewise.
---
 gcc/cp/parser.c | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 33f1df3..d03b0c9 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -706,11 +706,21 @@ cp_lexer_destroy (cp_lexer *lexer)
   ggc_free (lexer);
 }
 
+/* This needs to be set to TRUE before the lexer-debugging infrastructure can
+   be used.  The point of this flag is to help the compiler to fold away calls
+   to cp_lexer_debugging_p within this source file at compile time, when the
+   lexer is not being debugged.  */
+
+#define LEXER_DEBUGGING_ENABLED_P false
+
 /* Returns nonzero if debugging information should be output.  */
 
 static inline bool
 cp_lexer_debugging_p (cp_lexer *lexer)
 {
+  if (!LEXER_DEBUGGING_ENABLED_P)
+return false;
+
   return lexer->debugging_p;
 }
 
@@ -1296,6 +1306,10 @@ debug (cp_token *ptr)
 static void
 cp_lexer_start_debugging (cp_lexer* lexer)
 {
+  if (!LEXER_DEBUGGING_ENABLED_P)
+fatal_error (input_location,
+"LEXER_DEBUGGING_ENABLED_P is not set to true");
+
   lexer->debugging_p = true;
   cp_lexer_debug_stream = stderr;
 }
@@ -1305,6 +1319,10 @@ cp_lexer_start_debugging (cp_lexer* lexer)
 static void
 cp_lexer_stop_debugging (cp_lexer* lexer)
 {
+  if (!LEXER_DEBUGGING_ENABLED_P)
+fatal_error (input_location,
+"LEXER_DEBUGGING_ENABLED_P is not set to true");
+
   lexer->debugging_p = false;
   cp_lexer_debug_stream = NULL;
 }
-- 
2.7.0.134.gf5046bd.dirty

[PATCH, libstdc++-v3] Fix import of wide character related symbols in stdlib.h wraper

2016-01-25 Thread Andris Pavenis


include/c_compatibility/stdlib.h imports wide character related symbols
into global namespace unconditionaly which causes libstdc++-v3 build
to fail when one or both of _GLIBCXX_USE_WCHAR_T and _GLIBCXX_HAVE_MBSTATE_T
are not defined.

Included patch changes it to import them into global namespace only
when they are defined in cstdlib

Andris

2016-01-26  Andris Pavenis  

* include/c_compatibility/stdlib.h: Include wide character related
definitions only when they are available in cstdlib.

>From 17778d89abe4f51f929806e67d2e2352b6b4376e Mon Sep 17 00:00:00 2001
From: Andris Pavenis 
Date: Tue, 26 Jan 2016 06:24:48 +0200
Subject: [PATCH] [PATCH,libstdc++-v3] Fix use use wide character related
 symbols in stdlib.h wrapper

* include/c_compatibility/stdlib.h: include wide character related
definitions only when they are available in cstdlib
---
 libstdc++-v3/include/c_compatibility/stdlib.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/libstdc++-v3/include/c_compatibility/stdlib.h b/libstdc++-v3/include/c_compatibility/stdlib.h
index bd72580..31e7e5f 100644
--- a/libstdc++-v3/include/c_compatibility/stdlib.h
+++ b/libstdc++-v3/include/c_compatibility/stdlib.h
@@ -62,9 +62,11 @@ using std::getenv;
 using std::labs;
 using std::ldiv;
 using std::malloc;
+#ifdef _GLIBCXX_HAVE_MBSTATE_T
 using std::mblen;
 using std::mbstowcs;
 using std::mbtowc;
+#endif // _GLIBCXX_HAVE_MBSTATE_T
 using std::qsort;
 using std::rand;
 using std::realloc;
@@ -73,8 +75,10 @@ using std::strtod;
 using std::strtol;
 using std::strtoul;
 using std::system;
+#ifdef _GLIBCXX_USE_WCHAR_T
 using std::wcstombs;
 using std::wctomb;
+#endif // _GLIBCXX_USE_WCHAR_T
 #endif
 
 #endif
-- 
2.5.0

Re: [PATCH] OpenACC use_device clause ICE fix

2016-01-25 Thread Chung-Lin Tang

On 2016/1/25 7:06 PM, Jakub Jelinek wrote:
> The following ICEs without the patch and works with it, so I think it is
> better:
> 
> 2016-01-25  Jakub Jelinek  
> 
>   * omp-low.c (lower_omp_target) : Set
>   DECL_VALUE_EXPR of new_var even for the non-array case.  Look
>   through DECL_VALUE_EXPR for expansion.
> 
>   * c-c++-common/goacc/use_device-1.c: New test.

Thanks, the test was indeed just a reduction of a whole example program, which 
I'm not sure
we're at liberty to directly include in the testsuite. I've verified that the 
patch
allows the program to build and run correctly.

Thanks,
Chung-Lin

Re: Speedup configure and build with system.h

2016-01-25 Thread Michael Matz

Hi,

On Mon, 25 Jan 2016, Uros Bizjak wrote:

> This patch caused bootstrap failure on non-c++11 bootstrap compiler
> [1], e.g. CentOS 5.11.
> 
> The problem is with std::swap, which was defined in header 
> until c++11 [2].
> 
> [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69464
> [2] http://en.cppreference.com/w/cpp/algorithm/swap

Meh.  Can you try the attached patch with a configure test (it includes 
the generated files)?  It works for me with 4.3.4, and should make your 
build include  always.


Ciao,
Michael.Index: configure.ac
===
--- configure.ac	(revision 232675)
+++ configure.ac	(working copy)
@@ -416,6 +416,15 @@ struct X { typedef long long
 ]], [[X::t x;]])],[],[AC_MSG_ERROR([error verifying int64_t uses long long])])
 fi
 
+AC_CACHE_CHECK(for std::swap in , ac_cv_std_swap_in_utility, [
+AC_COMPILE_IFELSE([AC_LANG_PROGRAM([[
+#include 
+]], [[int a, b; std::swap(a,b);]])],[ac_cv_std_swap_in_utility=yes],[ac_cv_std_swap_in_utility=no])])
+if test $ac_cv_std_swap_in_utility = yes; then
+  AC_DEFINE(HAVE_SWAP_IN_UTILITY, 1,
+  [Define if  defines std::swap.])
+fi
+
 # Check whether compiler is affected by placement new aliasing bug (PR 29286).
 # If the host compiler is affected by the bug, and we build with optimization
 # enabled (which happens e.g. when cross-compiling), the pool allocator may
Index: system.h
===
--- system.h	(revision 232736)
+++ system.h	(working copy)
@@ -217,7 +217,7 @@ extern int errno;
 #endif
 
 #ifdef __cplusplus
-#ifdef INCLUDE_ALGORITHM
+#if defined (INCLUDE_ALGORITHM) || !defined (HAVE_SWAP_IN_UTILITY)
 # include 
 #endif
 # include 
Index: configure
===
--- configure	(revision 232675)
+++ configure	(working copy)
@@ -6534,6 +6534,40 @@ fi
 rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
 fi
 
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for std::swap in " >&5
+$as_echo_n "checking for std::swap in ... " >&6; }
+if test "${ac_cv_std_swap_in_utility+set}" = set; then :
+  $as_echo_n "(cached) " >&6
+else
+
+cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+
+#include 
+
+int
+main ()
+{
+int a, b; std::swap(a,b);
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_cxx_try_compile "$LINENO"; then :
+  ac_cv_std_swap_in_utility=yes
+else
+  ac_cv_std_swap_in_utility=no
+fi
+rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_std_swap_in_utility" >&5
+$as_echo "$ac_cv_std_swap_in_utility" >&6; }
+if test $ac_cv_std_swap_in_utility = yes; then
+
+$as_echo "#define HAVE_SWAP_IN_UTILITY 1" >>confdefs.h
+
+fi
+
 # Check whether compiler is affected by placement new aliasing bug (PR 29286).
 # If the host compiler is affected by the bug, and we build with optimization
 # enabled (which happens e.g. when cross-compiling), the pool allocator may
@@ -18419,7 +18453,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 18422 "configure"
+#line 18456 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -18525,7 +18559,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 18528 "configure"
+#line 18562 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
Index: config.in
===
--- config.in	(revision 232675)
+++ config.in	(working copy)
@@ -1705,6 +1705,12 @@
 #endif
 
 
+/* Define if  defines std::swap. */
+#ifndef USED_FOR_TARGET
+#undef HAVE_SWAP_IN_UTILITY
+#endif
+
+
 /* Define to 1 if you have the `sysconf' function. */
 #ifndef USED_FOR_TARGET
 #undef HAVE_SYSCONF
@@ -1865,7 +1871,8 @@
 #endif
 
 
-/* Define if your assembler supports .dwsect 0xB */
+/* Define if your assembler supports AIX debug frame section label reference.
+   */
 #ifndef USED_FOR_TARGET
 #undef HAVE_XCOFF_DWARF_EXTRAS
 #endif

Re: [PATCH 0/2] [ARC] Small fixes

2016-01-25 Thread Joern Wolfgang Rennecke




On 25/01/16 13:33, Claudiu Zissulescu wrote:

Please find attached two small patches which are fixing two issues within the 
ARC backend:

1. The first one fixes predicates used by arcset* patterns.
2. The second one rejects constant-constant comparisons. This situation may 
happen durring CSE step.

These are OK.

FWIW, there's probably a missed optimization here - these constant - 
constant comparisons

could be folded down further.

Re: [PATCH, PR69421] Check vector types of COND_EXPR operands are compatible when vectorizing it

2016-01-25 Thread Richard Biener

On Mon, Jan 25, 2016 at 11:16 AM, Ilya Enkovich  wrote:
> Hi,
>
> This patch covers one more case when boolean operands get different
> vectypes and we don't detect it.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu.  OK for trunk?

Ok.

Richard.

> Thanks,
> Ilya
> --
> gcc/
>
> 2016-01-25  Ilya Enkovich  
>
> PR target/69421
> * tree-vect-stmts.c (vectorizable_condition): Check vectype
> of operands is compatible with a statement vectype.
>
> gcc/testsuite/
>
> 2016-01-25  Ilya Enkovich  
>
> PR target/69421
> * gcc.dg/pr69421.c: New test.
>
>
> diff --git a/gcc/testsuite/gcc.dg/pr69421.c b/gcc/testsuite/gcc.dg/pr69421.c
> new file mode 100644
> index 000..252e22c
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/pr69421.c
> @@ -0,0 +1,16 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3" } */
> +
> +struct A { double a; };
> +double a;
> +
> +void
> +foo (_Bool *x)
> +{
> +  long i;
> +  for (i = 0; i < 64; i++)
> +{
> +  struct A c;
> +  x[i] = c.a || a;
> +}
> +}
> diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
> index 1d2246d..ed2ce07 100644
> --- a/gcc/tree-vect-stmts.c
> +++ b/gcc/tree-vect-stmts.c
> @@ -7528,6 +7528,7 @@ vectorizable_condition (gimple *stmt, 
> gimple_stmt_iterator *gsi,
>
>tree vectype = STMT_VINFO_VECTYPE (stmt_info);
>int nunits = TYPE_VECTOR_SUBPARTS (vectype);
> +  tree vectype1 = NULL_TREE, vectype2 = NULL_TREE;
>
>if (slp_node || PURE_SLP_STMT (stmt_info))
>  ncopies = 1;
> @@ -7547,9 +7548,17 @@ vectorizable_condition (gimple *stmt, 
> gimple_stmt_iterator *gsi,
>  return false;
>
>gimple *def_stmt;
> -  if (!vect_is_simple_use (then_clause, stmt_info->vinfo, _stmt, ))
> +  if (!vect_is_simple_use (then_clause, stmt_info->vinfo, _stmt, ,
> +  ))
> +return false;
> +  if (!vect_is_simple_use (else_clause, stmt_info->vinfo, _stmt, ,
> +  ))
>  return false;
> -  if (!vect_is_simple_use (else_clause, stmt_info->vinfo, _stmt, ))
> +
> +  if (vectype1 && !useless_type_conversion_p (vectype, vectype1))
> +return false;
> +
> +  if (vectype2 && !useless_type_conversion_p (vectype, vectype2))
>  return false;
>
>masked = !COMPARISON_CLASS_P (cond_expr);

Re: [PATCH] Fix aarch64 bootstrap (pr69416)

2016-01-25 Thread Christophe Lyon

On 22 January 2016 at 18:07, Richard Henderson  wrote:
> The bare CONST_INT inside the CCmode IF_THEN_ELSE is causing combine to make
> incorrect simplifications.  At this stage it feels safer to wrap the
> CONST_INT inside of an UNSPEC than make more generic changes to combine.
>
> But we should definitely investigate combine's CCmode issues for gcc7.
>
>
> Ok?
>
Hi,

After this, I'm seeing this test now FAILs:
gcc.target/aarch64/ccmp_1.c scan-assembler adds\t

Christophe


>
> r~

Re: [PING][PATCH] Mark symbols in offload tables with force_output in read_offload_tables

2016-01-25 Thread Ilya Verbin

Hi!

On Tue, Jan 05, 2016 at 15:56:15 +0100, Tom de Vries wrote:
> >diff --git a/gcc/lto-cgraph.c b/gcc/lto-cgraph.c
> >index 62e5454..cdaee41 100644
> >--- a/gcc/lto-cgraph.c
> >+++ b/gcc/lto-cgraph.c
> >@@ -1911,6 +1911,11 @@ input_offload_tables (void)
> >   tree fn_decl
> > = lto_file_decl_data_get_fn_decl (file_data, decl_index);
> >   vec_safe_push (offload_funcs, fn_decl);
> >+
> >+  /* Prevent IPA from removing fn_decl as unreachable, since there
> >+ may be no refs from the parent function to child_fn in offload
> >+ LTO mode.  */
> >+  cgraph_node::get (fn_decl)->mark_force_output ();
> > }
> >   else if (tag == LTO_symtab_variable)
> > {
> >@@ -1918,6 +1923,10 @@ input_offload_tables (void)
> >   tree var_decl
> > = lto_file_decl_data_get_var_decl (file_data, decl_index);
> >   vec_safe_push (offload_vars, var_decl);
> >+
> >+  /* Prevent IPA from removing var_decl as unused, since there
> >+ may be no refs to var_decl in offload LTO mode.  */
> >+  varpool_node::get (var_decl)->force_output = 1;
> > }

This doesn't work when there is more than one LTO partition, because only first
partition contains full offload table to maintain correct order, but cgraph and
varpool nodes aren't necessarily created for the first partition.  To reproduce:

$ make check-target-libgomp RUNTESTFLAGS="c.exp=for-* --target_board=unix/-flto"
FAIL: libgomp.c/for-3.c (internal compiler error)
FAIL: libgomp.c/for-5.c (internal compiler error)
FAIL: libgomp.c/for-6.c (internal compiler error)
$ make check-target-libgomp RUNTESTFLAGS="c++.exp=for-* 
--target_board=unix/-flto"
FAIL: libgomp.c++/for-11.C (internal compiler error)
FAIL: libgomp.c++/for-13.C (internal compiler error)
FAIL: libgomp.c++/for-14.C (internal compiler error)

  -- Ilya

Re: [hsa merge 07/10] IPA-HSA pass

2016-01-25 Thread Martin Liška

On 01/16/2016 11:00 AM, Jan Hubicka wrote:
> Can't it be represented via explicit REF_ADDR or something like that?
> 
> Honza

Hi.

Sure, I've just done a patch that can do that. However, as we're currently in 
stage4,
that change would probably require explicit permission of a release manager?

Thanks,
Martin
>From 9639fff94d043c55b55bfb12bb086032db565f0a Mon Sep 17 00:00:00 2001
From: marxin 
Date: Mon, 25 Jan 2016 16:11:00 +0100
Subject: [PATCH] HSA: simplify partitioning of HSA kernels and host impls.

gcc/lto/ChangeLog:

2016-01-25  Martin Liska  

	* lto-partition.c (add_symbol_to_partition_1): Remove usage
	of hsa_summaries.

gcc/ChangeLog:

2016-01-25  Martin Liska  

	* hsa.c (hsa_summary_t::link_functions): Create IPA_REF_ADDR
	reference for an HSA kernel and its host function.
---
 gcc/hsa.c   |  5 +
 gcc/lto/lto-partition.c | 19 ---
 2 files changed, 5 insertions(+), 19 deletions(-)

diff --git a/gcc/hsa.c b/gcc/hsa.c
index ec23f81..f0b3205 100644
--- a/gcc/hsa.c
+++ b/gcc/hsa.c
@@ -781,6 +781,11 @@ hsa_summary_t::link_functions (cgraph_node *gpu, cgraph_node *host,
   TREE_OPTIMIZATION (fn_opts)->x_flag_tree_loop_vectorize = false;
   TREE_OPTIMIZATION (fn_opts)->x_flag_tree_slp_vectorize = false;
   DECL_FUNCTION_SPECIFIC_OPTIMIZATION (gdecl) = fn_opts;
+
+  /* Create reference between a kernel and a corresponding host implementation
+ to quarantee LTO streaming to a same LTRANS.  */
+  if (kind == HSA_KERNEL)
+gpu->create_reference (host, IPA_REF_ADDR);
 }
 
 /* Add a HOST function to HSA summaries.  */
diff --git a/gcc/lto/lto-partition.c b/gcc/lto/lto-partition.c
index eb28fed..9eb63c2 100644
--- a/gcc/lto/lto-partition.c
+++ b/gcc/lto/lto-partition.c
@@ -34,7 +34,6 @@ along with GCC; see the file COPYING3.  If not see
 #include "ipa-prop.h"
 #include "ipa-inline.h"
 #include "lto-partition.h"
-#include "hsa.h"
 
 vec ltrans_partitions;
 
@@ -171,24 +170,6 @@ add_symbol_to_partition_1 (ltrans_partition part, symtab_node *node)
 	 Therefore put it into the same partition.  */
   if (cnode->instrumented_version)
 	add_symbol_to_partition_1 (part, cnode->instrumented_version);
-
-  /* Add an HSA associated with the symbol.  */
-  if (hsa_summaries != NULL)
-	{
-	  hsa_function_summary *s = hsa_summaries->get (cnode);
-	  if (s->m_kind == HSA_KERNEL)
-	{
-	  /* Add binded function.  */
-	  bool added = add_symbol_to_partition_1 (part,
-		  s->m_binded_function);
-	  gcc_assert (added);
-	  if (symtab->dump_file)
-		fprintf (symtab->dump_file,
-			 "adding an HSA function (host/gpu) to the "
-			 "partition: %s\n",
-			 s->m_binded_function->name ());
-	}
-	}
 }
 
   add_references_to_partition (part, node);
-- 
2.7.0

Re: [PATCH] PR c++/69399: Add HAVE_WORKING_CXX_BUILTIN_CONSTANT_P

2016-01-25 Thread Richard Biener

On Fri, Jan 22, 2016 at 7:55 PM, H.J. Lu  wrote:
> Without the fix for PR 65656, g++ miscompiles __builtin_constant_p in
> wi::lrshift in wide-int.h.  Add a check with PR 65656 testcase to verify
> that C++ __builtin_constant_p works properly.
>
> Tested on x86-64 with working GCC:
>
> gcc/auto-host.h:/* #undef HAVE_WORKING_CXX_BUILTIN_CONSTANT_P */
> prev-gcc/auto-host.h:/* #undef HAVE_WORKING_CXX_BUILTIN_CONSTANT_P */
> stage1-gcc/auto-host.h:#define HAVE_WORKING_CXX_BUILTIN_CONSTANT_P 1
>
> and broken GCC:
>
> gcc/auto-host.h:/* #undef HAVE_WORKING_CXX_BUILTIN_CONSTANT_P */
> prev-gcc/auto-host.h:/* #undef HAVE_WORKING_CXX_BUILTIN_CONSTANT_P */
> stage1-gcc/auto-host.h:/* #undef HAVE_WORKING_CXX_BUILTIN_CONSTANT_P */
>
> Ok for trunk?

I have a hard time seeing how we are "miscompiling"

  if (STATIC_CONSTANT_P (xi.precision > HOST_BITS_PER_WIDE_INT)
  ? xi.len == 1 && xi.val[0] >= 0
  : xi.precision <= HOST_BITS_PER_WIDE_INT)

anything that relies on __builtin_constant_p () for sematics is fishy so why
is this not simply an lrshfit implementation bug?

Richard.

> Thanks.
>
> H.J.
> ---
> gcc/
>
> PR c++/69399
> * configure.ac: Check if C++ __builtin_constant_p works
> properly.
> (HAVE_WORKING_CXX_BUILTIN_CONSTANT_P): AC_DEFINE.
> * system.h (STATIC_CONSTANT_P): Use __builtin_constant_p only
> if HAVE_WORKING_CXX_BUILTIN_CONSTANT_P is defined.
> * config.in: Regenerated.
> * configure: Likewise.
>
> gcc/testsuite/
>
> PR c++/69399
> * gcc.dg/torture/pr69399.c: New test.
> ---
>  gcc/config.in  | 10 -
>  gcc/configure  | 41 
> --
>  gcc/configure.ac   | 27 ++
>  gcc/system.h   |  2 +-
>  gcc/testsuite/gcc.dg/torture/pr69399.c | 21 +
>  5 files changed, 97 insertions(+), 4 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/torture/pr69399.c
>
> diff --git a/gcc/config.in b/gcc/config.in
> index 1796e1d..11ebf5c 100644
> --- a/gcc/config.in
> +++ b/gcc/config.in
> @@ -1846,6 +1846,13 @@
>  #endif
>
>
> +/* Define this macro if C++ __builtin_constant_p with constexpr does not 
> crash
> +   with a variable. */
> +#ifndef USED_FOR_TARGET
> +#undef HAVE_WORKING_CXX_BUILTIN_CONSTANT_P
> +#endif
> +
> +
>  /* Define to 1 if `fork' works. */
>  #ifndef USED_FOR_TARGET
>  #undef HAVE_WORKING_FORK
> @@ -1865,7 +1872,8 @@
>  #endif
>
>
> -/* Define if your assembler supports .dwsect 0xB */
> +/* Define if your assembler supports AIX debug frame section label reference.
> +   */
>  #ifndef USED_FOR_TARGET
>  #undef HAVE_XCOFF_DWARF_EXTRAS
>  #endif
> diff --git a/gcc/configure b/gcc/configure
> index ff646e8..2798231 100755
> --- a/gcc/configure
> +++ b/gcc/configure
> @@ -6534,6 +6534,43 @@ fi
>  rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
>  fi
>
> +# Check if C++ __builtin_constant_p works properly.  Without the fix
> +# for PR 65656, g++ miscompiles __builtin_constant_p in wi::lrshift in
> +# wide-int.h.  Add a check with PR 65656 testcase to verify that C++
> +# __builtin_constant_p works properly.
> +if test "$GCC" = yes; then
> +  saved_CFLAGS="$CFLAGS"
> +  saved_CXXFLAGS="$CXXFLAGS"
> +  CFLAGS="$CFLAGS -O -x c++ -std=c++11"
> +  CXXFLAGS="$CXXFLAGS -O -x c++ -std=c++11"
> +  { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether $CXX 
> __builtin_constant_p works with constexpr" >&5
> +$as_echo_n "checking whether $CXX __builtin_constant_p works with 
> constexpr... " >&6; }
> +  cat confdefs.h - <<_ACEOF >conftest.$ac_ext
> +/* end confdefs.h.  */
> +
> +int
> +foo (int argc)
> +{
> +  constexpr bool x = __builtin_constant_p(argc);
> +  return x ? 1 : 0;
> +}
> +
> +_ACEOF
> +if ac_fn_cxx_try_compile "$LINENO"; then :
> +  { $as_echo "$as_me:${as_lineno-$LINENO}: result: yes" >&5
> +$as_echo "yes" >&6; }
> +
> +$as_echo "#define HAVE_WORKING_CXX_BUILTIN_CONSTANT_P 1" >>confdefs.h
> +
> +else
> +  { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
> +$as_echo "no" >&6; }
> +fi
> +rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
> +  CFLAGS="$saved_CFLAGS"
> +  CXXFLAGS="$saved_CXXFLAGS"
> +fi
> +
>  # Check whether compiler is affected by placement new aliasing bug (PR 
> 29286).
>  # If the host compiler is affected by the bug, and we build with optimization
>  # enabled (which happens e.g. when cross-compiling), the pool allocator may
> @@ -18419,7 +18456,7 @@ else
>lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
>lt_status=$lt_dlunknown
>cat > conftest.$ac_ext <<_LT_EOF
> -#line 18422 "configure"
> +#line 18459 "configure"
>  #include "confdefs.h"
>
>  #if HAVE_DLFCN_H
> @@ -18525,7 +18562,7 @@ else
>lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
>lt_status=$lt_dlunknown
>cat > conftest.$ac_ext <<_LT_EOF
> -#line 18528 "configure"
>

Re: [PATCH, rs6000] Fix PR63354

2016-01-25 Thread David Edelsohn

On Sun, Jan 24, 2016 at 9:17 PM, Bill Schmidt
 wrote:

> Hi Jan, thanks for the report!  Patch below that should fix the problem.
> Bootstrapped and tested on powerpc64le-unknown-linux-gnu, no
> regressions.  David, is this ok for trunk?
>
> Thanks,
> Bill
>
>
> 2016-01-24  Bill Schmidt  
>
> * config/rs6000/rs6000.c (rs6000_keep_leaf_when_profiled):  Add
> decl with __attribute__ ((unused)) annotation.

Okay.

Thanks, David

Re: [PATCH] ARM PR68620 (ICE with FP16 on armeb)

2016-01-25 Thread Christophe Lyon

On 22 January 2016 at 18:06, Alan Lawrence  wrote:
> On 20/01/16 21:10, Christophe Lyon wrote:
>>
>> On 19 January 2016 at 15:51, Alan Lawrence 
>> wrote:
>>>
>>> On 19/01/16 11:15, Christophe Lyon wrote:
>>>
>> For neon_vdupn, I chose to implement neon_vdup_nv4hf and
>> neon_vdup_nv8hf instead of updating the VX iterator because I thought
>> it was not desirable to impact neon_vrev32.
>
>
>
> Well, the same instruction will suffice for vrev32'ing vectors of HF
> just
> as
> well as vectors of HI, so I think I'd argue that's harmless enough. To
> gain the
> benefit, we'd need to update arm_evpc_neon_vrev with a few new cases,
> though.
>
 Since this is more intrusive, I'd rather leave that part for later. OK?
>>>
>>>
>>>
>>> Sure.
>>>
>> +#ifdef __ARM_BIG_ENDIAN
>> +  /* Here, 3 is (4-1) where 4 is the number of lanes. This is also
>> the
>> + right value for vectors with 8 lanes.  */
>> +#define __arm_lane(__vec, __idx) (__idx ^ 3)
>> +#else
>> +#define __arm_lane(__vec, __idx) __idx
>> +#endif
>> +
>
>
>
> Looks right, but sounds... my concern here is that I'm hoping at some
> point we
> will move the *other* vget/set_lane intrinsics to use GCC vector
> extensions
> too. At which time (unlike __aarch64_lane which can be used everywhere)
> this
> will be the wrong formula. Can we name (and/or comment) it to avoid
> misleading
> anyone? The key characteristic seems to be that it is for vectors of
> 16-bit
> elements only.
>
 I'm not to follow, here. Looking at the patterns for
 neon_vget_lane_*internal in neon.md,
 I can see 2 flavours: one for VD, one for VQ2. The latter uses
 "halfelts".

 Do you prefer that I create 2 macros (say __arm_lane and __arm_laneq),
 that would be similar to the aarch64 ones (by computing the number of
 lanes of the input vector), but the "q" one would use half the total
 number of lanes instead?
>>>
>>>
>>>
>>> That works for me! Sthg like:
>>>
>>> #define __arm_lane(__vec, __idx) NUM_LANES(__vec) - __idx
>>> #define __arm_laneq(__vec, __idx) (__idx & (NUM_LANES(__vec)/2)) +
>>> (NUM_LANES(__vec)/2 - __idx)
>>> //or similarly
>>> #define __arm_laneq(__vec, __idx) (__idx ^ (NUM_LANES(__vec)/2 - 1))
>>>
>>> Alternatively I'd been thinking
>>>
>>> #define __arm_lane_32xN(__idx) __idx ^ 1
>>> #define __arm_lane_16xN(__idx) __idx ^ 3
>>> #define __arm_lane_8xN(__idx) __idx ^ 7
>>>
>>> Bear in mind PR64893 that we had on AArch64 :-(
>>>
>>
>> Here is a new version, based on the comments above.
>> I've also removed the addition of arm_fp_ok effective target since I
>> added that in my other testsuite patch.
>>
>> OK now?
>
>
> Yes. I realize my worry about PR64893 doesn't apply here since we pass
> constant lane numbers / vector bounds into __builtin_arm_lane_check. Thanks!
>

Thanks, I guess I still have to wait for Kyrill/Ramana 's approval.

> --Alan
>
>>
>> Thanks,
>>
>> Christophe
>>
>>> Cheers, Alan
>
>

Re: Mark oacc kernels fns

2016-01-25 Thread Jakub Jelinek

On Mon, Jan 25, 2016 at 10:06:50AM -0500, Nathan Sidwell wrote:
> On 01/04/16 10:39, Nathan Sidwell wrote:
> >There's currently no robust predicate to determine whether an oacc offload
> >function is for a kernels region (as opposed to a parallel region).  The 
> >test in
> >tree-ssa-loop.c uses the heuristic of seeing if all the dimensions are 
> >defaulted
> >  (which can easily be true for parallel offloads at that point).
> >
> >This patch marks TREE_PUBLIC on the offload attribute values, to note kernels
> >regions,  and adds a predicate to check that.  I also broke out the function
> >level determination from oacc_validate_dims, as there it was only laziness 
> >on my
> >part to have not done that earlier.
> >
> >Using these predicates improves the dump output of the openacc device 
> >lowering
> >pass too.
> >
> >ok?
> 
> https://gcc.gnu.org/ml/gcc-patches/2016-01/msg00092.html
> ping?

Ok, thanks.

Jakub

[PATCH 0/2] [ARC] Small fixes

2016-01-25 Thread Claudiu Zissulescu

Please find attached two small patches which are fixing two issues within the 
ARC backend:

1. The first one fixes predicates used by arcset* patterns.
2. The second one rejects constant-constant comparisons. This situation may 
happen durring CSE step.

Ok to apply?
Claudiu

Claudiu Zissulescu (2):
  [ARC] Fix arcset* pattern's predicate.
  [ARC] Reject constant-constant comparison.

 gcc/config/arc/arc.md| 18 +++---
 gcc/config/arc/predicates.md |  2 ++
 2 files changed, 13 insertions(+), 7 deletions(-)

-- 
1.9.1

[PATCH 1/2] [ARC] Fix arcset* pattern's predicate.

2016-01-25 Thread Claudiu Zissulescu

gcc/
2016-01-25  Claudiu Zissulescu  

* config/arc/arc.md (cstoresi4): Force operand into register.
(arcset): Fix predicate.
(arcsetltu): Likewise.
(arcsetgeu): Likewise.
(arcsethi): Likewise.
(arcsetls): Likewise.
---
 gcc/config/arc/arc.md | 18 +++---
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md
index 222a468..602cf0b 100644
--- a/gcc/config/arc/arc.md
+++ b/gcc/config/arc/arc.md
@@ -3346,8 +3346,9 @@
 
 (define_expand "cstoresi4"
   [(set (match_operand:SI 0 "dest_reg_operand" "")
-   (match_operator:SI 1 "ordered_comparison_operator" [(match_operand:SI 2 
"nonmemory_operand" "")
-   (match_operand:SI 3 
"nonmemory_operand" "")]))]
+   (match_operator:SI 1 "ordered_comparison_operator"
+  [(match_operand:SI 2 "nonmemory_operand" "")
+   (match_operand:SI 3 "nonmemory_operand" "")]))]
   ""
 {
   if (!TARGET_CODE_DENSITY)
@@ -3358,6 +3359,9 @@
emit_insn (gen_scc_insn (operands[0], operands[1]));
DONE;
   }
+  if (!register_operand (operands[2], SImode))
+operands[2] = force_reg (SImode, operands[2]);
+
 })
 
 (define_mode_iterator SDF [SF DF])
@@ -5414,7 +5418,7 @@
 
 (define_insn "arcset"
   [(set (match_operand:SI 0 "register_operand""=r,r,r,r,r,r,r")
-   (arcCC_cond:SI (match_operand:SI 1 "nonmemory_operand" "0,r,0,r,0,0,r")
+   (arcCC_cond:SI (match_operand:SI 1 "register_operand"  "0,r,0,r,0,0,r")
   (match_operand:SI 2 "nonmemory_operand" 
"r,r,L,L,I,n,n")))]
   "TARGET_V2 && TARGET_CODE_DENSITY"
   "set%? %0, %1, %2"
@@ -5427,7 +5431,7 @@
 
 (define_insn "arcsetltu"
   [(set (match_operand:SI 0 "register_operand" "=r,r,r,r,r,  r,  r")
-   (ltu:SI (match_operand:SI 1 "nonmemory_operand" "0,r,0,r,0,  0,  r")
+   (ltu:SI (match_operand:SI 1 "register_operand"  "0,r,0,r,0,  0,  r")
(match_operand:SI 2 "nonmemory_operand" "r,r,L,L,I,  n,  n")))]
   "TARGET_V2 && TARGET_CODE_DENSITY"
   "setlo%? %0, %1, %2"
@@ -5440,7 +5444,7 @@
 
 (define_insn "arcsetgeu"
   [(set (match_operand:SI 0 "register_operand" "=r,r,r,r,r,  r,  r")
-   (geu:SI (match_operand:SI 1 "nonmemory_operand" "0,r,0,r,0,  0,  r")
+   (geu:SI (match_operand:SI 1 "register_operand"  "0,r,0,r,0,  0,  r")
(match_operand:SI 2 "nonmemory_operand" "r,r,L,L,I,  n,  n")))]
   "TARGET_V2 && TARGET_CODE_DENSITY"
   "seths%? %0, %1, %2"
@@ -5454,7 +5458,7 @@
 ;; Special cases of SETCC
 (define_insn_and_split "arcsethi"
   [(set (match_operand:SI 0 "register_operand" "=r,r,  r,r")
-   (gtu:SI (match_operand:SI 1 "nonmemory_operand" "r,r,  r,r")
+   (gtu:SI (match_operand:SI 1 "register_operand"  "r,r,  r,r")
(match_operand:SI 2 "nonmemory_operand" "0,r,C62,n")))]
   "TARGET_V2 && TARGET_CODE_DENSITY"
   "setlo%? %0, %2, %1"
@@ -5477,7 +5481,7 @@
 
 (define_insn_and_split "arcsetls"
   [(set (match_operand:SI 0 "register_operand" "=r,r,  r,r")
-   (leu:SI (match_operand:SI 1 "nonmemory_operand" "r,r,  r,r")
+   (leu:SI (match_operand:SI 1 "register_operand"  "r,r,  r,r")
(match_operand:SI 2 "nonmemory_operand" "0,r,C62,n")))]
   "TARGET_V2 && TARGET_CODE_DENSITY"
   "seths%? %0, %2, %1"
-- 
1.9.1

[PATCH 2/2] [ARC] Reject constant-constant comparison.

2016-01-25 Thread Claudiu Zissulescu

gcc/
2016-01-25  Claudiu Zissulescu  

* config/arc/predicates.md (proper_comparison_operator): Reject
constant-constant comparison.
---
 gcc/config/arc/predicates.md | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/gcc/config/arc/predicates.md b/gcc/config/arc/predicates.md
index 52ac2ac..d384d70 100644
--- a/gcc/config/arc/predicates.md
+++ b/gcc/config/arc/predicates.md
@@ -510,6 +510,8 @@
 /* From combiner.  */
 case QImode: case HImode: case DImode: case SFmode: case DFmode:
   return 0;
+case VOIDmode:
+  return 0;
 default:
   gcc_unreachable ();
   }
-- 
1.9.1

[PATCH] Fix PR69380

2016-01-25 Thread Richard Biener


Tested on x86_64-linux, applied.

Richard.

2016-01-25  Richard Biener  

PR testsuite/69380
* g++.dg/tree-ssa/pr69336.C: Restrict to x86_64 and i?86.

Index: gcc/testsuite/g++.dg/tree-ssa/pr69336.C
===
--- gcc/testsuite/g++.dg/tree-ssa/pr69336.C (revision 232792)
+++ gcc/testsuite/g++.dg/tree-ssa/pr69336.C (working copy)
@@ -83,4 +83,4 @@ int main(void)
   return 0;
 }
 
-// { dg-final { scan-tree-dump-not "cmap" "optimized" } }
+// { dg-final { scan-tree-dump-not "cmap" "optimized" { target x86_64-*-* 
i?86-*-* } } }

RE: [PATCH 0/2] [ARC] Small fixes

2016-01-25 Thread Claudiu Zissulescu

 
> FWIW, there's probably a missed optimization here - these constant -
> constant comparisons could be folded down further.

They are. The issue is when the CSE runs, wants to validate a new instruction 
with the propagated constant, which will lead to errors as it checks the 
proper_comaprison_operator, as it hits the assert at the end. Returning zero, 
it invalidates the instruction change, and the constant comparison will be 
handled later on by other steps.

//Claudiu

Re: [PATCH] Fix the remaining PR c++/24666 blockers (arrays decay to pointers too early)

2016-01-25 Thread Jason Merrill


On 01/22/2016 05:30 PM, Patrick Palka wrote:

On Fri, 22 Jan 2016, Jason Merrill wrote:

On 01/22/2016 11:17 AM, Patrick Palka wrote:

On Thu, 21 Jan 2016, Patrick Palka wrote:

On Thu, 21 Jan 2016, Jason Merrill wrote:


On 01/19/2016 10:30 PM, Patrick Palka wrote:

 * g++.dg/template/unify17.C: XFAIL.


Hmm, I'm not comfortable with deliberately regressing this testcase.


  template 
-void bar (void (T[5])); // { dg-error "array of 'void'" }
+void bar (void (T[5])); // { dg-error "array of 'void'" "" { xfail
*-*-* } }


Can we work it so that T[5] also is un-decayed in the DECL_ARGUMENTS
of bar, but decayed in the TYPE_ARG_TYPES?


I think so, I'll try it.


Well, I tried it and the result is really ugly and it only "somewhat"
works.  (Maybe I'm just missing something obvious though.)  The ugliness
comes from the fact that decaying an array parameter type of a function
type embedded deep within some tree structure requires rebuilding all
the tree structures in between to avoid issues due to tree sharing.


Yes, that does complicate things.


This approach only "somewhat" works because it currently looks through
function, pointer, reference and array types.


Right, you would need to handle template arguments as well.


And I just noticed that
this approach does not work at all for USING_DECLs because no PARM_DECL
is ever retained anyway in that case.


I don't understand what you mean about USING_DECLs.


I just meant that we fail and would continue to fail to diagnose an
"array of void" error in the following test case:

template 
using X = void (T[5]);

void foo (X);


True.  I think here we want to get the error when instantiating X.


I think a better, complete fix for this issue would be to, one way or
another, be able to get at the PARM_DECLs that correspond to a given
FUNCTION_TYPE.  Say, if, the TREE_CHAIN of a FUNCTION_TYPE optionally
pointed to its PARM_DECLs, or something.  What do you think?


Hmm.  So void(int[5]) and void(int*) would be distinct types, but they
would share TYPE_CANONICAL, as though one is a typedef of the other?
Interesting, but I'm not sure how that would interact with template
argument canonicalization.  Well, that can probably be made to work by
treating dependent template arguments as distinct more frequently.

Another thought: What if we keep a list of arrays we need to
substitute into for a particular function template?


That approach definitely seems easier to reason about.  And it could
properly handle "using" templates as well as variable templates -- any
TEMPLATE_DECL, I think.


Agreed.

Jason

Re: Mark oacc kernels fns

2016-01-25 Thread Nathan Sidwell


On 01/04/16 10:39, Nathan Sidwell wrote:

There's currently no robust predicate to determine whether an oacc offload
function is for a kernels region (as opposed to a parallel region).  The test in
tree-ssa-loop.c uses the heuristic of seeing if all the dimensions are defaulted
  (which can easily be true for parallel offloads at that point).

This patch marks TREE_PUBLIC on the offload attribute values, to note kernels
regions,  and adds a predicate to check that.  I also broke out the function
level determination from oacc_validate_dims, as there it was only laziness on my
part to have not done that earlier.

Using these predicates improves the dump output of the openacc device lowering
pass too.

ok?


https://gcc.gnu.org/ml/gcc-patches/2016-01/msg00092.html
ping?

Re: Speedup configure and build with system.h

2016-01-25 Thread Michael Matz

Hi,

On Fri, 22 Jan 2016, Jakub Jelinek wrote:

> > > This may have caused:
> > > 
> > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69434
> > 
> > Guess we need:
> > 
> > 2016-01-22  Jakub Jelinek  
> > 
> > PR bootstrap/69434
> > * genrecog.c: Define INCLUDE_ALGORITHM before including system.h,
> > remove  include.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Thanks for the fixup, the problem didn't happen on my system.  This usage 
of sse3 intrinsics inside installed STL headers seems a bit unfortunate 
(not to mention the dubiousness of placing a 180 line function containing 
11 loops nested to level 4 for arcane functionality into a header; but I 
guess this battle is lost with STL).

Ciao,
Michael.

Re: Speedup configure and build with system.h

2016-01-25 Thread Richard Biener

On Mon, 25 Jan 2016, Michael Matz wrote:

> Hi,
> 
> On Mon, 25 Jan 2016, Uros Bizjak wrote:
> 
> > This patch caused bootstrap failure on non-c++11 bootstrap compiler
> > [1], e.g. CentOS 5.11.
> > 
> > The problem is with std::swap, which was defined in header 
> > until c++11 [2].
> > 
> > [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69464
> > [2] http://en.cppreference.com/w/cpp/algorithm/swap
> 
> Meh.  Can you try the attached patch with a configure test (it includes 
> the generated files)?  It works for me with 4.3.4, and should make your 
> build include  always.

Ok with a proper changelog.

Thanks,
Richard.

Re: [PATCH, rs6000] Fix PR63354

2016-01-25 Thread Bill Schmidt

Thanks, committed as r232793.

Bill

On Mon, 2016-01-25 at 08:54 -0500, David Edelsohn wrote:
> On Sun, Jan 24, 2016 at 9:17 PM, Bill Schmidt
>  wrote:
> 
> > Hi Jan, thanks for the report!  Patch below that should fix the problem.
> > Bootstrapped and tested on powerpc64le-unknown-linux-gnu, no
> > regressions.  David, is this ok for trunk?
> >
> > Thanks,
> > Bill
> >
> >
> > 2016-01-24  Bill Schmidt  
> >
> > * config/rs6000/rs6000.c (rs6000_keep_leaf_when_profiled):  Add
> > decl with __attribute__ ((unused)) annotation.
> 
> Okay.
> 
> Thanks, David
>

Re: [PATCH] Fix PR64091

2016-01-25 Thread Richard Biener

On Mon, 25 Jan 2016, Tom de Vries wrote:

> On 27/11/14 15:13, Richard Biener wrote:
> > 
> > Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.
> > 
> > Richard.
> > 
> > 2014-11-27  Richard Biener  
> > 
> > PR tree-optimization/64088
> > * tree-ssa-tail-merge.c (update_debug_stmt): After resetting
> > the stmt break from the loop over use operands.
> > 
> > * gcc.dg/torture/pr64091.c: New testcase.
> > 
> > Index: gcc/tree-ssa-tail-merge.c
> > ===
> > --- gcc/tree-ssa-tail-merge.c   (revision 218117)
> > +++ gcc/tree-ssa-tail-merge.c   (working copy)
> > @@ -1606,9 +1613,7 @@ update_debug_stmt (gimple stmt)
> >   {
> > use_operand_p use_p;
> > ssa_op_iter oi;
> > -  basic_block bbdef, bbuse;
> > -  gimple def_stmt;
> > -  tree name;
> > +  basic_block bbuse;
> > 
> > if (!gimple_debug_bind_p (stmt))
> >   return;
> > @@ -1616,19 +1621,16 @@ update_debug_stmt (gimple stmt)
> > bbuse = gimple_bb (stmt);
> > FOR_EACH_PHI_OR_STMT_USE (use_p, stmt, oi, SSA_OP_USE)
> >   {
> > -  name = USE_FROM_PTR (use_p);
> > -  gcc_assert (TREE_CODE (name) == SSA_NAME);
> > -
> > -  def_stmt = SSA_NAME_DEF_STMT (name);
> > -  gcc_assert (def_stmt != NULL);
> > -
> > -  bbdef = gimple_bb (def_stmt);
> > +  tree name = USE_FROM_PTR (use_p);
> > +  gimple def_stmt = SSA_NAME_DEF_STMT (name);
> > +  basic_block bbdef = gimple_bb (def_stmt);
> > if (bbdef == NULL || bbuse == bbdef
> >   || dominated_by_p (CDI_DOMINATORS, bbuse, bbdef))
> > continue;
> > 
> > gimple_debug_bind_reset_value (stmt);
> > update_stmt (stmt);
> > +  break;
> >   }
> >   }
> > 
> > Index: gcc/testsuite/gcc.dg/torture/pr64091.c
> > ===
> > --- gcc/testsuite/gcc.dg/torture/pr64091.c  (revision 0)
> > +++ gcc/testsuite/gcc.dg/torture/pr64091.c  (working copy)
> > @@ -0,0 +1,28 @@
> > +/* { dg-do compile } */
> > +/* { dg-additional-options "-g" } */
> > +
> > +extern int foo(void);
> > +
> > +int main(void)
> > +{
> > +  int i, a, b;
> > +
> > +  if (foo())
> > +return 0;
> > +
> > +  for (i = 0, a = 0, b = 0; i < 3; i++, a++)
> > +  {
> > +if (foo())
> > +  break;
> > +
> > +if (b += a)
> > +  a = 0;
> > +  }
> > +
> > +  if (!a)
> > +return 2;
> > +
> > +  b += a;
> > +
> > +  return 0;
> > +}
> > 
> 
> Hi,
> 
> the ICE that the patch above fixes does not reproduce on 4.9.
> 
> One reason is that an edge_flag EDGE_EXECUTABLE happens to be set, which
> prevents tail-merge from doing a merge.
> 
> Another reason is that the use that is added to the free_uses freelist during
> update_stmt happens to be not immediately reused, so the contents remains the
> same.
> 
> Using first attached patch, which:
> - clears EDGE_EXECUTABLE in tail-merge, and

this shows a latent issue in tail-merging that it doesn't ignore
edge flags that are "private" (that is, they have random state upon
pass entry).

> - clears the contents of a use when adding it to the freelist
> we manage to trigger the same problem with 4.9.
>
> Is seems possible to me that the same problem could trigger on 4.9 for a
> different example without the trigger patch.
> 
> The second attached patch is a minimal version of the above fix.
> 
> OK for 4.9?

Ok (the minimal fix).

Thanks,
Richard.

> Thanks,
> - Tom
> 
> 
> 
> 
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

Re: [hsa merge 07/10] IPA-HSA pass

2016-01-25 Thread Jakub Jelinek

On Mon, Jan 25, 2016 at 04:21:50PM +0100, Martin Liška wrote:
> On 01/16/2016 11:00 AM, Jan Hubicka wrote:
> > Can't it be represented via explicit REF_ADDR or something like that?
> > 
> > Honza
> 
> Hi.
> 
> Sure, I've just done a patch that can do that. However, as we're currently in 
> stage4,
> that change would probably require explicit permission of a release manager?

If Honza is fine with it and you've tested it, this is ok for trunk.

> >From 9639fff94d043c55b55bfb12bb086032db565f0a Mon Sep 17 00:00:00 2001
> From: marxin 
> Date: Mon, 25 Jan 2016 16:11:00 +0100
> Subject: [PATCH] HSA: simplify partitioning of HSA kernels and host impls.
> 
> gcc/lto/ChangeLog:
> 
> 2016-01-25  Martin Liska  
> 
>   * lto-partition.c (add_symbol_to_partition_1): Remove usage
>   of hsa_summaries.
> 
> gcc/ChangeLog:
> 
> 2016-01-25  Martin Liska  
> 
>   * hsa.c (hsa_summary_t::link_functions): Create IPA_REF_ADDR
>   reference for an HSA kernel and its host function.
> ---
>  gcc/hsa.c   |  5 +
>  gcc/lto/lto-partition.c | 19 ---
>  2 files changed, 5 insertions(+), 19 deletions(-)
> 
> diff --git a/gcc/hsa.c b/gcc/hsa.c
> index ec23f81..f0b3205 100644
> --- a/gcc/hsa.c
> +++ b/gcc/hsa.c
> @@ -781,6 +781,11 @@ hsa_summary_t::link_functions (cgraph_node *gpu, 
> cgraph_node *host,
>TREE_OPTIMIZATION (fn_opts)->x_flag_tree_loop_vectorize = false;
>TREE_OPTIMIZATION (fn_opts)->x_flag_tree_slp_vectorize = false;
>DECL_FUNCTION_SPECIFIC_OPTIMIZATION (gdecl) = fn_opts;
> +
> +  /* Create reference between a kernel and a corresponding host 
> implementation
> + to quarantee LTO streaming to a same LTRANS.  */
> +  if (kind == HSA_KERNEL)
> +gpu->create_reference (host, IPA_REF_ADDR);
>  }
>  
>  /* Add a HOST function to HSA summaries.  */
> diff --git a/gcc/lto/lto-partition.c b/gcc/lto/lto-partition.c
> index eb28fed..9eb63c2 100644
> --- a/gcc/lto/lto-partition.c
> +++ b/gcc/lto/lto-partition.c
> @@ -34,7 +34,6 @@ along with GCC; see the file COPYING3.  If not see
>  #include "ipa-prop.h"
>  #include "ipa-inline.h"
>  #include "lto-partition.h"
> -#include "hsa.h"
>  
>  vec ltrans_partitions;
>  
> @@ -171,24 +170,6 @@ add_symbol_to_partition_1 (ltrans_partition part, 
> symtab_node *node)
>Therefore put it into the same partition.  */
>if (cnode->instrumented_version)
>   add_symbol_to_partition_1 (part, cnode->instrumented_version);
> -
> -  /* Add an HSA associated with the symbol.  */
> -  if (hsa_summaries != NULL)
> - {
> -   hsa_function_summary *s = hsa_summaries->get (cnode);
> -   if (s->m_kind == HSA_KERNEL)
> - {
> -   /* Add binded function.  */
> -   bool added = add_symbol_to_partition_1 (part,
> -   s->m_binded_function);
> -   gcc_assert (added);
> -   if (symtab->dump_file)
> - fprintf (symtab->dump_file,
> -  "adding an HSA function (host/gpu) to the "
> -  "partition: %s\n",
> -  s->m_binded_function->name ());
> - }
> - }
>  }
>  
>add_references_to_partition (part, node);
> -- 
> 2.7.0
> 


Jakub

Re: [PATCH] Fix PR64091

2016-01-25 Thread Tom de Vries


On 27/11/14 15:13, Richard Biener wrote:


Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2014-11-27  Richard Biener  

PR tree-optimization/64088
* tree-ssa-tail-merge.c (update_debug_stmt): After resetting
the stmt break from the loop over use operands.

* gcc.dg/torture/pr64091.c: New testcase.

Index: gcc/tree-ssa-tail-merge.c
===
--- gcc/tree-ssa-tail-merge.c   (revision 218117)
+++ gcc/tree-ssa-tail-merge.c   (working copy)
@@ -1606,9 +1613,7 @@ update_debug_stmt (gimple stmt)
  {
use_operand_p use_p;
ssa_op_iter oi;
-  basic_block bbdef, bbuse;
-  gimple def_stmt;
-  tree name;
+  basic_block bbuse;

if (!gimple_debug_bind_p (stmt))
  return;
@@ -1616,19 +1621,16 @@ update_debug_stmt (gimple stmt)
bbuse = gimple_bb (stmt);
FOR_EACH_PHI_OR_STMT_USE (use_p, stmt, oi, SSA_OP_USE)
  {
-  name = USE_FROM_PTR (use_p);
-  gcc_assert (TREE_CODE (name) == SSA_NAME);
-
-  def_stmt = SSA_NAME_DEF_STMT (name);
-  gcc_assert (def_stmt != NULL);
-
-  bbdef = gimple_bb (def_stmt);
+  tree name = USE_FROM_PTR (use_p);
+  gimple def_stmt = SSA_NAME_DEF_STMT (name);
+  basic_block bbdef = gimple_bb (def_stmt);
if (bbdef == NULL || bbuse == bbdef
  || dominated_by_p (CDI_DOMINATORS, bbuse, bbdef))
continue;

gimple_debug_bind_reset_value (stmt);
update_stmt (stmt);
+  break;
  }
  }

Index: gcc/testsuite/gcc.dg/torture/pr64091.c
===
--- gcc/testsuite/gcc.dg/torture/pr64091.c  (revision 0)
+++ gcc/testsuite/gcc.dg/torture/pr64091.c  (working copy)
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-g" } */
+
+extern int foo(void);
+
+int main(void)
+{
+  int i, a, b;
+
+  if (foo())
+return 0;
+
+  for (i = 0, a = 0, b = 0; i < 3; i++, a++)
+  {
+if (foo())
+  break;
+
+if (b += a)
+  a = 0;
+  }
+
+  if (!a)
+return 2;
+
+  b += a;
+
+  return 0;
+}



Hi,

the ICE that the patch above fixes does not reproduce on 4.9.

One reason is that an edge_flag EDGE_EXECUTABLE happens to be set, which 
prevents tail-merge from doing a merge.


Another reason is that the use that is added to the free_uses freelist 
during update_stmt happens to be not immediately reused, so the contents 
remains the same.


Using first attached patch, which:
- clears EDGE_EXECUTABLE in tail-merge, and
- clears the contents of a use when adding it to the freelist
we manage to trigger the same problem with 4.9.

Is seems possible to me that the same problem could trigger on 4.9 for a 
different example without the trigger patch.


The second attached patch is a minimal version of the above fix.

OK for 4.9?

Thanks,
- Tom





trigger

---
 gcc/tree-ssa-operands.c   | 6 +-
 gcc/tree-ssa-tail-merge.c | 1 +
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/tree-ssa-operands.c b/gcc/tree-ssa-operands.c
index 76d04630..d249354 100644
--- a/gcc/tree-ssa-operands.c
+++ b/gcc/tree-ssa-operands.c
@@ -402,7 +402,11 @@ finalize_ssa_uses (struct function *fn, gimple stmt)
   if (old_ops)
 {
   for (ptr = old_ops; ptr; ptr = ptr->next)
-	delink_imm_use (USE_OP_PTR (ptr));
+	{
+	  delink_imm_use (USE_OP_PTR (ptr));
+	  USE_OP_PTR (ptr)->use = NULL;
+	  USE_OP_PTR (ptr)->loc.stmt = NULL;
+	}
   old_ops->next = gimple_ssa_operands (fn)->free_uses;
   gimple_ssa_operands (fn)->free_uses = old_ops;
 }
diff --git a/gcc/tree-ssa-tail-merge.c b/gcc/tree-ssa-tail-merge.c
index b5165d5..af2641c 100644
--- a/gcc/tree-ssa-tail-merge.c
+++ b/gcc/tree-ssa-tail-merge.c
@@ -725,6 +725,7 @@ find_same_succ_bb (basic_block bb, same_succ *same_p)
 {
   int index = e->dest->index;
   bitmap_set_bit (same->succs, index);
+  e->flags &= ~EDGE_EXECUTABLE;
   same_succ_edge_flags[index] = e->flags;
 }
   EXECUTE_IF_SET_IN_BITMAP (same->succs, 0, j, bj)
Backport "Fix PR64091"

2016-01-25  Tom de Vries  

	backport from trunk:
	2014-11-27  Richard Biener  

	PR tree-optimization/PR64091
	* tree-ssa-tail-merge.c (update_debug_stmt): After resetting
	the stmt break from the loop over use operands.

	* gcc.dg/torture/pr64091.c: New testcase.

---
 gcc/testsuite/gcc.dg/torture/pr64091.c | 28 
 gcc/tree-ssa-tail-merge.c  |  1 +
 2 files changed, 29 insertions(+)

diff --git a/gcc/testsuite/gcc.dg/torture/pr64091.c b/gcc/testsuite/gcc.dg/torture/pr64091.c
new file mode 100644
index 000..0cd994a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr64091.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-g" } */
+
+extern int foo(void);
+
+int main(void)
+{
+  int i, a, b;
+
+  if (foo())
+return 0;
+
+  for (i = 0, a = 0, b = 0; i < 3; i++, a++)
+  {
+

Re: [PATCH] fix gimplification of call parameters (PR cilkplus/69267)

2016-01-25 Thread Ryan Burn

ping

On Tue, Jan 19, 2016 at 9:28 AM, Ryan Burn  wrote:
> Does this look ok?
>
>> On Jan 15, 2016, at 5:41 PM, Ryan Burn  wrote:
>>
>> This patch changes the function cilk_gimplify_call_params_in_spawned_fn to 
>> use gimplify_arg instead of gimplify_expr. It fixes an ICE when calling a 
>> function with a constructed empty class as the argument.
>>
>> Bootstrapped and regression tested on x86_64-linux.
>>
>> 2016-01-15  Ryan Burn  
>>
>>PR cilkplus/69267
>>* cilk.c (cilk_gimplify_call_params_in_spawned_fn): Change to use
>>gimplify_arg. Removed superfluous post_p argument.
>>* c-family.h (cilk_gimplify_call_params_in_spawned_fn): Removed
>>superfluous post_p argument.
>>* c-gimplify.c (c_gimplify_expr): Likewise.
>>
>> gcc/cp/ChangeLog:
>>
>> 2016-01-15  Ryan Burn  
>>
>>PR cilkplus/69267
>>* cp-gimplify.c (cilk_cp_gimplify_call_params_in_spawned_fn): Removed
>>superfluous post_p argument in call to
>>cilk_gimplify_call_params_in_spawned_fn.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 2016-01-15  Ryan Burn  
>>
>> PR cilkplus/69267
>> * g++.dg/cilk-plus/CK/pr69267.cc: New test.
>>
>>
>>
>> 
>

Re: C++ PATCH for c++/69379 (ICE with PTRMEM_CST wrapped in NOP_EXPR)

2016-01-25 Thread Jason Merrill


On 01/22/2016 05:07 PM, Marek Polacek wrote:

On Fri, Jan 22, 2016 at 03:38:26PM -0500, Jason Merrill wrote:

If we have a NOP_EXPR to the same type, we should strip it here.


This helps for the unreduced testcases in the PR, but not for the reduced one,
because for the reduced one, the types are not the same.  One type is
struct
{
   void Dict:: (struct Dict *, T) * __pfn;
   long int __delta;
}
and the second one
struct
{
   void Dict:: (struct Dict *) * __pfn;
   long int __delta;
}

The NOP_EXPR in this case originated in build_reinterpret_cast_1:
7070   else if ((TYPE_PTRFN_P (type) && TYPE_PTRFN_P (intype))
7071|| (TYPE_PTRMEMFUNC_P (type) && TYPE_PTRMEMFUNC_P (intype)))
7072 return build_nop (type, expr);


Well, a reinterpret_cast makes the expression non-constant, so we can 
recognize that case (when the types are unrelated) and bail out.  After 
that we probably still need to deal with the case of conversion to a 
pointer-to-member-of-base type; for functions it looks like we can just 
copy the PTRMEM_CST and give it a different type, but for data members I 
think we'll need to add support for the type not matching the member in 
expand_ptrmem_cst.


Jason

[PATCH] Fix a typo in ppc libgcc (PR target/69444)

2016-01-25 Thread Jakub Jelinek

Hi!

The soft-fp multilib of powerpc libgcc doesn't build because of a typo
in the conditional - the guarded code uses inline asm that assumes hard
float.

Ok for trunk?

2016-01-25  Jakub Jelinek  

PR target/69444
* config/rs6000/sfp-machine.h: Fix a typo in #ifndef - __NO_FPRS__
instead of ___NO_FPRS__.

--- libgcc/config/rs6000/sfp-machine.h.jj   2016-01-21 21:27:57.0 
+0100
+++ libgcc/config/rs6000/sfp-machine.h  2016-01-25 11:45:48.093285428 +0100
@@ -110,7 +110,7 @@ typedef int __gcc_CMPtype __attribute__
floating point on pre-ISA 3.0 machines without the IEEE 128-bit floating
point support.  */
 
-#ifndef ___NO_FPRS__
+#ifndef __NO_FPRS__
 #define ISA_BIT(x) (1LL << (63 - x))
 
 /* Use the same bits of the FPSCR.  */

Jakub

[PATCH] [PR tree-optimization/69196] [PR tree-optimization/68398] Reorganize profitibility testing for FSM jump threading

2016-01-25 Thread Jeff Law



This is the first of a few patches to address 69196 (code size 
regression with jump threading) and 68398 (coremark regression due to 
FSM changes).


While these are caused by distinct issues, they hit the same hunk of 
code.  I could address them independently, but I believe in the end 
it'll just make the whole process more difficult.


This hunk of work is a bit of reorganization.  Right now 
valid_jump_thread_path searches the path for particular nuggets of 
information (did we thread through a multiway branch, did we thread a 
multiway branch, did we thread through the loop latch, did we thread the 
loop latch, etc).


We really want that code in the FSM detection side so that we can make 
better decisions about whether or not a particular FSM path is likely to 
be profitable to thread.  So this patch moves that code into the 
detection side.


Second, the limiters for the FSM code had some minor bugs.  For example, 
it didn't count one of the blocks when determining how many blocks where 
in the FSM path.  It didn't account for PHIs when determining how many 
statements we'd copy, etc.  Before the limiters are re-tuned, the basic 
accounting needs to be more accurate.


This turns out to have a tiny positive impact of 69196, but the primary 
purpose of this patch is putting bits in the right place and fixing dumb 
accounting errors.  The real work to address 69196 and 68398 will come 
in follow-up patches.


The testsuite changes are totally an artifact of changing how we detect 
the actions of the jump threader.


Bootstrapped & regression tested on x86_64.  Installed on the trunk.

Now to actually fix the regressions :-)

Jeff
commit e25c808d9975556443d1bf90f968f0fd567a5de6
Author: law 
Date:   Mon Jan 25 19:19:09 2016 +

PR tree-optimization/69196
PR tree-optimization/68398
* tree-ssa-threadupdate.h (enum bb_dom_status): Moved here from
tree-ssa-threadupdate.c.
(determine_bb_domination_status): Prototype
* tree-ssa-threadupdate.c (enum bb_dom_status): Remove
(determine_bb_domination_status): No longer static.
(valid_jump_thread_path): Remove code to detect characteristics
of the jump thread path not associated with correctness.
* tree-ssa-threadbackward.c (fsm_find_control_statment_thread_paths):
Correct test for thread path length.  Count PHIs for real operands as
statements that need to be copied.  Do not count ASSERT_EXPRs.
Look at all the blocks in the thread path.  Compute and selectively
filter thread paths based on threading through the latch, threading
a multiway branch or crossing a multiway branch.

PR tree-optimization/69196
PR tree-optimization/68398
* gcc.dg/tree-ssa/pr66752-3.c: Update expected output
* gcc.dg/tree-ssa/pr68198.c: Likewise.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@232802 
138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 6d51578..d9d59d7 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,21 @@
+2016-01-25  Jeff Law  
+
+   PR tree-optimization/69196
+   PR tree-optimization/68398
+   * tree-ssa-threadupdate.h (enum bb_dom_status): Moved here from
+   tree-ssa-threadupdate.c.
+   (determine_bb_domination_status): Prototype
+   * tree-ssa-threadupdate.c (enum bb_dom_status): Remove
+   (determine_bb_domination_status): No longer static.
+   (valid_jump_thread_path): Remove code to detect characteristics
+   of the jump thread path not associated with correctness.
+   * tree-ssa-threadbackward.c (fsm_find_control_statment_thread_paths):
+   Correct test for thread path length.  Count PHIs for real operands as
+   statements that need to be copied.  Do not count ASSERT_EXPRs.
+   Look at all the blocks in the thread path.  Compute and selectively
+   filter thread paths based on threading through the latch, threading
+   a multiway branch or crossing a multiway branch.
+
 2016-01-25  Bill Schmidt  
 
* config/rs6000/rs6000.c (rs6000_keep_leaf_when_profiled):  Add
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 763ceac..7e5daa9 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,10 @@
+2016-01-25  Jeff Law  
+
+   PR tree-optimization/69196
+   PR tree-optimization/68398
+   * gcc.dg/tree-ssa/pr66752-3.c: Update expected output
+   * gcc.dg/tree-ssa/pr68198.c: Likewise.
+
 2016-01-25  David Edelsohn  
 
PR target/69469
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr66752-3.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr66752-3.c
index 1f27b1a..6eeaca5 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr66752-3.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr66752-3.c
@@ -33,9 +33,9 @@ foo (int N, int c, int b, int *a)
 }

Re: [PATCH] PR c++/69399: Add HAVE_WORKING_CXX_BUILTIN_CONSTANT_P

2016-01-25 Thread H.J. Lu

On Mon, Jan 25, 2016 at 4:40 AM, Richard Biener
 wrote:
> On Fri, Jan 22, 2016 at 7:55 PM, H.J. Lu  wrote:
>> Without the fix for PR 65656, g++ miscompiles __builtin_constant_p in
>> wi::lrshift in wide-int.h.  Add a check with PR 65656 testcase to verify
>> that C++ __builtin_constant_p works properly.
>>
>> Tested on x86-64 with working GCC:
>>
>> gcc/auto-host.h:/* #undef HAVE_WORKING_CXX_BUILTIN_CONSTANT_P */
>> prev-gcc/auto-host.h:/* #undef HAVE_WORKING_CXX_BUILTIN_CONSTANT_P */
>> stage1-gcc/auto-host.h:#define HAVE_WORKING_CXX_BUILTIN_CONSTANT_P 1
>>
>> and broken GCC:
>>
>> gcc/auto-host.h:/* #undef HAVE_WORKING_CXX_BUILTIN_CONSTANT_P */
>> prev-gcc/auto-host.h:/* #undef HAVE_WORKING_CXX_BUILTIN_CONSTANT_P */
>> stage1-gcc/auto-host.h:/* #undef HAVE_WORKING_CXX_BUILTIN_CONSTANT_P */
>>
>> Ok for trunk?
>
> I have a hard time seeing how we are "miscompiling"
>
>   if (STATIC_CONSTANT_P (xi.precision > HOST_BITS_PER_WIDE_INT)
>   ? xi.len == 1 && xi.val[0] >= 0
>   : xi.precision <= HOST_BITS_PER_WIDE_INT)
>
> anything that relies on __builtin_constant_p () for sematics is fishy so why
> is this not simply an lrshfit implementation bug?
>


We hit this via:

Breakpoint 1, wi::lrshift
>, generic_wide_int > > (x=..., y=...)
at /export/gnu/import/git/sources/gcc-release/gcc/wide-int.h:2898
2898  val[0] = xi.to_uhwi () >> shift;
(gdb) bt
#0  wi::lrshift >,
generic_wide_int > > (x=..., y=...)
at /export/gnu/import/git/sources/gcc-release/gcc/wide-int.h:2898
#1  0x009e7bbe in
wi::rshift >,
generic_wide_int > > (sgn=,
y=..., x=...)
at /export/gnu/import/git/sources/gcc-release/gcc/wide-int.h:2947
#2  bit_value_binop_1 (code=code@entry=RSHIFT_EXPR,
type=type@entry=0x7fffefe82dc8, val=val@entry=0x7fffd7c0,
mask=mask@entry=0x7fffd790, r1type=0x7fffefe82dc8, r1val=...,
r1mask=..., r2type=0x7fffefd6b690, r2val=..., r2mask=...)
at /export/gnu/import/git/sources/gcc-release/gcc/tree-ssa-ccp.c:1348
#3  0x009e9e7b in bit_value_binop (code=code@entry=RSHIFT_EXPR,
type=0x7fffefe82dc8, rhs1=rhs1@entry=0x7fffefd71708, rhs2=)
at /export/gnu/import/git/sources/gcc-release/gcc/tree-ssa-ccp.c:1549
#4  0x009eb520 in evaluate_stmt (stmt=stmt@entry=0x7fffefe9a160)
at /export/gnu/import/git/sources/gcc-release/gcc/tree-ssa-ccp.c:1785
#5  0x009ec8d2 in visit_assignment (stmt=stmt@entry=0x7fffefe9a160,
output_p=output_p@entry=0x7fffdba0)
at /export/gnu/import/git/sources/gcc-release/gcc/tree-ssa-ccp.c:2258
#6  0x009ec9c2 in ccp_visit_stmt (stmt=0x7fffefe9a160,
taken_edge_p=0x7fffdba8, output_p=0x7fffdba0)
at /export/gnu/import/git/sources/gcc-release/gcc/tree-ssa-ccp.c:2336
---Type  to continue, or q  to quit---
#7  0x00a4efcf in simulate_stmt (stmt=stmt@entry=0x7fffefe9a160)
at /export/gnu/import/git/sources/gcc-release/gcc/tree-ssa-propagate.c:348
#8  0x00a50f79 in simulate_block (block=)
at /export/gnu/import/git/sources/gcc-release/gcc/tree-ssa-propagate.c:471
#9  ssa_propagate (
visit_stmt=visit_stmt@entry=0x9ec937 , visit_phi=visit_phi@entry=0x9e6aa5
)
at /export/gnu/import/git/sources/gcc-release/gcc/tree-ssa-propagate.c:888
#10 0x009e6295 in do_ssa_ccp ()
at /export/gnu/import/git/sources/gcc-release/gcc/tree-ssa-ccp.c:2382
#11 (anonymous namespace)::pass_ccp::execute (this=)
at /export/gnu/import/git/sources/gcc-release/gcc/tree-ssa-ccp.c:2415
#12 0x0089ca0c in execute_one_pass (pass=pass@entry=0x19b4bf0)
at /export/gnu/import/git/sources/gcc-release/gcc/passes.c:2330
#13 0x0089cd62 in execute_pass_list_1 (pass=0x19b4bf0)
at /export/gnu/import/git/sources/gcc-release/gcc/passes.c:2382
#14 0x0089cd7f in execute_pass_list_1 (pass=0x19b4a70,
pass@entry=0x19b48f0)
at /export/gnu/import/git/sources/gcc-release/gcc/passes.c:2383
#15 0x0089cd9c in execute_pass_list (fn=0x7fffefe98000, pass=0x19b48f0)
at /export/gnu/import/git/sources/gcc-release/gcc/passes.c:2393
#16 0x0089ba57 in do_per_function_toporder (
callback=callback@entry=0x89cd83 , ---Type  to continue, or q  to quit---
data=0x19b48f0) at /export/gnu/import/git/sources/gcc-release/gcc/passes.c:1728
#17 0x0089d3e3 in execute_ipa_pass_list (pass=0x19b4890)
at /export/gnu/import/git/sources/gcc-release/gcc/passes.c:2736
#18 0x0066f3ac in ipa_passes ()
at /export/gnu/import/git/sources/gcc-release/gcc/cgraphunit.c:2172
#19 symbol_table::compile (this=this@entry=0x7fffefd6b000)
at

Re: [PATCH][ARM,AARCH64] target/PR68674: relayout vector_types in expand_expr

2016-01-25 Thread Christophe Lyon

On 22 January 2016 at 12:56, Richard Biener  wrote:
> On Fri, Jan 22, 2016 at 12:41 PM, Christian Bruel
>  wrote:
>>
>>
>> On 01/19/2016 04:18 PM, Richard Biener wrote:
>>>
>>> maybe just if (currently_expanding_to_rtl)?
>>>
>>> But yes, this looks like a safe variant of the fix.
>>>
>>> Richard.
>>>
>> thanks, currently_expanding_to_rtl works perfectly. So the final version.
>> I added a test for each target.
>
> Ok.
>

Hi,

This small patch is needed to make the new test pass on arm hard-float
targets (eg. arm-none-linux-gnueabihf).

I'm not sure it counts as obvious, so here it is.
OK?

Christophe.

DATE  Christophe Lyon  

* gcc.target/arm/pr68674.c: Check and use arm_fp effective target.


> Thanks,
> Richard.
>
>> bootstrapped / tested for :
>> unix/-m32/-march=i586
>> unix
>>
>> arm-qemu/
>> arm-qemu//-mfpu=neon
>> arm-qemu//-mfpu=neon-fp-armv8
>>
>> aarch64-qemu
>>
>>
>>
>>
>>
>>
>>
diff --git a/gcc/testsuite/gcc.target/arm/pr68674.c 
b/gcc/testsuite/gcc.target/arm/pr68674.c
index a31a88a..0b32374 100644
--- a/gcc/testsuite/gcc.target/arm/pr68674.c
+++ b/gcc/testsuite/gcc.target/arm/pr68674.c
@@ -1,7 +1,9 @@
 /* PR target/68674 */
 /* { dg-do compile } */
 /* { dg-require-effective-target arm_neon_ok } */
-/* { dg-options "-O2 -mfloat-abi=softfp" } */
+/* { dg-require-effective-target arm_fp_ok } */
+/* { dg-options "-O2" } */
+/* { dg-add-options arm_fp } */
 
 #pragma GCC target ("fpu=vfp")

[PATCH] PR other/69006: fix extra newlines after diagnostics (v2)

2016-01-25 Thread David Malcolm

Here's an updated version of the patch.

On Wed, 2016-01-13 at 18:32 +0100, Bernd Schmidt wrote:
> On 01/13/2016 01:57 AM, David Malcolm wrote:
> > There are five places in trunk that can call diagnostic_show_locus.
>
> I'd kind of like to see before/after example output for all of these, to
> make sure that we are indeed removing only unnecessary newlines.

Here's an attempt to show all of the cases, for the 4 out of 5 meangingful
sites.  It's rather long, so by way of summary it's as if I'd hand-unrolled
these nested loops:
  for each of the 4 usage sites in the source code:
for each of "before the patch" vs "after the patch":
   for each of with, then without -fno-diagnostics-show-caret
   (i.e. first without the quoted source text, then with it).
giving 16 examples, using "VVV" and "^^^" to mark the bounds of
what I'm quoting (to make it easier to see trailing newlines).

USAGE SITE (1): in default_diagnostic_finalizer
As before, the patch updates this to remove a newline
immediately after a call to diagnostic_show_locus.
Example of use: the Go frontend, e.g. go.test/test/assign.go:

Before, with -fno-diagnostics-show-caret:

$ gccgo ../../src/gcc/testsuite/go.test/test/assign.go 
-I../x86_64-pc-linux-gnu/libgo -O  -fno-show-column  -pedantic-errors  -S  -o 
assign.s -fno-diagnostics-show-caret
../../src/gcc/testsuite/go.test/test/assign.go:41: error: assignment of 
unexported field ‘state’ in ‘sync.Mutex’ literal
../../src/gcc/testsuite/go.test/test/assign.go:41: error: assignment of 
unexported field ‘sema’ in ‘sync.Mutex’ literal
../../src/gcc/testsuite/go.test/test/assign.go:45: error: unknown field ‘key’ 
in ‘sync.Mutex’


Before, without -fno-diagnostics-show-caret:

$ gccgo ../../src/gcc/testsuite/go.test/test/assign.go 
-I../x86_64-pc-linux-gnu/libgo -O  -fno-show-column  -pedantic-errors  -S  -o 
assign.s
../../src/gcc/testsuite/go.test/test/assign.go:41: error: assignment of 
unexported field ‘state’ in ‘sync.Mutex’ literal
   x := sync.Mutex{0, 0} // ERROR "assignment.*Mutex"
 ^

../../src/gcc/testsuite/go.test/test/assign.go:41: error: assignment of 
unexported field ‘sema’ in ‘sync.Mutex’ literal
../../src/gcc/testsuite/go.test/test/assign.go:45: error: unknown field ‘key’ 
in ‘sync.Mutex’
   x := sync.Mutex{key: 0} // ERROR "(unknown|assignment).*Mutex"
   ^


(note the erroneous trailing blank lines after the caret lines)

After, with -fno-diagnostics-show-caret:

$ ./gccgo -B. ../../src/gcc/testsuite/go.test/test/assign.go 
-I../x86_64-pc-linux-gnu/libgo -O  -fno-show-column  -pedantic-errors  -S  -o 
assign.s
../../src/gcc/testsuite/go.test/test/assign.go:41: error: assignment of 
unexported field ‘state’ in ‘sync.Mutex’ literal
../../src/gcc/testsuite/go.test/test/assign.go:41: error: assignment of 
unexported field ‘sema’ in ‘sync.Mutex’ literal
../../src/gcc/testsuite/go.test/test/assign.go:45: error: unknown field ‘key’ 
in ‘sync.Mutex’

(i.e. unchanged)

After, without -fno-diagnostics-show-caret:

$ ./gccgo -B. ../../src/gcc/testsuite/go.test/test/assign.go 
-I../x86_64-pc-linux-gnu/libgo -O  -fno-show-column  -pedantic-errors  -S  -o 
assign.s -fno-diagnostics-show-caret
../../src/gcc/testsuite/go.test/test/assign.go:41: error: assignment of 
unexported field ‘state’ in ‘sync.Mutex’ literal
../../src/gcc/testsuite/go.test/test/assign.go:41: error: assignment of 
unexported field ‘sema’ in ‘sync.Mutex’ literal
../../src/gcc/testsuite/go.test/test/assign.go:45: error: unknown field ‘key’ 
in ‘sync.Mutex’
[david@c64 gcc]$ ./gccgo -B. ../../src/gcc/testsuite/go.test/test/assign.go 
-I../x86_64-pc-linux-gnu/libgo -O  -fno-show-column  -pedantic-errors  -S  -o 
assign.s
../../src/gcc/testsuite/go.test/test/assign.go:41: error: assignment of 
unexported field ‘state’ in ‘sync.Mutex’ literal
   x := sync.Mutex{0, 0} // ERROR "assignment.*Mutex"
 ^
../../src/gcc/testsuite/go.test/test/assign.go:41: error: assignment of 
unexported field ‘sema’ in ‘sync.Mutex’ literal
../../src/gcc/testsuite/go.test/test/assign.go:45: error: unknown field ‘key’ 
in ‘sync.Mutex’
   x := sync.Mutex{key: 0} // ERROR "(unknown|assignment).*Mutex"
   ^

(i.e. fixing the erroneous trailing blank lines after the caret lines)

USAGE SITE (2): c_diagnostic_finalizer:
Likewise.
Example of use: C frontend, e.g. gcc.dg/2003-1.c

Before, with -fno-diagnostics-show-caret:

$ gcc

RE: [PATCH] Skip re-computing the mips frame info after reload completed

2016-01-25 Thread Matthew Fortune

Bernd Edlinger  writes:
> Matthew Fortune  writes:
> > Has the patch been tested beyond just building GCC? I can do a
> > test run for you if you don't have things set up to do one yourself.
> 
> I built a cross-gcc with all languages and a cross-glibc, but I have
> not set up an emulation environment, so if you could give it a test
> that would be highly welcome.

mipsel-linux-gnu test results are the same before and after this patch.

Please go ahead and commit.

Thanks,
Matthew

Re: C++ PATCH for c++/69379 (ICE with PTRMEM_CST wrapped in NOP_EXPR)

2016-01-25 Thread Marek Polacek

On Mon, Jan 25, 2016 at 10:08:34AM -0500, Jason Merrill wrote:
> On 01/22/2016 05:07 PM, Marek Polacek wrote:
> >On Fri, Jan 22, 2016 at 03:38:26PM -0500, Jason Merrill wrote:
> >>If we have a NOP_EXPR to the same type, we should strip it here.
> >
> >This helps for the unreduced testcases in the PR, but not for the reduced 
> >one,
> >because for the reduced one, the types are not the same.  One type is
> >struct
> >{
> >   void Dict:: (struct Dict *, T) * __pfn;
> >   long int __delta;
> >}
> >and the second one
> >struct
> >{
> >   void Dict:: (struct Dict *) * __pfn;
> >   long int __delta;
> >}
> >
> >The NOP_EXPR in this case originated in build_reinterpret_cast_1:
> >7070   else if ((TYPE_PTRFN_P (type) && TYPE_PTRFN_P (intype))
> >7071|| (TYPE_PTRMEMFUNC_P (type) && TYPE_PTRMEMFUNC_P (intype)))
> >7072 return build_nop (type, expr);
> 
> Well, a reinterpret_cast makes the expression non-constant, so we can
> recognize that case (when the types are unrelated) and bail out.  After that
> we probably still need to deal with the case of conversion to a
> pointer-to-member-of-base type; for functions it looks like we can just copy
> the PTRMEM_CST and give it a different type, but for data members I think
> we'll need to add support for the type not matching the member in
> expand_ptrmem_cst.

It appears that handling the case when the types don't match is sufficient, at
least all the tests pass, thus the following should be enough.

If you want me to take care of the rest then please let me know, though without
a testcase it might be harder to get right.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2016-01-25  Marek Polacek  

PR c++/69379
* constexpr.c (cxx_eval_constant_expression): Handle PTRMEM_CSTs
wrapped in NOP_EXPRs.

* g++.dg/pr69379.C: New test.

diff --git gcc/cp/constexpr.c gcc/cp/constexpr.c
index 6b0e5a8..4b952d1 100644
--- gcc/cp/constexpr.c
+++ gcc/cp/constexpr.c
@@ -3619,6 +3619,20 @@ cxx_eval_constant_expression (const constexpr_ctx *ctx, 
tree t,
if (TREE_CODE (op) == PTRMEM_CST
&& !TYPE_PTRMEM_P (type))
  op = cplus_expand_constant (op);
+   if (TREE_CODE (op) == PTRMEM_CST && tcode == NOP_EXPR)
+ {
+   if (same_type_ignoring_top_level_qualifiers_p (type,
+  TREE_TYPE (op)))
+ STRIP_NOPS (t);
+   else
+ {
+   if (!ctx->quiet)
+ error_at (EXPR_LOC_OR_LOC (t, input_location),
+   "reinterpret_cast has different types");
+   *non_constant_p = true;
+   return t;
+ }
+ }
if (POINTER_TYPE_P (type)
&& TREE_CODE (op) == INTEGER_CST
&& !integer_zerop (op))
diff --git gcc/testsuite/g++.dg/pr69379.C gcc/testsuite/g++.dg/pr69379.C
index e69de29..249ad00 100644
--- gcc/testsuite/g++.dg/pr69379.C
+++ gcc/testsuite/g++.dg/pr69379.C
@@ -0,0 +1,20 @@
+// PR c++/69379
+// { dg-do compile }
+// { dg-options "-Wformat" }
+
+typedef int T;
+class A {
+public:
+  template  A(const char *, D);
+  template 
+  void m_fn1(const char *, Fn, A1 const &, A2);
+};
+struct Dict {
+  void m_fn2();
+};
+void fn1() {
+  A a("", "");
+  typedef void *Get;
+  typedef void (Dict::*d)(T);
+  a.m_fn1("", Get(), d(::m_fn2), "");
+}


Marek

Re: [PATCH 4/4][AArch64] Cost CCMP instruction sequences to choose better expand order

2016-01-25 Thread Wilco Dijkstra

Andreas Schwab  wrote:

> FAIL: gcc.target/aarch64/ccmp_1.c scan-assembler-times \tcmp\tw[0-9]+, 0 4
> FAIL: gcc.target/aarch64/ccmp_1.c scan-assembler adds\t
> FAIL: gcc.target/aarch64/ccmp_1.c scan-assembler-times fccmpe\t.*0\\.0 1

Yes I noticed those too, and here is the fix. Richard's recent change added 
UNSPEC to the CCMP patterns to stop combine optimizing the CCMP CCmode 
immediate in a rare case. This requires a change to the CCMP cost calculation 
as the CCMP instruction with unspec is no longer recognized.

Fix the ccmp_1.c test to allow both '0' and 'wzr' on cmp - BTW is there a 
regular expression that correctly implements (0|xzr)? If I use that the test 
still fails somehow but \[0wzr\]+ works fine... Is the correct syntax 
documented somewhere?

Finally to ensure FCCMPE is emitted on relational compares, add 
-ffinite-math-only.

ChangeLog:
2016-01-25  Wilco Dijkstra  

gcc/
* config/aarch64/aarch64.c (aarch64_if_then_else_costs):
Remove CONST_INT_P check in CCMP cost calculation.

gcc/testsuite/
* gcc.target/aarch64/ccmp_1.c: Fix test issues.

---
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
6c570c7db1cfbd0415e73fb110ce5d70aa09b540..7f304b78a3e48862bf5aaf855e307fe90969dd8c
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -6014,7 +6014,7 @@ aarch64_if_then_else_costs (rtx op0, rtx op1, rtx op2, 
int *cost, bool speed)
   else if (GET_MODE_CLASS (GET_MODE (inner)) == MODE_CC)
 {
   /* CCMP.  */
-  if ((GET_CODE (op1) == COMPARE) && CONST_INT_P (op2))
+  if (GET_CODE (op1) == COMPARE)
{
  /* Increase cost of CCMP reg, 0, imm, CC to prefer CMP reg, 0.  */
  if (XEXP (op1, 1) == const0_rtx)
diff --git a/gcc/testsuite/gcc.target/aarch64/ccmp_1.c 
b/gcc/testsuite/gcc.target/aarch64/ccmp_1.c
index 
7c39b61a585a1d4d662b0736e1c80e06bdc6b4ce..8e3f8629f802eec64c95080a23f320712333471b
 100644
--- a/gcc/testsuite/gcc.target/aarch64/ccmp_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/ccmp_1.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2" } */
+/* { dg-options "-O2 -ffinite-math-only" } */
 
 int
 f1 (int a)
@@ -85,7 +85,7 @@ f13 (int a, int b)
 /* { dg-final { scan-assembler "cmp\t(.)+34" } } */
 /* { dg-final { scan-assembler "cmp\t(.)+35" } } */
 
-/* { dg-final { scan-assembler-times "\tcmp\tw\[0-9\]+, 0" 4 } } */
+/* { dg-final { scan-assembler-times "\tcmp\tw\[0-9\]+, \[0wzr\]+" 4 } } */
 /* { dg-final { scan-assembler-times "fcmpe\t(.)+0\\.0" 2 } } */
 /* { dg-final { scan-assembler-times "fcmp\t(.)+0\\.0" 2 } } */

[PATCH] Handle -fsanitize=* in lto-wrapper (PR lto/69254)

2016-01-25 Thread Jakub Jelinek

Hi!

Here is an attempt to handle -f{,no-}sanitize= options in LTO wrapper.
In addition to that I've noticed ICEs e.g. if some OpenMP code is compiled
with -c -flto -fopenmp, but final link is -fno-openmp, similarly for
openacc, -fcilkplus is similar but used to be handled even less.

The intended behavior for -f{,no-}sanitize= is that for the ubsan
sanitizers which are typically lowered before IPA, but are often using
builtins that need initialization even at the LTO level, we collect
from each TU info on whether any ubsan sanitizers have been enabled
(note, this needs parsing of the options, because we can e.g. have 
-fsanitize=shift,return -fno-sanitize=undefined 
-fsanitize=integer-divide-by-zero
) and turn that into -fsanitize=shift from all the TUs if any of them
needed any (randomly chosen sanitizer that is handled by FEs only).
For address or thread sanitizers, which are handled solely post IPA,
the choice whether to sanitize is left to the linker command line.
And finally we need to ensure that e.g. -fno-sanitize=address,shift
doesn't turn off the ubsan sanitizers.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-01-25  Jakub Jelinek  

PR lto/69254
* opts.h (parse_sanitizer_options): New prototype.
* opts.c (sanitizer_opts): New array.
(parse_sanitizer_options): New function.
(common_handle_option): Use parse_sanitizer_options.
* lto-opts.c (lto_write_options): Write also -f{,no-}sanitize=
options.
* lto-wrapper.c (sanitize_shift_decoded_opt): New function.
(merge_and_complain): Determine if any -fsanitize= options
enabled at the end any undefined behavior sanitizers, and
append -fsanitize=shift if needed.  Handle -fcilkplus.
(append_compiler_options): Handle -fcilkplus and -fsanitize=.
(append_linker_options): Ignore -fno-{openmp,openacc,cilkplus}.
(find_and_merge_options): Canonicalize -fsanitize= options.
(run_gcc): Append -fsanitize=shift if compiler options set it
and linker options might override it.

--- gcc/opts.h.jj   2016-01-23 00:13:00.714017906 +0100
+++ gcc/opts.h  2016-01-25 14:06:31.833127411 +0100
@@ -372,6 +372,8 @@ extern void control_warning_option (unsi
 extern char *write_langs (unsigned int mask);
 extern void print_ignored_options (void);
 extern void handle_common_deferred_options (void);
+unsigned int parse_sanitizer_options (const char *, location_t, int,
+ unsigned int, int, bool);
 extern bool common_handle_option (struct gcc_options *opts,
  struct gcc_options *opts_set,
  const struct cl_decoded_option *decoded,
--- gcc/opts.c.jj   2016-01-23 00:13:00.662018617 +0100
+++ gcc/opts.c  2016-01-25 14:06:31.834127398 +0100
@@ -1433,6 +1433,104 @@ enable_fdo_optimizations (struct gcc_opt
 opts->x_flag_tree_loop_distribute_patterns = value;
 }
 
+/* -f{,no-}sanitize{,-recover}= suboptions.  */
+static const struct sanitizer_opts_s
+{
+  const char *const name;
+  unsigned int flag;
+  size_t len;
+} sanitizer_opts[] =
+{
+#define SANITIZER_OPT(name, flags) { #name, flags, sizeof #name - 1 }
+  SANITIZER_OPT (address, SANITIZE_ADDRESS | SANITIZE_USER_ADDRESS),
+  SANITIZER_OPT (kernel-address, SANITIZE_ADDRESS | SANITIZE_KERNEL_ADDRESS),
+  SANITIZER_OPT (thread, SANITIZE_THREAD),
+  SANITIZER_OPT (leak, SANITIZE_LEAK),
+  SANITIZER_OPT (shift, SANITIZE_SHIFT),
+  SANITIZER_OPT (integer-divide-by-zero, SANITIZE_DIVIDE),
+  SANITIZER_OPT (undefined, SANITIZE_UNDEFINED),
+  SANITIZER_OPT (unreachable, SANITIZE_UNREACHABLE),
+  SANITIZER_OPT (vla-bound, SANITIZE_VLA),
+  SANITIZER_OPT (return, SANITIZE_RETURN),
+  SANITIZER_OPT (null, SANITIZE_NULL),
+  SANITIZER_OPT (signed-integer-overflow, SANITIZE_SI_OVERFLOW),
+  SANITIZER_OPT (bool, SANITIZE_BOOL),
+  SANITIZER_OPT (enum, SANITIZE_ENUM),
+  SANITIZER_OPT (float-divide-by-zero, SANITIZE_FLOAT_DIVIDE),
+  SANITIZER_OPT (float-cast-overflow, SANITIZE_FLOAT_CAST),
+  SANITIZER_OPT (bounds, SANITIZE_BOUNDS),
+  SANITIZER_OPT (bounds-strict, SANITIZE_BOUNDS | SANITIZE_BOUNDS_STRICT),
+  SANITIZER_OPT (alignment, SANITIZE_ALIGNMENT),
+  SANITIZER_OPT (nonnull-attribute, SANITIZE_NONNULL_ATTRIBUTE),
+  SANITIZER_OPT (returns-nonnull-attribute, 
SANITIZE_RETURNS_NONNULL_ATTRIBUTE),
+  SANITIZER_OPT (object-size, SANITIZE_OBJECT_SIZE),
+  SANITIZER_OPT (vptr, SANITIZE_VPTR),
+  SANITIZER_OPT (all, ~0),
+#undef SANITIZER_OPT
+  { NULL, 0, 0 }
+};
+
+/* Parse comma separated sanitizer suboptions from P for option SCODE,
+   adjust previous FLAGS and return new ones.  If COMPLAIN is false,
+   don't issue diagnostics.  */
+
+unsigned int
+parse_sanitizer_options (const char *p, location_t loc, int scode,
+unsigned int flags, int value, bool complain)
+{
+  enum opt_code code = (enum opt_code) scode;
+  while (*p != 0)
+{
+

Re: [PATCH][ARM] Fix PR target/69245 Rewrite arm_set_current_function

2016-01-25 Thread Kyrill Tkachov



On 22/01/16 14:51, Christian Bruel wrote:

Hi Kyrill,

On 01/22/2016 03:17 PM, Kyrill Tkachov wrote:

Hi Christian,

On 22/01/16 14:07, Christian Bruel wrote:

Hi Kyrill,

On 01/21/2016 01:22 PM, Kyrill Tkachov wrote:

Hi Christian,

On 21/01/16 10:36, Christian Bruel wrote:
The current arm_set_current_function was both awkward and buggy. For instance using partially set TARGET_OPTION set from pragma_parse, while restore_target_globalsnor arm_option_params_internal was not reset. Another issue is that in 
some

paths, target_reinit was not called due to old cached 
target_option_current_node value. for instance with

foo{}
#pragma GCC target ...

foo was called with global_options set from old GCC target (which was wrong) 
and correct rtl values.

This is a reimplementation of the function. Hoping to be easier to read (and 
maintain). Solves the current issues seen so far.

regtested for arm-linux-gnueabi -mfpu=vfp, -mfpu=neon,-mfpu=neon-fp-armv8


Thanks for the patch, I'll try it out.
In the meantime there's a couple of style and typo nits...

+  /* Make sure that target_reinit is called for next function, since
+ TREE_TARGET_OPTION might change with the #pragma even if there are
+ no target attribute attached to the function.  */

s/attribute/attributes

-  arm_previous_fndecl = fndecl;
+  /* if no attribute, use the mode set by the current pragma target.  */
+  if (! new_tree)
+new_tree = target_option_current_node;
+

s/if/If/

+  /* now target_reinit can save the state for later. */
+  TREE_TARGET_GLOBALS (new_tree)
+= save_target_globals_default_opts ();

s/now/Now/


While playing on my side. I realized that we could simplify the patch further 
by removing the need to set and use target_option_current_node, since this is 
redundant with what handle_pragma_push/pop_options does.
Also since that the functions inside a pragma GCC target region will have 
DECL_FUNCTION_SPECIFIC_TARGET set already, we don't seem to need a special case 
for those.

With this V2, arm_set_current_function is becoming more minimalist and still 
fixes the current issues. Could you test this version instead ?


Thanks, I'll check this out instead.
I've played a bit with your previous version and the effect on the testcases 
looked ok, but I have a couple of
comments on the testcase in the meantime

Index: gcc/testsuite/gcc.target/arm/pr69245.c
===
--- gcc/testsuite/gcc.target/arm/pr69245.c(revision 0)
+++ gcc/testsuite/gcc.target/arm/pr69245.c(working copy)
@@ -0,0 +1,24 @@
+/* PR target/69245 */
+/* Test that pop_options restores the vfp fpu mode.  */
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_fp_ok } */
+/* { dg-add-options arm_fp } */
+
+#pragma GCC target "fpu=vfp"
+
+#pragma GCC push_options
+#pragma GCC target "fpu=neon"
+int a, c, d;
+float b;
+static int fn1 ()
+{
+  return 0;
+}
+#pragma GCC pop_options
+
+void fn2 ()
+{
+  d = b * c + a;
+}
+
+/* { dg-final { scan-assembler-times "\.fpu vfp" 1 } } */


PR 69245 is an ICE whereas your testcase doesn't ICE without the patch, it just 
fails the
scan-assembler check. I'd like to have the testcase trigger the ICE without 
your patch.
For that we need -O2 in dg-options.
Also, the "fpu=vfp" pragma you put in the beginning doesn't allow the ICE to 
trigger, presumably
because it triggers a different path through the pragma option popping code.
So removing that pragma and instead changing the dg-add-options from arm_fp to 
arm_vfp3 (which is
floating-point without the vfma instruction causes the ICE) does the trick for 
me.
Also the "fpu=neon" pragma should also be changed to be "fpu=neon-vfpv4" 
because that setting allows
the vfma instruction which is being wrongly considered in fn2().
I suppose you'll then want to change the scan-assembler directive to look for 
\.fpu vfp3.


ah yes ! OK for -O2, I thought I had it, must have been deleted somewhere :-(

I added the #pragma GCC target "fpu=vfp" to have some kind of deterministic checks to guard against the options permutations that Christophe stresses during his validations. so for instance the ".fpu scan-assembler would change depending 
on the default options...


so the following test should ICE with the all configurations 
(!-mfloat-abi=soft) in -O2

#pragma GCC target "fpu=vfp"

#pragma GCC push_options
#pragma GCC target "fpu=neon-vfpv4"
int a, c, d;
float b;
static int fn1 ()
{
  return 0;
}

#pragma GCC pop_options
void fn2 ()
{
  d = b * c + a;
}




Ah ok, I needed to update my tree to include your other midend fixes in this 
area.
I played around with the patch and gave it a bootstrap as well.
I wanted to make a sanity check on compile-time performance for files using 
arm_neon.h
and I didn't spot any measurable regressions.

So this is ok for trunk with the testcase changed as discussed above and using 
-O2
optimisation level and with a couple comment fixes below.

-  arm_previous_fndecl =

Re: [PATCH, committed][gcc-5-branch] Fix broken test case derived_constructor_comps_6.f90

2016-01-25 Thread Peter Bergner

On Mon, 2016-01-25 at 18:51 +0100, Paul Richard Thomas wrote:
> On 25 January 2016 at 18:33, Peter Bergner  wrote:
>> I'll leave it to you or someone else to fix the -m32 bug, since
>> I'm not seeing it on my system.
>
> Neither am I :-(

I have a powerpc64-linux (ie, BE) system I can jump on that does
support -m32.  I'll see if I can't recreate the -m32 segv Dominique
was seeing.

Peter

Re: [PING^2][PATCHv2, ARM, libgcc] New aeabi_idiv function for armv6-m

2016-01-25 Thread Andre Vieira (lists)


Ping.

On 27/10/15 17:03, Andre Vieira wrote:

Ping.

BR,
Andre

On 13/10/15 18:01, Andre Vieira wrote:

This patch ports the aeabi_idiv routine from Linaro Cortex-Strings
(https://git.linaro.org/toolchain/cortex-strings.git), which was
contributed by ARM under Free BSD license.

The new aeabi_idiv routine is used to replace the one in
libgcc/config/arm/lib1funcs.S. This replacement happens within the
Thumb1 wrapper. The new routine is under LGPLv3 license.

The main advantage of this version is that it can improve the
performance of the aeabi_idiv function for Thumb1. This solution will
also increase the code size. So it will only be used if
__OPTIMIZE_SIZE__ is not defined.

Make check passed for armv6-m.

libgcc/ChangeLog:
2015-08-10  Hale Wang  
  Andre Vieira  

* config/arm/lib1funcs.S: Add new wrapper.

Re: [aarch64] Improve TImode constant moves

2016-01-25 Thread Richard Henderson

On 01/25/2016 01:32 AM, Kyrill Tkachov wrote:
> +case CONST_WIDE_INT:
> +  *cost = 0;
> +  for (unsigned int n = CONST_WIDE_INT_NUNITS(x), i = 0; i < n; ++i)
> +{
> +  unsigned HOST_WIDE_INT e = CONST_WIDE_INT_ELT(x, i);
> +  if (e != 0)
> +*cost += COSTS_N_INSNS (aarch64_internal_mov_immediate
> +(NULL_RTX, GEN_INT (e), false, DImode));
> +}
> +  return true;
> +
> 
> We usually avoid creating intermediate rtxes in the cost function because
> it can potentially be called many times during compilation and we want to 
> avoid
> creating too many short-lived objects, though I suppose there's no way getting
> around this one (the GEN_INT call).

Well, it's only aarch64_internal_mov_immediate -- we could change the interface
to provide the HOST_WIDE_INT value directly.

But that was more than I wanted to do for enabling splittable TImode constants.


r~

[PATCH, gcc7, aarch64] Add arithmetic overflow patterns

2016-01-25 Thread Richard Henderson

After having just spent a few days looking through dumps of
builtin-overflow-*.c for regressions while testing the patch for the TImode
arithmetic PR, I thought I'd go ahead and post a patch to make use of the
overflow bit on aarch64.

Consider this queued for stage1.



r~
* config/aarch64/aarch64-modes.def (CC_V): New.
* config/aarch64/aarch64.c (aarch64_zero_extend_const_eq): New.
(aarch64_select_cc_mode): Test for signed overflow using CC_Vmode.
(aarch64_get_condition_code_1): Handle CC_Vmode.
* config/aarch64/aarch64-protos.h: Update.
* config/aarch64/aarch64.md (addv4, uaddv4): New.
(addti3): Create simpler code if low part is already known to be 0.
(addvti4, uaddvti4): New.
(*add3_compareC_cconly_imm): New.
(*add3_compareC_cconly): New.
(*add3_compareC_imm): New.
(*add3_compareC): Rename from add3_compare1; do not
handle constants within this pattern.
(*add3_compareV_cconly_imm): New.
(*add3_compareV_cconly): New.
(*add3_compareV_imm): New.
(add3_compareV): New.
(add3_carryinC, add3_carryinV): New.
(*add3_carryinC_zero, *add3_carryinV_zero): New.
(*add3_carryinC, *add3_carryinV): New.
(subv4, usubv4): New.
(subti): Handle op1 zero.
(subvti4, usub4ti4): New.
(*sub3_compare1_imm): New.
(sub3_carryinCV): New.
(*sub3_carryinCV_z1_z2, *sub3_carryinCV_z1): New.
(*sub3_carryinCV_z2, *sub3_carryinCV): New.


diff --git a/gcc/config/aarch64/aarch64-modes.def 
b/gcc/config/aarch64/aarch64-modes.def
index 7de0b3f..f34345a 100644
--- a/gcc/config/aarch64/aarch64-modes.def
+++ b/gcc/config/aarch64/aarch64-modes.def
@@ -26,6 +26,7 @@ CC_MODE (CC_SESWP); /* sign-extend LHS (but swap to make it 
RHS).  */
 CC_MODE (CC_NZ);/* Only N and Z bits of condition flags are valid.  */
 CC_MODE (CC_Z); /* Only Z bit of condition flags is valid.  */
 CC_MODE (CC_C); /* Only C bit of condition flags is valid.  */
+CC_MODE (CC_V); /* Only V bit of condition flags is valid.  */
 
 /* Half-precision floating point for __fp16.  */
 FLOAT_MODE (HF, 2, 0);
diff --git a/gcc/config/aarch64/aarch64-protos.h 
b/gcc/config/aarch64/aarch64-protos.h
index 15fc37d..32cf245 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -289,6 +289,7 @@ void aarch64_declare_function_name (FILE *, const char*, 
tree);
 bool aarch64_legitimate_pic_operand_p (rtx);
 bool aarch64_modes_tieable_p (machine_mode mode1,
  machine_mode mode2);
+bool aarch64_zero_extend_const_eq (machine_mode, rtx, machine_mode, rtx);
 bool aarch64_move_imm (HOST_WIDE_INT, machine_mode);
 bool aarch64_mov_operand_p (rtx, machine_mode);
 int aarch64_simd_attr_length_rglist (enum machine_mode);
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 0c18ab2..191d081 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -1489,6 +1489,16 @@ aarch64_split_simd_move (rtx dst, rtx src)
 }
 }
 
+bool
+aarch64_zero_extend_const_eq (machine_mode xmode, rtx x,
+ machine_mode ymode, rtx y)
+{
+  rtx r = simplify_const_unary_operation (ZERO_EXTEND, xmode, y, ymode);
+  gcc_assert (r != NULL);
+  return rtx_equal_p (x, r);
+}
+ 
+
 static rtx
 aarch64_force_temporary (machine_mode mode, rtx x, rtx value)
 {
@@ -4192,6 +4202,13 @@ aarch64_select_cc_mode (RTX_CODE code, rtx x, rtx y)
   && GET_CODE (y) == ZERO_EXTEND)
 return CC_Cmode;
 
+  /* A test for signed overflow.  */
+  if ((GET_MODE (x) == DImode || GET_MODE (x) == TImode)
+  && code == NE
+  && GET_CODE (x) == PLUS
+  && GET_CODE (y) == SIGN_EXTEND)
+return CC_Vmode;
+
   /* For everything else, return CCmode.  */
   return CCmode;
 }
@@ -4300,6 +4317,15 @@ aarch64_get_condition_code_1 (enum machine_mode mode, 
enum rtx_code comp_code)
}
   break;
 
+case CC_Vmode:
+  switch (comp_code)
+   {
+   case NE: return AARCH64_VS;
+   case EQ: return AARCH64_VC;
+   default: return -1;
+   }
+  break;
+
 default:
   return -1;
   break;
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 363785e..46056f2 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -1703,22 +1703,150 @@
   }
 )
 
+(define_expand "addv4"
+  [(match_operand:GPI 0 "register_operand")
+   (match_operand:GPI 1 "register_operand")
+   (match_operand:GPI 2 "register_operand")
+   (match_operand 3 "")]
+  ""
+{
+  emit_insn (gen_add3_compareV (operands[0], operands[1], operands[2]));
+
+  rtx x;
+  x = gen_rtx_NE (VOIDmode, gen_rtx_REG (CC_Vmode, CC_REGNUM), const0_rtx);
+  x = gen_rtx_IF_THEN_ELSE (VOIDmode, x,
+   gen_rtx_LABEL_REF (VOIDmode, operands[3]),
+   pc_rtx);
+  emit_jump_insn

Re: [PATCH] fix #69317 - [6 regression] wrong ABI version in -Wabi warnings

2016-01-25 Thread Martin Sebor


Ping: I'm looking a review/approval of the almost trivial patch
below:

   https://gcc.gnu.org/ml/gcc-patches/2016-01/msg01206.html

On 01/16/2016 05:42 PM, Martin Sebor wrote:

While adding an ABI warning in the patch for bug 69277 I noticed
that the ABI version printed by GCC 6.0 in some -Wabi diagnostics
is incorrect:  while 5.1.0 prints the versions of the ABI given
by the -Wabi=X and -fabi-version=Y options (i.e., it mentions
both X and Y), 6.0 prints the same version twice (just Y).

The attached patch fixes this and adds tests to verify that the
referenced versions are as expected (it uses ABIs 2 and 3 but
tests exercising the other ABI changes should be added as well).

Martin

Re: [PATCH] Fix aarch64 bootstrap (pr69416)

2016-01-25 Thread Richard Henderson

On 01/25/2016 05:28 AM, Christophe Lyon wrote:
> After this, I'm seeing this test now FAILs:
> gcc.target/aarch64/ccmp_1.c scan-assembler adds\t

That test case is badly written.  In addition to that one, several of the other
failures that I see within that file are simply equally optimal alternative
choices for the compiler.  The file needs to be split up and simpler more
directed tests written.

r~

[patch] bootstrap/69464 Avoid including all of in

2016-01-25 Thread Jonathan Wakely


In C++11 mode  defines std::shuffle which uses
std::uniform_int_distribution. It doesn't need the rest of ,
which is huge, especially on x86 with SSE3 support, where pmmintrin.h
is pulled in.

This moves the definition of std::uniform_int_distribution to a new
header, and makes  include that instead of .  That
removes 23kloc from , making it much less of a problem for
the rest of the compiler to use  during bootstrap.

The change revealed a few testsuite bugs where tests (incorrectly)
relied on  pulling in  or  indirectly.
It's likely that some programs will stop compiling because of this
change, the fix will be to add the necessary headers.

Tested x86_64-linux and powerpc64-linux, committed to trunk.

commit b8cab1de33b9d3fbfbe984b6b7e9d3aa41b8e80e
Author: Jonathan Wakely 
Date:   Mon Jan 25 14:15:24 2016 +

Avoid including all of  in 

	PR libstdc++/69464
	* include/Makefile.am: Add new header.
	* include/Makefile.in: Regenerate.
	* include/bits/random.h (uniform_int_distribution): Move to
	bits/uniform_int_dist.h.
	* include/bits/random.tcc (uniform_int_distribution::operator(),
	uniform_int_distribution::__generate_impl): Likewise.
	* include/bits/uniform_int_dist.h: New header.
	* include/bits/stl_algo.h [__cplusplus >= 201103L]: Include
	 instead of .
	* testsuite/20_util/specialized_algorithms/uninitialized_copy/
	move_iterators/1.cc: Include correct header for uninitialized_copy.
	* testsuite/20_util/specialized_algorithms/uninitialized_copy_n/
	move_iterators/1.cc: Likewise.
	* testsuite/25_algorithms/nth_element/58800.cc: Include correct
	header for vector.
	* testsuite/26_numerics/random/pr60037-neg.cc: Adjust dg-error lines.

diff --git a/libstdc++-v3/include/Makefile.am b/libstdc++-v3/include/Makefile.am
index 573f057..0b34c3c 100644
--- a/libstdc++-v3/include/Makefile.am
+++ b/libstdc++-v3/include/Makefile.am
@@ -180,6 +180,7 @@ bits_headers = \
 	${bits_srcdir}/stl_vector.h \
 	${bits_srcdir}/streambuf.tcc \
 	${bits_srcdir}/stringfwd.h \
+	${bits_srcdir}/uniform_int_dist.h \
 	${bits_srcdir}/unique_ptr.h \
 	${bits_srcdir}/unordered_map.h \
 	${bits_srcdir}/unordered_set.h \
diff --git a/libstdc++-v3/include/bits/random.h b/libstdc++-v3/include/bits/random.h
index 63f57d5..1babe80 100644
--- a/libstdc++-v3/include/bits/random.h
+++ b/libstdc++-v3/include/bits/random.h
@@ -32,6 +32,7 @@
 #define _RANDOM_H 1
 
 #include 
+#include 
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
@@ -149,14 +150,6 @@ _GLIBCXX_END_NAMESPACE_VERSION
   __mod(_Tp __x)
   { return _Mod<_Tp, __m, __a, __c>::__calc(__x); }
 
-/* Determine whether number is a power of 2.  */
-template
-  inline bool
-  _Power_of_2(_Tp __x)
-  {
-	return ((__x - 1) & __x) == 0;
-  };
-
 /*
  * An adaptor class for converting the output of any Generator into
  * the input for a specific Distribution.
@@ -1656,164 +1649,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
* @{
*/
 
-  /**
-   * @brief Uniform discrete distribution for random numbers.
-   * A discrete random distribution on the range @f$[min, max]@f$ with equal
-   * probability throughout the range.
-   */
-  template
-class uniform_int_distribution
-{
-  static_assert(std::is_integral<_IntType>::value,
-		"template argument not an integral type");
-
-public:
-  /** The type of the range of the distribution. */
-  typedef _IntType result_type;
-  /** Parameter type. */
-  struct param_type
-  {
-	typedef uniform_int_distribution<_IntType> distribution_type;
-
-	explicit
-	param_type(_IntType __a = 0,
-		   _IntType __b = std::numeric_limits<_IntType>::max())
-	: _M_a(__a), _M_b(__b)
-	{
-	  __glibcxx_assert(_M_a <= _M_b);
-	}
-
-	result_type
-	a() const
-	{ return _M_a; }
-
-	result_type
-	b() const
-	{ return _M_b; }
-
-	friend bool
-	operator==(const param_type& __p1, const param_type& __p2)
-	{ return __p1._M_a == __p2._M_a && __p1._M_b == __p2._M_b; }
-
-  private:
-	_IntType _M_a;
-	_IntType _M_b;
-  };
-
-public:
-  /**
-   * @brief Constructs a uniform distribution object.
-   */
-  explicit
-  uniform_int_distribution(_IntType __a = 0,
-			   _IntType __b = std::numeric_limits<_IntType>::max())
-  : _M_param(__a, __b)
-  { }
-
-  explicit
-  uniform_int_distribution(const param_type& __p)
-  : _M_param(__p)
-  { }
-
-  /**
-   * @brief Resets the distribution state.
-   *
-   * Does nothing for the uniform integer distribution.
-   */
-  void
-  reset() { }
-
-  result_type
-  a() const
-  { return _M_param.a(); }
-
-  result_type
-  b() const
-  { return _M_param.b(); }
-
-  /**
-   * @brief Returns the parameter set of the distribution.
-   */
-  param_type
-  param() const
-  { return _M_param; }
-
-  /**
-   * @brief Sets the parameter set of

[PATCH, committed][gcc-5-branch] Fix broken test case derived_constructor_comps_6.f90

2016-01-25 Thread Peter Bergner

When the test case derived_constructor_comps_6.f90 was backported to
the FSF 5 branch, a '}' on the dg-additional-options was dropped
causing the test case to fail.  I have added it back and committed
it as obvious.

Peter

PR fortran/61831
* gfortran.dg/derived_constructor_comps_6.f90: Add missing } to fix
up dg-additional-options.


Index: gcc/testsuite/gfortran.dg/derived_constructor_comps_6.f90
===
--- gcc/testsuite/gfortran.dg/derived_constructor_comps_6.f90   (revision 
232798)
+++ gcc/testsuite/gfortran.dg/derived_constructor_comps_6.f90   (working copy)
@@ -1,5 +1,5 @@
 ! { dg-do run }
-! { dg-additional-options "-fdump-tree-original"
+! { dg-additional-options "-fdump-tree-original" }
 !
 ! PR fortran/61831
 ! The deallocation of components of array constructor elements

Re: Patch RFA: Add option -fcollectible-pointers, use it in ivopts

2016-01-25 Thread Ian Lance Taylor

On Mon, Jan 25, 2016 at 3:39 AM, Bernd Schmidt  wrote:
> On 01/23/2016 12:52 AM, Ian Lance Taylor wrote:
>
>> 2016-01-22  Ian Lance Taylor  
>>
>> * common.opt (fkeep-gc-roots-live): New option.
>> * tree-ssa-loop-ivopts.c (add_candidate_1): If
>> -fkeep-gc-roots-live, skip pointers.
>> (add_iv_candidate_for_biv): Handle add_candidate_1 returning
>> NULL.
>> * doc/invoke.texi (Optimize Options): Document
>> -fkeep-gc-roots-live.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 2016-01-22  Ian Lance Taylor  
>>
>> * gcc.dg/tree-ssa/ivopt_5.c: New test.
>
>
> Patch not attached?

The patch is there in the mailing list.  See the attachment on
https://gcc.gnu.org/ml/gcc-patches/2016-01/msg01781.html .

Ian

[PATCH] pr 65702 - error out for invalid register asms earlier

2016-01-25 Thread tbsaunde+gcc

From: Trevor Saunders 

Hi,

$subject.  To avoid regressions I kept the checks when generating rtl, but I
believe its impossible for those to trigger now and we can remove the checks.

bootstrapped + regtested on x86_64-linux-gnu, ok?

Trev

gcc/c/ChangeLog:

2016-01-25  Trevor Saunders  

* c-decl.c (finish_decl): Check if asm register is valid.

gcc/ChangeLog:

2016-01-25  Trevor Saunders  

* varasm.c (register_asmspec_ok_p): New function.
(make_decl_rtl): Adjust.
* varasm.h (register_asmspec_ok_p): New prototype.

gcc/cp/ChangeLog:

2016-01-25  Trevor Saunders  

* decl.c (make_rtl_for_nonlocal_decl): Check if register asm is
valid.
---
 gcc/c/c-decl.c|   8 +-
 gcc/cp/decl.c |   4 +-
 gcc/testsuite/g++.dg/torture/register-asm-1.C |  14 +++
 gcc/testsuite/gcc.dg/reg-vol-struct-1.c   |   2 +-
 gcc/testsuite/gcc.dg/torture/register-asm-1.c |  12 +++
 gcc/varasm.c  | 150 +++---
 gcc/varasm.h  |   3 +
 7 files changed, 129 insertions(+), 64 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/torture/register-asm-1.C
 create mode 100644 gcc/testsuite/gcc.dg/torture/register-asm-1.c

diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c
index 1ec6042..9257f35 100644
--- a/gcc/c/c-decl.c
+++ b/gcc/c/c-decl.c
@@ -4867,7 +4867,9 @@ finish_decl (tree decl, location_t init_loc, tree init,
   when a tentative file-scope definition is seen.
   But at end of compilation, do output code for them.  */
DECL_DEFER_OUTPUT (decl) = 1;
- if (asmspec && C_DECL_REGISTER (decl))
+ if (asmspec
+ && C_DECL_REGISTER (decl)
+ && register_asmspec_ok_p (decl, asmspec, DECL_MODE (decl)))
DECL_HARD_REGISTER (decl) = 1;
  rest_of_decl_compilation (decl, true, 0);
}
@@ -4878,7 +4880,9 @@ finish_decl (tree decl, location_t init_loc, tree init,
 in a particular register.  */
  if (asmspec && C_DECL_REGISTER (decl))
{
- DECL_HARD_REGISTER (decl) = 1;
+ if (register_asmspec_ok_p (decl, asmspec, DECL_MODE (decl)))
+   DECL_HARD_REGISTER (decl) = 1;
+
  /* This cannot be done for a structure with volatile
 fields, on which DECL_REGISTER will have been
 reset.  */
diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index f4604b6..6d130bd 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -6201,7 +6201,9 @@ make_rtl_for_nonlocal_decl (tree decl, tree init, const 
char* asmspec)
   /* The `register' keyword, when used together with an
 asm-specification, indicates that the variable should be
 placed in a particular register.  */
-  if (VAR_P (decl) && DECL_REGISTER (decl))
+  if (VAR_P (decl)
+ && DECL_REGISTER (decl)
+ && register_asmspec_ok_p (decl, asmspec, DECL_MODE (decl)))
{
  set_user_assembler_name (decl, asmspec);
  DECL_HARD_REGISTER (decl) = 1;
diff --git a/gcc/testsuite/g++.dg/torture/register-asm-1.C 
b/gcc/testsuite/g++.dg/torture/register-asm-1.C
new file mode 100644
index 000..b5cfc84
--- /dev/null
+++ b/gcc/testsuite/g++.dg/torture/register-asm-1.C
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+
+class A {
+ int m_fn1() const;
+};
+int a[1];
+int b;
+int A::m_fn1() const {
+   register int c asm(""); // { dg-error "invalid register name for 'c'" }
+   while (b)
+   if (a[5])
+   c = b;
+   return c;
+}
diff --git a/gcc/testsuite/gcc.dg/reg-vol-struct-1.c 
b/gcc/testsuite/gcc.dg/reg-vol-struct-1.c
index b885f91..e67c7a2 100644
--- a/gcc/testsuite/gcc.dg/reg-vol-struct-1.c
+++ b/gcc/testsuite/gcc.dg/reg-vol-struct-1.c
@@ -12,7 +12,7 @@ f (void)
 {
   register struct S a;
   register struct S b[2];
-  register struct S c __asm__("nosuchreg"); /* { dg-error "object with 
volatile field" "explicit reg name" } */
+  register struct S c __asm__("nosuchreg"); /* { dg-error "invalid register 
name for 'c'|cannot put object with volatile field into register" } */
/* { dg-error "address of register" "explicit address" } */
   b; /* { dg-error "address of register" "implicit address" } */
 }
diff --git a/gcc/testsuite/gcc.dg/torture/register-asm-1.c 
b/gcc/testsuite/gcc.dg/torture/register-asm-1.c
new file mode 100644
index 000..1949f62
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/register-asm-1.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+
+   int a[1], b;
+int
+foo ()
+{
+register int c asm (""); /* { dg-error "invalid register name for 'c'" } */
+  while (b)
+   if (a[5])
+ c = b;
+return c;
+}
diff --git a/gcc/varasm.c b/gcc/varasm.c
index 3a3573e..7e3aebc9 100644
---

Re: [PATCH] pr 65702 - error out for invalid register asms earlier

2016-01-25 Thread Bernd Schmidt


On 01/25/2016 04:36 PM, tbsaunde+...@tbsaunde.org wrote:

$subject.  To avoid regressions I kept the checks when generating rtl, but I
believe its impossible for those to trigger now and we can remove the checks.

bootstrapped + regtested on x86_64-linux-gnu, ok?


Is this still an issue? I committed a fix for a similar PR a few weeks 
ago, and I can't make the testcase from 65702 ICE.



Bernd

Re: [PATCH] pr 65702 - error out for invalid register asms earlier

2016-01-25 Thread Trevor Saunders

On Mon, Jan 25, 2016 at 04:42:58PM +0100, Bernd Schmidt wrote:
> On 01/25/2016 04:36 PM, tbsaunde+...@tbsaunde.org wrote:
> >$subject.  To avoid regressions I kept the checks when generating rtl, but I
> >believe its impossible for those to trigger now and we can remove the checks.
> >
> >bootstrapped + regtested on x86_64-linux-gnu, ok?
> 
> Is this still an issue? I committed a fix for a similar PR a few weeks ago,
> and I can't make the testcase from 65702 ICE.

 hrm, I guess my tree was more out of date than I thought, it doesn't
 ICE for me at r232662.

 Never mind then ;)

 Trev

> 
> 
> Bernd

Re: [PATCH] fix #69251 - [6 Regression] ICE in unify_array_domain on a flexible array member

2016-01-25 Thread Martin Sebor


On 01/21/2016 04:32 PM, Martin Sebor wrote:

On 01/21/2016 04:19 PM, Jason Merrill wrote:

Can we reconsider the representation of flexible arrays?  Following the
example of the C front-end is causing a lot of trouble, and using a null
TYPE_DOMAIN seems more intuitive.


I remember running into at least one ICE in the middle end with
the alternate representation (null TYPE_DOMAIN).  At this late
stage I would worry about the fallout from that. It seems that
outside of 69251 and 69277 the problems are mostly triggered by
ill-formed code that wasn't being tested and I'm hoping that
the problems in the well-formed cases have been reported (and
with the patches I've sent fixed).

At the same time, based on some debugging I had to do for 69251
(ICE in unify_array_domain on a flexible array member) it seems
that it might make handling them in template easier.


In a discussion with Jason in IRC I agreed to submit a patch
changing the representation of flexible array members in the C++
front end to use a null domain rather than a domain with a null
upper bound.  Attached is a patch making the requested change.
It fixes the following bugs:

c++/69251 - [6 Regression] ICE in unify_array_domain on a flexible
  array member
  (the bug in the Subject)
c++/69253 - [6 Regression] ICE in cxx_incomplete_type_diagnostic
  initializing a flexible array member with empty string
  with the original patch here:
  https://gcc.gnu.org/ml/gcc-patches/2016-01/msg01325.htm
and
c++/69290 - [6 Regression] ICE on invalid initialization
  of a flexible array member
  with the original patch here:
  https://gcc.gnu.org/ml/gcc-patches/2016-01/msg01685.html
as well as
c++/69277 - [6 Regression] ICE mangling a flexible array member
  with its final patch posted here
  https://gcc.gnu.org/ml/gcc-patches/2016-01/msg01233.html

The downside of this approach is that it prevents everything but
the front end from distinguishing flexible array members from
arrays of unspecified or unknown bounds.  The immediate impact
is that prevents us from maintaining ABI compatibility with GCC
5 (with -fabi-version=9) and from diagnosing the mangling change.
This means should we decide to adopt this approach, the final
version of the patch for c++/69277 mentioned above that's still
pending approval will need to be tweaked to have the ABI checks
removed.

I successfully tested the new patch on x86_64.

Martin

PR c++/69251 - [6 Regression] ICE in unify_array_domain on a flexible array
	member
PR c++/69253 - [6 Regression] ICE in cxx_incomplete_type_diagnostic
	initializing a flexible array member with empty string
PR c++/69290 - [6 Regression] ICE on invalid initialization of a flexible
	array member

gcc/testsuite/ChangeLog:
2016-01-25  Martin Sebor  

	PR c++/69253
	PR c++/69251
	PR c++/69290
	* g++.dg/ext/flexarray-subst.C: New test.
	* g++.dg/ext/flexary11.C: New test.
	* g++.dg/ext/flexary12.C: New test.
	* g++.dg/ext/flexary13.C: New test.
	* g++.dg/ext/flexary14.C: New test.
	* g++.dg/other/dump-ada-spec-2.C: Adjust.

gcc/cp/ChangeLog:
2016-01-25  Martin Sebor  

	PR c++/69253
	PR c++/69251
	PR c++/69290
	* decl.c (compute_array_index_type): Return null for flexible array
	members.
	* tree.c (array_of_runtime_bound_p): Handle gracefully array types
	with null TYPE_MAX_VALUE.
	(build_ctor_subob_ref): Loosen debug checking to handle flexible
	array members.

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index ceeef60..beb7c58 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -8638,8 +8638,9 @@ compute_array_index_type (tree name, tree size, tsubst_flags_t complain)
   tree itype;
   tree osize = size;
 
+  /* Flexible array members have no domain.  */
   if (size == NULL_TREE)
-return build_index_type (NULL_TREE);
+return NULL_TREE;
 
   if (error_operand_p (size))
 return error_mark_node;
diff --git a/gcc/cp/tree.c b/gcc/cp/tree.c
index e2123ac..779652c 100644
--- a/gcc/cp/tree.c
+++ b/gcc/cp/tree.c
@@ -937,9 +937,10 @@ array_of_runtime_bound_p (tree t)
   tree dom = TYPE_DOMAIN (t);
   if (!dom)
 return false;
-  tree max = TYPE_MAX_VALUE (dom);
-  return (!potential_rvalue_constant_expression (max)
-	  || (!value_dependent_expression_p (max) && !TREE_CONSTANT (max)));
+  if (tree max = TYPE_MAX_VALUE (dom))
+return (!potential_rvalue_constant_expression (max)
+	|| (!value_dependent_expression_p (max) && !TREE_CONSTANT (max)));
+  return false;
 }
 
 /* Return a reference type node referring to TO_TYPE.  If RVAL is
@@ -2556,8 +2557,21 @@ build_ctor_subob_ref (tree index, tree type, tree obj)
 obj = build_class_member_access_expr (obj, index, NULL_TREE,
 	  /*reference*/false, tf_none);
   if (obj)
-gcc_assert (same_type_ignoring_top_level_qualifiers_p (type,
-			   TREE_TYPE (obj)));
+{
+  tree objtype = TREE_TYPE (obj);
+  if (TREE_CODE (objtype) == ARRAY_TYPE
+	  && (!TYPE_DOMAIN (objtype)
+	  || !TYPE_MAX_VALUE (TYPE_DOMAIN (objtype
+	{
+	  /* When the

Re: [PATCH, committed][gcc-5-branch] Fix broken test case derived_constructor_comps_6.f90

2016-01-25 Thread Paul Richard Thomas

Dear Peter,

Many thanks! I have been away, got back last night and had intended to
deal with it tonight. You should note that Dominique has flagged up
that it fails with -m32.

Tshuess

Paul

On 25 January 2016 at 18:09, Peter Bergner  wrote:
> When the test case derived_constructor_comps_6.f90 was backported to
> the FSF 5 branch, a '}' on the dg-additional-options was dropped
> causing the test case to fail.  I have added it back and committed
> it as obvious.
>
> Peter
>
> PR fortran/61831
> * gfortran.dg/derived_constructor_comps_6.f90: Add missing } to fix
> up dg-additional-options.
>
>
> Index: gcc/testsuite/gfortran.dg/derived_constructor_comps_6.f90
> ===
> --- gcc/testsuite/gfortran.dg/derived_constructor_comps_6.f90   (revision 
> 232798)
> +++ gcc/testsuite/gfortran.dg/derived_constructor_comps_6.f90   (working copy)
> @@ -1,5 +1,5 @@
>  ! { dg-do run }
> -! { dg-additional-options "-fdump-tree-original"
> +! { dg-additional-options "-fdump-tree-original" }
>  !
>  ! PR fortran/61831
>  ! The deallocation of components of array constructor elements
>



-- 
The difference between genius and stupidity is; genius has its limits.

Albert Einstein

Re: [patch,ira]: Improve on updated memory cost in coloring pass of integrated register allocator.

2016-01-25 Thread Vladimir Makarov


On 01/23/2016 06:09 AM, Ajit Kumar Agarwal wrote:

This patch improves the updated memory cost in coloring pass of integrated 
register
allocator. Only enter_freq of the loop is considered in updated memory cost in 
the
coloring pass. Consideration of only enter_freq is based on the concept that 
live Out
of the entry or header of the Loop is live in and liveout throughout the loop. 
Exit
freq is ignored in the update memory cost in coloring pass.
As we put stores for spilled pseudos on loop entry and loads on the loop 
exits, ignoring loop exits means for me that we basically ignore the 
cost of the loads which is probably wrong in a general case.

This increases the updated memory most and more chances of reducing the spill 
and
fetch and better assignment.

The concept of live-out of the header of the loop is live-in and live-out 
throughout
of the Loop is based on the following.

If a v live is out at the header of the loop then the variable is live-in at 
every node
in the loop. To prove this, consider a loop L with header h such that the 
variable v
defined at d is live-in at h. Since v is live at h, d is not part of L. This 
follows
from the dominance property, i.e. h is strictly dominated by d. Furthermore, 
there
exists a path from h to a use of v which does not go through d. For every node 
p in
the loop, since the loop is strongly connected and node is a component of the 
CFG,
there exists a path, consisting only of nodes of L from p to h. Concatenating 
these
two paths proves that v is live-in and live-out of p.

Bootstrapped on X86_64.

Performance run is done on SPEC CPU2000 benchmarks and following are the 
results.

SPEC INT benchmarks
(Mean Score with this patch vs Mean score without this patch = 3729.777 vs 
3717.083).

BenchmarksGains.
186.crafty   = 2.78%
176.gcc = 0.7%
253.perlbmk = 0.75%
255.vortex=  0.82%

SPEC FP benchmarks
(Mean Score with this patch vs Mean score without this patch = 4774.65  vs 
4751.838 ).

Benchmarks  Gains

168.wupwise = 0.77%
171.swim= 1.5%
177.mesa= 1.2%
200.sixtrack= 1.2%
178.galgel= 0.6%
179.art = 0.6%
183.equake   = 0.5%
187.facerec   = 0.7%.

Thanks for trying to improve GCC performance, Ajit.  Unfortunately, I 
got different numbers on SPEC2000 with your patch.  The different 
results might be a consequence of different test setup.


I got the following numbers using 4.2GHz i7-4790K (Haswell) using -Ofast 
-mtune=corei7.  Using the tune option is important as RA will try to 
improve code for Haswell architecture.


64-bit:
Int 5123 5126
FP 6886 6897

32-bit:
Int 4754 4763
FP 6363 6346

Here the first column is GCC with your patch and the second one is 
without your patch.  Only 32-bit FP score was improved by you patch.  
These days practically nobody uses 32-bit code for FP benchmarks.


So unfortunately I can not approve the patch.  Sorry.

Re: [PATCH, committed][gcc-5-branch] Fix broken test case derived_constructor_comps_6.f90

2016-01-25 Thread Peter Bergner

On Mon, 2016-01-25 at 18:17 +0100, Paul Richard Thomas wrote:
> Many thanks! I have been away, got back last night and had intended 
> to deal with it tonight.

No problem.

> You should note that Dominique has flagged up that it fails with 
> -m32.

I was building on powerpc64le-linux (which doesn't support -m32)
when I encountered the missing '}', so I only fixed the bug I saw.
I'll leave it to you or someone else to fix the -m32 bug, since
I'm not seeing it on my system.

Peter

Re: [PATCH, committed][gcc-5-branch] Fix broken test case derived_constructor_comps_6.f90

2016-01-25 Thread Paul Richard Thomas

Neither am I :-(

Paul

On 25 January 2016 at 18:33, Peter Bergner  wrote:
> On Mon, 2016-01-25 at 18:17 +0100, Paul Richard Thomas wrote:
>> Many thanks! I have been away, got back last night and had intended
>> to deal with it tonight.
>
> No problem.
>
>
>> You should note that Dominique has flagged up that it fails with
>> -m32.
>
> I was building on powerpc64le-linux (which doesn't support -m32)
> when I encountered the missing '}', so I only fixed the bug I saw.
> I'll leave it to you or someone else to fix the -m32 bug, since
> I'm not seeing it on my system.
>
> Peter
>
>



-- 
The difference between genius and stupidity is; genius has its limits.

Albert Einstein

Re: gomp_target_fini

2016-01-25 Thread Mike Stump

On Jan 22, 2016, at 2:16 AM, Jakub Jelinek  wrote:
> On Thu, Jan 21, 2016 at 04:24:46PM +0100, Bernd Schmidt wrote:
>> Thomas, I've mentioned this issue before - there is sometimes just too much
>> irrelevant stuff to wade through in your patch submissions, and it
>> discourages review. The discussion of the actual problem begins more than
>> halfway through your multi-page mail. Please try to be more concise.
>> 
>> On 12/16/2015 01:30 PM, Thomas Schwinge wrote:
>>> Now, with the above change installed, GOMP_PLUGIN_fatal will trigger the
>>> atexit handler, gomp_target_fini, which, with the device lock held, will
>>> call back into the plugin, GOMP_OFFLOAD_fini_device, which will try to
>>> clean up.
>>> 
>>> Because of the earlier CUDA_ERROR_LAUNCH_FAILED, the associated CUDA
>>> context is now in an inconsistent state
>> 
>>> Thus, any cuMemFreeHost invocations that are run during clean-up will now
>>> also/still return CUDA_ERROR_LAUNCH_FAILED, due to which we'll again call
>>> GOMP_PLUGIN_fatal, which again will trigger the same or another
>>> (GOMP_offload_unregister_ver) atexit handler, which will then deadlock
>>> trying to lock the device again, which is still locked.
>> 
>>> libgomp/
>>> * error.c (gomp_vfatal): Call _exit instead of exit.
>> 
>> It seems unfortunate to disable the atexit handlers for everything for what
>> seems purely an nvptx problem.
>> 
>> What exactly happens if you don't register the cleanups with atexit in the
>> first place? Or maybe you can query for CUDA_ERROR_LAUNCH_FAILED in the
>> cleanup functions?
> 
> I agree, _exit is just wrong, there could be important atexit hooks from the
> application.  You can set some flag that the libgomp or nvptx plugin atexit
> hooks should not do anything, or should do things differently.  But
> bypassing all atexit handlers is risky.

I’d use the phrase, is wrong.

Just create a semaphore that says that init was fully done, and at the end of 
init, set it, and at the beginning of the cleanup, just test it and anytime you 
want to cancel the cleanup, reset the semaphore.  Think of it, as a is_valid 
predicate.  Any operation that needs it to be valid can query it first, and 
fail otherwise.

Re: [PATCH] Fix aarch64 bootstrap (pr69416)

2016-01-25 Thread Wilco Dijkstra

Richard Henderson wrote:
> On 01/25/2016 05:28 AM, Christophe Lyon wrote:
> > After this, I'm seeing this test now FAILs:
> > gcc.target/aarch64/ccmp_1.c scan-assembler adds\t
>
> That test case is badly written.  In addition to that one, several of the 
> other
> failures that I see within that file are simply equally optimal alternative
> choices for the compiler.  The file needs to be split up and simpler more
> directed tests written.

The test case was written specifically to emit 'adds' as that is the optimal 
sequence. It is a regression caused by wrapping the immediate in a unspec which
disables costing of all CCMPs... I have a patch for this. 

The zero issue is due to the testcase assuming GCC emits '0' and 'wzr' 
correctly -
it was based on a very old patch that emits the correct zero for compares that 
hasn't
been OK'd yet. And the failure to emit an fccmp is due to a recent fix to NaN 
handling
in compares, so that testcase now needs -ffinite-math-only.

Wilco

Re: [PATCH, AArch64] Fix for PR67896 (C++ FE cannot distinguish __Poly{8,16,64,128}_t types)

2016-01-25 Thread Mike Stump

On Jan 25, 2016, at 4:15 AM, James Greenhalgh  wrote:
 P.S.: I haven't signed the copyright assignment to the FSF. The change
 is really small but I can do the paperwork if required.
> 
> I can't commit it on your behalf until we've heard back regarding whether
> this needs a copyright assignment to the FSF, but once I've heard I'd
> be happy to commit this for you.

This is fine for the tree without paper work.  Though, if you work on gcc on a 
regular basis and are likely to contribute more work in the future, it is nice 
to get the paper work out of the way for next time.

[C PATCH] Fix -Wunused-function (PR debug/66869)

2016-01-25 Thread Jakub Jelinek

Hi!

The early-debug changes moved warnings about unused functions into cgraph.
The problem is that if we have just unused declarations, they aren't
sometimes even registered with cgraph and therefore we no longer warn.

Here is an attempt to register those with cgraph anyway to get the warning,
for C FE only (no idea where to do that in C++ FE).  Or anyone has better
suggestions what to do?

Bootstrapped/regtested on x86_64-linux and i686-linux.

2016-01-25  Jakub Jelinek  

PR debug/66869
* c-decl.c (c_write_global_declarations_1): For warn_unused_function,
ensure creation of cgraph node even if there is no definition.

* gcc.dg/pr66869.c: New test.

--- gcc/c/c-decl.c.jj   2016-01-21 00:41:47.0 +0100
+++ gcc/c/c-decl.c  2016-01-25 16:36:31.973504082 +0100
@@ -10741,11 +10741,19 @@ c_write_global_declarations_1 (tree glob
   if (TREE_CODE (decl) == FUNCTION_DECL
  && DECL_INITIAL (decl) == 0
  && DECL_EXTERNAL (decl)
- && !TREE_PUBLIC (decl)
- && C_DECL_USED (decl))
+ && !TREE_PUBLIC (decl))
{
- pedwarn (input_location, 0, "%q+F used but never defined", decl);
- TREE_NO_WARNING (decl) = 1;
+ if (C_DECL_USED (decl))
+   {
+ pedwarn (input_location, 0, "%q+F used but never defined", decl);
+ TREE_NO_WARNING (decl) = 1;
+   }
+ /* For -Wunused-function push the unused statics into cgraph,
+so that check_global_declaration emits the warning.  */
+ else if (warn_unused_function
+  && ! DECL_ARTIFICIAL (decl)
+  && ! TREE_NO_WARNING (decl))
+   cgraph_node::get_create (decl);
}
 
   wrapup_global_declaration_1 (decl);
--- gcc/testsuite/gcc.dg/pr66869.c.jj   2016-01-25 16:38:39.037758657 +0100
+++ gcc/testsuite/gcc.dg/pr66869.c  2016-01-25 16:39:42.346888954 +0100
@@ -0,0 +1,6 @@
+/* PR debug/66869 */
+/* { dg-do compile } */
+/* { dg-options "-Wunused-function" } */
+
+static void test (void); /* { dg-warning "'test' declared 'static' but never 
defined" } */
+int i;

Jakub

Re: [PATCH 4/4][AArch64] Cost CCMP instruction sequences to choose better expand order

2016-01-25 Thread Richard Henderson

On 01/25/2016 12:09 PM, Wilco Dijkstra wrote:
> BTW is there a regular expression that correctly implements (0|xzr)? 

(0|wzr) works fine for me; I've got exactly that fix in one of my trees.


r~

[Patch, fortran] PR69385 - [6 regression] ICE on valid with -fcheck=mem

2016-01-25 Thread Paul Richard Thomas

Dear All,

The initial report concerns initialization assignments that should be
excluded from the check for assignment of scalars to unallocated
arrays. This part is so trivial that it does not require a test. On
the other hand, the block that implemented the check was plain and
simple wrong and the rest of the patch corrects this. It is commented
such as to be fully comprehensible.

Bootstrapped and regtested on FC21/x86_64 - OK for trunk and for
5-branch when all the wrinkles (PR69422 and 69423) are sorted out?

Cheers

Paul

2016-01-25  Paul Thomas  

PR fortran/69385
* trans-expr.c (gfc_trans_assignment_1): Exclude initialization
assignments from check on assignment of scalars to unassigned
arrays and correct wrong code within the corresponding block.

2015-01-25  Paul Thomas  

PR fortran/69385
* gfortran.dg/allocate_error_6.f90: New test.
Index: gcc/fortran/trans-expr.c
===
*** gcc/fortran/trans-expr.c(revision 232800)
--- gcc/fortran/trans-expr.c(working copy)
*** gfc_trans_assignment_1 (gfc_expr * expr1
*** 9286,9291 
--- 9286,9292 
  {
gfc_conv_expr (, expr1);
if (gfc_option.rtcheck & GFC_RTCHECK_MEM
+ && !init_flag
  && gfc_expr_attr (expr1).allocatable
  && expr1->rank
  && !expr2->rank)
*** gfc_trans_assignment_1 (gfc_expr * expr1
*** 9293,9306 
  tree cond;
  const char* msg;
  
! tmp = expr1->symtree->n.sym->backend_decl;
! if (POINTER_TYPE_P (TREE_TYPE (tmp)))
!   tmp = build_fold_indirect_ref_loc (input_location, tmp);
  
! if (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (tmp)))
!   tmp = gfc_conv_descriptor_data_get (tmp);
! else
!   tmp = TREE_OPERAND (lse.expr, 0);
  
  cond = fold_build2_loc (input_location, EQ_EXPR, boolean_type_node,
  tmp, build_int_cst (TREE_TYPE (tmp), 0));
--- 9294,9310 
  tree cond;
  const char* msg;
  
! /* We should only get array references here.  */
! gcc_assert (TREE_CODE (lse.expr) == POINTER_PLUS_EXPR
! || TREE_CODE (lse.expr) == ARRAY_REF);
  
! /* 'tmp' is either the pointer to the array(POINTER_PLUS_EXPR)
!or the array itself(ARRAY_REF).  */
! tmp = TREE_OPERAND (lse.expr, 0);
! 
! /* Provide the address of the array.  */
! if (TREE_CODE (lse.expr) == ARRAY_REF)
!   tmp = gfc_build_addr_expr (NULL_TREE, tmp);
  
  cond = fold_build2_loc (input_location, EQ_EXPR, boolean_type_node,
  tmp, build_int_cst (TREE_TYPE (tmp), 0));
Index: gcc/testsuite/gfortran.dg/allocate_error_6.f90
===
*** gcc/testsuite/gfortran.dg/allocate_error_6.f90  (revision 0)
--- gcc/testsuite/gfortran.dg/allocate_error_6.f90  (working copy)
***
*** 0 
--- 1,40 
+ ! { dg-do run }
+ ! { dg-additional-options "-fcheck=mem" }
+ ! { dg-shouldfail "Fortran runtime error: Assignment of scalar to unallocated 
array" }
+ !
+ ! This omission was encountered in the course of fixing PR54070. Whilst this 
is a
+ ! very specific case, others such as allocatable components have been tested.
+ !
+ ! Contributed by Tobias Burnus  
+ !
+ function g(a) result (res)
+   real :: a
+   real,allocatable :: res(:)
+   res = a  ! Since 'res' is not allocated, a runtime error should occur.
+ end function
+ 
+   interface
+ function g(a) result(res)
+   real :: a
+   real,allocatable :: res(:)
+ end function
+   end interface
+ !  print *, g(2.0)
+ !  call foo
+   call foofoo
+ contains
+   subroutine foo
+ type bar
+   real, allocatable, dimension(:) :: r
+ end type
+ type (bar) :: foobar
+ foobar%r = 1.0
+   end subroutine
+   subroutine foofoo
+ type barfoo
+   character(:), allocatable, dimension(:) :: c
+ end type
+ type (barfoo) :: foobarfoo
+ foobarfoo%c = "1.0"
+   end subroutine
+ end

Re: [Patch, fortran] PR69385 - [6 regression] ICE on valid with -fcheck=mem

2016-01-25 Thread Janus Weil

Hi Paul,

seems we were pretty well-synchronized in posting this (in the PR it
sounded as if you wanted me to submit it ...)

In any case, the patch is ok for my taste.

Thanks!

Cheers,
Janus



2016-01-25 22:02 GMT+01:00 Paul Richard Thomas :
> Dear All,
>
> The initial report concerns initialization assignments that should be
> excluded from the check for assignment of scalars to unallocated
> arrays. This part is so trivial that it does not require a test. On
> the other hand, the block that implemented the check was plain and
> simple wrong and the rest of the patch corrects this. It is commented
> such as to be fully comprehensible.
>
> Bootstrapped and regtested on FC21/x86_64 - OK for trunk and for
> 5-branch when all the wrinkles (PR69422 and 69423) are sorted out?
>
> Cheers
>
> Paul
>
> 2016-01-25  Paul Thomas  
>
> PR fortran/69385
> * trans-expr.c (gfc_trans_assignment_1): Exclude initialization
> assignments from check on assignment of scalars to unassigned
> arrays and correct wrong code within the corresponding block.
>
> 2015-01-25  Paul Thomas  
>
> PR fortran/69385
> * gfortran.dg/allocate_error_6.f90: New test.

Re: [PATCH] Fix a typo in ppc libgcc (PR target/69444)

2016-01-25 Thread David Edelsohn

On Mon, Jan 25, 2016 at 3:34 PM, Jakub Jelinek  wrote:
> Hi!
>
> The soft-fp multilib of powerpc libgcc doesn't build because of a typo
> in the conditional - the guarded code uses inline asm that assumes hard
> float.
>
> Ok for trunk?
>
> 2016-01-25  Jakub Jelinek  
>
> PR target/69444
> * config/rs6000/sfp-machine.h: Fix a typo in #ifndef - __NO_FPRS__
> instead of ___NO_FPRS__.

Okay.

Thanks, David

Re: Incorrect code due to indirect tail call of varargs function with hard float ABI

2016-01-25 Thread Kugan


This issue also remains in 4.9 and 5.0 branches. Is this OK to backport
to the release branches.

Thanks,
Kugan

On 02/12/15 10:00, Kugan wrote:
> 
>>>
>>> gcc/ChangeLog:
>>>
>>> 2015-11-18  Kugan Vivekanandarajah  
>>>
>>> PR target/68390
>>> * config/arm/arm.c (arm_function_ok_for_sibcall): Get function type
>>> for indirect function call.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> 2015-11-18  Kugan Vivekanandarajah  
>>>
>>> PR target/68390
>>> * gcc.target/arm/PR68390.c: New test.
>>>
>>
>> s/PR/pr in the test name and put this in gcc.c-torture/execute instead - 
>> there is nothing ARM specific about the test. Tests in gcc.target/arm should 
>> really only be architecture specific. This isn't.
>>
>>>
>>>
>>>
>>> p.txt
>>>
>>>
>>> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
>>> index a379121..0dae7da 100644
>>> --- a/gcc/config/arm/arm.c
>>> +++ b/gcc/config/arm/arm.c
>>> @@ -6680,8 +6680,13 @@ arm_function_ok_for_sibcall (tree decl, tree exp)
>>>  a VFP register but then need to transfer it to a core
>>>  register.  */
>>>rtx a, b;
>>> +  tree fn_decl = decl;
>>
>> Call it decl_or_type instead - it's really that ... 
>>
>>>  
>>> -  a = arm_function_value (TREE_TYPE (exp), decl, false);
>>> +  /* If it is an indirect function pointer, get the function type.  */
>>> +  if (!decl)
>>> +   fn_decl = TREE_TYPE (TREE_TYPE (CALL_EXPR_FN (exp)));
>>> +
>>
>> This is probably just my mail client - but please watch out for indentation.
>>
>>> +  a = arm_function_value (TREE_TYPE (exp), fn_decl, false);
>>>b = arm_function_value (TREE_TYPE (DECL_RESULT (cfun->decl)),
>>>   cfun->decl, false);
>>>if (!rtx_equal_p (a, b))
>>
>>
>> OK with those changes.
>>
>> Ramana
>>

>

[PATCH 1/3] add missing testcase

2016-01-25 Thread Sebastian Pop

---
 gcc/testsuite/gcc.dg/graphite/pr69292.c | 19 +++
 1 file changed, 19 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/graphite/pr69292.c

diff --git a/gcc/testsuite/gcc.dg/graphite/pr69292.c 
b/gcc/testsuite/gcc.dg/graphite/pr69292.c
new file mode 100644
index 000..b925181
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/graphite/pr69292.c
@@ -0,0 +1,19 @@
+/* { dg-options "-O2 -floop-nest-optimize" } */
+
+int m[1];
+
+void
+foo (double a[20][20], double b[20])
+{
+  int i, j, k;
+
+  for (i = 0; i < m[0]; ++i)
+for (j = 0; j < m[0]; ++j)
+  a[i][j] = a[i][j] + 1;
+
+  for (k = 0; k < 20; ++k)
+for (i = 0; i < m[0]; ++i)
+  for (j = 0; j < m[0]; ++j)
+   b[i] = b[i] + a[i][j];
+}
+
-- 
2.5.0

[PATCH] pr69477 - attribute aligned documentation misleading

2016-01-25 Thread Martin Sebor


The attached patch adjusts the documentation of attribute aligned
and attribute pack so as to prevent misreading the text of the
former attribute as if it had read:

  Specifying attribute aligned for struct and union types is
  equivalent to specifying the packed attribute on each of
  the structure or union members. ...

Martin
PR other/69477 - attribute aligned documentation misleading

gcc/ChangeLog:
2016-01-25  Martin Sebor  

	PR other/69477
	* doc/extend.texi (Common Type Attributes): Move text that talks about
	attribute packed from attribute aligned to the section discussing
	the former attribute for clarity.

Index: gcc/doc/extend.texi
===
--- gcc/doc/extend.texi	(revision 232765)
+++ gcc/doc/extend.texi	(working copy)
@@ -6307,9 +6307,6 @@ relevant type, and the code that the com
 pointer arithmetic operations is often more efficient for
 efficiently-aligned types than for other types.
 
-The @code{aligned} attribute can only increase the alignment; but you
-can decrease it by specifying @code{packed} as well.  See below.
-
 Note that the effectiveness of @code{aligned} attributes may be limited
 by inherent limitations in your linker.  On many systems, the linker is
 only able to arrange for variables to be aligned up to a certain maximum
@@ -6319,36 +6316,8 @@ up to a maximum of 8-byte alignment, the
 in an @code{__attribute__} still only provides you with 8-byte
 alignment.  See your linker documentation for further information.
 
-@opindex fshort-enums
-Specifying this attribute for @code{struct} and @code{union} types is
-equivalent to specifying the @code{packed} attribute on each of the
-structure or union members.  Specifying the @option{-fshort-enums}
-flag on the line is equivalent to specifying the @code{packed}
-attribute on all @code{enum} definitions.
-
-In the following example @code{struct my_packed_struct}'s members are
-packed closely together, but the internal layout of its @code{s} member
-is not packed---to do that, @code{struct my_unpacked_struct} needs to
-be packed too.
-
-@smallexample
-struct my_unpacked_struct
- @{
-char c;
-int i;
- @};
-
-struct __attribute__ ((__packed__)) my_packed_struct
-  @{
- char c;
- int  i;
- struct my_unpacked_struct s;
-  @};
-@end smallexample
-
-You may only specify this attribute on the definition of an @code{enum},
-@code{struct} or @code{union}, not on a @code{typedef} that does not
-also define the enumerated type, structure or union.
+The @code{aligned} attribute can only increase alignment.  Alignment
+can be decreased by specifying the @code{packed} attribute.  See below.
 
 @item bnd_variable_size
 @cindex @code{bnd_variable_size} type attribute
@@ -6476,6 +6445,37 @@ of the structure or union is placed to m
 attached to an @code{enum} definition, it indicates that the smallest
 integral type should be used.
 
+@opindex fshort-enums
+Specifying the @code{packed} attribute for @code{struct} and @code{union}
+types is equivalent to specifying the @code{packed} attribute on each
+of the structure or union members.  Specifying the @option{-fshort-enums}
+flag on the command line is equivalent to specifying the @code{packed}
+attribute on all @code{enum} definitions.
+
+In the following example @code{struct my_packed_struct}'s members are
+packed closely together, but the internal layout of its @code{s} member
+is not packed---to do that, @code{struct my_unpacked_struct} needs to
+be packed too.
+
+@smallexample
+struct my_unpacked_struct
+ @{
+char c;
+int i;
+ @};
+
+struct __attribute__ ((__packed__)) my_packed_struct
+  @{
+ char c;
+ int  i;
+ struct my_unpacked_struct s;
+  @};
+@end smallexample
+
+You may only specify the @code{packed} attribute attribute on the definition
+of an @code{enum}, @code{struct} or @code{union}, not on a @code{typedef}
+that does not also define the enumerated type, structure or union.
+
 @item scalar_storage_order ("@var{endianness}")
 @cindex @code{scalar_storage_order} type attribute
 When attached to a @code{union} or a @code{struct}, this attribute sets

Re: [hsa merge 07/10] IPA-HSA pass

2016-01-25 Thread Jan Hubicka

> On Mon, Jan 25, 2016 at 04:21:50PM +0100, Martin Liška wrote:
> > On 01/16/2016 11:00 AM, Jan Hubicka wrote:
> > > Can't it be represented via explicit REF_ADDR or something like that?
> > > 
> > > Honza
> > 
> > Hi.
> > 
> > Sure, I've just done a patch that can do that. However, as we're currently 
> > in stage4,
> > that change would probably require explicit permission of a release manager?
> 
> If Honza is fine with it and you've tested it, this is ok for trunk.

It looks fine to me.

Honza

[PATCH 2/3] fix PR68343: disable fuse-*.c tests for isl 0.14 or earlier

2016-01-25 Thread Sebastian Pop

The patch disables all fuse-*.c tests when configuring gcc with isl 0.14 or 
earlier.

ChangeLog:

* Makefile.in: Regenerate.
* Makefile.tpl: Export ISLVER.
* configure: Regenerate.
* config/isl.m4: Detect isl-0.15.

gcc/

* Makefile.in: Set ISLVER in site.exp.
* config.in: Regenerate.
* configure: Regenerate.
* configure.ac: Define HAVE_isl for isl-0.15.

gcc/testsuite/

* gcc.dg/graphite/graphite.exp: Only run the fuse-*.c tests with 
isl-0.15.
---
 Makefile.in|  2 ++
 Makefile.tpl   |  2 ++
 config/isl.m4  | 12 
 configure  | 29 +
 gcc/Makefile.in|  1 +
 gcc/testsuite/gcc.dg/graphite/fuse-2.c |  4 ++--
 gcc/testsuite/gcc.dg/graphite/graphite.exp |  8 +++-
 7 files changed, 55 insertions(+), 3 deletions(-)

diff --git a/Makefile.in b/Makefile.in
index 20d1c90..a519a54 100644
--- a/Makefile.in
+++ b/Makefile.in
@@ -222,6 +222,7 @@ HOST_EXPORTS = \
GMPINC="$(HOST_GMPINC)"; export GMPINC; \
ISLLIBS="$(HOST_ISLLIBS)"; export ISLLIBS; \
ISLINC="$(HOST_ISLINC)"; export ISLINC; \
+   ISLVER="$(HOST_ISLVER)"; export ISLVER; \
LIBELFLIBS="$(HOST_LIBELFLIBS)"; export LIBELFLIBS; \
LIBELFINC="$(HOST_LIBELFINC)"; export LIBELFINC; \
XGCC_FLAGS_FOR_TARGET="$(XGCC_FLAGS_FOR_TARGET)"; export 
XGCC_FLAGS_FOR_TARGET; \
@@ -315,6 +316,7 @@ HOST_GMPINC = @gmpinc@
 # Where to find isl
 HOST_ISLLIBS = @isllibs@
 HOST_ISLINC = @islinc@
+HOST_ISLVER = @islver@
 
 # Where to find libelf
 HOST_LIBELFLIBS = @libelflibs@
diff --git a/Makefile.tpl b/Makefile.tpl
index 2567365..829f664 100644
--- a/Makefile.tpl
+++ b/Makefile.tpl
@@ -225,6 +225,7 @@ HOST_EXPORTS = \
GMPINC="$(HOST_GMPINC)"; export GMPINC; \
ISLLIBS="$(HOST_ISLLIBS)"; export ISLLIBS; \
ISLINC="$(HOST_ISLINC)"; export ISLINC; \
+   ISLVER="$(HOST_ISLVER)"; export ISLVER; \
LIBELFLIBS="$(HOST_LIBELFLIBS)"; export LIBELFLIBS; \
LIBELFINC="$(HOST_LIBELFINC)"; export LIBELFINC; \
XGCC_FLAGS_FOR_TARGET="$(XGCC_FLAGS_FOR_TARGET)"; export 
XGCC_FLAGS_FOR_TARGET; \
@@ -318,6 +319,7 @@ HOST_GMPINC = @gmpinc@
 # Where to find isl
 HOST_ISLLIBS = @isllibs@
 HOST_ISLINC = @islinc@
+HOST_ISLVER = @islver@
 
 # Where to find libelf
 HOST_LIBELFLIBS = @libelflibs@
diff --git a/config/isl.m4 b/config/isl.m4
index 86ccb94..0103f1f 100644
--- a/config/isl.m4
+++ b/config/isl.m4
@@ -117,6 +117,18 @@ AC_DEFUN([ISL_CHECK_VERSION],
   AC_MSG_RESULT([recommended isl version is 0.15, minimum required isl 
version 0.14 is deprecated])
 fi
 
+AC_MSG_CHECKING([for isl-0.15])
+AC_TRY_LINK([#include ],
+[isl_options_set_schedule_serialize_sccs (NULL, 0);],
+[ac_has_isl_options_set_schedule_serialize_sccs=yes],
+[ac_has_isl_options_set_schedule_serialize_sccs=no])
+AC_MSG_RESULT($ac_has_isl_options_set_schedule_serialize_sccs)
+
+if test x"$ac_has_isl_options_set_schedule_serialize_sccs" = x"yes"; then
+  islver="0.15"
+  AC_SUBST([islver])
+fi
+
 CFLAGS=$_isl_saved_CFLAGS
 LDFLAGS=$_isl_saved_LDFLAGS
 LIBS=$_isl_saved_LIBS
diff --git a/configure b/configure
index cae3373..b9a4b51 100755
--- a/configure
+++ b/configure
@@ -650,6 +650,7 @@ extra_linker_plugin_flags
 extra_linker_plugin_configure_flags
 islinc
 isllibs
+islver
 poststage1_ldflags
 poststage1_libs
 stage1_ldflags
@@ -6048,6 +6049,34 @@ $as_echo "$gcc_cv_isl" >&6; }
 $as_echo "recommended isl version is 0.15, minimum required isl version 0.14 
is deprecated" >&6; }
 fi
 
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for isl-0.15" >&5
+$as_echo_n "checking for isl-0.15... " >&6; }
+cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+#include 
+int
+main ()
+{
+isl_options_set_schedule_serialize_sccs (NULL, 0);
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_c_try_link "$LINENO"; then :
+  ac_has_isl_options_set_schedule_serialize_sccs=yes
+else
+  ac_has_isl_options_set_schedule_serialize_sccs=no
+fi
+rm -f core conftest.err conftest.$ac_objext \
+conftest$ac_exeext conftest.$ac_ext
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: 
$ac_has_isl_options_set_schedule_serialize_sccs" >&5
+$as_echo "$ac_has_isl_options_set_schedule_serialize_sccs" >&6; }
+
+if test x"$ac_has_isl_options_set_schedule_serialize_sccs" = x"yes"; then
+  islver="0.15"
+
+fi
+
 CFLAGS=$_isl_saved_CFLAGS
 LDFLAGS=$_isl_saved_LDFLAGS
 LIBS=$_isl_saved_LIBS
diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index ab9cbbf..aa3c018 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -3698,6 +3698,7 @@ site.exp: ./config.status Makefile
  echo "set PLUGINCFLAGS \"$(PLUGINCFLAGS)\"" >> ./site.tmp; \
  echo "set GMPINC \"$(GMPINC)\"" >>

[PATCH 3/3] new scop schedule for isl-0.15

2016-01-25 Thread Sebastian Pop

Keep unchanged the implementation for isl-0.14.

* graphite-poly.c (apply_poly_transforms): Simplify.
(print_isl_set): Use more readable format: ISL_YAML_STYLE_BLOCK.
(print_isl_map): Same.
(print_isl_union_map): Same.
(print_isl_schedule): New.
(debug_isl_schedule): New.
* graphite-dependences.c (scop_get_reads): Do not call
isl_union_map_add_map that is undocumented isl functionality.
(scop_get_must_writes): Same.
(scop_get_may_writes): Same.
(scop_get_original_schedule): Remove.
(scop_get_dependences): Do not call isl_union_map_compute_flow that
is deprecated in isl 0.15.  Instead, use isl_union_access_* interface.
(compute_deps): Remove.
* graphite-isl-ast-to-gimple.c (print_schedule_ast): New.
(debug_schedule_ast): New.
(translate_isl_ast_to_gimple::scop_to_isl_ast): Call 
set_separate_option.
(graphite_regenerate_ast_isl): Add dump.
(translate_isl_ast_to_gimple::scop_to_isl_ast): Generate code
from scop->transformed_schedule.
(graphite_regenerate_ast_isl): Add more dump.
* graphite-optimize-isl.c (optimize_isl): Set
scop->transformed_schedule.  Check whether schedules are equal.
(apply_poly_transforms): Move here.
* graphite-poly.c (apply_poly_transforms): ... from here.
(free_poly_bb): Static.
(free_scop): Static.
(pbb_number_of_iterations_at_time): Remove.
(print_isl_ast): New.
(debug_isl_ast): New.
(debug_scop_pbb): New.
* graphite-scop-detection.c (print_edge): Move.
(print_sese): Move.
* graphite-sese-to-poly.c (build_pbb_scattering_polyhedrons): Remove.
(build_scop_scattering): Remove.
(create_pw_aff_from_tree): Assert instead of bailing out.
(add_condition_to_pbb): Remove unused code, do not fail.
(add_conditions_to_domain): Same.
(add_conditions_to_constraints): Remove.
(build_scop_context): New.
(add_iter_domain_dimension): New.
(build_iteration_domains): Initialize pbb->iterators.
Call add_conditions_to_domain.
(nested_in): New.
(loop_at): New.
(index_outermost_in_loop): New.
(index_pbb_in_loop): New.
(outermost_pbb_in): New.
(add_in_sequence): New.
(add_outer_projection): New.
(outer_projection_mupa): New.
(add_loop_schedule): New.
(build_schedule_pbb): New.
(build_schedule_loop): New.
(embed_in_surrounding_loops): New.
(build_schedule_loop_nest): New.
(build_original_schedule): New.
(build_poly_scop): Call build_original_schedule.
* graphite.h: Declare print_isl_schedule and debug_isl_schedule.
(free_poly_dr): Remove.
(struct poly_bb): Add iterators.  Remove schedule, transformed, saved.
(free_poly_bb): Remove.
(debug_loop_vec): Remove.
(print_isl_ast): Declare.
(debug_isl_ast): Declare.
(scop_do_interchange): Remove.
(scop_do_strip_mine): Remove.
(scop_do_block): Remove.
(flatten_all_loops): Remove.
(optimize_isl): Remove.
(pbb_number_of_iterations_at_time): Remove.
(debug_scop_pbb): Declare.
(print_schedule_ast): Declare.
(debug_schedule_ast): Declare.
(struct scop): Remove schedule.  Add original_schedule,
transformed_schedule.
(free_gimple_poly_bb): Remove.
(print_generated_program): Remove.
(debug_generated_program): Remove.
(unify_scattering_dimensions): Remove.
* sese.c (print_edge): ... here.
(print_sese): ... here.
(debug_edge): ... here.
(debug_sese): ... here.
* sese.h (print_edge): Declare.
(print_sese): Declare.
(dump_edge): Declare.
(dump_sese): Declare.
---
 gcc/graphite-dependences.c | 123 ++-
 gcc/graphite-isl-ast-to-gimple.c   | 203 +++-
 gcc/graphite-optimize-isl.c| 177 +++---
 gcc/graphite-poly.c| 154 +
 gcc/graphite-scop-detection.c  |  15 -
 gcc/graphite-sese-to-poly.c| 365 ++---
 gcc/graphite.h |  48 +--
 gcc/sese.c |  34 ++
 gcc/sese.h |   7 +-
 gcc/testsuite/gcc.dg/graphite/pr35356-1.c  |   2 +-
 .../gfortran.dg/graphite/interchange-3.f90 |   2 +-
 11 files changed, 836 insertions(+), 294 deletions(-)

diff --git a/gcc/graphite-dependences.c b/gcc/graphite-dependences.c
index 0544700..f9d5bc3 100644
--- a/gcc/graphite-dependences.c
+++ b/gcc/graphite-dependences.c
@@ -66,7 +66,7 @@ add_pdr_constraints (poly_dr_p pdr, poly_bb_p pbb)
 /* Returns all the memory reads in

[patch] fix gccjit build failure

2016-01-25 Thread Matthias Klose

gccjit currently fails to build, needing an additional header. Ok to install on 
the trunk?


Matthias


* jit-playback.c: Include .

--- a/gcc/jit/jit-playback.c
+++ b/gcc/jit/jit-playback.c
@@ -43,6 +43,8 @@ along with GCC; see the file COPYING3.
 #include "jit-builtins.h"
 #include "jit-tempdir.h"

+#include 
+

 /* gcc::jit::playback::context::build_cast uses the convert.h API,
which in turn requires the frontend to provide a "convert"

Re: Wonly-top-basic-asm

2016-01-25 Thread Segher Boessenkool

Hi David,

On Sun, Jan 24, 2016 at 02:23:53PM -0800, David Wohlferd wrote:
> - Warn that this could change in future versions of gcc.  To avoid 
> impacts from this change, use extended asm.
> - Implement and document -Wonly-top-basic-asm (disabled by default) as a 
> way to locate affected statements.

In my opinion we should not warn for any asm that means the same both
as basic and as extended asm.  The problem then becomes, what *is* the
meaning of a basic asm, what does it clobber.

Currently the only differences are:

- asms that have a % in the string, or {|} on targets with ASSEMBLER_DIALECT;
- ia64 (for stop bits);
- mep, and this one is easily fixed.
- basic asms do not get TARGET_MD_ASM_ADJUST.

Segher

Re: [PATCH] jit: Fix missing references to pthread in jit-playback.c

2016-01-25 Thread David Malcolm

On Sat, 2016-01-23 at 19:08 +0100, Iain Buclaw wrote:
> Hi,
> 
> I noticed when building from 2016-01-17 snapshot that the JIT frontend
> failed to build.
> 
> ---
> jit-playback.c:2075:36: error: ‘PTHREAD_MUTEX_INITIALIZER’ was not
> declared in this scope
> jit-playback.c: In member function ‘void
> gcc::jit::playback::context::acquire_mutex()’:
> jit-playback.c:2086:33: error: ‘pthread_mutex_lock’ was not declared
> in this scope
> jit-playback.c: In member function ‘void
> gcc::jit::playback::context::release_mutex()’:
> jit-playback.c:2100:35: error: ‘pthread_mutex_unlock’ was not declared
> in this scope
> ---
> 
> I'm not sure if this is something environmental on my side, or some
> reorder/removals were done in the gcc headers included by the JIT
> frontend, however this was needed in order to continue.

Thanks.  Doko just reported the same issue, and I now see it (with
r232813) so this isn't just at your end.

OK for trunk.

Dave

[PATCH] PR target/68986: [5/6 Regression] internal compiler error: Segmentation fault

2016-01-25 Thread H.J. Lu

Stack alignment adjustment for __tls_get_addr should be done in
ix86_update_stack_boundary, not ix86_compute_frame_layout.  Also
there is no need to over-align stack for __tls_get_addr and function
with __tls_get_addr call isn't a leaf function.

Tested on x86-64 with -m32 on testsuite.  OK for trunk?

Thanks.

H.J.
---
gcc/

PR target/68986
* config/i386/i386.c (ix86_compute_frame_layout): Move stack
alignment adjustment to ...
(ix86_update_stack_boundary): Here.  Don't over-align stack for
__tls_get_addr.
(ix86_finalize_stack_realign_flags): Use stack_alignment_needed
if __tls_get_addr is called.

gcc/testsuite/

PR target/68986
* gcc.target/i386/pr68986-1.c: New test.
* gcc.target/i386/pr68986-2.c: Likewise.
* gcc.target/i386/pr68986-3.c: Likewise.
---
 gcc/config/i386/i386.c| 24 +++-
 gcc/testsuite/gcc.target/i386/pr68986-1.c | 11 +++
 gcc/testsuite/gcc.target/i386/pr68986-2.c | 13 +
 gcc/testsuite/gcc.target/i386/pr68986-3.c | 13 +
 4 files changed, 48 insertions(+), 13 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr68986-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr68986-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr68986-3.c

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 34b57a4..9c27ea9 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -11360,18 +11360,6 @@ ix86_compute_frame_layout (struct ix86_frame *frame)
   crtl->preferred_stack_boundary = 128;
   crtl->stack_alignment_needed = 128;
 }
-  /* preferred_stack_boundary is never updated for call
- expanded from tls descriptor. Update it here. We don't update it in
- expand stage because according to the comments before
- ix86_current_function_calls_tls_descriptor, tls calls may be optimized
- away.  */
-  else if (ix86_current_function_calls_tls_descriptor
-  && crtl->preferred_stack_boundary < PREFERRED_STACK_BOUNDARY)
-{
-  crtl->preferred_stack_boundary = PREFERRED_STACK_BOUNDARY;
-  if (crtl->stack_alignment_needed < PREFERRED_STACK_BOUNDARY)
-   crtl->stack_alignment_needed = PREFERRED_STACK_BOUNDARY;
-}
 
   stack_alignment_needed = crtl->stack_alignment_needed / BITS_PER_UNIT;
   preferred_alignment = crtl->preferred_stack_boundary / BITS_PER_UNIT;
@@ -12043,6 +12031,15 @@ ix86_update_stack_boundary (void)
   && cfun->stdarg
   && crtl->stack_alignment_estimated < 128)
 crtl->stack_alignment_estimated = 128;
+
+  /* __tls_get_addr needs to be called with 16-byte aligned stack.  */
+  if (ix86_tls_descriptor_calls_expanded_in_cfun
+  && crtl->preferred_stack_boundary < 128)
+{
+  crtl->preferred_stack_boundary = 128;
+  if (crtl->stack_alignment_needed < 128)
+   crtl->stack_alignment_needed = 128;
+}
 }
 
 /* Handle the TARGET_GET_DRAP_RTX hook.  Return NULL if no DRAP is
@@ -12506,7 +12503,8 @@ ix86_finalize_stack_realign_flags (void)
 = (crtl->parm_stack_boundary > ix86_incoming_stack_boundary
? crtl->parm_stack_boundary : ix86_incoming_stack_boundary);
   unsigned int stack_realign = (incoming_stack_boundary
-   < (crtl->is_leaf
+   < ((crtl->is_leaf
+   && 
!ix86_current_function_calls_tls_descriptor)
   ? crtl->max_used_stack_slot_alignment
   : crtl->stack_alignment_needed));
 
diff --git a/gcc/testsuite/gcc.target/i386/pr68986-1.c 
b/gcc/testsuite/gcc.target/i386/pr68986-1.c
new file mode 100644
index 000..998f34f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr68986-1.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target tls_native } */
+/* { dg-require-effective-target fpic } */
+/* { dg-options "-fPIC -mno-accumulate-outgoing-args 
-mpreferred-stack-boundary=5 -mincoming-stack-boundary=4" } */
+
+extern __thread int msgdata;
+int
+foo ()
+{
+  return msgdata;
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr68986-2.c 
b/gcc/testsuite/gcc.target/i386/pr68986-2.c
new file mode 100644
index 000..23f9a52
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr68986-2.c
@@ -0,0 +1,13 @@
+/* { dg-do compile { target ia32 } } */
+/* { dg-require-effective-target tls_native } */
+/* { dg-require-effective-target fpic } */
+/* { dg-options "-fPIC -mno-accumulate-outgoing-args 
-mpreferred-stack-boundary=2 -m32" } */
+
+extern __thread int msgdata;
+int
+foo ()
+{
+  return msgdata;
+}
+
+/* { dg-final { scan-assembler "andl\[\\t \]*\\$-16,\[\\t \]*%esp" } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr68986-3.c 
b/gcc/testsuite/gcc.target/i386/pr68986-3.c
new file mode 100644
index 000..5744cf2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr68986-3.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* {

Re: [PATCH] OpenACC use_device clause ICE fix

2016-01-25 Thread Chung-Lin Tang

On 2016/1/22 12:32 AM, Jakub Jelinek wrote:
> On Thu, Jan 21, 2016 at 10:22:19PM +0800, Chung-Lin Tang wrote:
>> On 2016/1/20 09:17 PM, Bernd Schmidt wrote:
>>> On 01/05/2016 02:15 PM, Chung-Lin Tang wrote:
 * omp-low.c (scan_sharing_clauses): Call add_local_decl() for
 use_device/use_device_ptr variables.
>>>
>>> It looks vaguely plausible, but if everything is part of the host
>>> function, why make a copy of the decl at all? I.e. what happens if you
>>> just remove the install_var_local call?
>>
>> Because (only) inside the OpenMP context, the variable is supposed to
>> contain the device-side value; a runtime call is used to obtain the
>> value from the device back to host.  So a new variable is created, the
>> remap_decl mechanisms are used to change references inside the omp
>> context, and other references of the original variable are not touched.
> 
> The patch looks wrong to me, the var shouldn't be actually used,
> it is supposed to have DECL_VALUE_EXPR set for it during omp lowering and
> the following gimplification is supposed to replace it.
> 
> I've tried the testcases you've listed and couldn't get an ICE, so, if you
> see some ICE, can you mail the testcase (in patch form)?
> Perhaps there is something wrong with the OpenACC lowering?
> 
>   Jakub
> 

I've attached a small testcase that triggers the ICE under -fopenacc. This stll
happens under current trunk.

Thanks,
Chung-Lin


void foo (float *x, float *y)
{
  int n = 1 << 20;
  #pragma acc data create(x[0:n]) copyout(y[0:n])
  {
#pragma acc host_data use_device(x,y)
{
  for (int i = 1 ; i < n; i++)
y[0] += x[i] * y[i];
}
  }
}

[gomp4] Merge trunk r232548 (2016-01-19) into gomp-4_0-branch

2016-01-25 Thread Thomas Schwinge

Hi!

Committed to gomp-4_0-branch in r232784:

commit 9cfa5d5eb5fd3b186124883a76232189b359b3de
Merge: 312e74d 56778b6
Author: tschwinge 
Date:   Mon Jan 25 07:35:18 2016 +

svn merge -r 232189:232548 svn+ssh://gcc.gnu.org/svn/gcc/trunk


git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@232784 
138bc75d-0d04-0410-961f-82ee72b054a4


Grüße
 Thomas

[PATCH] Fix PR69393

2016-01-25 Thread Richard Biener


The following fixes an issue with LTO and debug info of OMP vars.

LTO bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2016-01-25  Richard Biener  

PR lto/69393
* dwarf2out.c (is_naming_typedef_decl): Not when DECL_NAMELESS.
* tree-streamer-out.c (pack_ts_base_value_fields): Stream
DECL_NAMELESS.
* tree-streamer-in.c (unpack_ts_base_value_fields): Likewise.

* testsuite/libgomp.c++/pr69393.C: New testcase.

Index: gcc/dwarf2out.c
===
*** gcc/dwarf2out.c (revision 232717)
--- gcc/dwarf2out.c (working copy)
*** is_naming_typedef_decl (const_tree decl)
*** 22970,22975 
--- 22970,22976 
  {
if (decl == NULL_TREE
|| TREE_CODE (decl) != TYPE_DECL
+   || DECL_NAMELESS (decl)
|| !is_tagged_type (TREE_TYPE (decl))
|| DECL_IS_BUILTIN (decl)
|| is_redundant_typedef (decl)
Index: gcc/tree-streamer-out.c
===
*** gcc/tree-streamer-out.c (revision 232717)
--- gcc/tree-streamer-out.c (working copy)
*** pack_ts_base_value_fields (struct bitpac
*** 87,93 
bp_pack_value (bp, TREE_ADDRESSABLE (expr), 1);
bp_pack_value (bp, TREE_THIS_VOLATILE (expr), 1);
if (DECL_P (expr))
! bp_pack_value (bp, DECL_UNSIGNED (expr), 1);
else if (TYPE_P (expr))
  bp_pack_value (bp, TYPE_UNSIGNED (expr), 1);
else
--- 87,96 
bp_pack_value (bp, TREE_ADDRESSABLE (expr), 1);
bp_pack_value (bp, TREE_THIS_VOLATILE (expr), 1);
if (DECL_P (expr))
! {
!   bp_pack_value (bp, DECL_UNSIGNED (expr), 1);
!   bp_pack_value (bp, DECL_NAMELESS (expr), 1);
! }
else if (TYPE_P (expr))
  bp_pack_value (bp, TYPE_UNSIGNED (expr), 1);
else
Index: gcc/tree-streamer-in.c
===
*** gcc/tree-streamer-in.c  (revision 232717)
--- gcc/tree-streamer-in.c  (working copy)
*** unpack_ts_base_value_fields (struct bitp
*** 116,122 
TREE_ADDRESSABLE (expr) = (unsigned) bp_unpack_value (bp, 1);
TREE_THIS_VOLATILE (expr) = (unsigned) bp_unpack_value (bp, 1);
if (DECL_P (expr))
! DECL_UNSIGNED (expr) = (unsigned) bp_unpack_value (bp, 1);
else if (TYPE_P (expr))
  TYPE_UNSIGNED (expr) = (unsigned) bp_unpack_value (bp, 1);
else
--- 116,125 
TREE_ADDRESSABLE (expr) = (unsigned) bp_unpack_value (bp, 1);
TREE_THIS_VOLATILE (expr) = (unsigned) bp_unpack_value (bp, 1);
if (DECL_P (expr))
! {
!   DECL_UNSIGNED (expr) = (unsigned) bp_unpack_value (bp, 1);
!   DECL_NAMELESS (expr) = (unsigned) bp_unpack_value (bp, 1);
! }
else if (TYPE_P (expr))
  TYPE_UNSIGNED (expr) = (unsigned) bp_unpack_value (bp, 1);
else
Index: libgomp/testsuite/libgomp.c++/pr69393.C
===
*** libgomp/testsuite/libgomp.c++/pr69393.C (revision 0)
--- libgomp/testsuite/libgomp.c++/pr69393.C (working copy)
***
*** 0 
--- 1,16 
+ // { dg-do run }
+ // { dg-require-effective-target lto }
+ // { dg-options "-flto -g -fopenmp" }
+ 
+ int e = 5;
+ 
+ int
+ main ()
+ {
+   int a[e];
+   a[0] = 6;
+ #pragma omp parallel
+   if (a[0] != 6)
+ __builtin_abort ();
+   return 0;
+ }

Re: [aarch64] Improve TImode constant moves

2016-01-25 Thread Kyrill Tkachov


Hi Richard,

On 24/01/16 10:54, Richard Henderson wrote:

This looks to be an incomplete transition of the aarch64 backend to 
CONST_WIDE_INT.  I haven't checked to see if it's a regression from gcc5, but I 
suspect not, since there should have been similar checks for CONST_DOUBLE.



FWIW, I defined TARGET_SUPPORTS_WIDE_INT for aarch64 on trunk and the GCC 5 
branch in order to fix
PR 68129.


This is probably gcc7 fodder, but it helped me debug another TImode PR.


r~


+case CONST_WIDE_INT:
+  *cost = 0;
+  for (unsigned int n = CONST_WIDE_INT_NUNITS(x), i = 0; i < n; ++i)
+   {
+ unsigned HOST_WIDE_INT e = CONST_WIDE_INT_ELT(x, i);
+ if (e != 0)
+   *cost += COSTS_N_INSNS (aarch64_internal_mov_immediate
+   (NULL_RTX, GEN_INT (e), false, DImode));
+   }
+  return true;
+

We usually avoid creating intermediate rtxes in the cost function because
it can potentially be called many times during compilation and we want to avoid
creating too many short-lived objects, though I suppose there's no way getting
around this one (the GEN_INT call).

Thanks,
Kyrill

Re: [PATCH] OpenACC use_device clause ICE fix

2016-01-25 Thread Jakub Jelinek

On Mon, Jan 25, 2016 at 05:52:56PM +0900, Chung-Lin Tang wrote:
> I've attached a small testcase that triggers the ICE under -fopenacc. This 
> stll
> happens under current trunk.

Then I think I'd prefer (untested so far):

2016-01-25  Jakub Jelinek  

* omp-low.c (lower_omp_target) : Set
DECL_VALUE_EXPR of new_var even for the non-array case.  Look
through DECL_VALUE_EXPR for expansion.

* c-c++-common/goacc/use_device-1.c: New test.

--- gcc/omp-low.c.jj2016-01-21 00:55:19.0 +0100
+++ gcc/omp-low.c   2016-01-25 10:45:30.995510057 +0100
@@ -15878,6 +15878,14 @@ lower_omp_target (gimple_stmt_iterator *
SET_DECL_VALUE_EXPR (new_var, x);
DECL_HAS_VALUE_EXPR_P (new_var) = 1;
  }
+   else
+ {
+   tree new_var = lookup_decl (var, ctx);
+   x = create_tmp_var_raw (TREE_TYPE (new_var), get_name (new_var));
+   gimple_add_tmp_var (x);
+   SET_DECL_VALUE_EXPR (new_var, x);
+   DECL_HAS_VALUE_EXPR_P (new_var) = 1;
+ }
break;
   }
 
@@ -16493,6 +16501,7 @@ lower_omp_target (gimple_stmt_iterator *
x = build_fold_addr_expr (v);
  }
  }
+   new_var = DECL_VALUE_EXPR (new_var);
x = fold_convert (TREE_TYPE (new_var), x);
gimplify_expr (, _body, NULL, is_gimple_val, fb_rvalue);
gimple_seq_add_stmt (_body,
--- gcc/testsuite/c-c++-common/goacc/use_device-1.c.jj  2016-01-25 
10:56:33.472310437 +0100
+++ gcc/testsuite/c-c++-common/goacc/use_device-1.c 2016-01-25 
10:56:43.128176481 +0100
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+
+void
+foo (float *x, float *y)
+{
+  int n = 1 << 20;
+#pragma acc data create(x[0:n]) copyout(y[0:n])
+  {
+#pragma acc host_data use_device(x,y)
+{
+  for (int i = 1; i < n; i++)
+   y[0] += x[i] * y[i];
+}
+  }
+}

Jakub

[libmpx, committed] Fix verbosity for error messages

2016-01-25 Thread Ilya Enkovich

Hi,

This is an obvious patch fixing a verbosity for a part of error messages.  
Bootstrapped on x86_64-pc-linux-gnu.  Applied to trunk and gcc-5-branch.

Thanks,
Ilya
--
libmpx/

2016-01-20  Ilya Enkovich  

* mpxrt/mpxrt.c (handler): Fix verbosity for
error message.


diff --git a/libmpx/mpxrt/mpxrt.c b/libmpx/mpxrt/mpxrt.c
index bcdd3a6..b52906b 100644
--- a/libmpx/mpxrt/mpxrt.c
+++ b/libmpx/mpxrt/mpxrt.c
@@ -268,7 +268,7 @@ handler (int sig __attribute__ ((unused)),
   __mpxrt_write_uint (VERB_ERROR, trapno, 10);
   __mpxrt_write (VERB_ERROR, ", ip = 0x");
   __mpxrt_write_uint (VERB_ERROR, ip, 16);
-  __mpxrt_write (VERB_BR, "\n");
+  __mpxrt_write (VERB_ERROR, "\n");
   exit (255);
 }
   else
@@ -277,7 +277,7 @@ handler (int sig __attribute__ ((unused)),
   __mpxrt_write_uint (VERB_ERROR, trapno, 10);
   __mpxrt_write (VERB_ERROR, "! at 0x");
   __mpxrt_write_uint (VERB_ERROR, ip, 16);
-  __mpxrt_write (VERB_BR, "\n");
+  __mpxrt_write (VERB_ERROR, "\n");
   exit (255);
 }
 }

Re: [PATCH] OpenACC use_device clause ICE fix

2016-01-25 Thread Jakub Jelinek

On Mon, Jan 25, 2016 at 11:02:05AM +0100, Jakub Jelinek wrote:
> On Mon, Jan 25, 2016 at 10:58:17AM +0100, Jakub Jelinek wrote:
> > --- gcc/testsuite/c-c++-common/goacc/use_device-1.c.jj  2016-01-25 
> > 10:56:33.472310437 +0100
> > +++ gcc/testsuite/c-c++-common/goacc/use_device-1.c 2016-01-25 
> > 10:56:43.128176481 +0100
> > @@ -0,0 +1,15 @@
> > +/* { dg-do compile } */
> > +
> > +void
> > +foo (float *x, float *y)
> > +{
> > +  int n = 1 << 20;
> > +#pragma acc data create(x[0:n]) copyout(y[0:n])
> > +  {
> > +#pragma acc host_data use_device(x,y)
> > +{
> > +  for (int i = 1; i < n; i++)
> > +   y[0] += x[i] * y[i];
> > +}
> > +  }
> > +}
> 
> Though the testcase looks invalid to me, how can you dereference
> the device pointer on the host?  Though, for a testcase that it doesn't ICE
> maybe good enough.

The following ICEs without the patch and works with it, so I think it is
better:

2016-01-25  Jakub Jelinek  

* omp-low.c (lower_omp_target) : Set
DECL_VALUE_EXPR of new_var even for the non-array case.  Look
through DECL_VALUE_EXPR for expansion.

* c-c++-common/goacc/use_device-1.c: New test.

--- gcc/omp-low.c.jj2016-01-21 00:55:19.0 +0100
+++ gcc/omp-low.c   2016-01-25 10:45:30.995510057 +0100
@@ -15878,6 +15878,14 @@ lower_omp_target (gimple_stmt_iterator *
SET_DECL_VALUE_EXPR (new_var, x);
DECL_HAS_VALUE_EXPR_P (new_var) = 1;
  }
+   else
+ {
+   tree new_var = lookup_decl (var, ctx);
+   x = create_tmp_var_raw (TREE_TYPE (new_var), get_name (new_var));
+   gimple_add_tmp_var (x);
+   SET_DECL_VALUE_EXPR (new_var, x);
+   DECL_HAS_VALUE_EXPR_P (new_var) = 1;
+ }
break;
   }
 
@@ -16493,6 +16501,7 @@ lower_omp_target (gimple_stmt_iterator *
x = build_fold_addr_expr (v);
  }
  }
+   new_var = DECL_VALUE_EXPR (new_var);
x = fold_convert (TREE_TYPE (new_var), x);
gimplify_expr (, _body, NULL, is_gimple_val, fb_rvalue);
gimple_seq_add_stmt (_body,
--- gcc/testsuite/c-c++-common/goacc/use_device-1.c.jj  2016-01-25 
10:56:33.472310437 +0100
+++ gcc/testsuite/c-c++-common/goacc/use_device-1.c 2016-01-25 
10:56:43.128176481 +0100
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+
+void bar (float *, float *);
+
+void
+foo (float *x, float *y)
+{
+  int n = 1 << 10;
+#pragma acc data create(x[0:n]) copyout(y[0:n])
+  {
+#pragma acc host_data use_device(x,y)
+bar (x, y);
+  }
+}


Jakub

[PATCH] Fix PR69376

2016-01-25 Thread Richard Biener


The following makes SCCVN properly save/restore SSA_NAME_ANTI_RANGE_P.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2016-01-25  Richard Biener  

PR tree-optimization/69376
* tree-ssa-sccvn.h (struct vn_ssa_aux): Add range_info_anti_range_p
flag.
(VN_INFO_ANTI_RANGE_P): New inline.
(VN_INFO_RANGE_TYPE): Likewise.
* tree-ssa-sccvn.c (set_ssa_val_to): Also record and copy
SSA_NAME_ANTI_RANGE_P.
(free_scc_vn): Restore SSA_NAME_ANTI_RANGE_P.
* tree-ssa-pre.c (eliminate_dom_walker::before_dom_children):
Properly query VN_INFO_RANGE_TYPE.

* gcc.dg/torture/pr69376.c: New testcase.

Index: gcc/tree-ssa-sccvn.h
===
*** gcc/tree-ssa-sccvn.h(revision 232717)
--- gcc/tree-ssa-sccvn.h(working copy)
*** typedef struct vn_ssa_aux
*** 191,196 
--- 191,199 
   insertion of such with EXPR as definition is required before
   a use can be created of it.  */
unsigned needs_insertion : 1;
+ 
+   /* Whether range-info is anti-range.  */
+   unsigned range_info_anti_range_p : 1;
  } *vn_ssa_aux_t;
  
  enum vn_lookup_kind { VN_NOWALK, VN_WALK, VN_WALKREWRITE };
*** VN_INFO_RANGE_INFO (tree name)
*** 253,258 
--- 256,279 
  : SSA_NAME_RANGE_INFO (name));
  }
  
+ /* Whether the original range info of NAME is an anti-range.  */
+ 
+ inline bool
+ VN_INFO_ANTI_RANGE_P (tree name)
+ {
+   return (VN_INFO (name)->info.range_info
+ ? VN_INFO (name)->range_info_anti_range_p
+ : SSA_NAME_ANTI_RANGE_P (name));
+ }
+ 
+ /* Get at the original range info kind for NAME.  */
+ 
+ inline value_range_type
+ VN_INFO_RANGE_TYPE (tree name)
+ {
+   return VN_INFO_ANTI_RANGE_P (name) ? VR_ANTI_RANGE : VR_RANGE;
+ }
+ 
  /* Get at the original pointer info for NAME.  */
  
  inline ptr_info_def *
Index: gcc/tree-ssa-pre.c
===
*** gcc/tree-ssa-pre.c  (revision 232717)
--- gcc/tree-ssa-pre.c  (working copy)
*** eliminate_dom_walker::before_dom_childre
*** 4047,4053 
   && ! VN_INFO_RANGE_INFO (sprime)
   && b == sprime_b)
duplicate_ssa_name_range_info (sprime,
!  SSA_NAME_RANGE_TYPE (lhs),
   VN_INFO_RANGE_INFO (lhs));
}
  
--- 4047,4053 
   && ! VN_INFO_RANGE_INFO (sprime)
   && b == sprime_b)
duplicate_ssa_name_range_info (sprime,
!  VN_INFO_RANGE_TYPE (lhs),
   VN_INFO_RANGE_INFO (lhs));
}
  
Index: gcc/testsuite/gcc.dg/torture/pr69376.c
===
*** gcc/testsuite/gcc.dg/torture/pr69376.c  (revision 0)
--- gcc/testsuite/gcc.dg/torture/pr69376.c  (working copy)
***
*** 0 
--- 1,45 
+ /* { dg-do run } */
+ /* { dg-require-effective-target int32plus } */
+ 
+ int printf (const char *, ...); 
+ 
+ unsigned a, c, *d, f;
+ char b, e;
+ short g;
+ 
+ void
+ fn1 ()
+ {
+   unsigned h = 4294967290;
+   if (b >= 0)
+ {
+   h = b;
+   c = b / 290;
+   f = ~(c - (8 || h));
+   if (f)
+   printf ("%d\n", 1);
+   if (f)
+   printf ("%d\n", f);
+   g = ~f;
+   if (c < 3)
+   {
+ int i = -h < ~c;
+ unsigned j;
+ if (i)
+   j = h;
+ h = -j * g;
+   }
+   c = h;
+ }
+   unsigned k = ~h;
+   char l = e || g;
+   if (l < 1 || k < 7)
+ *d = a;
+ }
+ 
+ int
+ main ()
+ {
+   fn1 ();
+   return 0;
+ }
Index: gcc/tree-ssa-sccvn.c
===
*** gcc/tree-ssa-sccvn.c(revision 232717)
--- gcc/tree-ssa-sccvn.c(working copy)
*** set_ssa_val_to (tree from, tree to)
*** 3139,3153 
{
  /* Save old info.  */
  if (! VN_INFO (to)->info.range_info)
!   VN_INFO (to)->info.range_info = SSA_NAME_RANGE_INFO (to);
  /* Use that from the dominator.  */
  SSA_NAME_RANGE_INFO (to) = SSA_NAME_RANGE_INFO (from);
}
  else
{
  /* Save old info.  */
  if (! VN_INFO (to)->info.range_info)
!   VN_INFO (to)->info.range_info = SSA_NAME_RANGE_INFO (to);
  /* Rather than allocating memory and unioning the info
 just clear it.  */
  SSA_NAME_RANGE_INFO (to) = NULL;
--- 3139,3162 
{
  /* Save old info.  */
  if (! VN_INFO (to)->info.range_info)
!   {
!

Re: [PATCH] OpenACC use_device clause ICE fix

2016-01-25 Thread Jakub Jelinek

On Mon, Jan 25, 2016 at 10:58:17AM +0100, Jakub Jelinek wrote:
> --- gcc/testsuite/c-c++-common/goacc/use_device-1.c.jj2016-01-25 
> 10:56:33.472310437 +0100
> +++ gcc/testsuite/c-c++-common/goacc/use_device-1.c   2016-01-25 
> 10:56:43.128176481 +0100
> @@ -0,0 +1,15 @@
> +/* { dg-do compile } */
> +
> +void
> +foo (float *x, float *y)
> +{
> +  int n = 1 << 20;
> +#pragma acc data create(x[0:n]) copyout(y[0:n])
> +  {
> +#pragma acc host_data use_device(x,y)
> +{
> +  for (int i = 1; i < n; i++)
> + y[0] += x[i] * y[i];
> +}
> +  }
> +}

Though the testcase looks invalid to me, how can you dereference
the device pointer on the host?  Though, for a testcase that it doesn't ICE
maybe good enough.

Jakub

Re: [gomp4, PR68977, Committed] Don't gimplify in ssa mode if seen_error in oacc_xform_loop

2016-01-25 Thread Tom de Vries


On 14/01/16 10:43, Richard Biener wrote:

On Wed, Jan 13, 2016 at 9:04 PM, Tom de Vries  wrote:

Hi,

At r231739, there was an ICE when checking code generated by
oacc_xform_loop, in case the source contained an error.

Due to seen_error (), gimplification during oacc_xform_loop bailed out, and
an uninitialized var was introduced.  Because of gimplifying in ssa mode,
that caused an ICE.

I can't reproduce this any longer, but I think the fix still makes sense.
The patch makes sure oacc_xform_loop gimplifies in non-ssa mode if
seen_error ().


I don't think it makes "sense" in any way.  After seen_error () a following ICE
will be "confused after earlier errors" in release mode and thus I think that's
not an important problem to paper over with this kind of "hack".

I'd rather avoid doing any of omp-low if seen_error ()?



The error triggered in oacc_device_lower, so there's nothing we can do 
before (in omp-low).


How about this fix, which replaces the oacc ifn calls with 
zero-assignments if seen_error ()?


Thanks,
- Tom
Ignore oacc ifn if seen_error in execute_oacc_device_lower

---
 gcc/omp-low.c | 39 ++-
 1 file changed, 34 insertions(+), 5 deletions(-)

diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index 2de3aeb..f678f05 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -20201,7 +20201,7 @@ execute_oacc_device_lower ()
 
 	/* Rewind to allow rescan.  */
 	gsi_prev ();
-	bool rescan = false, remove = false;
+	bool rescan = false, remove = false, assign_zero = false;
 	enum  internal_fn ifn_code = gimple_call_internal_fn (call);
 
 	switch (ifn_code)
@@ -20209,11 +20209,25 @@ execute_oacc_device_lower ()
 	  default: break;
 
 	  case IFN_GOACC_LOOP:
+	if (seen_error ())
+	  {
+		remove = true;
+		assign_zero = true;
+		break;
+	  }
+
 	oacc_xform_loop (call);
 	rescan = true;
 	break;
 
 	  case IFN_GOACC_REDUCTION:
+	if (seen_error ())
+	  {
+		remove = true;
+		assign_zero = true;
+		break;
+	  }
+
 	/* Mark the function for SSA renaming.  */
 	mark_virtual_operands_for_renaming (cfun);
 
@@ -20228,6 +20242,13 @@ execute_oacc_device_lower ()
 
 	  case IFN_UNIQUE:
 	{
+	  if (seen_error ())
+		{
+		  remove = true;
+		  assign_zero = true;
+		  break;
+		}
+
 	  enum ifn_unique_kind kind
 		= ((enum ifn_unique_kind)
 		   TREE_INT_CST_LOW (gimple_call_arg (call, 0)));
@@ -20266,11 +20287,19 @@ execute_oacc_device_lower ()
 	  {
 	if (gimple_vdef (call))
 	  replace_uses_by (gimple_vdef (call), gimple_vuse (call));
-	if (gimple_call_lhs (call))
+	tree lhs = gimple_call_lhs (call);
+	if (lhs != NULL_TREE)
 	  {
-		/* Propagate the data dependency var.  */
-		gimple *ass = gimple_build_assign (gimple_call_lhs (call),
-		   gimple_call_arg (call, 1));
+		gimple *ass;
+		if (assign_zero)
+		  {
+		tree zero = build_zero_cst (TREE_TYPE (lhs));
+		ass = gimple_build_assign (lhs, zero);
+		  }
+		else
+		  /* Propagate the data dependency var.  */
+		  ass = gimple_build_assign (lhs, gimple_call_arg (call, 1));
+
 		gsi_replace (, ass,  false);
 	  }
 	else

Re: [AArch64] Remove AARCH64_EXTRA_TUNE_RECIP_SQRT from Cortex-A57 tuning

2016-01-25 Thread James Greenhalgh

On Mon, Jan 11, 2016 at 12:04:43PM +, James Greenhalgh wrote:
> 
> Hi,
> 
> I've seen a couple of large performance issues caused by expanding
> the high-precision reciprocal square root for Cortex-A57, so I'd like
> to turn it off by default.
> 
> This is good for art (~2%) from Spec2000, bad (~3.5%) for fma3d from
> Spec2000, good (~5.5%) for gromcas from Spec2006, and very good (>10%) for
> some private microbenchmark kernels which stress the divide/sqrt/multiply
> units. It therefore seems to me to be the correct choice to make across
> a number of workloads.
> 
> Bootstrapped and tested on aarch64-none-linux-gnu with no issues.
> 
> OK?

*Ping*

Thanks,
James

> ---
> 2015-12-11  James Greenhalgh  
> 
>   * config/aarch64/aarch64.c (cortexa57_tunings): Remove
>   AARCH64_EXTRA_TUNE_RECIP_SQRT.
> 

> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 1d5d898..999c9fc 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -484,8 +484,7 @@ static const struct tune_params cortexa57_tunings =
>0, /* max_case_values.  */
>0, /* cache_line_size.  */
>tune_params::AUTOPREFETCHER_WEAK,  /* autoprefetcher_model.  */
> -  (AARCH64_EXTRA_TUNE_RENAME_FMA_REGS
> -   | AARCH64_EXTRA_TUNE_RECIP_SQRT)  /* tune_flags.  */
> +  (AARCH64_EXTRA_TUNE_RENAME_FMA_REGS)   /* tune_flags.  */
>  };
>  
>  static const struct tune_params cortexa72_tunings =

[PATCH, PR69421] Check vector types of COND_EXPR operands are compatible when vectorizing it

2016-01-25 Thread Ilya Enkovich

Hi,

This patch covers one more case when boolean operands get different
vectypes and we don't detect it.

Bootstrapped and regtested on x86_64-pc-linux-gnu.  OK for trunk?

Thanks,
Ilya
--
gcc/

2016-01-25  Ilya Enkovich  

PR target/69421
* tree-vect-stmts.c (vectorizable_condition): Check vectype
of operands is compatible with a statement vectype.

gcc/testsuite/

2016-01-25  Ilya Enkovich  

PR target/69421
* gcc.dg/pr69421.c: New test.


diff --git a/gcc/testsuite/gcc.dg/pr69421.c b/gcc/testsuite/gcc.dg/pr69421.c
new file mode 100644
index 000..252e22c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr69421.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O3" } */
+
+struct A { double a; };
+double a;
+
+void
+foo (_Bool *x)
+{
+  long i;
+  for (i = 0; i < 64; i++)
+{
+  struct A c;
+  x[i] = c.a || a;
+}
+}
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 1d2246d..ed2ce07 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -7528,6 +7528,7 @@ vectorizable_condition (gimple *stmt, 
gimple_stmt_iterator *gsi,
 
   tree vectype = STMT_VINFO_VECTYPE (stmt_info);
   int nunits = TYPE_VECTOR_SUBPARTS (vectype);
+  tree vectype1 = NULL_TREE, vectype2 = NULL_TREE;
 
   if (slp_node || PURE_SLP_STMT (stmt_info))
 ncopies = 1;
@@ -7547,9 +7548,17 @@ vectorizable_condition (gimple *stmt, 
gimple_stmt_iterator *gsi,
 return false;
 
   gimple *def_stmt;
-  if (!vect_is_simple_use (then_clause, stmt_info->vinfo, _stmt, ))
+  if (!vect_is_simple_use (then_clause, stmt_info->vinfo, _stmt, ,
+  ))
+return false;
+  if (!vect_is_simple_use (else_clause, stmt_info->vinfo, _stmt, ,
+  ))
 return false;
-  if (!vect_is_simple_use (else_clause, stmt_info->vinfo, _stmt, ))
+
+  if (vectype1 && !useless_type_conversion_p (vectype, vectype1))
+return false;
+
+  if (vectype2 && !useless_type_conversion_p (vectype, vectype2))
 return false;
 
   masked = !COMPARISON_CLASS_P (cond_expr);

RE: [PATCH] [ARC] Add basic support for double load and store instructions

2016-01-25 Thread Claudiu Zissulescu

Committed r232788

Thanks,
Claudiu

> -Original Message-
> From: Joern Wolfgang Rennecke [mailto:g...@amylaar.uk]
> Sent: Sunday, January 24, 2016 3:26 PM
> To: Claudiu Zissulescu; gcc-patches@gcc.gnu.org
> Cc: Francois Bedard; jeremy.benn...@embecosm.com
> Subject: Re: [PATCH] [ARC] Add basic support for double load and store
> instructions
> 
> 
> 
> On 22/01/16 11:59, Claudiu Zissulescu wrote:
> > Thank u for the feedback. I hope this new patch solves the outstanding
> issues. Please find it attached.
> 
> This is OK.

Re: [Patch AArch64] Use software sqrt expansion always for -mlow-precision-recip-sqrt

2016-01-25 Thread James Greenhalgh

On Mon, Jan 11, 2016 at 11:53:39AM +, James Greenhalgh wrote:
> 
> Hi,
> 
> I'd like to switch the logic around in aarch64.c such that
> -mlow-precision-recip-sqrt causes us to always emit the low-precision
> software expansion for reciprocal square root. I have two reasons to do
> this; first is consistency across -mcpu targets, second is enabling more
> -mcpu targets to use the flag for peak tuning.
> 
> I don't much like that the precision we use for -mlow-precision-recip-sqrt
> differs between cores (and possibly compiler revisions). Yes, we're
> under -ffast-math but I take this flag to mean the user explicitly wants the
> low-precision expansion, and we should not diverge from that based on an
> internal decision as to what is optimal for performance in the
> high-precision case. I'd prefer to keep things as predictable as possible,
> and here that means always emitting the low-precision expansion when asked.
> 
> Judging by the comments in the thread proposing the reciprocal square
> root optimisation, this will benefit all cores currently supported by GCC.
> To be clear, we would still not expand in the high-precision case for any
> cores which do not explicitly ask for it. Currently that is Cortex-A57
> and xgene, though I will be proposing a patch to remove Cortex-A57 from
> that list shortly.
> 
> Which gives my second motivation for this patch. -mlow-precision-recip-sqrt
> is intended as a tuning flag for situations where performance is more
> important than precision, but the current logic requires setting an
> internal flag which also changes the performance characteristics where
> high-precision is needed. This conflates two decisions the target might
> want to make, and reduces the applicability of an option targets might
> want to enable for performance. In particular, I'd still like to see
> -mlow-precision-recip-sqrt continue to emit the cheaper, low-precision
> sequence for floats under Cortex-A57.
> 
> Based on that reasoning, this patch makes the appropriate change to the
> logic. I've checked with the current -mcpu values to ensure that behaviour
> without -mlow-precision-recip-sqrt does not change, and that behaviour
> with -mlow-precision-recip-sqrt is to emit the low precision sequences.
> 
> I've also put this through bootstrap and test on aarch64-none-linux-gnu
> with no issues.
> 
> OK?

*Ping*

Thanks,
James

> 2015-12-10  James Greenhalgh  
> 
>   * config/aarch64/aarch64.c (use_rsqrt_p): Always use software
>   reciprocal sqrt for -mlow-precision-recip-sqrt.
> 

> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 9142ac0..1d5d898 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -7485,8 +7485,9 @@ use_rsqrt_p (void)
>  {
>return (!flag_trapping_math
> && flag_unsafe_math_optimizations
> -   && (aarch64_tune_params.extra_tuning_flags
> -   & AARCH64_EXTRA_TUNE_RECIP_SQRT));
> +   && ((aarch64_tune_params.extra_tuning_flags
> +& AARCH64_EXTRA_TUNE_RECIP_SQRT)
> +   || flag_mrecip_low_precision_sqrt));
>  }
>  
>  /* Function to decide when to use

Re: Patch RFA: Add option -fcollectible-pointers, use it in ivopts

2016-01-25 Thread Bernd Schmidt


On 01/23/2016 12:52 AM, Ian Lance Taylor wrote:


2016-01-22  Ian Lance Taylor  

* common.opt (fkeep-gc-roots-live): New option.
* tree-ssa-loop-ivopts.c (add_candidate_1): If
-fkeep-gc-roots-live, skip pointers.
(add_iv_candidate_for_biv): Handle add_candidate_1 returning
NULL.
* doc/invoke.texi (Optimize Options): Document
-fkeep-gc-roots-live.

gcc/testsuite/ChangeLog:

2016-01-22  Ian Lance Taylor  

* gcc.dg/tree-ssa/ivopt_5.c: New test.


Patch not attached?


Bernd

Re: [AARCH64][ACLE][NEON] Implement vcvt_s64_f64 and vcvt_u64_f64 NEON intrinsics.

2016-01-25 Thread James Greenhalgh

On Thu, Jan 21, 2016 at 12:32:07PM +, James Greenhalgh wrote:
> On Wed, Jan 13, 2016 at 05:44:30PM +, Bilyan Borisov wrote:
> > This patch implements all the vcvtR_s64_f64 and vcvtR_u64_f64 vector
> > intrinsics, where R is ['',a,m,n,p]. Since these intrinsics are
> > identical in semantics to the corresponding scalar variants, they are
> > implemented in terms of them, with appropriate packing and unpacking
> > of vector arguments. New test cases, covering all the intrinsics were
> > also added.
> 
> This patch is very low risk, gets us another step towards closing pr58693,
> and was posted before the Stage 3 deadline. This is OK for trunk.

I realised you don't have commit access, so I've committed this on your
behalf as revision 232789.

Thanks,
James

> > gcc/
> > 
> > 2015-XX-XX  Bilyan Borisov  
> > 
> > * config/aarch64/arm_neon.h (vcvt_s64_f64): New intrinsic.
> > (vcvt_u64_f64): Likewise.
> > (vcvta_s64_f64): Likewise.
> > (vcvta_u64_f64): Likewise.
> > (vcvtm_s64_f64): Likewise.
> > (vcvtm_u64_f64): Likewise.
> > (vcvtn_s64_f64): Likewise.
> > (vcvtn_u64_f64): Likewise.
> > (vcvtp_s64_f64): Likewise.
> > (vcvtp_u64_f64): Likewise.
> > 
> > gcc/testsuite/
> > 
> > 2015-XX-XX  Bilyan Borisov  
> > 
> > * gcc.target/aarch64/simd/vcvt_s64_f64_1.c: New.
> > * gcc.target/aarch64/simd/vcvt_u64_f64_1.c: Likewise.
> > * gcc.target/aarch64/simd/vcvta_s64_f64_1.c: Likewise.
> > * gcc.target/aarch64/simd/vcvta_u64_f64_1.c: Likewise.
> > * gcc.target/aarch64/simd/vcvtm_s64_f64_1.c: Likewise.
> > * gcc.target/aarch64/simd/vcvtm_u64_f64_1.c: Likewise.
> > * gcc.target/aarch64/simd/vcvtn_s64_f64_1.c: Likewise.
> > * gcc.target/aarch64/simd/vcvtn_u64_f64_1.c: Likewise.
> > * gcc.target/aarch64/simd/vcvtp_s64_f64_1.c: Likewise.
> > * gcc.target/aarch64/simd/vcvtp_u64_f64_1.c: Likewise.
>

Minor tweaks to documentation of scalar_storage_order

2016-01-25 Thread Eric Botcazou

Tested on x86_64-suse-linux, applied on the mainline as obvious.


2016-01-25  Eric Botcazou  

* doc/extend.texi (scalar_storage_order type attribute): Fix typo and
improve wording for mixed storage order support.

-- 
Eric BotcazouIndex: doc/extend.texi
===
--- doc/extend.texi	(revision 232773)
+++ doc/extend.texi	(working copy)
@@ -6481,10 +6481,10 @@ integral type should be used.
 When attached to a @code{union} or a @code{struct}, this attribute sets
 the storage order, aka endianness, of the scalar fields of the type, as
 well as the array fields whose component is scalar.  The supported
-endianness are @code{big-endian} and @code{little-endian}.  The attribute
+endiannesses are @code{big-endian} and @code{little-endian}.  The attribute
 has no effects on fields which are themselves a @code{union}, a @code{struct}
 or an array whose component is a @code{union} or a @code{struct}, and it is
-possible to have fields with a different scalar storage order than the
+possible for these fields to have a different scalar storage order than the
 enclosing type.
 
 This attribute is supported only for targets that use a uniform default

Re: [PATCH, AArch64] Fix for PR67896 (C++ FE cannot distinguish __Poly{8,16,64,128}_t types)

2016-01-25 Thread James Greenhalgh

On Wed, Jan 20, 2016 at 09:27:41PM +0100, Roger Ferrer Ibáñez wrote:
> Hi James,
> 
> > This patch looks technically correct to me, though there is a small
> > style issue to correct (in-line below), and your ChangeLogs don't fit
> > our usual style.
> 
> thank you very much for the useful comments. I'm attaching a new
> version of the patch with the style issues (hopefully) ironed out.

Thanks, this version of the patch looks correct to me.

> > > P.S.: I haven't signed the copyright assignment to the FSF. The change
> > > is really small but I can do the paperwork if required.

I can't commit it on your behalf until we've heard back regarding whether
this needs a copyright assignment to the FSF, but once I've heard I'd
be happy to commit this for you. I'll expand the CC list a bit further
to see if we can get an answer on that.

Thanks again for the analysis and patch.

James

> gcc/ChangeLog:
> 
> 2016-01-19  Roger Ferrer Ibáñez  
> 
> PR target/67896
> * config/aarch64/aarch64-builtins.c
> (aarch64_init_simd_builtin_types): Do not set structural
> equality to __Poly{8,16,64,128}_t types.
> 
> gcc/testsuite/ChangeLog:
> 
> 2016-01-19  Roger Ferrer Ibáñez  
> 
> PR target/67896
> * gcc.target/aarch64/simd/pr67896.C: New.
> 
> -- 
> Roger Ferrer Ibáñez

> From 72c065f6a3f9d168baf357de1b567faa6042c03b Mon Sep 17 00:00:00 2001
> From: Roger Ferrer Ibanez 
> Date: Wed, 20 Jan 2016 21:11:42 +0100
> Subject: [PATCH] Do not set structural equality on polynomial types
> 
> ---
>  gcc/config/aarch64/aarch64-builtins.c   | 10 ++
>  gcc/testsuite/gcc.target/aarch64/simd/pr67896.C |  7 +++
>  2 files changed, 13 insertions(+), 4 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/simd/pr67896.C
> 
> diff --git a/gcc/config/aarch64/aarch64-builtins.c 
> b/gcc/config/aarch64/aarch64-builtins.c
> index bd7a8dd..40272ed 100644
> --- a/gcc/config/aarch64/aarch64-builtins.c
> +++ b/gcc/config/aarch64/aarch64-builtins.c
> @@ -610,14 +610,16 @@ aarch64_init_simd_builtin_types (void)
>enum machine_mode mode = aarch64_simd_types[i].mode;
>  
>if (aarch64_simd_types[i].itype == NULL)
> - aarch64_simd_types[i].itype =
> -   build_distinct_type_copy
> - (build_vector_type (eltype, GET_MODE_NUNITS (mode)));
> + {
> +   aarch64_simd_types[i].itype
> + = build_distinct_type_copy
> +   (build_vector_type (eltype, GET_MODE_NUNITS (mode)));
> +   SET_TYPE_STRUCTURAL_EQUALITY (aarch64_simd_types[i].itype);
> + }
>  
>tdecl = add_builtin_type (aarch64_simd_types[i].name,
>   aarch64_simd_types[i].itype);
>TYPE_NAME (aarch64_simd_types[i].itype) = tdecl;
> -  SET_TYPE_STRUCTURAL_EQUALITY (aarch64_simd_types[i].itype);
>  }
>  
>  #define AARCH64_BUILD_SIGNED_TYPE(mode)  \
> diff --git a/gcc/testsuite/gcc.target/aarch64/simd/pr67896.C 
> b/gcc/testsuite/gcc.target/aarch64/simd/pr67896.C
> new file mode 100644
> index 000..1f916e0
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/simd/pr67896.C
> @@ -0,0 +1,7 @@
> +typedef __Poly8_t A;
> +typedef __Poly16_t A; /* { dg-error "conflicting declaration" } */
> +typedef __Poly64_t A; /* { dg-error "conflicting declaration" } */
> +typedef __Poly128_t A; /* { dg-error "conflicting declaration" } */
> +
> +typedef __Poly8x8_t B;
> +typedef __Poly16x8_t B; /* { dg-error "conflicting declaration" } */ 
> -- 
> 2.1.4
>

Re: Speedup configure and build with system.h

2016-01-25 Thread Uros Bizjak

Hello!

> * system.h (string, algorithm): Include only conditionally.
> (new): Include always under C++.
> * bb-reorder.c (toplevel): Define USES_ALGORITHM.
> * final.c (toplevel): Ditto.
> * ipa-chkp.c (toplevel): Define USES_STRING.
> * genconditions.c (write_header): Make gencondmd.c define
> USES_STRING.
> * mem-stats.h (mem_usage::print_dash_line): Don't use std::string.
>
> * config/aarch64/aarch64.c (toplevel): Define USES_STRING.
> * common/config/aarch64/aarch64-common.c (toplevel): Ditto.

This patch caused bootstrap failure on non-c++11 bootstrap compiler
[1], e.g. CentOS 5.11.

The problem is with std::swap, which was defined in header 
until c++11 [2].

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69464
[2] http://en.cppreference.com/w/cpp/algorithm/swap

Uros.

Re: Wonly-top-basic-asm

2016-01-25 Thread Bernd Schmidt


On 01/24/2016 11:23 PM, David Wohlferd wrote:

+Wonly-top-basic-asm
+C ObjC ObjC++ C++ Var(warn_only_top_basic_asm) Warning
+Warn on unsafe uses of basic asm.


Maybe just -Wbasic-asm?


+/* Warn on basic asm used inside of functions,
+   EXCEPT when in naked functions. Also allow asm(""). */


Two spaces after a sentence.


+if (warn_only_top_basic_asm && (TREE_STRING_LENGTH (str) != 1) )


Unnecessary parens, and extra space before closing paren.


+ if (warn_only_top_basic_asm &&
+ (TREE_STRING_LENGTH (string) != 1))


Extra parens, and && goes first on the next line.


+ warning_at(asm_loc, OPT_Wonly_top_basic_asm,


Space before "(".


+   "asm statement in function does not use extended syntax");


Could break that into ".." "..." over two lines so as to keep indentation.


-asm ("");
+asm ("":::);


Is that necessary? As far as I can tell we're treating these equally.


@@ -7487,6 +7490,8 @@
  consecutive in the output, put them in a single multi-instruction @code{asm}
  statement. Note that GCC's optimizers can move @code{asm} statements
  relative to other code, including across jumps.
+Using inputs and outputs with extended @code{asm} can help correctly position
+your asm.


Not sure this is needed either. Sounds a bit like advertising :) In 
general the doc changes seem much too verbose to me.



+Extended @code{asm}'s @samp{%=} may help resolve this.


Same here. I think the block that recommends extended asm is good 
enough. I think the next part could be shrunk significantly too.



-Here is an example of basic @code{asm} for i386:
+Basic @code{asm} statements within functions do not perform an implicit
+"memory" clobber (@pxref{Clobbers}).  Also, there is no implicit clobbering
+of @emph{any} registers, so (other than "naked" functions which follow the


"other than in"? Also @code{naked} maybe. I'd place a note about 
clobbering after the existing "To access C data, it is better to use 
extended asm".



+ABI rules) changed registers must be restored to their original value before
+exiting the @code{asm}.  While this behavior has not always been documented,
+GCC has worked this way since at least v2.95.3.  Also, lacking inputs and
+outputs means that GCC's optimizers may have difficulties consistently
+positioning the basic @code{asm} in the generated code.


The existing text already mentions ordering issues. Lose this block.


+The concept of ``clobbering'' does not apply to basic @code{asm} statements
+outside of functions (aka top-level asm).


Stating the obvious?


+@strong{Warning!} This "clobber nothing" behavior may be different than how


Ok there is precedent for this, but it's spelt "@strong{Warning:}" in 
all other cases. Still, I'd probably also shrink this paragraph and put 
a note about lack of C semantics and possibly different behaviour from 
other compilers near the top, where we say that extended asm is better 
in most cases.



+other compilers treat basic @code{asm}, since the C standards for the
+@code{asm} statement provide no guidance regarding these semantics.  As a
+result, @code{asm} statements that work correctly on other compilers may not
+work correctly with GCC (and vice versa), even though they both compile
+without error.  Also, there is discussion underway about changing GCC to
+have basic @code{asm} clobber at least memory and perhaps some (or all)
+registers.  If implemented, this change may fix subtle problems with
+existing @code{asm} statements.  However it may break or slow down ones that
+were working correctly.


How would such a change break anything? I'd also not mention discussion 
underway, just say "Future versions of GCC may treat basic @code{asm} as 
clobbering memory".



+If your existing code needs clobbers that GCC's basic @code{asm} is not
+providing, or if you want to 'future-proof' your asm against possible
+changes to basic @code{asm}'s semantics, use extended @code{asm}.


Recommending it too often. Lose this.


+Extended @code{asm} allows you to specify what (if anything) needs to be
+clobbered for your code to work correctly.


And again.


You can use @ref{Warning
+Options, @option{-Wonly-top-basic-asm}} to locate basic @code{asm}


I think just plain @option is usual.


+statements that may need changes, and refer to
+@uref{https://gcc.gnu.org/wiki/ConvertBasicAsmToExtended, How to convert
+from basic asm to extended asm} for information about how to perform the
+conversion.


A link is probably good if we have such a page.


+Here is an example of top-level basic @code{asm} for i386 that defines an
+asm macro.  That macro is then invoked from within a function using
+extended @code{asm}:


The updated example also looks good.

I think I'm fine with the concept but I'd like to see an updated patch 
with better docs.



Bernd

[PATCH] rs6000: Put back the 's' output modifier

2016-01-25 Thread Segher Boessenkool

It turns out the 's' output modifier is used in some glibc math code,
and is in an installed header even.  So let's put it back, it is much
less of a burden supporting it a bit longer than to deal with the fallout.
(It is also being fixed for glibc.)

Tested on powerpc64-linux-gcc; is this okay for mainline?


Segher


2016-01-26  Segher Boessenkool  

* config/rs6000/rs6000.c (print_operand): Rollback 's' removal.

---
 gcc/config/rs6000/rs6000.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index ba4aeab..2a3a441 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -19949,6 +19949,14 @@ print_operand (FILE *file, rtx x, int code)
fprintf (file, "%d", 128 >> (REGNO (x) - CR0_REGNO));
   return;
 
+case 's':
+  /* Low 5 bits of 32 - value */
+  if (! INT_P (x))
+   output_operand_lossage ("invalid %%s value");
+  else
+   fprintf (file, HOST_WIDE_INT_PRINT_DEC, (32 - INTVAL (x)) & 31);
+  return;
+
 case 't':
   /* Like 'J' but get to the OVERFLOW/UNORDERED bit.  */
   gcc_assert (REG_P (x) && GET_MODE (x) == CCmode);
-- 
1.9.3

94 matches

Mail list logo