date:20161013

On 10/13/2016 04:31 PM, Rainer Orth wrote:
> Hi Martin,
> 
>> Good! How does it look with the former solaris targets that does not support
>> prioritized ctors?
> 
> still no failures there, neither with ld (which lacks constructor
> priority support) nor with gld (which has it).  Only Solaris 12 shows
> the failures, both with ld and gld (both of which support constructor
> priorities).
> 
>   Rainer
> 

I see. So please send me some example of a binary that still fails
on Solaris 12.

Thanks,
Martin

[PATCH] Fold __builtin_str{n}{case}cmp functions (simplified version 4)

Simplified version that just supports only null-terminated strings.
Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Ready to be installed?
Martin
>From 41c49024a02cff43774903206ad77b2ae161e81a Mon Sep 17 00:00:00 2001
From: marxin 
Date: Thu, 13 Oct 2016 10:25:25 +0200
Subject: [PATCH 2/4] Fold __builtin_str{n}{case}cmp functions

gcc/ChangeLog:

2016-10-13  Martin Liska  

	* builtins.c (fold_builtin_strcmp): Remove function.
	(fold_builtin_strncmp): Likewise.
	(fold_builtin_2): Remove call of the function.
	(fold_builtin_3): Likewise.
	* fold-const-call.c (fold_const_call): Add constant folding
	for CFN_BUILT_IN_STRCASECMP and CFN_BUILT_IN_STRNCASECMP.
	* fold-const-call.h (build_cmp_result): Declare the function.
	* gimple-fold.c (gimple_load_first_char): New function.
	(gimple_fold_builtin_string_compare): Likewise.
	(gimple_fold_builtin): Call the function.
---
 gcc/builtins.c| 138 
 gcc/fold-const-call.c |  45 +---
 gcc/fold-const-call.h |   1 +
 gcc/gimple-fold.c | 189 +-
 4 files changed, 226 insertions(+), 147 deletions(-)

diff --git a/gcc/builtins.c b/gcc/builtins.c
index 43a9db0..ed5a635 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -150,8 +150,6 @@ static rtx expand_builtin_fabs (tree, rtx, rtx);
 static rtx expand_builtin_signbit (tree, rtx);
 static tree fold_builtin_memchr (location_t, tree, tree, tree, tree);
 static tree fold_builtin_memcmp (location_t, tree, tree, tree);
-static tree fold_builtin_strcmp (location_t, tree, tree);
-static tree fold_builtin_strncmp (location_t, tree, tree, tree);
 static tree fold_builtin_isascii (location_t, tree);
 static tree fold_builtin_toascii (location_t, tree);
 static tree fold_builtin_isdigit (location_t, tree);
@@ -7331,136 +7329,6 @@ fold_builtin_memcmp (location_t loc, tree arg1, tree arg2, tree len)
   return NULL_TREE;
 }
 
-/* Fold function call to builtin strcmp with arguments ARG1 and ARG2.
-   Return NULL_TREE if no simplification can be made.  */
-
-static tree
-fold_builtin_strcmp (location_t loc, tree arg1, tree arg2)
-{
-  if (!validate_arg (arg1, POINTER_TYPE)
-  || !validate_arg (arg2, POINTER_TYPE))
-return NULL_TREE;
-
-  /* If ARG1 and ARG2 are the same (and not volatile), return zero.  */
-  if (operand_equal_p (arg1, arg2, 0))
-return integer_zero_node;
-
-  /* If the second arg is "", return *(const unsigned char*)arg1.  */
-  const char *p2 = c_getstr (arg2);
-  if (p2 && *p2 == '\0')
-{
-  tree cst_uchar_node = build_type_variant (unsigned_char_type_node, 1, 0);
-  tree cst_uchar_ptr_node
-	= build_pointer_type_for_mode (cst_uchar_node, ptr_mode, true);
-
-  return fold_convert_loc (loc, integer_type_node,
-			   build1 (INDIRECT_REF, cst_uchar_node,
-   fold_convert_loc (loc,
-			 cst_uchar_ptr_node,
-			 arg1)));
-}
-
-  /* If the first arg is "", return -*(const unsigned char*)arg2.  */
-  const char *p1 = c_getstr (arg1);
-  if (p1 && *p1 == '\0')
-{
-  tree cst_uchar_node = build_type_variant (unsigned_char_type_node, 1, 0);
-  tree cst_uchar_ptr_node
-	= build_pointer_type_for_mode (cst_uchar_node, ptr_mode, true);
-
-  tree temp
-	= fold_convert_loc (loc, integer_type_node,
-			build1 (INDIRECT_REF, cst_uchar_node,
-fold_convert_loc (loc,
-		  cst_uchar_ptr_node,
-		  arg2)));
-  return fold_build1_loc (loc, NEGATE_EXPR, integer_type_node, temp);
-}
-
-  return NULL_TREE;
-}
-
-/* Fold function call to builtin strncmp with arguments ARG1, ARG2, and LEN.
-   Return NULL_TREE if no simplification can be made.  */
-
-static tree
-fold_builtin_strncmp (location_t loc, tree arg1, tree arg2, tree len)
-{
-  if (!validate_arg (arg1, POINTER_TYPE)
-  || !validate_arg (arg2, POINTER_TYPE)
-  || !validate_arg (len, INTEGER_TYPE))
-return NULL_TREE;
-
-  /* If the LEN parameter is zero, return zero.  */
-  if (integer_zerop (len))
-return omit_two_operands_loc (loc, integer_type_node, integer_zero_node,
-			  arg1, arg2);
-
-  /* If ARG1 and ARG2 are the same (and not volatile), return zero.  */
-  if (operand_equal_p (arg1, arg2, 0))
-return omit_one_operand_loc (loc, integer_type_node, integer_zero_node, len);
-
-  /* If the second arg is "", and the length is greater than zero,
- return *(const unsigned char*)arg1.  */
-  const char *p2 = c_getstr (arg2);
-  if (p2 && *p2 == '\0'
-  && TREE_CODE (len) == INTEGER_CST
-  && tree_int_cst_sgn (len) == 1)
-{
-  tree cst_uchar_node = build_type_variant (unsigned_char_type_node, 1, 0);
-  tree cst_uchar_ptr_node
-	= build_pointer_type_for_mode (cst_uchar_node, ptr_mode, true);
-
-  return fold_convert_loc (loc, integer_type_node,
-			   build1 (INDIRECT_REF, cst_uchar_node,
-   fold_convert_loc (loc,
-			 cst_uchar_ptr_node,
-			 arg1)));
-

[PATCH] Replace non-constexpr decrement in std::chrono::floor


Decrementing a duration is not constexpr (yet ... I made an NB comment
about it).

I'm not sure if these functions are correct for floating-point
durations, because we could end up with a duration which is very very
slightly lower of higher than the desired value, but then we subtract
1.0 from it. That's what the reference implementation in Howard's
proposal does, so I'll worry about it another day.

* include/std/chrono (floor): Replace non-constexpr operation.
* testsuite/20_util/duration_cast/rounding.cc: Test conversion to
durations with floating pointer representations.

Tested powerpc64le-linux, committed to trunk.

commit 7f55016a186e4df17ab3ab4fd6dd1821508e540e
Author: Jonathan Wakely 
Date:   Thu Oct 13 14:03:57 2016 +0100

Replace non-constexpr decrement in std::chrono::floor

* include/std/chrono (floor): Replace non-constexpr operation.
* testsuite/20_util/duration_cast/rounding.cc: Test conversion to
durations with floating pointer representations.

diff --git a/libstdc++-v3/include/std/chrono b/libstdc++-v3/include/std/chrono
index cb8c876..ceae7f8 100644
--- a/libstdc++-v3/include/std/chrono
+++ b/libstdc++-v3/include/std/chrono
@@ -224,7 +224,7 @@ _GLIBCXX_END_NAMESPACE_VERSION
   {
auto __to = chrono::duration_cast<_ToDur>(__d);
if (__to > __d)
- --__to;
+ return __to - _ToDur{1};
return __to;
   }
 
diff --git a/libstdc++-v3/testsuite/20_util/duration_cast/rounding.cc 
b/libstdc++-v3/testsuite/20_util/duration_cast/rounding.cc
index a753323..2a1df74 100644
--- a/libstdc++-v3/testsuite/20_util/duration_cast/rounding.cc
+++ b/libstdc++-v3/testsuite/20_util/duration_cast/rounding.cc
@@ -27,6 +27,8 @@
 using namespace std::chrono_literals;
 using std::chrono::seconds;
 
+using fp_seconds = std::chrono::duration;
+
 static_assert( std::chrono::floor(1000ms) == 1s );
 static_assert( std::chrono::floor(1001ms) == 1s );
 static_assert( std::chrono::floor(1500ms) == 1s );
@@ -34,6 +36,7 @@ static_assert( std::chrono::floor(1999ms) == 1s );
 static_assert( std::chrono::floor(2000ms) == 2s );
 static_assert( std::chrono::floor(2001ms) == 2s );
 static_assert( std::chrono::floor(2500ms) == 2s );
+static_assert( std::chrono::floor(500ms) == fp_seconds{0.5f} );
 
 static_assert( std::chrono::ceil(1000ms) == 1s );
 static_assert( std::chrono::ceil(1001ms) == 2s );
@@ -42,6 +45,7 @@ static_assert( std::chrono::ceil(1999ms) == 2s );
 static_assert( std::chrono::ceil(2000ms) == 2s );
 static_assert( std::chrono::ceil(2001ms) == 3s );
 static_assert( std::chrono::ceil(2500ms) == 3s );
+static_assert( std::chrono::ceil(500ms) == fp_seconds{0.5f} );
 
 static_assert( std::chrono::round(1000ms) == 1s );
 static_assert( std::chrono::round(1001ms) == 1s );

[PATCH, libfortran] PR 48587 Newunit allocator

2016-10-13 Thread Janne Blomqvist

Currently GFortran newer reuses unit numbers allocated with NEWUNIT=,
instead having a simple counter that is decremented each time such a
unit is opened.  For a long running program which repeatedly opens
files with NEWUNIT= and closes them, the counter can wrap around and
cause an abort.  This patch replaces the counter with an allocator
that keeps track of which units numbers are allocated, and can reuse
them once they have been deallocated.  Since operating systems tend to
limit the number of simultaneous open files for a process to a
relatively modest number, a relatively simple approach with a linear
scan through an array suffices.  Though as a small optimization there
is a low water indicator keeping track of the index for which all unit
numbers below are already allocated.  This linear scan also ensures
that we always allocate the smallest available unit number.

2016-10-13  Janne Blomqvist  

PR libfortran/48587
* io/io.h (get_unique_unit_number): Remove prototype.
(newunit_alloc): New prototype.
* io/open.c (st_open): Call newunit_alloc.
* io/unit.c (newunits,newunit_size,newunit_lwi): New static
variables.
(GFC_FIRST_NEWUNIT): Rename to NEWUNIT_START.
(next_available_newunit): Remove variable.
(get_unit): Call newunit_alloc.
(close_unit_1): Call newunit_free.
(close_units): Free newunits array.
(get_unique_number): Remove function.
(newunit_alloc): New function.
(newunit_free): New function.

Regtested on x86_64-pc-linux-gnu. Ok for trunk?
---
 libgfortran/io/io.h   |   5 ++-
 libgfortran/io/open.c |   2 +-
 libgfortran/io/unit.c | 103 --
 3 files changed, 86 insertions(+), 24 deletions(-)

diff --git a/libgfortran/io/io.h b/libgfortran/io/io.h
index ea93fba..aaacc08 100644
--- a/libgfortran/io/io.h
+++ b/libgfortran/io/io.h
@@ -715,8 +715,9 @@ internal_proto (finish_last_advance_record);
 extern int unit_truncate (gfc_unit *, gfc_offset, st_parameter_common *);
 internal_proto (unit_truncate);
 
-extern GFC_INTEGER_4 get_unique_unit_number (st_parameter_common *);
-internal_proto(get_unique_unit_number);
+extern int newunit_alloc (void);
+internal_proto(newunit_alloc);
+
 
 /* open.c */
 
diff --git a/libgfortran/io/open.c b/libgfortran/io/open.c
index d074b02..2e7163d 100644
--- a/libgfortran/io/open.c
+++ b/libgfortran/io/open.c
@@ -812,7 +812,7 @@ st_open (st_parameter_open *opp)
   if ((opp->common.flags & IOPARM_LIBRETURN_MASK) == IOPARM_LIBRETURN_OK)
 {
   if ((opp->common.flags & IOPARM_OPEN_HAS_NEWUNIT))
-   opp->common.unit = get_unique_unit_number(>common);
+   opp->common.unit = newunit_alloc ();
   else if (opp->common.unit < 0)
{
  u = find_unit (opp->common.unit);
diff --git a/libgfortran/io/unit.c b/libgfortran/io/unit.c
index 274b24b..cc24ca7 100644
--- a/libgfortran/io/unit.c
+++ b/libgfortran/io/unit.c
@@ -29,6 +29,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If 
not, see
 #include "unix.h"
 #include 
 #include 
+#include 
 
 
 /* IO locking rules:
@@ -68,12 +69,34 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
If not, see
on it.  unlock_unit or close_unit must be always called only with the
private lock held.  */
 
-/* Subroutines related to units */
 
-/* Unit number to be assigned when NEWUNIT is used in an OPEN statement.  */
-#define GFC_FIRST_NEWUNIT -10
+
+/* Table of allocated newunit values.  A simple solution would be to
+   map OS file descriptors (fd's) to unit numbers, e.g. with newunit =
+   -fd - 2, however that doesn't work since Fortran allows an existing
+   unit number to be reassociated with a new file. Thus the simple
+   approach may lead to a situation where we'd try to assign a
+   (negative) unit number which already exists. Hence we must keep
+   track of allocated newunit values ourselves. This is the purpose of
+   the newunits array. The indices map to newunit values as newunit =
+   -index + NEWUNIT_FIRST. E.g. newunits[0] having the value true
+   means that a unit with number NEWUNIT_FIRST exists. Similar to
+   POSIX file descriptors, we always allocate the lowest (in absolute
+   value) available unit number.
+ */
+static bool *newunits;
+static int newunit_size; /* Total number of elements in the newunits array.  */
+/* Low water indicator for the newunits array. Below the LWI all the
+   units are allocated, above and equal to the LWI there may be both
+   allocated and free units. */
+static int newunit_lwi;
+static void newunit_free (int);
+
+/* Unit numbers assigned with NEWUNIT start from here.  */
+#define NEWUNIT_START -10
+
+
 #define NEWUNIT_STACK_SIZE 16
-static GFC_INTEGER_4 next_available_newunit = GFC_FIRST_NEWUNIT;
 
 /* A stack to save previously used newunit-assigned unit numbers to
allow them to be reused without reallocating the gfc_unit structure
@@ -81,6 +104,7 @@

[PATCH] Fix PR77937

2016-10-13 Thread Bill Schmidt

The previous patch for
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77937 is necessary, but not
sufficient in all cases.  It allows -1 to be used with a pointer
increment, which we really do not want given that this is generally not
profitable.  Disable this case for now.  We can add logic later to
estimate the cost for the rare case where it can be useful.

Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no
regressions, committed.

Thanks,
Bill


2016-10-13  Bill Schmidt  

PR tree-optimization/77937
* gimple-ssa-strength-reduction.c (analyze_increments): Set cost
to infinite when we have a pointer with an increment of -1.


Index: gcc/gimple-ssa-strength-reduction.c
===
--- gcc/gimple-ssa-strength-reduction.c (revision 241120)
+++ gcc/gimple-ssa-strength-reduction.c (working copy)
@@ -2818,6 +2818,11 @@ analyze_increments (slsr_cand_t first_dep, machine
   || (incr == -1
   && !POINTER_TYPE_P (first_dep->cand_type)))
incr_vec[i].cost = COST_NEUTRAL;
+
+  /* FIXME: We don't handle pointers with a -1 increment yet.
+ They are usually unprofitable anyway.  */
+  else if (incr == -1 && POINTER_TYPE_P (first_dep->cand_type))
+   incr_vec[i].cost = COST_INFINITE;
   
   /* FORNOW: If we need to add an initializer, give up if a cast from
 the candidate's type to its stride's type can lose precision.

Re: [PATCH] Replace non-constexpr decrement in std::chrono::floor


On 13/10/16 15:24 +0100, Jonathan Wakely wrote:

Decrementing a duration is not constexpr (yet ... I made an NB comment
about it).

I'm not sure if these functions are correct for floating-point
durations, because we could end up with a duration which is very very
slightly lower of higher than the desired value, but then we subtract
1.0 from it. That's what the reference implementation in Howard's
proposal does, so I'll worry about it another day.

* include/std/chrono (floor): Replace non-constexpr operation.
* testsuite/20_util/duration_cast/rounding.cc: Test conversion to
durations with floating pointer representations.

Tested powerpc64le-linux, committed to trunk.


Not actually committed yet because I'm stuck at:

 Committing to svn+ssh://r...@gcc.gnu.org/svn/gcc/trunk ...


It should finish eventually.

Re: [PATCH] Introduce -Wimplicit-fallthrough={0,1,2,3,4,5}

2016-10-13 Thread Allan Sandfeld Jensen

On Tuesday 11 October 2016, Jakub Jelinek wrote:
> Hi!
> 
> The following patch introduces difference warning levels for
> -Wimplicit-fallthrough warning, so projects can choose if they want to
> honor only attributes (-Wimplicit-fallthrough=5), or what kind of comments.
> =4 is very picky and accepts only very small amount of comments, =3 is what
> we had before this patch, =2 looks case insensitively for falls?[
> \t-]*thr(u|ough) anywhere in the comment, =1 accepts any comment, =0 is
> the same as -Wno-implicit-fallthrough - disables the warning.

I would suggest also looking for comments with "no break" in them, as that is 
another common way to annotate the intentional lack of a 'break'.

If you want another example besides the linux kernel, I unified all our fall 
through comments in qtbase in August: 
https://codereview.qt-project.org/#/c/163595/

Though Qt is far from -Wimpliciit-fallthough clean after that, another 
colleague is working on that since we have traditionally aimed for -Wall -
Wextra -Werror, though it will seriously wreck hawock with readability in 
several places with unrolled loops and switches on integers.

Best regards
`Allan

Re: [patch] aarch64--freebsd support for gcc.


On 10/12/2016 01:43 PM, Andreas Tobler wrote:


libgcc:

2016-10-10  Andreas Tobler  

* config.host: Add support for aarch64-*-freebsd*.

gcc:

2016-10-10  Andreas Tobler  

* config.gcc: Add aarch64-*-freebsd* support.
* config.host: Likewise.
* config/aarch64/aarch64-freebsd.h: New file.
* config/aarch64/t-aarch64-freebsd: Ditto.

toplevel:

2016-10-10  Andreas Tobler 

* configure.ac: Add aarch64-*-freebsd*.
* configure: Regenerate.

Certainly OK for the trunk.  Jakub, Richi & Joseph make the rules for
the release branches.


I had a chat with Jakub and I learned as long as there is no branch
freeze or such, every global reviewer can approve such a patch backport.
So may I ask you, would you mind approving this patch for 6.x and 5.x?

Yes.  Approved for 5.x and 6.x.

jeff

[PATCH] Introduce -fprofile-update=maybe-atomic

Hello.

As it's very hard to guess from GCC driver whether a target supports atomic 
updates
for GCOV counter or not, I decided to come up with a new option value 
(maybe-atomic),
that would be transformed in a corresponding value (single or atomic) in 
tree-profile.c.
The GCC driver selects the option when -pthread is present in the command line.

That should fix all tests failures seen on AIX target.

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Ready to be installed?
Martin
>From 1d00b7b4d42d080fe4d6cd51a03829b0fe525c9d Mon Sep 17 00:00:00 2001
From: marxin 
Date: Wed, 12 Oct 2016 15:05:49 +0200
Subject: [PATCH] Introduce -fprofile-update=maybe-atomic

gcc/ChangeLog:

2016-10-12  Martin Liska  

	* common.opt: Add maybe-atomic as a new enum value for
	-fprofile-update.
	* coretypes.h: Likewise.
	* doc/invoke.texi: Document the new option value.
	* gcc.c: Replace atomic with maybe-atomic.  Remove warning.
	* tree-profile.c (tree_profiling): Select default value
	of -fprofile-update when 'maybe-atomic' is selected.

gcc/testsuite/ChangeLog:

2016-10-12  Martin Liska  

	* gcc.dg/no_profile_instrument_function-attr-1.c: Update test
	to match scanned pattern.
	* gcc.dg/tree-ssa/ssa-lim-11.c: Likewise.
---
 gcc/common.opt |  5 +++-
 gcc/coretypes.h|  3 +-
 gcc/doc/invoke.texi| 11 +--
 gcc/gcc.c  |  6 +---
 .../gcc.dg/no_profile_instrument_function-attr-1.c |  2 +-
 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-11.c |  2 +-
 gcc/tree-profile.c | 35 +++---
 7 files changed, 35 insertions(+), 29 deletions(-)

diff --git a/gcc/common.opt b/gcc/common.opt
index 15679c5..d6c5acd 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1937,7 +1937,7 @@ Enable correction of flow inconsistent profile data input.
 
 fprofile-update=
 Common Joined RejectNegative Enum(profile_update) Var(flag_profile_update) Init(PROFILE_UPDATE_SINGLE)
--fprofile-update=[single|atomic]	Set the profile update method.
+-fprofile-update=[single|atomic|maybe-atomic]	Set the profile update method.
 
 Enum
 Name(profile_update) Type(enum profile_update) UnknownError(unknown profile update method %qs)
@@ -1948,6 +1948,9 @@ Enum(profile_update) String(single) Value(PROFILE_UPDATE_SINGLE)
 EnumValue
 Enum(profile_update) String(atomic) Value(PROFILE_UPDATE_ATOMIC)
 
+EnumValue
+Enum(profile_update) String(maybe-atomic) Value(PROFILE_UPDATE_MAYBE_ATOMIC)
+
 fprofile-generate
 Common
 Enable common options for generating profile info for profile feedback directed optimizations.
diff --git a/gcc/coretypes.h b/gcc/coretypes.h
index fe1e984..aec2a6e 100644
--- a/gcc/coretypes.h
+++ b/gcc/coretypes.h
@@ -177,7 +177,8 @@ enum offload_abi {
 /* Types of profile update methods.  */
 enum profile_update {
   PROFILE_UPDATE_SINGLE,
-  PROFILE_UPDATE_ATOMIC
+  PROFILE_UPDATE_ATOMIC,
+  PROFILE_UPDATE_MAYBE_ATOMIC
 };
 
 /* Types of unwind/exception handling info that can be generated.  */
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index c11f1d5..eb6cae3 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -10315,13 +10315,18 @@ To optimize the program based on the collected profile information, use
 
 Alter the update method for an application instrumented for profile
 feedback based optimization.  The @var{method} argument should be one of
-@samp{single} or @samp{atomic}.  The first one is useful for single-threaded
-applications, while the second one prevents profile corruption by emitting
-thread-safe code.
+@samp{single}, @samp{atomic} or @samp{maybe-atomic}.
+The first one is useful for single-threaded applications,
+while the second one prevents profile corruption by emitting thread-safe code.
 
 @strong{Warning:} When an application does not properly join all threads
 (or creates an detached thread), a profile file can be still corrupted.
 
+Using @samp{maybe-atomic} would be transformed either to @samp{atomic},
+when supported by a target, or to @samp{single} otherwise. The GCC driver
+automatically selects @samp{maybe-atomic} when @option{-pthread}
+is present in the command line.
+
 @item -fsanitize=address
 @opindex fsanitize=address
 Enable AddressSanitizer, a fast memory error detector.
diff --git a/gcc/gcc.c b/gcc/gcc.c
index 5213cb0..1959fc7 100644
--- a/gcc/gcc.c
+++ b/gcc/gcc.c
@@ -1144,11 +1144,7 @@ static const char *cc1_options =
  %{coverage:-fprofile-arcs -ftest-coverage}\
  %{fprofile-arcs|fprofile-generate*|coverage:\
%{!fprofile-update=single:\
- %{pthread:-fprofile-update=atomic}}}\
- %{fprofile-update=single:\
-   %{fprofile-arcs|fprofile-generate*|coverage:\
- %{pthread:%n-fprofile-update=atomic should be used\
- for a multithreaded application}}}";
+ %{pthread:-fprofile-update=maybe-atomic}}}";
 
 static const char

Re: [PATCH, RFC] gcov: dump in a static dtor instead of in an atexit handler

On 10/13/2016 04:04 PM, Rainer Orth wrote:
> no, it's from r240990 unlike I'm completely mistaken.  However, current
> trunk bootstraps are running as we speak.
> 
>   Rainer

Good! How does it look with the former solaris targets that does not support
prioritized ctors?

Thanks,
Martin

[PATCH] Check \0-termination of string in c_getstr (simplified version)

Hello.

After receiving feedback from Richi and Wilco Dijkstra, I decided to fully not
support not null-terminated strings. It brings more complications and the code 
has started
to be overengineered. Thus c_getstr accepts only such strings and as a bonus it 
returns
length of a string.

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Ready to be installed?
Martin
>From bee44f0dedc86b1c354e21dd87dad6313147dcc3 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Thu, 13 Oct 2016 10:20:12 +0200
Subject: [PATCH 1/4] Support only \0-terminated string in c_getstr and return
 strlen

gcc/ChangeLog:

2016-10-13  Martin Liska  

	* fold-const.c (c_getstr): Support of properly \0-terminated
	string constants.  New argument is added.
	* fold-const.h: New argument is added.
---
 gcc/fold-const.c | 38 +-
 gcc/fold-const.h |  2 +-
 2 files changed, 30 insertions(+), 10 deletions(-)

diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 02aa484..57a9243 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -14440,24 +14440,44 @@ fold_build_pointer_plus_hwi_loc (location_t loc, tree ptr, HOST_WIDE_INT off)
 }
 
 /* Return a char pointer for a C string if it is a string constant
-   or sum of string constant and integer constant.  */
+   or sum of string constant and integer constant.  We only support
+   string constants properly terminated with '\0' character.
+   If STRLEN is a valid pointer, length (including terminating character)
+   of returned string is stored to the argument.  */
 
 const char *
-c_getstr (tree src)
+c_getstr (tree src, unsigned HOST_WIDE_INT *strlen)
 {
   tree offset_node;
 
+  if (strlen)
+*strlen = 0;
+
   src = string_constant (src, _node);
   if (src == 0)
-return 0;
+return NULL;
 
-  if (offset_node == 0)
-return TREE_STRING_POINTER (src);
-  else if (!tree_fits_uhwi_p (offset_node)
-	   || compare_tree_int (offset_node, TREE_STRING_LENGTH (src) - 1) > 0)
-return 0;
+  unsigned HOST_WIDE_INT offset = 0;
+  if (offset_node != NULL_TREE)
+{
+  if (!tree_fits_uhwi_p (offset_node))
+	return NULL;
+  else
+	offset = tree_to_uhwi (offset_node);
+}
+
+  unsigned HOST_WIDE_INT string_length = TREE_STRING_LENGTH (src);
+  const char *string = TREE_STRING_POINTER (src);
+
+  /* Support only properly null-terminated strings.  */
+  if (string_length == 0
+  || string[string_length - 1] != '\0'
+  || offset > string_length)
+return NULL;
 
-  return TREE_STRING_POINTER (src) + tree_to_uhwi (offset_node);
+  if (strlen)
+*strlen = string_length - offset;
+  return string + offset;
 }
 
 #if CHECKING_P
diff --git a/gcc/fold-const.h b/gcc/fold-const.h
index 637e46b..bc22c88 100644
--- a/gcc/fold-const.h
+++ b/gcc/fold-const.h
@@ -182,7 +182,7 @@ extern bool expr_not_equal_to (tree t, const wide_int &);
 extern tree const_unop (enum tree_code, tree, tree);
 extern tree const_binop (enum tree_code, tree, tree, tree);
 extern bool negate_mathfn_p (combined_fn);
-extern const char *c_getstr (tree);
+extern const char *c_getstr (tree, unsigned HOST_WIDE_INT *strlen = NULL);
 
 /* Return OFF converted to a pointer offset type suitable as offset for
POINTER_PLUS_EXPR.  Use location LOC for this conversion.  */
-- 
2.9.2

[PATCH, GCC/ARM 2/2] Allow combination of aprofile and rmprofile multilibs

2016-10-13 Thread Thomas Preudhomme


Hi ARM maintainers,

This patchset aims at adding multilib support for R and M profile ARM 
architectures and allowing it to be built alongside multilib for A profile ARM 
architectures. This specific patch is concerned with the latter. The patch works 
by moving the bits shared by both aprofile and rmprofile multilib build 
(variable initilization as well as ISA and float ABI to build multilib for) to a 
new t-multilib file. Then, based on which profile was requested in 
--with-multilib-list option, that files includes t-aprofile and/or t-rmprofile 
where the architecture and FPU to build the multilib for are specified.


Unfortunately the duplication of CPU to A profile architectures could not be 
avoided because substitution due to MULTILIB_MATCHES are not transitive. 
Therefore, mapping armv7-a to armv7 for rmprofile multilib build does not have 
the expected effect. Two patches were written to allow this using 2 different 
approaches but I decided against it because this is not the right solution IMO. 
See caveats below for what I believe is the correct approach.



*** combined build caveats ***

As the documentation in this patch warns, there is a few caveats to using a 
combined multilib build due to the way the multilib framework works.


1) For instance, when using only rmprofile the combination of options -mthumb 
-march=armv7 -mfpu=neon the thumb/-march=armv7 multilib but in a combined 
multilib build the default multilib would be used. This is because in the 
rmprofile build -mfpu=neon is not specified in MULTILIB_OPTION and thus the 
option is ignored when considering MULTILIB_REQUIRED entries.


2) Another issue is the fact that aprofile and rmprofile multilib build have 
some conflicting requirements in terms of how to map options for which no 
multilib is built to another option. (i) A first example of this is the 
difference of CPU to architecture mapping mentionned above: rmprofile multilib 
build needs A profile CPUs and architectures to be mapped down to ARMv7 so that 
one of the v7-ar multilib gets chosen in such a case but aprofile needs A 
profile architectures to stand on their own because multilibs are built for 
several architectures.


(ii) Another example of this is that in aprofile multilib build no multilib is 
built with -mfpu=fpv5-d16 but some multilibs are built with -mfpu=fpv4-d16. 
Therefore, aprofile defines a match rule to map fpv5-d16 onto fpv4-d16. However, 
rmprofile multilib profile *does* build some multilibs with -mfpu=fpv5-d16. This 
has the consequence that when building for -mthumb -march=armv7e-m 
-mfpu=fpv5-d16 -mfloat-abi=hard the default multilib is chosen because this is 
rewritten into -mthumb -march=armv7e-m -mfpu=fpv5-d16 -mfloat-abi=hard and there 
is no multilib for that.


Both of these issues could be handled by using MULTILIB_REUSE instead of 
MULTILIB_MATCHES but this would require a large set of rules. I believe instead 
the right approach is to create a new mechanism to inform GCC on how options can 
be down mapped _when no multilib can be found_ which would require a smaller set 
of rules and would make it explicit that the options are not equivalent. A patch 
will be posted to this effect at a later time.


ChangeLog entry is as follows:


*** gcc/ChangeLog ***

2016-10-03  Thomas Preud'homme  

* config.gcc: Allow combinations of aprofile and rmprofile values for
--with-multilib-list.
* config/arm/t-multilib: New file.
* config/arm/t-aprofile: Remove initialization of MULTILIB_*
variables.  Remove setting of ISA and floating-point ABI in
MULTILIB_OPTIONS and MULTILIB_DIRNAMES.  Set architecture and FPU in
MULTI_ARCH_OPTS_A and MULTI_ARCH_DIRS_A rather than MULTILIB_OPTIONS
and MULTILIB_DIRNAMES respectively.  Add comment to introduce all
matches.  Add architecture matches for marvel-pj4 and generic-armv7-a
CPU options.
* config/arm/t-rmprofile: Likewise except for the matches changes.
* doc/install.texi (--with-multilib-list): Document the combination of
aprofile and rmprofile values and warn about pitfalls in doing that.


Testing:

* "tree install/lib/gcc/arm-none-eabi/7.0.0" is the same before and after the 
patchset for both aprofile and rmprofile
* "tree install/lib/gcc/arm-none-eabi/7.0.0" is the same for aprofile,rmprofile 
and rmprofile,aprofile
* default spec (gcc -dumpspecs) is the same for aprofile,rmprofile and 
rmprofile,aprofile


* Difference in --print-multi-directory between aprofile or rmprofile and 
aprofile,rmprofile for all combination of ISA (ARM/Thumb), architecture, CPU, 
FPU and float ABI is as expected (best multilib for the combination is chosen), 
modulo the caveat mentionned above and the new marvel-pj4 and generic-armv7-a 
CPU to architecture mapping.



Is this ok for master?

Best regards,

Thomas
diff --git a/gcc/config.gcc b/gcc/config.gcc
index

Re: [PATCH] Tweaks to print_rtx_function


On 10/13/2016 04:08 PM, David Malcolm wrote:

I thought it might be useful to brainstorm [1] some ideas on this,
so  here are various possible ways it could be printed for this use-case:

* Offset by LAST_VIRTUAL_REGISTER + 1 (as in the patch), and printed
just as a number, giving:

  (reg:SI 3)


Unambiguous in the compact format, nice low register numbers, but some 
potential for confusion with hard regs based on what people are used to.



* Prefixed by a "sigil" character:


>   (reg:SI %3)

Avoids the confusion issue and shouldn't overlap with hard register 
names. I think this is the one I prefer, followed by plain (reg:SI 3).



  (reg:SI P3)


Can't use this, as there are machines with P3 registers.


* Prefixed so it looks like a register name:

  (reg:SI pseudo-3)
  (reg:SI pseudo_3)
  (reg:SI pseudo+3)


Not too different from just a "%" prefix and probably too verbose.


Looking at print_rtx_operand_code_r there are also things like
ORIGINAL_REGNO, REG_EXPR and REG_OFFSET which get printed after the
main regno, e.g.: >



  (reg:SI 1 [  ])


That's the REG_EXPR here presumably? The interesting part comes when 
parsing this.



Bernd

Re: [patch, avr, pr71676 and pr71678] Issues with casesi expand

2016-10-13 Thread Georg-Johann Lay


On 13.10.2016 13:44, Pitchumani Sivanupandi wrote:

On Monday 26 September 2016 08:19 PM, Georg-Johann Lay wrote:

On 26.09.2016 15:19, Pitchumani Sivanupandi wrote:

Attached patch for PR71676 and PR71678.

PR71676 is for AVR target that generates wrong code when switch case index is
more than 16 bits.

Switch case index of larger than SImode are checked for out of range before
'casesi' expand. RTL expand of casesi gets index as SImode, but index is
compared in HImode and ignores upper 16bits.

Attached patch changes the expansion for casesi to make the index comparison
in SImode and code generation accordingly.

PR71678 is ICE because below pattern in 'casesi' is not recognized.
(set (reg:HI 47)
 (minus:HI (subreg:HI (subreg:SI (reg:DI 44) 0) 0)
   (reg:HI 45)))

Fix of PR71676 avoids the above pattern as it changes the comparison
to SImode.


But this means that all comparisons are now performed in SImode which is a
great performance loss for most programs which will switch on 16-bit values.

IMO we need a less intrusive (w.r.t. performance) approach.


Yes.

I tried to split 'casesi' into several based on case values so that compare is
done
in less expensive modes (i.e. QI or HI). In few cases it is not possible without
SImode subtract/ compare.

Pattern casesi will have index in SI mode. So, out of range checks will be
expensive
as most common uses (in AVR) of case values will be in QI/HI mode.

e.g.
  if case values in QI range
if upper three bytes index is set
  goto out_of_range

offset = index - lower_bound (QImode)
if offset > case_range   (QImode)
  goto out_of_range
goto jump_table + offset

  else if case values in HI range
if index[2,3] is set
  goto out_of_range

offset = index - lower_bound (HImode)
if offset > case_range   (HImode)
  goto out_of_range
goto jump_table + offset

This modification will not work for the negative index values. Because code to
check
upper bytes of index will be expensive than the SImode subtract/ compare.

So, I'm trying to update fix to have SImode subtract/ compare if the case
values include
negative integers. For, others will try to optimize as mentioned above. Is that
approach OK?


But the above code will be executed at run time and add even more overhead, or 
am I missing something?  If you conclude statically at expand time from the 
case ranges then we might hit a similar problem as with the original subreg 
computation.


Unfortunately, the generated code (setting cc0, a reg and pc) cannot be wrapped 
into an unspec or parallel and then later be rectified...


I am thinking about a new avr target pass to tidy up the code if no 32-bit 
computation is needed, but this will be some effort.



Johann



Alternatively we can have flags to generate shorter code for 'casesi' using 
HImode
subtract/ compare. But correctness is not guaranteed (PR71676).

Regards,
Pitchumani

[Fortran, patch, caf] Add unimplemented message for polymorphic objects with allocatable/pointer components

2016-10-13 Thread Andre Vehreschild

Hi all,

attached patch adds an unimplemented message, when a polymorphic coarray object
with allocatable/pointer components is declared for coarray mode library. This
is just an ad-hoc solution until handling those constructs is implemented.
There are already some prs that address ICEs caused by this issue: 77961, 77785.

Bootstrapped and regtests ok on x86_64-linux/F23. May have some fuzz when the
patch for polymorphic assign:

https://gcc.gnu.org/ml/fortran/2016-10/msg00091.html

is not present. The polymorphic assign patch is not necessary for this patch.

If no one objects, I will commit tomorrow morning.

Regards,
Andre
-- 
Andre Vehreschild * Email: vehre ad gmx dot de 
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment; filename=caf_unimp.clog

gcc/testsuite/ChangeLog:

2016-10-13  Andre Vehreschild  

* gfortran.dg/coarray_38.f90: Expect error message.

gcc/fortran/ChangeLog:

2016-10-13  Andre Vehreschild  

* resolve.c (resolve_symbol): Add unimplemented message for
polymorphic types with allocatable/pointer components and coarray=lib.

diff --git a/gcc/fortran/resolve.c b/gcc/fortran/resolve.c
index 42e3421..2226227 100644
--- a/gcc/fortran/resolve.c
+++ b/gcc/fortran/resolve.c
@@ -13796,6 +13796,19 @@ resolve_symbol (gfc_symbol *sym)
  (just like derived type declaration symbols have flavor FL_DERIVED). */
   gcc_assert (sym->ts.type != BT_UNION);
 
+  /* Coarrayed polymorphic objects with allocatable or pointer components are
+ yet unsupported for -fcoarray=lib.  */
+  if (flag_coarray == GFC_FCOARRAY_LIB && sym->ts.type == BT_CLASS
+  && sym->ts.u.derived && CLASS_DATA (sym)
+  && CLASS_DATA (sym)->attr.codimension
+  && (sym->ts.u.derived->attr.alloc_comp
+	  || sym->ts.u.derived->attr.pointer_comp))
+{
+  gfc_error ("Sorry, allocatable/pointer components in polymorphic (CLASS) "
+		 "type coarrays at %L are unsupported", >declared_at);
+  return;
+}
+
   if (sym->attr.artificial)
 return;
 
diff --git a/gcc/testsuite/gfortran.dg/coarray_38.f90 b/gcc/testsuite/gfortran.dg/coarray_38.f90
index 31155c5..c8011d4 100644
--- a/gcc/testsuite/gfortran.dg/coarray_38.f90
+++ b/gcc/testsuite/gfortran.dg/coarray_38.f90
@@ -71,7 +71,7 @@ end type t
 type t2
   class(t), allocatable :: caf2[:]
 end type t2
-class(t), allocatable :: caf[:]
+class(t), allocatable :: caf[:] ! { dg-error "Sorry, allocatable/pointer components in polymorphic" }
 type(t) :: x
 type(t2) :: y

[PATCH] Use normal mode containers in searchers


We never want the searchers to use the debug mode or profile mode
containers.

* include/experiumental/functional (boyer_moore_searcher)
(__boyer_moore_map_base, __boyer_moore_array_base): Qualify containers
with _GLIBCXX_STD_C.
* include/std/functional: Likewise.

Tested powerpc64le-linux, committed to trunk.


commit 34bf8c32da5cc50b7ade97707f44e6b2cdcd86df
Author: Jonathan Wakely 
Date:   Thu Oct 13 16:24:14 2016 +0100

Use normal mode containers in searchers

* include/experiumental/functional (boyer_moore_searcher)
(__boyer_moore_map_base, __boyer_moore_array_base): Qualify containers
with _GLIBCXX_STD_C.
* include/std/functional: Likewise.

diff --git a/libstdc++-v3/include/experimental/functional 
b/libstdc++-v3/include/experimental/functional
index db45665..77e6e66 100644
--- a/libstdc++-v3/include/experimental/functional
+++ b/libstdc++-v3/include/experimental/functional
@@ -119,7 +119,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   _Pred
   _M_pred() const { return _M_bad_char.key_eq(); }
 
-  std::unordered_map<_Key, _Tp, _Hash, _Pred> _M_bad_char;
+  _GLIBCXX_STD_C::unordered_map<_Key, _Tp, _Hash, _Pred> _M_bad_char;
 };
 
   template
@@ -128,7 +128,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
__boyer_moore_array_base(_RAIter __pat, size_t __patlen,
 _Unused&&, _Pred&& __pred)
-   : _M_bad_char{ std::array<_Tp, _Len>{}, std::move(__pred) }
+   : _M_bad_char{ _GLIBCXX_STD_C::array<_Tp, _Len>{}, std::move(__pred) }
{
  std::get<0>(_M_bad_char).fill(__patlen);
  if (__patlen > 0)
@@ -156,7 +156,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   const _Pred&
   _M_pred() const { return std::get<1>(_M_bad_char); }
 
-  std::tuple, _Pred> _M_bad_char;
+  std::tuple<_GLIBCXX_STD_C::array<_Tp, _Len>, _Pred> _M_bad_char;
 };
 
   template
@@ -229,7 +229,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   _RAIter _M_pat;
   _RAIter _M_pat_end;
-  std::vector<__diff_type> _M_good_suffix;
+  _GLIBCXX_STD_C::vector<__diff_type> _M_good_suffix;
 };
 
   template _M_bad_char;
+  _GLIBCXX_STD_C::unordered_map<_Key, _Tp, _Hash, _Pred> _M_bad_char;
 };
 
   template
@@ -2215,7 +2215,7 @@ _GLIBCXX_MEM_FN_TRAITS(&&, false_type, true_type)
   template
__boyer_moore_array_base(_RAIter __pat, size_t __patlen,
 _Unused&&, _Pred&& __pred)
-   : _M_bad_char{ std::array<_Tp, _Len>{}, std::move(__pred) }
+   : _M_bad_char{ _GLIBCXX_STD_C::array<_Tp, _Len>{}, std::move(__pred) }
{
  std::get<0>(_M_bad_char).fill(__patlen);
  if (__patlen > 0)
@@ -2243,7 +2243,7 @@ _GLIBCXX_MEM_FN_TRAITS(&&, false_type, true_type)
   const _Pred&
   _M_pred() const { return std::get<1>(_M_bad_char); }
 
-  std::tuple, _Pred> _M_bad_char;
+  std::tuple<_GLIBCXX_STD_C::array<_Tp, _Len>, _Pred> _M_bad_char;
 };
 
   template
@@ -2316,7 +2316,7 @@ _GLIBCXX_MEM_FN_TRAITS(&&, false_type, true_type)
 
   _RAIter _M_pat;
   _RAIter _M_pat_end;
-  std::vector<__diff_type> _M_good_suffix;
+  _GLIBCXX_STD_C::vector<__diff_type> _M_good_suffix;
 };
 
   template

Re: [PATCH] Omit INSN_LOCATION from compact dumps


On 10/13/2016 05:52 PM, David Malcolm wrote:


Alternatively, it seems that we might want an additional flag for
this.


Probably - I imagine most testcases won't care (it's obviously easier to 
read without locations) but some will. The writing side is maybe less 
interesting than the reading side, making sure we parse either variant 
correctly.



If so, maybe it's time to introduce a "class rtx_writer" or
similar to hold the global state relating to dumping, and rewrite
the dumping in those terms?


Depends how invasive that's going to be. I have no clear picture of it.


Bernd

[PATCH] Create the *logue in the same order as before (PR77962)

2016-10-13 Thread Segher Boessenkool

PR77962 shows Go failing on 32-bit x86.  This happens because the i386
port requires the split stack prologue to be created before the normal
prologue, and my previous patch changed it to be the other way around.

This patch changes it back.  Things will be exactly as before for targets
that do not do shrink-wrapping for separate components.  For targets
that *do* support it, all three prologue/epilogue creation functions
will now be called twice for functions that have anything wrapped
separately (instead of just the prologue created twice).

Bootstrapping+testing on powerpc64-linux {-m64,-m32}, all languages;
and on x86_64-linux all,go,obj-c++ (i.e. no ada).

Is this okay for trunk if testing succeeds?  And sorry for the breakage.


Segher


2016-10-13  Segher Boessenkool  

PR bootstrap/77962
* function.c (thread_prologue_and_epilogue_insns): Call all
make_*logue_seq in the same order as traditional.  Call them
all a second time if shrink_wrapped-separate.

---
 gcc/function.c | 18 --
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/gcc/function.c b/gcc/function.c
index 5dafb8c..208f1a5 100644
--- a/gcc/function.c
+++ b/gcc/function.c
@@ -5919,7 +5919,9 @@ thread_prologue_and_epilogue_insns (void)
   edge entry_edge = single_succ_edge (ENTRY_BLOCK_PTR_FOR_FN (cfun));
   edge orig_entry_edge = entry_edge;
 
+  rtx_insn *split_prologue_seq = make_split_prologue_seq ();
   rtx_insn *prologue_seq = make_prologue_seq ();
+  rtx_insn *epilogue_seq = make_epilogue_seq ();
 
   /* Try to perform a kind of shrink-wrapping, making sure the
  prologue/epilogue is emitted only around those parts of the
@@ -5931,13 +5933,17 @@ thread_prologue_and_epilogue_insns (void)
   try_shrink_wrapping_separate (entry_edge->dest);
 
   /* If that did anything for any component we now need the generate the
- "main" prologue again.  If that does not work for some target then
- that target should not enable separate shrink-wrapping.  */
+ "main" prologue again.  Because some targets require some of these
+ to be called in a specific order (i386 requires the split prologue
+ to be first, for example), we create all three sequences again here.
+ If this does not work for some target, that target should not enable
+ separate shrink-wrapping.  */
   if (crtl->shrink_wrapped_separate)
-prologue_seq = make_prologue_seq ();
-
-  rtx_insn *split_prologue_seq = make_split_prologue_seq ();
-  rtx_insn *epilogue_seq = make_epilogue_seq ();
+{
+  split_prologue_seq = make_split_prologue_seq ();
+  prologue_seq = make_prologue_seq ();
+  epilogue_seq = make_epilogue_seq ();
+}
 
   rtl_profile_for_bb (EXIT_BLOCK_PTR_FOR_FN (cfun));
 
-- 
1.9.3

[PATCH] Omit INSN_LOCATION from compact dumps

2016-10-13 Thread David Malcolm

On Thu, 2016-10-13 at 12:21 +0200, Bernd Schmidt wrote:
> On 10/12/2016 10:37 PM, David Malcolm wrote:
> > It didn't pass, due to this change:
> > 
> >  (print_rtx_operand_code_i): When printing source locations,
> > wrap
> >  xloc.file in quotes. [...snip...]
> [...]
> > The following is a revised version of the patch which updates this
> > test case.
> 
> Also ok.

(committed to trunk as r241120)

> This reminds me, wrapping the filename in quotes was a side
> issue - what I was really hoping for was to have testcases without
> this
> visual clutter unless they wanted to explicitly test functionality
> related to it.

The following patch omits the INSN_LOCATION in compact mode.

Currently bootstrapping

OK for trunk if it passes?

Alternatively, it seems that we might want an additional flag for
this.  If so, maybe it's time to introduce a "class rtx_writer" or
similar to hold the global state relating to dumping, and rewrite
the dumping in those terms?

gcc/ChangeLog:
* print-rtl-function.c (print_rtx_function): Update comment for
omission of INSN_LOCATIONs in compact mode.
* print-rtl.c (print_rtx_operand_code_i): Omit INSN_LOCATIONs in
compact mode.
---
 gcc/print-rtl-function.c | 13 +++--
 gcc/print-rtl.c  |  5 +++--
 2 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/gcc/print-rtl-function.c b/gcc/print-rtl-function.c
index 90a0ff7..87a6458 100644
--- a/gcc/print-rtl-function.c
+++ b/gcc/print-rtl-function.c
@@ -133,6 +133,7 @@ can_have_basic_block_p (const rtx_insn *insn)
- INSN_CODEs are omitted,
- register numbers are omitted for hard and virtual regs
- insn names are prefixed with "c" (e.g. "cinsn", "cnote", etc)
+   - INSN_LOCATIONs are omitted.
 
Example output (with COMPACT==true):
 
@@ -144,30 +145,30 @@ can_have_basic_block_p (const rtx_insn *insn)
 (cnote [bb 2] NOTE_INSN_BASIC_BLOCK)
 (cinsn (set (mem/c:SI (plus:DI (reg/f:DI virtual-stack-vars)
   (const_int -4)) [1 i+0 S4 A32])
-  (reg:SI di [ i ])) "t.c":2
+  (reg:SI di [ i ]))
   (nil))
 (cnote NOTE_INSN_FUNCTION_BEG)
 (cinsn (set (reg:SI 89)
   (mem/c:SI (plus:DI (reg/f:DI virtual-stack-vars)
-  (const_int -4)) [1 i+0 S4 A32])) "t.c":3
+  (const_int -4)) [1 i+0 S4 A32]))
   (nil))
 (cinsn (parallel [
   (set (reg:SI 87 [ _2 ])
   (ashift:SI (reg:SI 89)
   (const_int 1)))
   (clobber (reg:CC flags))
-  ]) "t.c":3
+  ])
   (expr_list:REG_EQUAL (ashift:SI (mem/c:SI (plus:DI (reg/f:DI 
virtual-stack-vars)
   (const_int -4)) [1 i+0 S4 A32])
   (const_int 1))
   (nil)))
 (cinsn (set (reg:SI 88 [  ])
-  (reg:SI 87 [ _2 ])) "t.c":3
+  (reg:SI 87 [ _2 ]))
   (nil))
 (cinsn (set (reg/i:SI ax)
-  (reg:SI 88 [  ])) "t.c":4
+  (reg:SI 88 [  ]))
   (nil))
-(cinsn (use (reg/i:SI ax)) "t.c":4
+(cinsn (use (reg/i:SI ax))
   (nil))
 (edge-to exit (flags "FALLTHRU"))
) ;; block 2
diff --git a/gcc/print-rtl.c b/gcc/print-rtl.c
index f114cb4..2bf7a13 100644
--- a/gcc/print-rtl.c
+++ b/gcc/print-rtl.c
@@ -288,8 +288,9 @@ print_rtx_operand_code_i (const_rtx in_rtx, int idx)
 
   /*  Pretty-print insn locations.  Ignore scoping as it is mostly
  redundant with line number information and do not print anything
- when there is no location information available.  */
-  if (INSN_HAS_LOCATION (in_insn))
+ when there is no location information available.
+ Don't print locations when in compact mode.  */
+  if (INSN_HAS_LOCATION (in_insn) && !flag_compact)
{
  expanded_location xloc = insn_location (in_insn);
  fprintf (outfile, " \"%s\":%i", xloc.file, xloc.line);
-- 
1.8.5.3

Re: [PATCH GCC]Simplify (convert)(X op const) -> (convert)X op (convert)const by match


On 10/12/2016 02:48 AM, Richard Biener wrote:

On Tue, Oct 11, 2016 at 11:34 PM, Marc Glisse  wrote:

On Tue, 11 Oct 2016, Bin Cheng wrote:


We missed folding (convert)(X op const) -> (convert)X op (convert)const
for unsigned narrowing because of reason reported at
https://gcc.gnu.org/ml/gcc/2016-07/msg00126.html
This patch fixes the issue by adding new match pattern, it also
adds a test case.  This is the prerequisite patch for next patch adding new
vectorization pattern.



Some technical comments below. I am sure Jeff and/or Richi will have more to
say on the approach. I am a bit surprised to see it as adding a new
transformation, instead of moving an old one.


The "old one" would be c-family/c-common.c:shorten_binary_op.  It's generally
prefered to move stuff, preserving semantics.
Right.   Kai and I hadn't looked much at shorten_binary_op (focusing 
more on shorten_compare).  But the same principles apply to both.


Namely that the existing routines should be twiddled to handle warnings 
only, but not modify the underlying IL.  IL modifications 
(canonicalization and optimization) should be moved into the match.pd 
framework.


When Kai left Red Hat, that work stalled.  I've got bits and pieces of 
that work lying around, but I don't think they'd help Bin's work right now.




There is also already a bunch of similar match.pd patterns here:

[ ... ]
Right.   Those were a first start at handling some of the desired 
narrowing, focused primarily on BZs that required narrowing to resolve. 
Like Kai's work, I have some generalizations and improvements in a 
half-completed state here, but haven't had time to work on them.



Jeff

[PATCH] Fold __builtin_memchr (simplified version 4)

Simplified version that supports only valid null-terminated string constants.
Apart from that, I added checking for constant folding of expressions that
have side effects.

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Ready to be installed?
Martin
>From 5028bf5cf23cda31e72a342b821474ed0c3c07b9 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Thu, 13 Oct 2016 10:30:56 +0200
Subject: [PATCH 3/4] Fold __builtin_memchr function

gcc/ChangeLog:

2016-10-13  Martin Liska  

	* builtins.h(target_char_cst_p): Declare the function.
	* builtins.c (fold_builtin_memchr): Remove.
	(target_char_cst_p): Move the function from gimple-fold.c.
	(fold_builtin_3): Do not call the function.
	* gimple-fold.c (gimple_fold_builtin_memchr): New function.
	(gimple_fold_builtin): Call the function.
	* fold-const-call.c (fold_const_call_1): Handle CFN_BUILT_IN_MEMCHR.
---
 gcc/builtins.c| 59 ++-
 gcc/builtins.h|  1 +
 gcc/fold-const-call.c | 31 +
 gcc/gimple-fold.c | 77 +--
 4 files changed, 109 insertions(+), 59 deletions(-)

diff --git a/gcc/builtins.c b/gcc/builtins.c
index ed5a635..03d8563 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -148,7 +148,6 @@ static tree rewrite_call_expr (location_t, tree, int, tree, int, ...);
 static bool validate_arg (const_tree, enum tree_code code);
 static rtx expand_builtin_fabs (tree, rtx, rtx);
 static rtx expand_builtin_signbit (tree, rtx);
-static tree fold_builtin_memchr (location_t, tree, tree, tree, tree);
 static tree fold_builtin_memcmp (location_t, tree, tree, tree);
 static tree fold_builtin_isascii (location_t, tree);
 static tree fold_builtin_toascii (location_t, tree);
@@ -7242,47 +7241,6 @@ fold_builtin_sincos (location_t loc,
 			 fold_build1_loc (loc, REALPART_EXPR, type, call)));
 }
 
-/* Fold function call to builtin memchr.  ARG1, ARG2 and LEN are the
-   arguments to the call, and TYPE is its return type.
-   Return NULL_TREE if no simplification can be made.  */
-
-static tree
-fold_builtin_memchr (location_t loc, tree arg1, tree arg2, tree len, tree type)
-{
-  if (!validate_arg (arg1, POINTER_TYPE)
-  || !validate_arg (arg2, INTEGER_TYPE)
-  || !validate_arg (len, INTEGER_TYPE))
-return NULL_TREE;
-  else
-{
-  const char *p1;
-
-  if (TREE_CODE (arg2) != INTEGER_CST
-	  || !tree_fits_uhwi_p (len))
-	return NULL_TREE;
-
-  p1 = c_getstr (arg1);
-  if (p1 && compare_tree_int (len, strlen (p1) + 1) <= 0)
-	{
-	  char c;
-	  const char *r;
-	  tree tem;
-
-	  if (target_char_cast (arg2, ))
-	return NULL_TREE;
-
-	  r = (const char *) memchr (p1, c, tree_to_uhwi (len));
-
-	  if (r == NULL)
-	return build_int_cst (TREE_TYPE (arg1), 0);
-
-	  tem = fold_build_pointer_plus_hwi_loc (loc, arg1, r - p1);
-	  return fold_convert_loc (loc, type, tem);
-	}
-  return NULL_TREE;
-}
-}
-
 /* Fold function call to builtin memcmp with arguments ARG1 and ARG2.
Return NULL_TREE if no simplification can be made.  */
 
@@ -8338,9 +8296,6 @@ fold_builtin_3 (location_t loc, tree fndecl,
 	return do_mpfr_remquo (arg0, arg1, arg2);
 break;
 
-case BUILT_IN_MEMCHR:
-  return fold_builtin_memchr (loc, arg0, arg1, arg2, type);
-
 case BUILT_IN_BCMP:
 case BUILT_IN_MEMCMP:
   return fold_builtin_memcmp (loc, arg0, arg1, arg2);;
@@ -9906,3 +9861,17 @@ is_inexpensive_builtin (tree decl)
 
   return false;
 }
+
+/* Return true if T is a constant and the value cast to a target char
+   can be represented by a host char.
+   Store the casted char constant in *P if so.  */
+
+bool
+target_char_cst_p (tree t, char *p)
+{
+  if (!tree_fits_uhwi_p (t) || CHAR_TYPE_SIZE != HOST_BITS_PER_CHAR)
+return false;
+
+  *p = (char)tree_to_uhwi (t);
+  return true;
+}
diff --git a/gcc/builtins.h b/gcc/builtins.h
index 8d0acd0..5e83646 100644
--- a/gcc/builtins.h
+++ b/gcc/builtins.h
@@ -97,6 +97,7 @@ extern unsigned HOST_WIDE_INT target_percent;
 extern char target_percent_s[3];
 extern char target_percent_c[3];
 extern char target_percent_s_newline[4];
+extern bool target_char_cst_p (tree t, char *p);
 
 extern internal_fn associated_internal_fn (tree);
 extern internal_fn replacement_internal_fn (gcall *);
diff --git a/gcc/fold-const-call.c b/gcc/fold-const-call.c
index f67b245..05a15f9 100644
--- a/gcc/fold-const-call.c
+++ b/gcc/fold-const-call.c
@@ -28,6 +28,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "fold-const-call.h"
 #include "case-cfn-macros.h"
 #include "tm.h" /* For C[LT]Z_DEFINED_AT_ZERO.  */
+#include "builtins.h"
 
 /* Functions that test for certain constant types, abstracting away the
decision about whether to check for overflow.  */
@@ -1463,6 +1464,36 @@ fold_const_call_1 (combined_fn fn, tree type, tree arg0, tree arg1, tree arg2)
   return NULL_TREE;
 }
 
+  switch (fn)
+{
+case

[PATCH] Test folding of str{n}{case}cmp and memchr (simplified version 4)

Simplified version of tests, where I added tests for side effects.

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Ready to be installed?
Martin
>From 83da10e2bd4f4e36028ca33d7d3a0472e8b46d7a Mon Sep 17 00:00:00 2001
From: marxin 
Date: Tue, 16 Aug 2016 15:56:01 +0200
Subject: [PATCH 4/4] Test folding of str{n}{case}cmp and memchr

gcc/testsuite/ChangeLog:

2016-08-16  Martin Liska  

	* gcc.dg/tree-ssa/builtins-folding-generic.c: New test.
	* gcc.dg/tree-ssa/builtins-folding-gimple.c: Likewise.
	* gcc.dg/tree-ssa/builtins-folding-gimple-ub.c: Likewise.
---
 .../gcc.dg/tree-ssa/builtins-folding-generic.c |  76 ++
 .../gcc.dg/tree-ssa/builtins-folding-gimple-ub.c   |  23 +++
 .../gcc.dg/tree-ssa/builtins-folding-gimple.c  | 161 +
 3 files changed, 260 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/builtins-folding-generic.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/builtins-folding-gimple-ub.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/builtins-folding-gimple.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/builtins-folding-generic.c b/gcc/testsuite/gcc.dg/tree-ssa/builtins-folding-generic.c
new file mode 100644
index 000..175feff
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/builtins-folding-generic.c
@@ -0,0 +1,76 @@
+/* { dg-do run } */
+/* { dg-options "-O1 -fdump-tree-original" } */
+
+char *buffer1;
+char *buffer2;
+
+#define SIZE 1000
+
+int
+main (void)
+{
+  const char* const foo1 = "hello world";
+
+  buffer1 = __builtin_malloc (SIZE);
+  __builtin_strcpy (buffer1, foo1);
+  buffer2 = __builtin_malloc (SIZE);
+  __builtin_strcpy (buffer2, foo1);
+
+  /* MEMCHR.  */
+  if (__builtin_memchr ("hello world", 'x', 11))
+__builtin_abort ();
+  if (__builtin_memchr ("hello world", 'x', 0) != 0)
+__builtin_abort ();
+  if (__builtin_memchr ("hello world", 'w', 2))
+__builtin_abort ();
+  if (__builtin_memchr ("hello world", 'd', 10))
+__builtin_abort ();
+  if (__builtin_memchr ("hello world", '\0', 11))
+__builtin_abort ();
+
+  /* STRCMP.  */
+  if (__builtin_strcmp ("hello", "a") <= 0)
+__builtin_abort ();
+  if (__builtin_strcmp ("a", "a") != 0)
+__builtin_abort ();
+  if (__builtin_strcmp ("a", "") <= 0)
+__builtin_abort ();
+  if (__builtin_strcmp ("", "a") >= 0)
+__builtin_abort ();
+  if (__builtin_strcmp ("ab", "ba") >= 0)
+__builtin_abort ();
+
+  /* STRNCMP.  */
+  if (__builtin_strncmp ("hello", "a", 0) != 0)
+__builtin_abort ();
+  if (__builtin_strncmp ("a", "a", 100) != 0)
+__builtin_abort ();
+  if (__builtin_strncmp ("a", "", 100) <= 0)
+__builtin_abort ();
+  if (__builtin_strncmp ("", "a", 100) >= 0)
+__builtin_abort ();
+  if (__builtin_strncmp ("ab", "ba", 1) >= 0)
+__builtin_abort ();
+  if (__builtin_strncmp ("aab", "aac", 2) != 0)
+__builtin_abort ();
+
+  /* STRCASECMP.  */
+  if (__builtin_strcasecmp ("a", "a") != 0)
+__builtin_abort ();
+
+  /* STRNCASECMP.  */
+  if (__builtin_strncasecmp ("hello", "a", 0) != 0)
+__builtin_abort ();
+  if (__builtin_strncasecmp ("a", "a", 100) != 0)
+__builtin_abort ();
+  if (__builtin_strncasecmp ("aab", "aac", 2) != 0)
+__builtin_abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-not "__builtin_strcmp" "original" } } */
+/* { dg-final { scan-tree-dump-not "__builtin_strcasecmp" "original" } } */
+/* { dg-final { scan-tree-dump-not "__builtin_strncmp" "original" } } */
+/* { dg-final { scan-tree-dump-not "__builtin_strncasecmp" "original" } } */
+/* { dg-final { scan-tree-dump-not "__builtin_memchr" "original" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/builtins-folding-gimple-ub.c b/gcc/testsuite/gcc.dg/tree-ssa/builtins-folding-gimple-ub.c
new file mode 100644
index 000..df0ede2
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/builtins-folding-gimple-ub.c
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-O1 -fdump-tree-optimized" } */
+
+char *buffer1;
+char *buffer2;
+
+#define SIZE 1000
+
+int
+main (void)
+{
+  const char* const foo1 = "hello world";
+
+  /* MEMCHR.  */
+  if (__builtin_memchr ("", 'x', 1000)) /* Not folded away.  */
+__builtin_abort ();
+  if (__builtin_memchr (foo1, 'x', 1000)) /* Not folded away.  */
+__builtin_abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "__builtin_memchr" 2 "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/builtins-folding-gimple.c b/gcc/testsuite/gcc.dg/tree-ssa/builtins-folding-gimple.c
new file mode 100644
index 000..283bd1c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/builtins-folding-gimple.c
@@ -0,0 +1,161 @@
+/* { dg-do run } */
+/* { dg-options "-O1 -fdump-tree-optimized" } */
+
+char *buffer1;
+char *buffer2;
+
+#define SIZE 1000
+
+int
+main (void)
+{
+  const char* const foo1 = "hello world";
+
+  buffer1 = __builtin_malloc (SIZE);
+

[PATCH, GCC/ARM 1/2] Add multilib support for embedded bare-metal targets

2016-10-13 Thread Thomas Preudhomme


Hi ARM maintainers,

This patchset aims at adding multilib support for R and M profile ARM 
architectures and allowing it to be built alongside multilib for A profile ARM 
architectures. This specific patch adds the t-rmprofile multilib Makefile 
fragment for the former objective. Multilib are built for all M profile 
architecture involved: ARMv6S-M, ARMv7-M and ARMv7E-M as well as ARMv7. ARMv7 
multilib is used for R profile architectures but also A profile architectures.


ChangeLog entry is as follows:


*** gcc/ChangeLog ***

2016-10-03  Thomas Preud'homme  

* config.gcc: Allow new rmprofile value for configure option
--with-multilib-list.
* config/arm/t-rmprofile: New file.
* doc/install.texi (--with-multilib-list): Document new rmprofile value
for ARM.


Testing:

== aprofile ==
* "tree install/lib/gcc/arm-none-eabi/7.0.0" is the same before and after the 
patchset for both aprofile and rmprofile
* default spec (gcc -dumpspecs) is the same before and after the patchset for 
aprofile
* No difference in --print-multi-directory between before and after the patchset 
for aprofile for all combination of ISA (ARM/Thumb), architecture, CPU, FPU and 
float ABI


== rmprofile ==
* aprofile and rmprofile use similar directory structure (ISA/arch/FPU/float 
ABI) and directory naming
* Difference in --print-multi-directory between before [1] and after the 
patchset for rmprofile for all combination of ISA (ARM/Thumb), architecture, 
CPU, FPU and float ABI modulo the name and directory structure changes


[1] as per patch applied in ARM embedded branches 
https://gcc.gnu.org/viewcvs/gcc/branches/ARM/embedded-5-branch/gcc/config/arm/t-baremetal?view=markup


== aprofile + rmprofile ==
* aprofile,rmprofile and rmprofile,aprofile builds give an error saying it is 
not supported



Is this ok for master branch?

Best regards,

Thomas
diff --git a/gcc/config.gcc b/gcc/config.gcc
index e544d767b4e364c8853d7ece3bffac22840fd51b..bfd1127d6e8e647ca8c3a57dd2d58b586dffe4a5 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -3721,6 +3721,16 @@ case "${target}" in
 # pragmatic.
 tmake_profile_file="arm/t-aprofile"
 ;;
+			rmprofile)
+# Note that arm/t-rmprofile is a
+# stand-alone make file fragment to be
+# used only with itself.  We do not
+# specifically use the
+# TM_MULTILIB_OPTION framework because
+# this shorthand is more
+# pragmatic.
+tmake_profile_file="arm/t-rmprofile"
+;;
 			default)
 ;;
 			*)
@@ -3730,9 +3740,10 @@ case "${target}" in
 			esac
 
 			if test "x${tmake_profile_file}" != x ; then
-# arm/t-aprofile is only designed to work
-# without any with-cpu, with-arch, with-mode,
-# with-fpu or with-float options.
+# arm/t-aprofile and arm/t-rmprofile are only
+# designed to work without any with-cpu,
+# with-arch, with-mode, with-fpu or with-float
+# options.
 if test "x$with_arch" != x \
 || test "x$with_cpu" != x \
 || test "x$with_float" != x \
diff --git a/gcc/config/arm/t-rmprofile b/gcc/config/arm/t-rmprofile
new file mode 100644
index ..c195a6590c2f8e1753a9f2498583f5be89df7d1e
--- /dev/null
+++ b/gcc/config/arm/t-rmprofile
@@ -0,0 +1,172 @@
+# Copyright (C) 2016 Free Software Foundation, Inc.
+#
+# This file is part of GCC.
+#
+# GCC is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3, or (at your option)
+# any later version.
+#
+# GCC is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# .
+
+# This is a target makefile fragment that attempts to get
+# multilibs built for the range of CPU's, FPU's and ABI's that
+# are relevant for the ARM architecture.  It should not be used in
+# conjunction with another make file fragment and assumes --with-arch,
+# --with-cpu, --with-fpu, --with-float, --with-mode have their default
+# values during the configure step.  We enforce this during the
+# top-level configury.
+
+MULTILIB_OPTIONS =
+MULTILIB_DIRNAMES=
+MULTILIB_EXCEPTIONS  =
+MULTILIB_MATCHES =
+MULTILIB_REUSE   =
+
+# We have the following hierachy:
+#   ISA: A32 (.) or T16/T32 (thumb).
+#   Architecture: ARMv6S-M (v6-m), ARMv7-M (v7-m), ARMv7E-M (v7e-m),
+# ARMv8-M Baseline (v8-m.base) or ARMv8-M Mainline (v8-m.main).
+#   FPU: VFPv3-D16 (fpv3), FPV4-SP-D16 (fpv4-sp), FPV5-SP-D16 (fpv5-sp),
+#VFPv5-D16 (fpv5), or None (.).
+#   Float-abi: Soft (.), softfp (softfp), or hard (hardfp).
+
+# Options to

Re: [ping * 2] remove optab functions for [us]divmod_optab in optabs.def


On 10/13/2016 07:18 PM, Prathamesh Kulkarni wrote:

On 13 October 2016 at 16:56, Bernd Schmidt  wrote:

On 10/06/2016 07:43 AM, Prathamesh Kulkarni wrote:


Pinging patch: https://gcc.gnu.org/ml/gcc-patches/2016-09/msg01038.html



If I understand correctly this is a latent issue where nonexistant libfunc
names are stored (but not currently used). If that's correct, then OK.

Hi Bernd,
AFAIU it's indeed a latent issue with optab_libfunc() returning
non-existent libfunc
names.


[...]


It seems probably this code-path never got triggered to generate call
to "__udivmoddi4" or "__divmoddi4"
and the issue remained latent.
Is the patch OK to commit ?


Yes, I think so. Thanks,


Bernd

Re: [PATCH] Remove a few -Wno-error from Makefile.in


On 10/13/2016 12:09 PM, Marek Polacek wrote:

I thought I had already done this, but apparently not.  I added these because
of -Wimplicit-fallthrough, but they're no longer needed, so remove it to not
suppress any possible useful warnings.

Bootstrapped/regtested on x86_64-linux, ppc64-linux and aarch64-linux,
ok for trunk?

2016-10-13  Marek Polacek  

* Makefile.in (insn-attrtab.o-warn, insn-dfatab.o-warn,
insn-latencytab.o-warn, insn-output.o-warn, insn-emit.o-warn): Don't
use -Wno-error.

OK.
jeff

Re: [PATCH] Remove a few -Wno-error from Makefile.in




On 10/13/2016 08:09 PM, Marek Polacek wrote:

I thought I had already done this, but apparently not.  I added these because
of -Wimplicit-fallthrough, but they're no longer needed, so remove it to not
suppress any possible useful warnings.

Bootstrapped/regtested on x86_64-linux, ppc64-linux and aarch64-linux,
ok for trunk?

2016-10-13  Marek Polacek  

* Makefile.in (insn-attrtab.o-warn, insn-dfatab.o-warn,
insn-latencytab.o-warn, insn-output.o-warn, insn-emit.o-warn): Don't
use -Wno-error.


Ok.


Bernd

Go patch committed: don't get backend version of redefinition

2016-10-13 Thread Ian Lance Taylor

This patch to the Go frontend changes it to not try to get the backend
version of a redefinition.  A redefinition is an error anyhow, and
getting the backend version can cause the compiler to crash as it
walks over a list of statements for the second time.  No test case
added as I don't think it's worth adding a test case for a
crash-on-invalid.  Bootstrapped and ran Go testsuite on
x86_64-pc-linux-gnu.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 241124)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-681580a3afc687ba3ff9ef240c67e8630e4306e6
+e3913d96fb024b916c87a4dc01f413523467ead9
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/gogo.cc
===
--- gcc/go/gofrontend/gogo.cc   (revision 240942)
+++ gcc/go/gofrontend/gogo.cc   (working copy)
@@ -7214,6 +7214,14 @@ Named_object::get_backend(Gogo* gogo, st
   std::vector& type_decls,
   std::vector& func_decls)
 {
+  // If this is a definition, avoid trying to get the backend
+  // representation, as that can crash.
+  if (this->is_redefinition_)
+{
+  go_assert(saw_errors());
+  return;
+}
+
   switch (this->classification_)
 {
 case NAMED_OBJECT_CONST:

[PATCH] Change test to use VERIFY not assert


This used to pass because  included , but
it doesn't now. The test wasn't failing because the stdc++.h PCH
included , but it fails without PCH.

* testsuite/26_numerics/random/default_random_engine.cc: Use VERIFY
instead of assert.

Tested x86_64-linux, committed to trunk.

commit 81c756a8a213ca217602aa1fa90d513989adb2e5
Author: Jonathan Wakely 
Date:   Thu Oct 13 17:30:43 2016 +0100

Change test to use VERIFY not assert

* testsuite/26_numerics/random/default_random_engine.cc: Use VERIFY
instead of assert.

diff --git a/libstdc++-v3/testsuite/26_numerics/random/default_random_engine.cc 
b/libstdc++-v3/testsuite/26_numerics/random/default_random_engine.cc
index 99d5e1f..e21c7ae 100644
--- a/libstdc++-v3/testsuite/26_numerics/random/default_random_engine.cc
+++ b/libstdc++-v3/testsuite/26_numerics/random/default_random_engine.cc
@@ -20,7 +20,7 @@
 // with this library; see the file COPYING3.  If not see
 // .
 
-// 26.4.5 Engines and egine adaptors with predefined parameters [rand.predef]
+// 26.4.5 Engines and engine adaptors with predefined parameters [rand.predef]
 // 26.4.5 [10]
 
 #include 
@@ -38,7 +38,7 @@ test01()
   std::minstd_rand0 b;
   b.discard();
 
-  assert( a() == b() );
+  VERIFY( a() == b() );
 }
 
 int main()

Re: [PATCH] Create the *logue in the same order as before (PR77962)

2016-10-13 Thread Segher Boessenkool

On Thu, Oct 13, 2016 at 03:15:37PM +, Segher Boessenkool wrote:
> Bootstrapping+testing on powerpc64-linux {-m64,-m32}, all languages;
> and on x86_64-linux all,go,obj-c++ (i.e. no ada).
> 
> Is this okay for trunk if testing succeeds?  And sorry for the breakage.

No new failures for either.


Segher

[PATCH] Add missing header in testcase


Another one that fails without PCH.

* testsuite/experimental/algorithm/sample.cc: Add missing header.

Tested x86_64-linux, committed to trunk.

commit b6ba4386fc81777bd4a06b2807b03b38f7f85d76
Author: Jonathan Wakely 
Date:   Thu Oct 13 17:42:03 2016 +0100

Add missing  header in testcase

* testsuite/experimental/algorithm/sample.cc: Add missing header.

diff --git a/libstdc++-v3/testsuite/experimental/algorithm/sample.cc 
b/libstdc++-v3/testsuite/experimental/algorithm/sample.cc
index 19681d7..0d84e9d 100644
--- a/libstdc++-v3/testsuite/experimental/algorithm/sample.cc
+++ b/libstdc++-v3/testsuite/experimental/algorithm/sample.cc
@@ -22,6 +22,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 std::mt19937 rng;

[PATCH] Avoid #include in other headers


The  header is pretty large, especially in C++17 mode
because the searchers include  and . This
tweaks other headers to avoid including it unnecessarily.

We could reduce things much further by splitting std::function and
std::reference_wrapper into their own headers, which I'm working on.
This kind of internal refactoring helps counteract the fact that the
library keep growing with each new standard revision so the headers
are larger and larger.

Any time we reduce interdepencies within the library headers we cause
compilation errors for some user code, but the fix is just to add the
missing headers. I'll mention this in the GCC 7 porting-to doc nearer
the release.

* include/bits/shared_ptr_base.h: Include .
[!__cpp_rtti]: Do not include .
* include/experimental/array: Do not include .
* include/experimental/memory: Include 
instead of .
* include/experimental/propagate_const: Include ,
, and  instead of .
* include/experimental/tuple: Do not include .
* include/std/future: Include .
* include/std/memory: Do not include .
* include/std/mutex: [_GLIBCXX_HAVE_TLS]: Likewise.
* testsuite/20_util/shared_ptr/thread/default_weaktoshared.cc: Add
missing includes.
* testsuite/20_util/shared_ptr/thread/mutex_weaktoshared.cc: Likewise.
* testsuite/20_util/specialized_algorithms/memory_management_tools/
1.cc: Likewise.
* testsuite/30_threads/call_once/60497.cc: Likewise.
* testsuite/30_threads/lock/2.cc: Likewise.
* testsuite/30_threads/thread/native_handle/cancel.cc: Likewise.
* testsuite/experimental/algorithm/sample.cc: Likewise.
* testsuite/experimental/array/make_array.cc: Likewise.
* testsuite/experimental/array/neg.cc: Likewise. Adjust dg-error line.
* testsuite/experimental/propagate_const/assignment/move_neg.cc:
Adjust dg-error lines.
* testsuite/experimental/propagate_const/cons/move_neg.cc: Likewise.
* testsuite/experimental/propagate_const/requirements2.cc: Likewise.
* testsuite/experimental/propagate_const/requirements3.cc: Likewise.
* testsuite/experimental/propagate_const/requirements4.cc: Likewise.
* testsuite/experimental/propagate_const/requirements5.cc: Likewise.

Tested powerpc64le-linux (without PCH), committed to trunk.


commit b125f5397462eee46dec2693232afc677df4e421
Author: Jonathan Wakely 
Date:   Thu Oct 13 15:55:18 2016 +0100

Avoid #include  in other headers

* include/bits/shared_ptr_base.h: Include .
[!__cpp_rtti]: Do not include .
* include/experimental/array: Do not include .
* include/experimental/memory: Include 
instead of .
* include/experimental/propagate_const: Include ,
, and  instead of .
* include/experimental/tuple: Do not include .
* include/std/future: Include .
* include/std/memory: Do not include .
* include/std/mutex: [_GLIBCXX_HAVE_TLS]: Likewise.
* testsuite/20_util/shared_ptr/thread/default_weaktoshared.cc: Add
missing includes.
* testsuite/20_util/shared_ptr/thread/mutex_weaktoshared.cc: Likewise.
* testsuite/20_util/specialized_algorithms/memory_management_tools/
1.cc: Likewise.
* testsuite/30_threads/call_once/60497.cc: Likewise.
* testsuite/30_threads/lock/2.cc: Likewise.
* testsuite/30_threads/thread/native_handle/cancel.cc: Likewise.
* testsuite/experimental/algorithm/sample.cc: Likewise.
* testsuite/experimental/array/make_array.cc: Likewise.
* testsuite/experimental/array/neg.cc: Likewise. Adjust dg-error line.
* testsuite/experimental/propagate_const/assignment/move_neg.cc:
Adjust dg-error lines.
* testsuite/experimental/propagate_const/cons/move_neg.cc: Likewise.
* testsuite/experimental/propagate_const/requirements2.cc: Likewise.
* testsuite/experimental/propagate_const/requirements3.cc: Likewise.
* testsuite/experimental/propagate_const/requirements4.cc: Likewise.
* testsuite/experimental/propagate_const/requirements5.cc: Likewise.

diff --git a/libstdc++-v3/include/bits/shared_ptr_base.h 
b/libstdc++-v3/include/bits/shared_ptr_base.h
index e8820a1..422e3b5 100644
--- a/libstdc++-v3/include/bits/shared_ptr_base.h
+++ b/libstdc++-v3/include/bits/shared_ptr_base.h
@@ -49,7 +49,10 @@
 #ifndef _SHARED_PTR_BASE_H
 #define _SHARED_PTR_BASE_H 1
 
-#include 
+#include 
+#if __cpp_rtti
+# include 
+#endif
 #include 
 #include 
 
diff --git a/libstdc++-v3/include/experimental/array 
b/libstdc++-v3/include/experimental/array
index 31a066b..34d75cc 100644
--- a/libstdc++-v3/include/experimental/array
+++ b/libstdc++-v3/include/experimental/array
@@ -36,7 +36,6 @@
 #else
 
 #include 
-#include 
 #include 
 
 namespace std _GLIBCXX_VISIBILITY(default)
diff --git

[committed, arm, testsuite] fix dg-skip-if logic in Xscale-specific tests

2016-10-13 Thread Sandra Loosemore

I noticed that two Xscale-specific tests, gcc.target/arm/scd42-1.c and 
gcc.target/arm/scd42-2.c, were incorrectly being run in test 
configurations explicitly specifying some other incompatible -mcpu.  The 
similar test gcc.target/arm/scd42-3.c was correctly being skipped in 
that case, so I copied the logic from that file to correct the other two 
tests.  Because this was a straight cut-and-paste, I thought this 
qualified as an obvious fix, and have committed it.


-Sandra

2016-10-13  Sandra Loosemore 

	gcc/testsuite/
	* scd42-1.c: Skip if -mcpu incompatible with Xscale is specified,
	not just -march.
	* scd42-2.c: Fix existing logic to skip if -mcpu is incompatible
	with Xscale.
Index: gcc.target/arm/scd42-1.c
===
--- gcc.target/arm/scd42-1.c	(revision 470710)
+++ gcc.target/arm/scd42-1.c	(working copy)
@@ -1,6 +1,7 @@
 /* Verify that mov is preferred on XScale for loading a 1 byte constant. */
 /* { dg-do compile } */
-/* { dg-skip-if "incompatible options" { arm*-*-* } { "-march=*" } { "" } } */
+/* { dg-skip-if "Test is specific to Xscale" { arm*-*-* } { "-march=*" } { "-march=xscale" } } */
+/* { dg-skip-if "Test is specific to Xscale" { arm*-*-* } { "-mcpu=*" } { "-mcpu=xscale" } } */
 /* { dg-skip-if "do not override -mfloat-abi" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=softfp" } } */
 /* { dg-options "-mcpu=xscale -O -mfloat-abi=softfp" } */
 
Index: gcc.target/arm/scd42-2.c
===
--- gcc.target/arm/scd42-2.c	(revision 470710)
+++ gcc.target/arm/scd42-2.c	(working copy)
@@ -1,10 +1,10 @@
 /* Verify that mov is preferred on XScale for loading a 2 byte constant. */
 /* { dg-do compile } */
-/* { dg-options "-mcpu=xscale -O" } */
 /* { dg-skip-if "Test is specific to the Xscale" { arm*-*-* } { "-march=*" } { "-march=xscale" } } */
 /* { dg-skip-if "Test is specific to the Xscale" { arm*-*-* } { "-mcpu=*" } { "-mcpu=xscale" } } */
 /* { dg-skip-if "Test is specific to ARM mode" { arm*-*-* } { "-mthumb" } { "" } } */
 /* { dg-require-effective-target arm32 } */
+/* { dg-options "-mcpu=xscale -O" } */
 
 unsigned load2(void) __attribute__ ((naked));
 unsigned load2(void)

[PATCH] DWARF: make signedness explicit for enumerator const values

2016-10-13 Thread Pierre-Marie de Rodat

Hello,

Currently, the DWARF description does not specify the signedness of the
representation of enumeration types.  This is a problem in some
contexts where DWARF consumers need to determine if value X is greater
than value Y.

For instance in Ada:

type Enum_Type is ( A, B, C, D);
for Enum_Type use (-1, 0, 1, 2);

type Rec_Type (E : Enum_Type) is record
   when A .. B => null;
   when others => B : Booleann;
end record;

The above can be described in DWARF the following way:

DW_TAG_enumeration_type(Enum_Type)
| DW_AT_byte_size: 1
  DW_TAG_enumerator(A)
  | DW_AT_const_value: -1
  DW_TAG_enumerator(B)
  | DW_AT_const_value: 0
  DW_TAG_enumerator(C)
  | DW_AT_const_value: 1
  DW_TAG_enumerator(D)
  | DW_AT_const_value: 2

DW_TAG_structure_type(Rec_Type)
  DW_TAG_member(E)
  | DW_AT_type: 
  DW_TAG_variant_part
  | DW_AT_discr: 
DW_TAG_variant
| DW_AT_discr_list: DW_DSC_range 0x7f 0
DW_TAG_variant
| DW_TAG_member(b)

DWARF consumers need to know that enumerators (A, B, C and D) are signed
in order to determine the set of E values for which Rec_Type has a B
field.  In practice, they need to know how to interpret the 0x7f LEB128
number above (-1, not 127).

There seems to be only two alternatives to solve this issue: one is to
add a DW_AT_type attribute to DW_TAG_enumerator_type DIEs to make it
point to a base type that specifies the signedness.  The other is to
make sure the form of the DW_AT_const_value attribute carries the
signedness information.  This patch implements the latter.

Currently, most of these attributes are generated with DW_FORM_data*
forms (dw_val_class_unsigned_const).  This patch changes the enumerator
description generation to always use instead the DW_FORM_[us]data forms.
It does so adding a new dw_val_class ("explicit unsigned const"), using
it for unsigned enumerators and using "[signed] const" for the signed
ones.

Bootstrapped and regtested (GCC+GDB testsuites) sucessfully on
x86_64-linux.  I also checked that the new testcase fails with current
trunk.  Ok to commit?

Thank you in advance!

gcc/

* dwarf2out.h (enum dw_val_class): Add a
dw_val_class_explicit_unsigned_const class.
(struct dw_val_node): Add a val_explicit_unsigned variant.
* dwarf2out.c (dw_val_equal_p, print_dw_val, attr_checksum,
attr_checksum_ordered, same_dw_val_p, size_of_die, value_format,
output_die): Handle dw_val_class_explicit_unsigned_const.
(add_AT_explicit_unsigned, AT_explicit_unsigned): New functions.
(gen_enumeration_type_die): Use the explicit unsigned const form
for all unsigned enumerator values and use the explicit [signed]
const form for all signed ones.

gcc/testsuite/

* gnat.dg/debug10.adb: New testcase.
---
 gcc/dwarf2out.c   | 61 ++-
 gcc/dwarf2out.h   |  3 ++
 gcc/testsuite/gnat.dg/debug10.adb | 39 +
 3 files changed, 95 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/gnat.dg/debug10.adb

diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index b5787ef..7022e6c 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -1361,6 +1361,7 @@ dw_val_equal_p (dw_val_node *a, dw_val_node *b)
 
 case dw_val_class_offset:
 case dw_val_class_unsigned_const:
+case dw_val_class_explicit_unsigned_const:
 case dw_val_class_const:
 case dw_val_class_range_list:
 case dw_val_class_lineptr:
@@ -3947,6 +3948,29 @@ AT_unsigned (dw_attr_node *a)
   return a->dw_attr_val.v.val_unsigned;
 }
 
+/* Add an explicitely unsigned integer attribute value to a DIE.  */
+
+static inline void
+add_AT_explicit_unsigned (dw_die_ref die, enum dwarf_attribute attr_kind,
+ unsigned HOST_WIDE_INT unsigned_val)
+{
+  dw_attr_node attr;
+
+  attr.dw_attr = attr_kind;
+  attr.dw_attr_val.val_class = dw_val_class_explicit_unsigned_const;
+  attr.dw_attr_val.val_entry = NULL;
+  attr.dw_attr_val.v.val_explicit_unsigned = unsigned_val;
+  add_dwarf_attr (die, );
+}
+
+static inline unsigned HOST_WIDE_INT
+AT_explicit_unsigned (dw_attr_node *a)
+{
+  gcc_assert (a != NULL
+ && AT_class (a) == dw_val_class_explicit_unsigned_const);
+  return a->dw_attr_val.v.val_explicit_unsigned;
+}
+
 /* Add an unsigned wide integer attribute value to a DIE.  */
 
 static inline void
@@ -5600,6 +5624,7 @@ print_dw_val (dw_val_node *val, bool recurse, FILE 
*outfile)
   fprintf (outfile, HOST_WIDE_INT_PRINT_DEC, val->v.val_int);
   break;
 case dw_val_class_unsigned_const:
+case dw_val_class_explicit_unsigned_const:
   fprintf (outfile, HOST_WIDE_INT_PRINT_UNSIGNED, val->v.val_unsigned);
   break;
 case dw_val_class_const_double:
@@ -5998,6 +6023,7 @@ attr_checksum (dw_attr_node *at, struct md5_ctx *ctx, int 
*mark)
   CHECKSUM (at->dw_attr_val.v.val_int);

Re: [PATCH] - improve sprintf buffer overflow detection (middle-end/49905)

2016-10-13 Thread Martin Sebor

No worries: I've refreshed your patch on top of Thomas Preud'homme's for
PR testsuite/77710 and found that one more bit is needed to fix this
completely. 32-bit Solaris shows three more warnings:

/vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-1.c:1355:3:
warning: format '%lc' expects argument of type 'wint_t', but argument 6 has
type 'int' [-Wformat=]
/vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-1.c:1356:3:
warning: format '%lc' expects argument of type 'wint_t', but argument 6 has
type 'int' [-Wformat=]
/vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-1.c:1357:3:
warning: format '%lc' expects argument of type 'wint_t', but argument 6 has
type 'int' [-Wformat=]

Rats! I overlooked those in followup patch I committed to fix
the others. I had tested the change with a 32-bit cross compiler
but I still see them in the 32-bit Solaris cross compiler, though
not in the i366 one. I assumed the i386 compiler was a good enough
proxy but now that I've checked more carefully I see that it warns
for %lc with a wchar_t argument such as L'a' but not for int such
as 0, while the 32-bit Solaris compiler for %lc with an int argument
and not for wchar_t.

In the i386 compiler wchar_t is long and wint_t is unsigned int while
in the Solaris one both wchar_t and wint_t are long int. Even though
these types and arguments are the same width (and on Solaris even the
same sign), -Wformat still warns.

I've fixed fix this in the test in r241123. Since I didn't manage
to convince Joseph that the warning is unhelpful in our discussion
last week I wasn't going to pursue it but I've now changed my mind.
The warning is obviously detrimental to portability so I've raised
bug 77970 for it.

Thanks
Martin

Fixed as follows:

With this one and your refreshed patch, all failures are gone now for
i386-pc-solaris2.12, sparc-sun-solaris2.12, and x86_64-pc-linux-gnu.

Rainer

[PATCH][AArch64] Use new target pass registration framework for FMA steering pass

2016-10-13 Thread Kyrill Tkachov


Hi all,

This patch moves the aarch64-specific FMA steering pass registration into the 
new framework
that Jakub introduced. With this patch the RTL dump for the steering pass is 
now numbered properly
so that it appears after the regrename pass, rather than getting a dump number 
that puts it after
all the other passes.

I've followed a similar approach to [1] and added an aarch64-passes.def file 
and updated
PASSES_EXTRA in t-aarch64. I deleted cortex-a57-fma-steering.h as I don't think 
it adds any value.
The prototype for the make_pass* function works just as well in 
aarch64-protos.h I think.

Bootstrapped and tested on aarch64-none-linux-gnu.
Manually checked that the pass still runs when tuning for Cortex-A57 and 
doesn't run otherwise.

Ok for trunk?

Thanks,
Kyrill

[1] https://gcc.gnu.org/ml/gcc-patches/2016-10/msg00615.html

2016-10-13  Kyrylo Tkachov  

* config/aarch64/aarch64.c: Delete inclusion of
cortex-a57-fma-steering.h.
(aarch64_override_options): Delete call
to aarch64_register_fma_steering.
* config/aarch64/aarch64-protos.h (make_pass_fma_steering): Declare.
* config/aarch64/cortex-a57-fma-steering.h: Delete.
* config/aarch64/aarch64-passes.def: New file.
* config/aarch64/cortex-a57-fma-steering.c
(aarch64_register_fma_steering): Delete definition.
(make_pass_fma_steering): Remove static qualifier.
* config/aarch64/t-aarch64 (PASSES_EXTRA): New directive.
(cortex-a57-fma-steering.o): Remove dependency on
cortex-a57-fma-steering.h.
diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
index 4c551ef143d3b32e94bd58989c85ebd3352cdd9b..b6ca3dfacb0dc88e5d688905d9d013263d4e8d7f 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -464,4 +464,6 @@ enum aarch64_parse_opt_result aarch64_parse_extension (const char *,
 std::string aarch64_get_extension_string_for_isa_flags (unsigned long,
 			unsigned long);
 
+rtl_opt_pass *make_pass_fma_steering (gcc::context *ctxt);
+
 #endif /* GCC_AARCH64_PROTOS_H */
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index ef8b8a24388d2f8e21271e0285b8d9d48078e759..e7556632901177c04f9884be4f3ee40e5f677917 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -64,7 +64,6 @@
 #include "rtl-iter.h"
 #include "tm-constrs.h"
 #include "sched-int.h"
-#include "cortex-a57-fma-steering.h"
 #include "target-globals.h"
 #include "common/common-target.h"
 
@@ -8561,9 +8560,6 @@ aarch64_override_options (void)
  while processing functions with potential target attributes.  */
   target_option_default_node = target_option_current_node
   = build_target_option_node (_options);
-
-  aarch64_register_fma_steering ();
-
 }
 
 /* Implement targetm.override_options_after_change.  */
diff --git a/gcc/config/aarch64/cortex-a57-fma-steering.h b/gcc/config/aarch64/aarch64-passes.def
similarity index 78%
rename from gcc/config/aarch64/cortex-a57-fma-steering.h
rename to gcc/config/aarch64/aarch64-passes.def
index 65bf5acc132d2db645d1b00ef031dc33a195bb78..7fe80391a3fb0dc79715b9fb23fd4c08a9d26d74 100644
--- a/gcc/config/aarch64/cortex-a57-fma-steering.h
+++ b/gcc/config/aarch64/aarch64-passes.def
@@ -1,6 +1,5 @@
-/* This file contains declarations for the FMA steering optimization
-   pass for Cortex-A57.
-   Copyright (C) 2015-2016 Free Software Foundation, Inc.
+/* AArch64-specific passes declarations.
+   Copyright (C) 2016 Free Software Foundation, Inc.
Contributed by ARM Ltd.
 
This file is part of GCC.
@@ -19,4 +18,4 @@
along with GCC; see the file COPYING3.  If not see
.  */
 
-void aarch64_register_fma_steering (void);
+INSERT_PASS_AFTER (pass_regrename, 1, pass_fma_steering);
diff --git a/gcc/config/aarch64/cortex-a57-fma-steering.c b/gcc/config/aarch64/cortex-a57-fma-steering.c
index 1bf804b4873c6b32e0eb3d640a74c2e52843e796..b5f329f75a6ccfcdf5a1d1dda6758bdd87ba 100644
--- a/gcc/config/aarch64/cortex-a57-fma-steering.c
+++ b/gcc/config/aarch64/cortex-a57-fma-steering.c
@@ -35,7 +35,6 @@
 #include "context.h"
 #include "tree-pass.h"
 #include "regrename.h"
-#include "cortex-a57-fma-steering.h"
 #include "aarch64-protos.h"
 
 /* For better performance, the destination of FMADD/FMSUB instructions should
@@ -1068,21 +1067,8 @@ public:
 
 /* Create a new fma steering pass instance.  */
 
-static rtl_opt_pass *
+rtl_opt_pass *
 make_pass_fma_steering (gcc::context *ctxt)
 {
   return new pass_fma_steering (ctxt);
 }
-
-/* Register the FMA steering pass to the pass manager.  */
-
-void
-aarch64_register_fma_steering ()
-{
-  opt_pass *pass_fma_steering = make_pass_fma_steering (g);
-
-  struct register_pass_info fma_steering_info
-= { pass_fma_steering, "rnreg", 1, PASS_POS_INSERT_AFTER };
-
-  register_pass (_steering_info);
-}
diff --git a/gcc/config/aarch64/t-aarch64 b/gcc/config/aarch64/t-aarch64
index

Re: [PATCH] Create the *logue in the same order as before (PR77962)


On 10/13/2016 09:15 AM, Segher Boessenkool wrote:

PR77962 shows Go failing on 32-bit x86.  This happens because the i386
port requires the split stack prologue to be created before the normal
prologue, and my previous patch changed it to be the other way around.

This patch changes it back.  Things will be exactly as before for targets
that do not do shrink-wrapping for separate components.  For targets
that *do* support it, all three prologue/epilogue creation functions
will now be called twice for functions that have anything wrapped
separately (instead of just the prologue created twice).

Bootstrapping+testing on powerpc64-linux {-m64,-m32}, all languages;
and on x86_64-linux all,go,obj-c++ (i.e. no ada).

Is this okay for trunk if testing succeeds?  And sorry for the breakage.


Segher


2016-10-13  Segher Boessenkool  

PR bootstrap/77962
* function.c (thread_prologue_and_epilogue_insns): Call all
make_*logue_seq in the same order as traditional.  Call them
all a second time if shrink_wrapped-separate.

OK.
jeff

Re: [ping * 2] remove optab functions for [us]divmod_optab in optabs.def

2016-10-13 Thread Prathamesh Kulkarni

On 13 October 2016 at 16:56, Bernd Schmidt  wrote:
> On 10/06/2016 07:43 AM, Prathamesh Kulkarni wrote:
>>
>> Pinging patch: https://gcc.gnu.org/ml/gcc-patches/2016-09/msg01038.html
>
>
> If I understand correctly this is a latent issue where nonexistant libfunc
> names are stored (but not currently used). If that's correct, then OK.
Hi Bernd,
AFAIU it's indeed a latent issue with optab_libfunc() returning
non-existent libfunc
names.

In expand_twoval_binop_libfunc() we have:
 mode = GET_MODE (op0);
  libfunc = optab_libfunc (binoptab, mode);
  if (!libfunc)
return false;

When binoptab is sdivmod_optab:
optab_libfunc () could return bogus libfunc like "__divmoddi4"
resulting in link-error. That's because optab_libfunc() calls gen_int_libfunc()
which lazily constructs libfunc for "__divmoddi4" and returns it.
This is the same issue I came across when implementing divmod transform.

When binoptab is udivmod_optab:
optab_libfunc() could return  "__udivmoddi4" which would result
in wrong code. That's because expand_twoval_binop_libfunc() expects
the libfunc to take two arguments and return result whose mode is
twice that of it's argument whereas __udivmoddi4() takes 3 arguments,
the 3rd argument being a pointer to store the remainder and return value
passed as quotient.

Currently the only way to generate call to divmod libfunc is via
expand_twoval_binop_libfunc()
which is called *only* from expand_divmod() if mod libfunc does not exist:

/* No remainder function.  Try a quotient-and-remainder
   function, keeping the remainder.  */
  if (!remainder)
{
  remainder = gen_reg_rtx (compute_mode);
  if (!expand_twoval_binop_libfunc
  (unsignedp ? udivmod_optab : sdivmod_optab,
   op0, op1,
   NULL_RTX, remainder,
   unsignedp ? UMOD : MOD))
remainder = NULL_RTX;
}

It seems probably this code-path never got triggered to generate call
to "__udivmoddi4" or "__divmoddi4"
and the issue remained latent.
Is the patch OK to commit ?

Thanks,
Prathamesh
>
>
> Bernd

[PATCH] Qualify use of std::declval to avoid ADL


* include/experimental/propagate_const (element_type): Qualify
declval.

Tested x86_64-linux, committted to trunk.

commit 1df5cda6740d67ac7074fbbf03178d25b45549bb
Author: Jonathan Wakely 
Date:   Thu Oct 13 17:39:18 2016 +0100

Qualify use of std::declval to avoid ADL

* include/experimental/propagate_const (element_type): Qualify
declval.

diff --git a/libstdc++-v3/include/experimental/propagate_const 
b/libstdc++-v3/include/experimental/propagate_const
index 15ffe4a..e1fb4e4 100644
--- a/libstdc++-v3/include/experimental/propagate_const
+++ b/libstdc++-v3/include/experimental/propagate_const
@@ -63,7 +63,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 class propagate_const
 {
 public:
-  typedef remove_reference_t())> element_type;
+  typedef remove_reference_t())> element_type;
 
 private:
   template

Re: [PATCH, libfortran] PR 48587 Newunit allocator

2016-10-13 Thread Jerry DeLisle


On 10/13/2016 08:16 AM, Janne Blomqvist wrote:

Currently GFortran newer reuses unit numbers allocated with NEWUNIT=,
instead having a simple counter that is decremented each time such a
unit is opened.


I am going to have to study this a bit. After I added the newunit stack, the 
units are reused,  Need to see how this plays with internal units, and 
recursive, and dtio.


Jerry

[PATCH] Remove a few -Wno-error from Makefile.in

2016-10-13 Thread Marek Polacek

I thought I had already done this, but apparently not.  I added these because
of -Wimplicit-fallthrough, but they're no longer needed, so remove it to not
suppress any possible useful warnings.

Bootstrapped/regtested on x86_64-linux, ppc64-linux and aarch64-linux,
ok for trunk?

2016-10-13  Marek Polacek  

* Makefile.in (insn-attrtab.o-warn, insn-dfatab.o-warn,
insn-latencytab.o-warn, insn-output.o-warn, insn-emit.o-warn): Don't
use -Wno-error.

diff --git gcc/Makefile.in gcc/Makefile.in
index 76b77ab..d6e44e4 100644
--- gcc/Makefile.in
+++ gcc/Makefile.in
@@ -218,11 +218,6 @@ libgcov-merge-tool.o-warn = -Wno-error
 gimple-match.o-warn = -Wno-unused
 generic-match.o-warn = -Wno-unused
 dfp.o-warn = -Wno-strict-aliasing
-insn-attrtab.o-warn = -Wno-error
-insn-dfatab.o-warn = -Wno-error
-insn-latencytab.o-warn = -Wno-error
-insn-output.o-warn = -Wno-error
-insn-emit.o-warn = -Wno-error
 
 # All warnings have to be shut off in stage1 if the compiler used then
 # isn't gcc; configure determines that.  WARN_CFLAGS will be either

Marek

Re: [PATCH] Allow `make tags` to work from top-level directory


On 10/06/2016 07:21 AM, Eric Gallager wrote:

The libdecnumber, libgcc, and libobjc subdirectories are missing TAGS
targets in their Makefiles. The attached patch causes them to be
skipped when running `make tags`.

ChangeLog entry:

2016-10-06  Eric Gallager  

* Makefile.def: Mark libdecnumber, libgcc, and libobjc as missing
TAGS target.
* Makefile.in: Regenerate.


OK.  Please install.

Thanks,
Jeff

Re: RFC: Split into smaller pieces


On 13/10/16 19:19 +0100, Jonathan Wakely wrote:

On 13/10/16 18:34 +0100, Jonathan Wakely wrote:

Code which doesn't need the whole of  should include the
relevant  header instead.

This means that we don't need to pull the whole of  (and
 and ) into  just because shared_ptr
wants to use reference_wrapper in one place.  This reduces 
from 48kloc to 30kloc!


With a few additional changes to remove  from other
headers we can get:

old  |  new  | Header
--|---|---
47571 | 30449 | memory
49620 | 32498 | thread
49049 | 30861 | condition_variable
49459 | 31271 | shared_mutex
54215 | 37745 | future  75063 | 68509 | regex

Apart from , which is enormous even without , these
are pretty dramatic improvements.


And to show it's not just line-count that changes, here are the
-ftime-report numbers for including each header in an otherwise empty
file, compiled with -O0:

memory:
TOTAL  :   0.66  0.190.8656342 kB
TOTAL  :   0.41  0.090.4940487 kB

thread:
TOTAL  :   0.73  0.150.9063030 kB
TOTAL  :   0.59  0.110.7147508 kB

condition_variable:
TOTAL  :   0.77  0.150.9363641 kB
TOTAL  :   0.50  0.110.6147360 kB

shared_mutex:
TOTAL  :   0.79  0.140.9463985 kB
TOTAL  :   0.50  0.100.6147705 kB

future:
TOTAL  :   1.18  0.201.4092564 kB
TOTAL  :   0.90  0.171.0978584 kB

regex:
TOTAL  :   1.14  0.241.39100089 kB
TOTAL  :   1.04  0.241.3091322 kB

Re: RFC: Split into smaller pieces


Apparently this got spam-filtered and didn't make it to the lists...

On 13/10/16 18:34 +0100, Jonathan Wakely wrote:

This splits the large (2200 lines)  header into smaller
pieces, so there are separate headers for:

- std::less, std::equal_to etc. (already in their own header)
- std::__invoke (already in its own header)
- std::reference_wrapper (often used on its own, e.g. in )
- std::function (using in  and )

Everything else (std::mem_fn, std::bind, std::not_fn, searchers) stays
in , because we don't actually need them elsewhere in the
library.

Code which doesn't need the whole of  should include the
relevant  header instead.

This means that we don't need to pull the whole of  (and
 and ) into  just because shared_ptr
wants to use reference_wrapper in one place.  This reduces 
from 48kloc to 30kloc!

The patch is compressed because it's quite large, but it's mostly just
moving big blocks of code from  into new headers.

Any objections?

* include/Makefile.am: Add  and .
Order alphabetically.
* include/Makefile.in: Regenerate.
* include/bits/refwrap.h: New header.
(_Maybe_get_result_type,_Weak_result_type_impl, _Weak_result_type)
(_Reference_wrapper_base_impl, _Reference_wrapper_base)
(reference_wrapper, ref, cref): Move here from .
* include/bits/shared_ptr_base.h: Include  and
 instead of .
* include/bits/std_function.h: New header.
(_Maybe_unary_or_binary_function, bad_function_call)
(__is_location_invariant, _Nocopy_types, _Any_data)
(_Simple_type_wrapper, _Function_base, _Function_handler, function):
Move here from .
* include/bits/unique_ptr.h: Include .
* include/std/functional: Include new headers and move components to
them.
* include/std/future: Include .
* include/std/memory: Don't include .
* testsuite/20_util/default_delete/48631_neg.cc: Adjust dg-error line.
* testsuite/20_util/default_delete/void_neg.cc: Likewise.
* testsuite/20_util/shared_ptr/thread/default_weaktoshared.cc:
Include .
* testsuite/20_util/shared_ptr/thread/mutex_weaktoshared.cc: Likewise.
* testsuite/20_util/specialized_algorithms/memory_management_tools/
1.cc: Include .
* testsuite/20_util/unique_ptr/assign/48635_neg.cc: Adjust dg-error
lines.
* testsuite/20_util/unique_ptr/assign/cv_qual.cc: Likewise.
* testsuite/20_util/unique_ptr/cons/cv_qual.cc: Likewise.
* testsuite/20_util/unique_ptr/modifiers/cv_qual.cc: Likewise.
* testsuite/30_threads/thread/native_handle/cancel.cc: Include
.

Re: [PATCH] Test cases for PR77937

2016-10-13 Thread Rainer Orth

Hi Bill,

> Here are torture test cases for
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77937.  Markus Trippelsdorf
> kindly provided the source for the tests and verified the correct
> dejagnu options on x86_64-pc-linux-gnu.  Committed.
>
> Thanks,
> Bill
>
>
> 2016-10-13  Bill Schmidt  
>
>   PR tree-optimization/77937
>   * gcc.dg/torture/pr77937-1.c: New.
>   * gcc.dg/torture/pr77937-2.c: New.
>
>
> Index: gcc/testsuite/gcc.dg/torture/pr77937-1.c
> ===
> --- gcc/testsuite/gcc.dg/torture/pr77937-1.c  (revision 0)
> +++ gcc/testsuite/gcc.dg/torture/pr77937-1.c  (working copy)
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +/* { dg-do options "-O3 -march=amdfam10" { target { x86_64-*-* } } } */

this can't be right: you always need target { i?86-*-* x86_64-*-* } and
if really need be restrict it to 64-bit only with lp64.  This makes sure
the test is run correctly for multilib x86 configurations
(e.g. i686-pc-linux-gnu with -m64).  Same in the other test.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

[PATCH] Test cases for PR77937

2016-10-13 Thread Bill Schmidt

Here are torture test cases for
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77937.  Markus Trippelsdorf
kindly provided the source for the tests and verified the correct
dejagnu options on x86_64-pc-linux-gnu.  Committed.

Thanks,
Bill


2016-10-13  Bill Schmidt  

PR tree-optimization/77937
* gcc.dg/torture/pr77937-1.c: New.
* gcc.dg/torture/pr77937-2.c: New.


Index: gcc/testsuite/gcc.dg/torture/pr77937-1.c
===
--- gcc/testsuite/gcc.dg/torture/pr77937-1.c(revision 0)
+++ gcc/testsuite/gcc.dg/torture/pr77937-1.c(working copy)
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-do options "-O3 -march=amdfam10" { target { x86_64-*-* } } } */
+
+int *a;
+int b, c, d;
+void fn1(char *p1, int p2) {
+  int x;
+  while (1) {
+x = 0;
+for (; x < 8; x++)
+  p1[0] = -a[0] * d + p1[0] * c + 1 >> b >> 1;
+p1 += p2;
+  }
+}
Index: gcc/testsuite/gcc.dg/torture/pr77937-2.c
===
--- gcc/testsuite/gcc.dg/torture/pr77937-2.c(revision 0)
+++ gcc/testsuite/gcc.dg/torture/pr77937-2.c(working copy)
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-do options "-O3 -march=amdfam10" { target { x86_64-*-* } } } */
+
+extern int fn2(int);
+extern int fn3(int);
+int a, b, c;
+void fn1(long p1) {
+  char *d;
+  for (;; d += p1) {
+d[0] = fn2(1 >> a);
+fn3(0);
+fn3(c >> a);
+d[1] = fn3(d[1] * b + c >> a);
+d[4] = fn3(d[4] * b + c >> a);
+d[5] = fn3(d[5] * b + c >> a);
+  }
+}

Re: RFC: Split into smaller pieces


On 13/10/16 18:34 +0100, Jonathan Wakely wrote:

Code which doesn't need the whole of  should include the
relevant  header instead.

This means that we don't need to pull the whole of  (and
 and ) into  just because shared_ptr
wants to use reference_wrapper in one place.  This reduces 
from 48kloc to 30kloc!


With a few additional changes to remove  from other
headers we can get:

old  |  new  | Header
--|---|---
47571 | 30449 | memory
49620 | 32498 | thread
49049 | 30861 | condition_variable
49459 | 31271 | shared_mutex
54215 | 37745 | future  
75063 | 68509 | regex   


Apart from , which is enormous even without , these
are pretty dramatic improvements.

[v3] Remove 'test' variables from a few more testsuite dirs

2016-10-13 Thread Paolo Carlini


Hi,

nothing especially interesting here... Tested x86_64-linux.

Thanks, Paolo.

//

2016-10-13  Paolo Carlini  

* testsuite/24_iterators/container_access.cc: Remove 'test' variables.
* testsuite/24_iterators/istream_iterator/2.cc: Likewise.
* testsuite/24_iterators/istreambuf_iterator/2.cc: Likewise.
* testsuite/24_iterators/istreambuf_iterator/2627.cc: Likewise.
* testsuite/24_iterators/operations/next.cc: Likewise.
* testsuite/24_iterators/operations/prev.cc: Likewise.
* testsuite/24_iterators/ostreambuf_iterator/2.cc: Likewise.
* testsuite/24_iterators/random_access_iterator/26020.cc: Likewise.
* testsuite/24_iterators/range_access_cpp14.cc: Likewise.
* testsuite/24_iterators/reverse_iterator/11729.cc: Likewise.
* testsuite/24_iterators/reverse_iterator/3.cc: Likewise.
* testsuite/25_algorithms/adjacent_find/vectorbool.cc: Likewise.
* testsuite/25_algorithms/all_of/1.cc: Likewise.
* testsuite/25_algorithms/any_of/1.cc: Likewise.
* testsuite/25_algorithms/binary_search/2.cc: Likewise.
* testsuite/25_algorithms/binary_search/partitioned.cc: Likewise.
* testsuite/25_algorithms/clamp/1.cc: Likewise.
* testsuite/25_algorithms/clamp/2.cc: Likewise.
* testsuite/25_algorithms/copy/1.cc: Likewise.
* testsuite/25_algorithms/copy/2.cc: Likewise.
* testsuite/25_algorithms/copy/3.cc: Likewise.
* testsuite/25_algorithms/copy/34595.cc: Likewise.
* testsuite/25_algorithms/copy/4.cc: Likewise.
* testsuite/25_algorithms/copy/deque_iterators/1.cc: Likewise.
* testsuite/25_algorithms/copy/move_iterators/1.cc: Likewise.
* testsuite/25_algorithms/copy/streambuf_iterators/char/1.cc: Likewise.
* testsuite/25_algorithms/copy/streambuf_iterators/char/2.cc: Likewise.
* testsuite/25_algorithms/copy/streambuf_iterators/char/3.cc: Likewise.
* testsuite/25_algorithms/copy/streambuf_iterators/char/4.cc: Likewise.
* testsuite/25_algorithms/copy/streambuf_iterators/wchar_t/1.cc:
Likewise.
* testsuite/25_algorithms/copy/streambuf_iterators/wchar_t/2.cc:
Likewise.
* testsuite/25_algorithms/copy/streambuf_iterators/wchar_t/3.cc:
Likewise.
* testsuite/25_algorithms/copy/streambuf_iterators/wchar_t/4.cc:
Likewise.
* testsuite/25_algorithms/copy_backward/deque_iterators/1.cc: Likewise.
* testsuite/25_algorithms/copy_backward/move_iterators/1.cc: Likewise.
* testsuite/25_algorithms/copy_n/1.cc: Likewise.
* testsuite/25_algorithms/copy_n/2.cc: Likewise.
* testsuite/25_algorithms/copy_n/3.cc: Likewise.
* testsuite/25_algorithms/copy_n/4.cc: Likewise.
* testsuite/25_algorithms/copy_n/50119.cc: Likewise.
* testsuite/25_algorithms/copy_n/move_iterators/1.cc: Likewise.
* testsuite/25_algorithms/equal_range/2.cc: Likewise.
* testsuite/25_algorithms/equal_range/partitioned.cc: Likewise.
* testsuite/25_algorithms/fill/1.cc: Likewise.
* testsuite/25_algorithms/fill/2.cc: Likewise.
* testsuite/25_algorithms/fill/3.cc: Likewise.
* testsuite/25_algorithms/fill/4.cc: Likewise.
* testsuite/25_algorithms/fill_n/1.cc: Likewise.
* testsuite/25_algorithms/find/39546.cc: Likewise.
* testsuite/25_algorithms/find/istreambuf_iterators/char/1.cc: Likewise.
* testsuite/25_algorithms/find/istreambuf_iterators/char/2.cc: Likewise.
* testsuite/25_algorithms/find/istreambuf_iterators/wchar_t/1.cc:
Likewise.
* testsuite/25_algorithms/find/istreambuf_iterators/wchar_t/2.cc:
Likewise.
* testsuite/25_algorithms/find_if/1.cc: Likewise.
* testsuite/25_algorithms/find_if_not/1.cc: Likewise.
* testsuite/25_algorithms/for_each/1.cc: Likewise.
* testsuite/25_algorithms/heap/1.cc: Likewise.
* testsuite/25_algorithms/heap/moveable.cc: Likewise.
* testsuite/25_algorithms/heap/moveable2.cc: Likewise.
* testsuite/25_algorithms/heap/vectorbool.cc: Likewise.
* testsuite/25_algorithms/includes/1.cc: Likewise.
* testsuite/25_algorithms/inplace_merge/1.cc: Likewise.
* testsuite/25_algorithms/inplace_merge/49559.cc: Likewise.
* testsuite/25_algorithms/inplace_merge/moveable.cc: Likewise.
* testsuite/25_algorithms/inplace_merge/moveable2.cc: Likewise.
* testsuite/25_algorithms/is_heap/1.cc: Likewise.
* testsuite/25_algorithms/is_heap_until/1.cc: Likewise.
* testsuite/25_algorithms/is_partitioned/1.cc: Likewise.
* testsuite/25_algorithms/is_permutation/1.cc: Likewise.
* testsuite/25_algorithms/is_permutation/2.cc: Likewise.
* testsuite/25_algorithms/is_permutation/vectorbool.cc: Likewise.
* testsuite/25_algorithms/is_sorted/1.cc:

Re: [PATCH, libfortran] PR 48587 Newunit allocator

2016-10-13 Thread Jerry DeLisle


On 10/13/2016 08:16 AM, Janne Blomqvist wrote:

Currently GFortran newer reuses unit numbers allocated with NEWUNIT=,
instead having a simple counter that is decremented each time such a
unit is opened.  For a long running program which repeatedly opens
files with NEWUNIT= and closes them, the counter can wrap around and
cause an abort.  This patch replaces the counter with an allocator
that keeps track of which units numbers are allocated, and can reuse
them once they have been deallocated.  Since operating systems tend to
limit the number of simultaneous open files for a process to a
relatively modest number, a relatively simple approach with a linear
scan through an array suffices.  Though as a small optimization there
is a low water indicator keeping track of the index for which all unit
numbers below are already allocated.  This linear scan also ensures
that we always allocate the smallest available unit number.

2016-10-13  Janne Blomqvist  

PR libfortran/48587
* io/io.h (get_unique_unit_number): Remove prototype.
(newunit_alloc): New prototype.
* io/open.c (st_open): Call newunit_alloc.
* io/unit.c (newunits,newunit_size,newunit_lwi): New static
variables.
(GFC_FIRST_NEWUNIT): Rename to NEWUNIT_START.
(next_available_newunit): Remove variable.
(get_unit): Call newunit_alloc.
(close_unit_1): Call newunit_free.
(close_units): Free newunits array.
(get_unique_number): Remove function.
(newunit_alloc): New function.
(newunit_free): New function.

Regtested on x86_64-pc-linux-gnu. Ok for trunk?



Yes, OK, clever! Thanks!

Jerry

Re: PR35503 - warn for restrict pointer

2016-10-13 Thread Prathamesh Kulkarni

On 7 October 2016 at 10:33, Prathamesh Kulkarni
 wrote:
> On 22 September 2016 at 23:15, Joseph Myers  wrote:
>> On Thu, 22 Sep 2016, Prathamesh Kulkarni wrote:
>>
>>> Would that be acceptable ? I am not sure how to make %Z check if the
>>> argument has type vec *
>>> since vec is not really a builtin C type.
>>> Could you suggest me a better solution so that the format checker will check
>>> if arg has type vec * instead of checking if it's just a pointer ?
>>> Also for testing, should I create a testcase in g++.dg since
>>> gcc.dg/format/ tests are C-only ?
>>
>> If it's C++-only then it would need to be in g++.dg.
>>
>> The way we handle GCC-specific types in checking these formats is that the
>> code using these formats has to define typedefs which the format-checking
>> code then looks up.  In most cases it can just look up names like
>> location_t or tree, but for HOST_WIDE_INT it looks up
>> __gcc_host_wide_int__ which the user must have defined as a typedef.
>> Probably that's the way to go in this case: the user must do "typedef
>> vec __gcc_vec_int__;" or similar, and the code looks up
>> __gcc_vec_int__.
> Thanks for the suggestions. To keep it simple, instead of vec,
> I made %Z take two args: int *v, unsigned len, and prints elements in
> v having length == len.
> Is that OK ?
>
> Bootstrapped+tested on x86_64-unknown-linux-gnu.
> As pointed out earlier in the thread, the patch can give false positives 
> because
> it only checks whether parameters are qualified with restrict, not how
> parameters
> are used inside the function. For instance it warned for example 10
> mentioned in n1570
> under section 6.7.3.1 - "Formal definition of restrict".
> Should we keep the warning in Wall or keep it in Wextra ?
> The attached patch enables it with Wall.
Ping for c, c-family changes:
https://gcc.gnu.org/ml/gcc-patches/2016-10/msg00446.html

Thanks,
Prathamesh
>
> Thanks,
> Prathamesh
>>
>> --
>> Joseph S. Myers
>> jos...@codesourcery.com

Re: [PATCH v4] PR48344: Fix unrecognizable insn error with -fstack-limit-register=r2

2016-10-13 Thread Andreas Schwab

I've committed this to fix the ICE.

Andreas.

* config/m68k/m68k.c (m68k_option_override): Check
opt_fstack_limit_symbol_arg and opt_fstack_limit_register_no
instead of stack_limit_rtx.

* gcc.target/m68k/stack-limit-1.c: Expect warning on line 0.

diff --git a/gcc/config/m68k/m68k.c b/gcc/config/m68k/m68k.c
index e6bcfa0caf..a883e42514 100644
--- a/gcc/config/m68k/m68k.c
+++ b/gcc/config/m68k/m68k.c
@@ -638,10 +638,12 @@ m68k_option_override (void)
 }
 #endif
 
-  if (stack_limit_rtx != NULL_RTX && !TARGET_68020)
+  if ((opt_fstack_limit_symbol_arg != NULL || opt_fstack_limit_register_no >= 
0)
+  && !TARGET_68020)
 {
   warning (0, "-fstack-limit- options are not supported on this cpu");
-  stack_limit_rtx = NULL_RTX;
+  opt_fstack_limit_symbol_arg = NULL;
+  opt_fstack_limit_register_no = -1;
 }
 
   SUBTARGET_OVERRIDE_OPTIONS;
diff --git a/gcc/testsuite/gcc.target/m68k/stack-limit-1.c 
b/gcc/testsuite/gcc.target/m68k/stack-limit-1.c
index b1e9b99b26..5086edd77f 100644
--- a/gcc/testsuite/gcc.target/m68k/stack-limit-1.c
+++ b/gcc/testsuite/gcc.target/m68k/stack-limit-1.c
@@ -1,6 +1,6 @@
 /* -fstack-limit- should be ignored without an ICE if not supported.  */
 /* { dg-do compile } */
 /* { dg-options "-fstack-limit-symbol=_stack_limit -m68000" } */
-/* { dg-warning "not supported" "" { target *-*-* } 1 } */
+/* { dg-warning "not supported" "" { target *-*-* } 0 } */
 
 void dummy (void) { }
-- 
2.10.1


-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

Re: [PATCH] PR68212, Correct frequencies/counts when unrolling


On 09/20/2016 03:27 PM, Pat Haugen wrote:

The following patch corrects frequency/count values computed when generating 
the switch blocks/peeled loop copies before the loop. The two main problem 
areas were for the peeled copies duplicate_loop_to_header_edge was not using 
the preheader frequency as part of the scale factor when peeling a copy of the 
loop to the preheader edge, and the second was that the switch block generation 
was just totally lacking code to compute correct freq/count values. Verified by 
comparing freq/count values in the unroller dump before/after.

Bootstrap/regtest on powerpc64le with no new regressions. Ok for trunk?

-Pat



2016-09-20  Pat Haugen  

PR rtl-optimization/68212
* cfgloopmanip.c (duplicate_loop_to_header_edge): Use preheader edge
frequency when computing scale factor for peeled copies.
* loop-unroll.c (unroll_loop_runtime_iterations): Fix freq/count
values for switch/peel blocks/edges.



OK.  Thanks for your patience.

Jeff

Re: [PATCH] Don't peel extra copy of loop in unroller for loops with exit at end


On 09/22/2016 01:10 PM, Pat Haugen wrote:

I noticed the loop unroller peels an extra copy of the loop before it enters 
the switch block code to round the iteration count to a multiple of the unroll 
factor. This peeled copy is only needed for the case where the exit test is at 
the beginning of the loop since in that case it inserts the test for zero peel 
iterations before that peeled copy.

This patch bumps the iteration count by 1 for loops with the exit at the end so 
that it represents the number of times the loop body is executed, and therefore 
removes the need to always execute that first peeled copy. With this change, 
when the number of executions of the loop is an even multiple of the unroll 
factor then the code will jump to the unrolled loop immediately instead of 
executing all the switch code and peeled copies of the loop and then falling 
into the unrolled loop. This change also reduces code size by removing a peeled 
copy of the loop.

Bootstrap/regtest on powerpc64le with no new regressions. Ok for trunk?



2016-09-22  Pat Haugen  

* loop-unroll.c (unroll_loop_runtime_iterations): Condition initial
loop peel to loops with exit test at the beginning.



OK.
jeff

Re: [C++ PATCH] RFC: implement P0386R2 - C++17 inline variables

2016-10-13 Thread Jason Merrill

On Tue, Oct 11, 2016 at 9:39 AM, Jakub Jelinek  wrote:
> Here is an attempt to implement C++17 inline variables.
> Bootstrapped/regtested on x86_64-linux and i686-linux.
>
> The main question is if the inline variables, which are vague linkage,
> should be !DECL_EXTERNAL or DECL_EXTERNAL DECL_NOT_REALLY_EXTERN while
> in the FE.  In the patch, they are !DECL_EXTERNAL, except for inline
> static data members in instantiated templates.  As the inline-var3.C
> testcase shows, even if they are !DECL_EXTERNAL (except for the instantiated
> static data members), even at -O0 we do not actually emit them unless
> odr-used, so to some extent I don't see the point in forcing them to
> be DECL_EXTERNAL DECL_NOT_REALLY_EXTERN.

Yeah, I ended up agreeing with you.  There's no need to work hard to
make them work with an obsolete system.  So I've checked in the patch
with a few minor tweaks, attached.

> Another thing is I've noticed (with Jonathan's help to look it up) that
> we aren't implementing DR1511, I'm willing to try to implement that, but
> as it will need to touch the same spots as the patch, I think it should be
> resolved incrementally.

Sounds good.

> Yet another thing are thread_local inline vars.  E.g. on:
> struct S { S (); ~S (); };
> struct T { ~T (); };
> int foo ();
> thread_local inline S s;
> inline thread_local T t;
> inline thread_local int u = foo ();
> it seems both clang++ (~2 months old) and g++ with the patch emits:
> 8 byte TLS _ZGV1{stu] variables
> 1 byte TLS __tls_guard variable
> and _ZTH1[stu] being aliases to __tls_init, which does essentially:
>   if (__tls_guard) return;
>   __tls_guard = 1;
>   if (*(char *)&_ZGV1s == 0) {
> *(char *)&_ZGV1s = 1;
> S::S ();
> __cxa_thread_atexit (S::~S, , &__dso_handle);
>   }
>   if (*(char *)&_ZGV1t == 0) {
> *(char *)&_ZGV1t = 1;
> __cxa_thread_atexit (T::~T, , &__dso_handle);
>   }
>   if (*(char *)&_ZGV1u == 0) {
> *(char *)&_ZGV1u = 1;
> u = foo ();
>   }
> Is that what we want to emit?  At first I doubted this could work properly,
> now thinking about it more, perhaps it can.

I think so; I don't see a reason for inline vars to work differently
from templates here.

> And, do we really want all the
> _ZGV* vars for the TLS inline vars (and other TLS comdats) to be 8 byte,
> even when we are using just a single byte?  Or is it too late to change (ABI
> break)?

Right, the ABI specifies that the guard variable is 8 bytes.  A
comment says, "The intent of specifying an 8-byte structure for the
guard variable, but only describing one byte of its contents, is to
allow flexibility in the implementation of the API above. On systems
with good small lock support, the second word might be used for a
mutex lock. On others, it might identify (as a pointer or index) a
more complex lock structure to use."  This seems unnecessary for
systems with byte atomic instructions, and we might be able to get
away with changing it without breaking any actual usage, but it would
indeed be an ABI change.

> And, as mentioned in the DWARF mailing list, I think we should emit
> DW_AT_inline on the inline vars (both explicit and implicit - static
> constexpr data members in C++17 mode).  I hope that can be done as a
> follow-up.

Makes sense.
diff --git a/gcc/c-family/c-cppbuiltin.c b/gcc/c-family/c-cppbuiltin.c
index 94af585..06b5aa3 100644
--- a/gcc/c-family/c-cppbuiltin.c
+++ b/gcc/c-family/c-cppbuiltin.c
@@ -935,6 +935,7 @@ c_cpp_builtins (cpp_reader *pfile)
  cpp_define (pfile, "__cpp_constexpr=201603");
  cpp_define (pfile, "__cpp_if_constexpr=201606");
  cpp_define (pfile, "__cpp_capture_star_this=201603");
+ cpp_define (pfile, "__cpp_inline_variables=201606");
}
   if (flag_concepts)
/* Use a value smaller than the 201507 specified in
diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 7670162..f761d0d 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -2917,9 +2917,11 @@ redeclaration_error_message (tree newdecl, tree olddecl)
  && !DECL_INITIAL (newdecl))
{
  DECL_EXTERNAL (newdecl) = 1;
- if (warning_at (DECL_SOURCE_LOCATION (newdecl), OPT_Wdeprecated,
- "redundant redeclaration of % static "
- "data member %qD", newdecl))
+ /* For now, only warn with explicit -Wdeprecated.  */
+ if (global_options_set.x_warn_deprecated
+ && warning_at (DECL_SOURCE_LOCATION (newdecl), OPT_Wdeprecated,
+"redundant redeclaration of % static "
+"data member %qD", newdecl))
inform (DECL_SOURCE_LOCATION (olddecl),
"previous declaration of %qD", olddecl);
  return NULL;
@@ -5479,18 +5481,19 @@ maybe_commonize_var (tree decl)
 be merged.  */
  TREE_PUBLIC (decl) = 0;
  DECL_COMMON (decl) = 0;
+ const char *msg;

Re: [PATCH] Allow `make tags` to work from top-level directory

2016-10-13 Thread Eric Gallager

On 10/13/16, Jeff Law  wrote:
> On 10/06/2016 07:21 AM, Eric Gallager wrote:
>> The libdecnumber, libgcc, and libobjc subdirectories are missing TAGS
>> targets in their Makefiles. The attached patch causes them to be
>> skipped when running `make tags`.
>>
>> ChangeLog entry:
>>
>> 2016-10-06  Eric Gallager  
>>
>>  * Makefile.def: Mark libdecnumber, libgcc, and libobjc as missing
>>  TAGS target.
>>  * Makefile.in: Regenerate.
>>
> OK.  Please install.
>
> Thanks,
> Jeff
>


I'm still waiting to hear back from  about my request
for copyright assignment, which I'll need to get sorted out before I
can start committing stuff (like this patch).

Thanks,
Eric

Re: [PATCH] Remove x86 pcommit instruction

2016-10-13 Thread H.J. Lu

On Thu, Oct 13, 2016 at 5:09 AM, Andrew Senkevich
 wrote:
> 2016-10-11 20:09 GMT+03:00 H.J. Lu :
>> On Tue, Oct 11, 2016 at 10:04 AM, Andrew Senkevich
>>  wrote:
>>> 2016-10-06 1:07 GMT+03:00 H.J. Lu :
 On Wed, Oct 5, 2016 at 1:42 PM, Andrew Senkevich
  wrote:
> 2016-10-05 18:06 GMT+03:00 Uros Bizjak :
>> On Wed, Oct 5, 2016 at 3:47 PM, Andrew Senkevich
>>  wrote:
 -mpcommit
 -Target Report Mask(ISA_PCOMMIT) Var(ix86_isa_flags) Save
 -Support PCOMMIT instruction.
 -

 You should not simply delete a option that was in the released
 compiler, but a warning should be emitted instead. Please see how
 msse5 is handled in i386.opt.
>>>
>>> Thank you, it is fixed in patch below. Ok for trunk?
>>
>> OK.
>>
>>> Is it subject for backport for 5.* and 6.* releases?
>>
>> Yes, but please wait a couple of days if any problem arises in trunk.
>>
>> (Please also provide an entry for Release Changes, since this is
>> user-facing change. Also for release branches.)
>
> Hi HJ,
>
> could you please commit this patch for trunk since I have no commit 
> rights.
> Attached in format for git am.
>
>

 Done.
>>>
>>> Thanks, HJ!
>>>
>>> Should I ask you or somebody else for backports for to 5.* and 6.* or
>>> may be I can somehow get commit after approval rights to don't disturb
>>> others with commits? I am preparing several patches.
>>>
>>
>> Please provide patches for GCC 5 and 6.
>
> Attached.

I checked them into GCC 5 and GCC 6 branches.

> Have you possibility to update according changes.html files?
>

Here is the patch for GCC 7.  I am not sure what to do with GCC
5 and 6.

-- 
H.J.
---
Index: gcc-7/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-7/changes.html,v
retrieving revision 1.18
diff -u -p -r1.18 changes.html
--- gcc-7/changes.html 12 Oct 2016 11:08:25 - 1.18
+++ gcc-7/changes.html 13 Oct 2016 21:37:18 -
@@ -318,7 +318,14 @@ const int* get_address (unsigned idx)

 

-
+IA-32/x86-64
+   
+ 
+   Support for
+   https://software.intel.com/en-us/blogs/2016/09/12/deprecate-pcommit-instruction;>deprecated
+   pcommit instruction has been removed.
+ 
+

Re: PING [PATCH] accept flexible arrays in struct in unions (c++/71912 - [6/7 regression])

2016-10-13 Thread Martin Sebor


On 10/12/2016 07:43 AM, Jason Merrill wrote:

On Tue, Oct 11, 2016 at 9:45 PM, Martin Sebor  wrote:

Are there any other changes you want me to make to the patch?
I leave this weekend for the WG14 meeting and would like to
get this change finalized and hopefully committed before then.

  https://gcc.gnu.org/ml/gcc-patches/2016-10/msg00410.html


OK, thanks.  Sorry I overlooked your earlier mail.


I forgot to also request approval to backport it to the 6 branch.
Are you fine with that as well?

Thanks
Martin

Re: Questionable code in gcov-io.c

2016-10-13 Thread Andrew Pinski

On Wed, Oct 12, 2016 at 11:24 AM, Nathan Sidwell  wrote:
> On 10/12/16 11:04, Andreas Schwab wrote:
>
>> Do we still need to call fstat?  I don't think it can ever fail here.
>
>
> Update removing the fstat.  Survived a profiled bootstrap, so I'll commit
> tomorrow, unless there are further comments.  Thanks for spotting this!


This breaks the build for aarch64-elf:

In file included from
/home/jenkins/workspace/BuildToolchainAARCH64_thunder_elf_upstream/toolchain/scripts/../src/libgcc/libgcov-driver.c:53:0:
/home/jenkins/workspace/BuildToolchainAARCH64_thunder_elf_upstream/toolchain/scripts/../src/libgcc/../gcc/gcov-io.c:
In function ‘__gcov_open’:
/home/jenkins/workspace/BuildToolchainAARCH64_thunder_elf_upstream/toolchain/scripts/../src/libgcc/../gcc/gcov-io.c:184:10:
error: assignment of read-only variable ‘mode’
 mode = 1;
  ^

Thanks,
Andrew


>
> nathan
>
>

Re: [PATCH, libfortran] PR 48587 Newunit allocator

2016-10-13 Thread Dominique d'Humières

With the patch, the following code

integer :: i, j
i = -10
write(unit=i,fmt=*, iostat=j) 10
print *, j
end

fails at run time with

Assertion failed: (ind >= 0 && ind < newunit_size), function newunit_free, file 
../../../work/libgfortran/io/unit.c, line 966.

Without the patch the output is 5002.

TIA

Dominique

Re: Set nonnull attribute to ptr_info_def based on VRP

2016-10-13 Thread kugan


Hi Richard,

On 13/10/16 20:44, Richard Biener wrote:

On Thu, Oct 13, 2016 at 6:49 AM, kugan
 wrote:

Hi Richard,



what does this try to do?  Preserve info VRP computed across PTA?

I think we didn't yet sort out the nonlocal/escaped vs. null handling
properly
(or how PTA should handle get_ptr_nonnull).  The way you are using it
asks for pt.null to be orthogonal to nonlocal/escaped and thus having
nonlocal or escaped would also require setting ptr.null in PTA.  It then
would be also more canonical to set it for pt.anything as well.  Which
means conservatively handling it would be equivalent to flipping its
semantic and changing its name to pt.nonnull.

That said, you seem to be simply "reserving" the bit for VRP, keeping it
conservatively true when "not computed".  I guess I'm fine with this for
now
but it should be documented in the header file that way.



Thanks for the comments.

To summarize, currently I am not relying on PTA analysis at all. Just saving
null from VRP (or rather nonnull) and preserving it across PTA. Primary
intention is to pass it for PARM_DECL SSA names (from ipa-vrp).

In this case, using  pt.anything/nonlocal/escaped will only make the result
more pessimistic.

Ideally, we should improve pt.null within PTA but for now as you said, I
will document it.

When we start using pt.null from PTA analysis, we would also have to take
into account pt.anything/nonlocal/escaped.

Does that make sense?


Yes.



Here is the revised patch based on the review. I also had to adjust two 
testcases since we set pt.null conservatively and dumps that too.


Thanks,
Kugan

gcc/ChangeLog:

2016-10-14  Kugan Vivekanandarajah  

* tree-ssa-alias.h (pt_solution_singleton_or_null_p): Renamed from
pt_solution_singleton_p.
* tree-ssa-ccp.c (fold_builtin_alloca_with_align): Use renamed
pt_solution_singleton_or_null_p from pt_solution_singleton_p.
* tree-ssa-structalias.c (find_what_var_points_to): Conservatively set
pt.null to 1.
(find_what_p_points_to): Preserve pointer nonnull computed by VRP.
(pt_solution_singleton_or_null_p): Renamed from
pt_solution_singleton_p.
* tree-ssanames.h (set_ptr_nonnull): Declare.
(get_ptr_nonnull): Likewise.
* tree-ssanames.c (set_ptr_nonnull): New.
(get_ptr_nonnull): Likewise.
* tree-vrp.c (vrp_finalize): Set ptr that are nonnull.
(evrp_dom_walker::before_dom_children): Likewise.


gcc/testsuite/ChangeLog:

2016-10-14  Kugan Vivekanandarajah  

* gcc.dg/torture/pr39074-2.c: Adjust testcase.
* gcc.dg/torture/pr39074.c: Likewise.
>From 0bc1c2efb600854148889bfdd9d121a3edc2841b Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah 
Date: Wed, 12 Oct 2016 13:48:07 +1100
Subject: [PATCH 1/3] Add-nonnull-to-pointer-from-VRP

---
 gcc/testsuite/gcc.dg/torture/pr39074-2.c |  2 +-
 gcc/testsuite/gcc.dg/torture/pr39074.c   |  2 +-
 gcc/tree-ssa-alias.h |  2 +-
 gcc/tree-ssa-ccp.c   |  2 +-
 gcc/tree-ssa-structalias.c   | 11 ++--
 gcc/tree-ssanames.c  | 29 +
 gcc/tree-ssanames.h  |  2 ++
 gcc/tree-vrp.c   | 44 +---
 8 files changed, 73 insertions(+), 21 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/torture/pr39074-2.c b/gcc/testsuite/gcc.dg/torture/pr39074-2.c
index 740b463..0693f2d 100644
--- a/gcc/testsuite/gcc.dg/torture/pr39074-2.c
+++ b/gcc/testsuite/gcc.dg/torture/pr39074-2.c
@@ -31,4 +31,4 @@ int main()
 }
 
 /* { dg-final { scan-tree-dump "y.._. = { i }" "alias" } } */
-/* { dg-final { scan-tree-dump "y.._., points-to vars: { D. }" "alias" } } */
+/* { dg-final { scan-tree-dump "y.._., points-to NULL, points-to vars: { D. }" "alias" } } */
diff --git a/gcc/testsuite/gcc.dg/torture/pr39074.c b/gcc/testsuite/gcc.dg/torture/pr39074.c
index 31ed499..54c444e 100644
--- a/gcc/testsuite/gcc.dg/torture/pr39074.c
+++ b/gcc/testsuite/gcc.dg/torture/pr39074.c
@@ -30,4 +30,4 @@ int main()
 }
 
 /* { dg-final { scan-tree-dump "y.._. = { i }" "alias" } } */
-/* { dg-final { scan-tree-dump "y.._., points-to vars: { D. }" "alias" } } */
+/* { dg-final { scan-tree-dump "y.._., points-to NULL, points-to vars: { D. }" "alias" } } */
diff --git a/gcc/tree-ssa-alias.h b/gcc/tree-ssa-alias.h
index 6680cc0..27a06fc 100644
--- a/gcc/tree-ssa-alias.h
+++ b/gcc/tree-ssa-alias.h
@@ -146,7 +146,7 @@ extern void dump_alias_stats (FILE *);
 /* In tree-ssa-structalias.c  */
 extern unsigned int compute_may_aliases (void);
 extern bool pt_solution_empty_p (struct pt_solution *);
-extern bool pt_solution_singleton_p (struct pt_solution *, unsigned *);
+extern bool pt_solution_singleton_or_null_p (struct pt_solution *, unsigned *);
 extern bool pt_solution_includes_global (struct

Re: [ipa-vrp] Use get/set_ptr_nonnull in ipa-vrp

2016-10-13 Thread kugan


Hi Honza,

On 12/10/16 22:16, Jan Hubicka wrote:

Hi,

This patch uses the get/set_ptr_nonnull so that ipa-vrp also
propagates nonnull ranges for pinter.

Bootstrapped and regression tested this with other patched without
any new regressions on x86_64-linux-gnu.

Is this OK for trunk?

Thanks,
Kugan




gcc/ChangeLog:

2016-10-12  Kugan Vivekanandarajah  

* ipa-prop.c (ipa_compute_jump_functions_for_edge): Set value range
  for pointer type too.
(ipcp_update_vr): set_ptr_nonnull for pointer.

gcc/testsuite/ChangeLog:

2016-10-12  Kugan Vivekanandarajah  

* gcc.dg/ipa/vrp4.c: New test.

OK, thank you!
We should be able to derive a lot of (useful) non-null information from the
fact that the pointers are dereferenced either prior the function call or in a
statement that postdominate the function entry.
I will try with spec2k/2006. Do you have any specific benchmark in mind 
that I can try first?


  I guess we could also give (semi)

useful -Wmissing-attribute=nonnull hints in that case.


I will send a follow up patch for this.

Thanks,
Kugan


Honza

[patch, fortran] PR77972 ICE on broken character continuation with -Wall etc.

2016-10-13 Thread Jerry DeLisle

This patch is straight forward. We were sending bogus locus info to the 
diagnostics machinery and catch an assert in error,c.


The patch avoids doing this.

Regression tested on x86-64-linux.

OK for trunk?

Regards,

Jerry

2016-10-13  Jerry DeLisle  

* scanner.c (gfc_next_char_literal): If nextc is null do not
decrement the pointer and call the diagnostics.

diff --git a/gcc/fortran/scanner.c b/gcc/fortran/scanner.c
index be9c5091..5e355359 100644
--- a/gcc/fortran/scanner.c
+++ b/gcc/fortran/scanner.c
@@ -1414,10 +1414,9 @@ restart:

   if (c != '&')
{
- if (in_string)
+ if (in_string && gfc_current_locus.nextc)
{
- if (gfc_current_locus.nextc)
-   gfc_current_locus.nextc--;
+ gfc_current_locus.nextc--;
  if (warn_ampersand && in_string == INSTRING_WARN)
gfc_warning (OPT_Wampersand,
 "Missing %<&%> in continued character "

Re: [patch, fortran] PR77972 ICE on broken character continuation with -Wall etc.

2016-10-13 Thread Steve Kargl

On Thu, Oct 13, 2016 at 07:04:04PM -0700, Jerry DeLisle wrote:
> This patch is straight forward. We were sending bogus locus info to the 
> diagnostics machinery and catch an assert in error,c.
> 
> The patch avoids doing this.
> 
> Regression tested on x86-64-linux.
> 
> OK for trunk?
> 

Yes, but see below.

> Regards,
> 
> Jerry
> 
> 2016-10-13  Jerry DeLisle  
> 
>   * scanner.c (gfc_next_char_literal): If nextc is null do not
>   decrement the pointer and call the diagnostics.
> 
> diff --git a/gcc/fortran/scanner.c b/gcc/fortran/scanner.c
> index be9c5091..5e355359 100644
> --- a/gcc/fortran/scanner.c
> +++ b/gcc/fortran/scanner.c
> @@ -1414,10 +1414,9 @@ restart:
> 
> if (c != '&')
>  {
> - if (in_string)
> + if (in_string && gfc_current_locus.nextc)
>  {
> - if (gfc_current_locus.nextc)
> -   gfc_current_locus.nextc--;
> + gfc_current_locus.nextc--;
>if (warn_ampersand && in_string == INSTRING_WARN)
>  gfc_warning (OPT_Wampersand,
>   "Missing %<&%> in continued character "

If this is a "missing '&' in a continued..." and the '&' is 
required by the standard, then why is this just a warning?

-- 
Steve

Re: [Patch, reload, tentative, PR 71627] Tweak conditions in find_valid_class_1

2016-10-13 Thread Senthil Kumar Selvaraj


Bernd Schmidt writes:

> On 09/16/2016 09:02 PM, Senthil Kumar Selvaraj wrote:
>>   Does this make sense? I ran a reg test for the avr target with a
>>   slightly older version of this patch, it did not show any regressions.
>>   If this is the right fix, I'll make sure to run reg tests on x86_64
>>   after backporting to a gcc version where that target used reload.
>
> It's hard to say, and could have different effects on different targets.
> One thing though, at the very least the reg_class_size test would have 
> to be adapted - the idea is to find the largest class, and there's a 
> risk here of ending up with a large class that only has one valid register.

Agreed - I've updated the patch to compute rclass sizes based on regno
availability i.e., only if in_hard_reg_set_p and HARD_REGNO_MODE_OK, and
then use the computed sizes when calculating best_size.

>
> You'll also want to verify this against
>https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54814

Yes, this patch doesn't break the fix for PR54814. The change to
in_hard_reg_set_p was what fixed that, and that remains unmodified.

Reg tested this on top of trunk@190252 with the in_hard_reg_set_p
backport. x86_64-pc-linux bootstrapped and regtested ok. avr showed
no regressions either.

Ok for trunk?

Regards
Senthil

gcc/ChangeLog:

2016-10-13  Senthil Kumar Selvaraj  

* reload.c (find_valid_class_1): Allow regclass if atleast one
regno in class is ok. Compute and use rclass size based on
actually available regnos for mode in rclass.

gcc/testsuite/ChangeLog:

2016-10-13  Senthil Kumar Selvaraj  

* gcc.target/avr/pr71627.c: New.


Index: gcc/reload.c
===
--- gcc/reload.c(revision 240989)
+++ gcc/reload.c(working copy)
@@ -711,31 +711,36 @@
   enum reg_class best_class = NO_REGS;
   unsigned int best_size = 0;
   int cost;
+  unsigned int computed_rclass_sizes[N_REG_CLASSES] = { 0 };
 
   for (rclass = 1; rclass < N_REG_CLASSES; rclass++)
 {
-  int bad = 0;
-  for (regno = 0; regno < FIRST_PSEUDO_REGISTER && !bad; regno++)
-   {
- if (in_hard_reg_set_p (reg_class_contents[rclass], mode, regno)
- && !HARD_REGNO_MODE_OK (regno, mode))
-   bad = 1;
-   }
-  
-  if (bad)
-   continue;
+  int atleast_one_regno_ok = 0;
 
+  for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
+{
+  if (in_hard_reg_set_p (reg_class_contents[rclass], mode, regno))
+{
+  atleast_one_regno_ok = 1;
+  if (HARD_REGNO_MODE_OK (regno, mode))
+computed_rclass_sizes[rclass]++;
+}
+}
+
+  if (!atleast_one_regno_ok)
+continue;
+
   cost = register_move_cost (outer, (enum reg_class) rclass, dest_class);
 
-  if ((reg_class_size[rclass] > best_size
-  && (best_cost < 0 || best_cost >= cost))
- || best_cost > cost)
-   {
- best_class = (enum reg_class) rclass;
- best_size = reg_class_size[rclass];
- best_cost = register_move_cost (outer, (enum reg_class) rclass,
- dest_class);
-   }
+  if ((computed_rclass_sizes[rclass] > best_size
+   && (best_cost < 0 || best_cost >= cost))
+  || best_cost > cost)
+{
+  best_class = (enum reg_class) rclass;
+  best_size = computed_rclass_sizes[rclass];
+  best_cost = register_move_cost (outer, (enum reg_class) rclass,
+  dest_class);
+}
 }
 
   gcc_assert (best_size != 0);

Index: gcc/testsuite/gcc.target/avr/pr71627.c
===
--- gcc/testsuite/gcc.target/avr/pr71627.c  (nonexistent)
+++ gcc/testsuite/gcc.target/avr/pr71627.c  (working copy)
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O1" } */
+
+
+extern volatile __memx const long  a, b, c, d, e, f;
+extern volatile long result;
+
+extern void vfunc (const char*, ...);
+
+void foo (void)
+{
+   result = a + b + c + d + e + f;
+   vfunc ("text", a, b, c, d, e, f, result);
+}

[PATCH] More early LTO dwarf2out bits


This merges a few more bits guarding stuff with ! early_dwarf, mostly
to avoid creating locations that involve addresses of decls early
but also to avoid wasting work for BLOCK_NONLOCALIZED_VARs.

Bootstrapped and tested on x86_64-unknown-linux-gnu, gdb testsuite
tested on the same arch (from the gdb 7.12 branch), applied to trunk.

Richard.

2016-10-13  Richard Biener  

* dwarf2out.c (tree_add_const_value_attribute): Do not try
rtl_for_decl_init during early phase.
(gen_variable_die): Do not create locations during early phase.
(gen_label_die): Likewise.
(decls_for_scope): Do not waste time handling BLOCK_NONLOCALIZED_VARs
twice.

Index: gcc/dwarf2out.c
===
--- gcc/dwarf2out.c (revision 241022)
+++ gcc/dwarf2out.c (working copy)
@@ -17958,12 +17958,15 @@ tree_add_const_value_attribute (dw_die_r
   init = t;
   gcc_assert (!DECL_P (init));
 
-  rtl = rtl_for_decl_init (init, type);
-  if (rtl)
-return add_const_value_attribute (die, rtl);
+  if (! early_dwarf)
+{
+  rtl = rtl_for_decl_init (init, type);
+  if (rtl)
+   return add_const_value_attribute (die, rtl);
+}
   /* If the host and target are sane, try harder.  */
-  else if (CHAR_BIT == 8 && BITS_PER_UNIT == 8
-  && initializer_constant_valid_p (init, type))
+  if (CHAR_BIT == 8 && BITS_PER_UNIT == 8
+  && initializer_constant_valid_p (init, type))
 {
   HOST_WIDE_INT size = int_size_in_bytes (TREE_TYPE (init));
   if (size > 0 && (int) size == size)
@@ -21269,13 +21272,13 @@ gen_variable_die (tree decl, tree origin
   if (com_decl)
 {
   dw_die_ref com_die;
-  dw_loc_list_ref loc;
+  dw_loc_list_ref loc = NULL;
   die_node com_die_arg;
 
   var_die = lookup_decl_die (decl_or_origin);
   if (var_die)
{
- if (get_AT (var_die, DW_AT_location) == NULL)
+ if (! early_dwarf && get_AT (var_die, DW_AT_location) == NULL)
{
  loc = loc_list_from_tree (com_decl, off ? 1 : 2, NULL);
  if (loc)
@@ -21309,7 +21312,8 @@ gen_variable_die (tree decl, tree origin
   com_die_arg.decl_id = DECL_UID (com_decl);
   com_die_arg.die_parent = context_die;
   com_die = common_block_die_table->find (_die_arg);
-  loc = loc_list_from_tree (com_decl, 2, NULL);
+  if (! early_dwarf)
+   loc = loc_list_from_tree (com_decl, 2, NULL);
   if (com_die == NULL)
{
  const char *cnam
@@ -21550,7 +21554,7 @@ gen_label_die (tree decl, dw_die_ref con
 
   if (DECL_ABSTRACT_P (decl))
 equate_decl_number_to_die (decl, lbl_die);
-  else
+  else if (! early_dwarf)
 {
   insn = DECL_RTL_IF_SET (decl);
 
@@ -23327,9 +23331,13 @@ decls_for_scope (tree stmt, dw_die_ref c
 {
   for (decl = BLOCK_VARS (stmt); decl != NULL; decl = DECL_CHAIN (decl))
process_scope_var (stmt, decl, NULL_TREE, context_die);
-  for (i = 0; i < BLOCK_NUM_NONLOCALIZED_VARS (stmt); i++)
-   process_scope_var (stmt, NULL, BLOCK_NONLOCALIZED_VAR (stmt, i),
-  context_die);
+  /* BLOCK_NONLOCALIZED_VARs simply generate DIE stubs with abstract
+origin - avoid doing this twice as we have no good way to see
+if we've done it once already.  */
+  if (! early_dwarf)
+   for (i = 0; i < BLOCK_NUM_NONLOCALIZED_VARS (stmt); i++)
+ process_scope_var (stmt, NULL, BLOCK_NONLOCALIZED_VAR (stmt, i),
+context_die);
 }
 
   /* Even if we're at -g1, we need to process the subblocks in order to get

Re: PING! [Fortran, Patch, PR72832, v1] [6/7 Regression] [OOP] ALLOCATE with SOURCE fails to allocate requested dimensions

2016-10-13 Thread Andre Vehreschild

Hi Steve,

thanks for the review. Committed as r241088 on trunk.

Letting it mature for one week in trunk before backporting to gcc-6.

Regards,
Andre

On Wed, 12 Oct 2016 10:18:29 -0700
Steve Kargl  wrote:

> On Wed, Oct 12, 2016 at 11:50:10AM +0200, Andre Vehreschild wrote:
> > Ping!
> > 
> > Updated patch with the comments gotten so far.
> > 
> > Ok for trunk?
> >   
> 
> Looks good to me.
> 


-- 
Andre Vehreschild * Email: vehre ad gmx dot de 
Index: gcc/fortran/ChangeLog
===
--- gcc/fortran/ChangeLog	(Revision 241086)
+++ gcc/fortran/ChangeLog	(Arbeitskopie)
@@ -1,3 +1,13 @@
+2016-10-13  Andre Vehreschild  
+
+	PR fortran/72832
+	* trans-expr.c (gfc_copy_class_to_class): Add generation of
+	runtime array bounds check.
+	* trans-intrinsic.c (gfc_conv_intrinsic_size): Add a crutch to
+	get the descriptor of a function returning a class object.
+	* trans-stmt.c (gfc_trans_allocate): Use the array spec on the
+	array to allocate instead of the array spec from source=.
+
 2016-10-12  Andre Vehreschild  
 
 	* trans-expr.c (gfc_find_and_cut_at_last_class_ref): Fixed style.
Index: gcc/fortran/trans-expr.c
===
--- gcc/fortran/trans-expr.c	(Revision 241086)
+++ gcc/fortran/trans-expr.c	(Arbeitskopie)
@@ -1235,6 +1235,7 @@
   stmtblock_t body;
   stmtblock_t ifbody;
   gfc_loopinfo loop;
+  tree orig_nelems = nelems; /* Needed for bounds check.  */
 
   gfc_init_block ();
   tmp = fold_build2_loc (input_location, MINUS_EXPR,
@@ -1262,6 +1263,31 @@
 	}
   vec_safe_push (args, to_ref);
 
+  /* Add bounds check.  */
+  if ((gfc_option.rtcheck & GFC_RTCHECK_BOUNDS) > 0 && is_from_desc)
+	{
+	  char *msg;
+	  const char *name = "<>";
+	  tree from_len;
+
+	  if (DECL_P (to))
+	name = (const char *)(DECL_NAME (to)->identifier.id.str);
+
+	  from_len = gfc_conv_descriptor_size (from_data, 1);
+	  tmp = fold_build2_loc (input_location, NE_EXPR,
+  boolean_type_node, from_len, orig_nelems);
+	  msg = xasprintf ("Array bound mismatch for dimension %d "
+			   "of array '%s' (%%ld/%%ld)",
+			   1, name);
+
+	  gfc_trans_runtime_check (true, false, tmp, ,
+   _current_locus, msg,
+			 fold_convert (long_integer_type_node, orig_nelems),
+			   fold_convert (long_integer_type_node, from_len));
+
+	  free (msg);
+	}
+
   tmp = build_call_vec (fcn_type, fcn, args);
 
   /* Build the body of the loop.  */
Index: gcc/fortran/trans-intrinsic.c
===
--- gcc/fortran/trans-intrinsic.c	(Revision 241086)
+++ gcc/fortran/trans-intrinsic.c	(Arbeitskopie)
@@ -6544,9 +6544,20 @@
   if (actual->expr->ts.type == BT_CLASS)
 gfc_add_class_array_ref (actual->expr);
 
-  argse.want_pointer = 1;
   argse.data_not_needed = 1;
-  gfc_conv_expr_descriptor (, actual->expr);
+  if (gfc_is_alloc_class_array_function (actual->expr))
+{
+  /* For functions that return a class array conv_expr_descriptor is not
+	 able to get the descriptor right.  Therefore this special case.  */
+  gfc_conv_expr_reference (, actual->expr);
+  argse.expr = gfc_build_addr_expr (NULL_TREE,
+	gfc_class_data_get (argse.expr));
+}
+  else
+{
+  argse.want_pointer = 1;
+  gfc_conv_expr_descriptor (, actual->expr);
+}
   gfc_add_block_to_block (>pre, );
   gfc_add_block_to_block (>post, );
   arg1 = gfc_evaluate_now (argse.expr, >pre);
Index: gcc/fortran/trans-stmt.c
===
--- gcc/fortran/trans-stmt.c	(Revision 241086)
+++ gcc/fortran/trans-stmt.c	(Arbeitskopie)
@@ -5489,7 +5489,8 @@
 		  desc = tmp;
 		  tmp = gfc_class_data_get (tmp);
 		}
-	  e3_is = E3_DESC;
+	  if (code->ext.alloc.arr_spec_from_expr3)
+		e3_is = E3_DESC;
 	}
 	  else
 	desc = !is_coarray ? se.expr
Index: gcc/testsuite/ChangeLog
===
--- gcc/testsuite/ChangeLog	(Revision 241086)
+++ gcc/testsuite/ChangeLog	(Arbeitskopie)
@@ -1,3 +1,10 @@
+2016-10-13  Andre Vehreschild  
+
+	PR fortran/72832
+	* gfortran.dg/allocate_with_source_22.f03: New test.
+	* gfortran.dg/allocate_with_source_23.f03: New test.  Expected to
+	fail.
+
 2016-10-13  Thomas Preud'homme  
 
 	* gcc.target/arm/movhi_movw.c: Enable test for ARM mode.
Index: gcc/testsuite/gfortran.dg/allocate_with_source_22.f03
===
--- gcc/testsuite/gfortran.dg/allocate_with_source_22.f03	(nicht existent)
+++ gcc/testsuite/gfortran.dg/allocate_with_source_22.f03	(Arbeitskopie)
@@ -0,0 +1,48 @@
+! { dg-do run }
+!
+! Test that pr72832 is fixed now.
+! Contributed by Daan van Vugt
+
+program allocate_source
+  type :: t
+integer :: i
+

[PATCH] Do not merge BBs with a different EH landing pads (PR, tree-optimization/77943)

Hi.

Following patch adds code that is already present in IPA ICF.
Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Ready to be installed?
Martin
>From 542c318af84ca561661b42baca3da7c340971dd8 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Wed, 12 Oct 2016 16:38:31 +0200
Subject: [PATCH] Do not merge BBs with a different EH landing pads (PR
 tree-optimization/77943)

gcc/testsuite/ChangeLog:

2016-10-12  Martin Liska  

	PR tree-optimization/77943
	* g++.dg/tree-ssa/pr77943.C: New test.

gcc/ChangeLog:

2016-10-12  Martin Liska  

	PR tree-optimization/77943
	* tree-ssa-tail-merge.c (merge_stmts_p): Do not merge BBs with
	a different EH landing pads.
---
 gcc/testsuite/g++.dg/tree-ssa/pr77943.C | 25 +
 gcc/tree-ssa-tail-merge.c   |  5 +
 2 files changed, 30 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/tree-ssa/pr77943.C

diff --git a/gcc/testsuite/g++.dg/tree-ssa/pr77943.C b/gcc/testsuite/g++.dg/tree-ssa/pr77943.C
new file mode 100644
index 000..ef7954a
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tree-ssa/pr77943.C
@@ -0,0 +1,25 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -std=c++11" } */
+
+void thrower[[gnu::noinline]]() {
+throw 1;
+}
+
+inline void fatal() noexcept {thrower();}
+inline void notFatal() {thrower();}
+
+void func(bool callFatal) {
+if (callFatal) {
+fatal();
+} else { 
+notFatal();
+}
+}
+
+int main(int argc, const char* argv[]) {
+try {
+bool callFatal = argc > 1;
+func(callFatal);
+} catch (...) {
+}
+}
diff --git a/gcc/tree-ssa-tail-merge.c b/gcc/tree-ssa-tail-merge.c
index 5e815ec..c292ee7 100644
--- a/gcc/tree-ssa-tail-merge.c
+++ b/gcc/tree-ssa-tail-merge.c
@@ -204,6 +204,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "params.h"
 #include "tree-ssa-sccvn.h"
 #include "cfgloop.h"
+#include "tree-eh.h"
 
 /* Describes a group of bbs with the same successors.  The successor bbs are
cached in succs, and the successor edge flags are cached in succ_flags.
@@ -1222,6 +1223,10 @@ merge_stmts_p (gimple *stmt1, gimple *stmt2)
   if (is_tm_ending (stmt1))
 return false;
 
+  /* Verify EH landing pads.  */
+  if (lookup_stmt_eh_lp_fn (cfun, stmt1) != lookup_stmt_eh_lp_fn (cfun, stmt2))
+return false;
+
   if (is_gimple_call (stmt1)
   && gimple_call_internal_p (stmt1))
 switch (gimple_call_internal_fn (stmt1))
-- 
2.9.2

Re: [RFC] Possible folding opportunities for string built-ins

On Wed, Oct 12, 2016 at 3:48 PM, Martin Liška  wrote:
> Hi.
>
> As you probably mentioned, simple folding improvement has grown to multiple 
> patches
> and multiple iterations. Apart from that, I also noticed that we do not do 
> the best
> for couple of cases and I would like to have a feedback if it worth to 
> improve or not?
>
> $ cat /tmp/string-folding-missing.c
> const char global_1[4] = {'a', 'b', 'c', 'd' };
> const char global_2[6] = "abcdefghijk";
>
> int main()
> {
>   const char local1[] = "asdfasdfasdf";
>
>   /* Case 1 */
>   __builtin_memchr (global_1, 'c', 5);
>
>   /* Case 2 */
>   __builtin_memchr (global_2, 'c', 5);
>
>   /* Case 3 */
>   __builtin_memchr (local1, 'a', 5);
>
>   return 0;
> }
>
> Cases:
> 1) Currently, calling c_getstr (which calls string_constant) can't handle 
> CONSTRUCTOR. Potential
> solution can be to create on demand STRING_CST, however as string_constant is 
> called multiple times,
> it can be overkill.

I believe somewhere during GENERICIZation / GIMPLIFICATION we should
simply turn those
COSNTRUCTORs into STRING_CSTs ... (probably not in string_constant
itself as that would be somewhat
gross of a place).

> 2) /tmp/x.c:2:26: warning: initializer-string for array of chars is too 
> long
>  const char global_2[6] = "abcdefghijk";
> Here I'm not sure whether one can consider global_2 == "abcdef" (w/o trailing 
> zero char) or not?
> If so, adding new output argument (string_length) to string_constant can be 
> solution.

Likewise if we are able to warn the FE should be able to truncate the
STRING_CST itself.  The
question is still whether a non-NULL terminated string should be
constant folded (it looks like
the STRING_PTR in a STRING_CST is always '\0' terminated).

> 3) Currently, ctor_for_folding return error_mark_node for local variables. 
> I'm wondering whether returning
> DECL_INITIAL for these would be doable? Will it make any issue for LTO?

They do not prevail (ok, you might see this during GENERIC folding).
They get lowered to runtime
initialization either from a constant or a CONST_DECL (with
DECL_INITIAL).  For those CONST_DECLs
we should return a ctor_for_folding.

Richard.

> Last question is whether one can aggressively fold strcasecmp in a host 
> compiler? Or are there any situations
> where results depends on locale?
>
> Thanks for thoughts.
> Martin

RE: [PATCH] [ARC] New option handling, refurbish multilib support.

2016-10-13 Thread Claudiu Zissulescu

Hi Andrew,

> Sorry it's taken s long to review this patch.
> 
This is understandable, it is a quite large patch and changes a lot of items.

> In general I like this, and think it's a step in the right direction.
> I have a few pretty minor concerns, but hopefully nothing too
> contentious.
> 
> In general I like the use of the *.def files, but my only concern is
> the lack of explanation in any of them about what they're for.  It
> would be nice if each had a summary of what each field represents to
> make it easier to add new entries.

This is a good point, I will add extra info on how to add new entries.

> 
> The use of arc_seen_options can, I think be replaced by using
> global_opts_seen.x_target_flags, or maybe there's a detail I'm not
> understanding, in which case maybe a comment somewhere to explain why
> those two things are different.

I cannot remember why I haven't use the global_options_set, but I will try as u 
suggested.

> 
> The only thing I dislike in this patch is the switch from 'arc_cpu' to
> a set of global boolean flags.  I can't see why we'd ever want more
> than one of those architecture flags to be true, so I'd rather we
> switched back to an enum if that's possible.
> 
I remember, initially I wanted to use enums, but I bumped into an issue, don't 
remember exactly what.  As far as I can see now, it perfectly make sense your 
comment. 

I will come back to u with extra info/revised patch asap,
Claudiu

Re: [vrp] use get_ptr_nonnull in tree-vrp

On Thu, Oct 13, 2016 at 6:38 AM, kugan
 wrote:
> Hi Richard,
>
>
> On 13/10/16 05:53, kugan wrote:
>>
>> Hi Richard,
>>
>> On 12/10/16 23:24, Richard Biener wrote:
>>>
>>> On Wed, Oct 12, 2016 at 8:56 AM, kugan
>>>  wrote:

 Hi,

 This patch uses get_ptr_nonnull in tree-vrp.

 Bootstrapped and regression tested this with other patched without any
 new regressions on x86_64-linux-gnu.

 Is this OK for trunk?
>>>
>>>
>>> Um.  Doesn't make much sense given nothing provides this info before
>>> EVRP?
>>> And if it makes sense then it makes sense not only for PARM_DECL SSA
>>> names.
>>
>> Not before EVRP. But when in TREE-VRP, EVRP + IPA-VRP should provide this.
>
>
> My primary intention was to pass it for PARM_DECL SSA names which comes from
> ipa-vrp. I have changed this now.

Ok I see.  The new patch still has it inside the SSA_NAME_IS_DEFAULT_DEF () if
so isn't any better in this regard.  To handle non-default-defs you'd
have to intersect
with non-NULL in update_value_range where we also intersect with get_range_info
info for integer types.

Your original patch was better so that is ok (once the prerequesites
are approved).

We can decide later if it's worth handling get_ptr_nonnull in
update_value_range.

Thanks,
Richard.

> Thanks,
> Kugan
>
>
>> I am not sure if this is the question?
>>
>> Thanks,
>> Kugan
>>>
>>>
>>> Richard.
>>>
 Thanks,
 Kugan

 gcc/testsuite/ChangeLog:

 2016-10-12  Kugan Vivekanandarajah  

 * gcc.dg/ipa/vrp4.c: Adjust testcase.

 gcc/ChangeLog:

 2016-10-12  Kugan Vivekanandarajah  

 * tree-vrp.c (get_value_range): Check get_ptr_nonnull.

Re: [PATCH] Don't unnecessarily create stack protector guard decls and MEMs (PR target/77957)

On Thu, 13 Oct 2016, Jakub Jelinek wrote:

> Hi!
> 
> PR77957 is likely a rs6000 backend bug where a useless memory load causes
> .LCTOC0 undefined reference in the end at -O0 and as such should be fixed,
> but I think it is completely unnecessary to create those loads at all
> if we know we are going to ignore it in the stack_protect_{set,test}
> patterns because TARGET_THREAD_SSP_OFFSET is defined and we are going to
> load from ABI selected TLS word.
> 
> So, this patch tweaks the -fstack-protector* expansion (both prologue and
> epilogue) so that if targetm.stack_protector_guard hook returns NULL it just
> passes a dummy const0_rtx down to the expander that is going to ignore it
> anyway, and adjusts the affected targets to return NULL in those cases.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok.

Thanks,
Richard.

> 2016-10-12  Jakub Jelinek  
> 
>   PR target/77957
>   * hooks.h (hook_tree_void_null): Declare.
>   * hooks.c (hook_tree_void_null): New function.
>   * langhooks.c (lhd_return_null_tree_v): Remove.
>   * langhooks-def.h (lhd_return_null_tree_v): Remove.
>   * cfgexpand.c (stack_protect_prologue): If guard_decl is NULL,
>   set y to const0_rtx.
>   * function.c (stack_protect_epilogue): Likewise.
>   * config/tilepro/tilepro.c (TARGET_STACK_PROTECT_GUARD): Redefine
>   if TARGET_THREAD_SSP_OFFSET is defined.
>   * config/s390/s390.c (TARGET_STACK_PROTECT_GUARD): Likewise.
>   * config/sparc/sparc.c (TARGET_STACK_PROTECT_GUARD): Likewise.
>   * config/tilegx/tilegx.c (TARGET_STACK_PROTECT_GUARD): Likewise.
>   * config/rs6000/rs6000.c (TARGET_STACK_PROTECT_GUARD): Likewise.
>   * config/i386/i386.c (TARGET_STACK_PROTECT_GUARD): Likewise.
>   (ix86_stack_protect_guard): New function.
> c/
>   * c-objc-common.h (LANG_HOOKS_GETDECLS): Use hook_tree_void_null
>   instead of lhd_return_null_tree_v.
> ada/
>   * gcc-interface/misc.c (LANG_HOOKS_GETDECLS): Use hook_tree_void_null
>   instead of lhd_return_null_tree_v.
> 
> --- gcc/config/tilepro/tilepro.c.jj   2016-09-27 20:12:45.0 +0200
> +++ gcc/config/tilepro/tilepro.c  2016-10-12 18:18:41.001998284 +0200
> @@ -4948,6 +4948,11 @@ tilepro_file_end (void)
>  #undef  TARGET_OPTION_OVERRIDE
>  #define TARGET_OPTION_OVERRIDE tilepro_option_override
>  
> +#ifdef TARGET_THREAD_SSP_OFFSET
> +#undef TARGET_STACK_PROTECT_GUARD
> +#define TARGET_STACK_PROTECT_GUARD hook_tree_void_null
> +#endif
> +
>  #undef  TARGET_SCALAR_MODE_SUPPORTED_P
>  #define TARGET_SCALAR_MODE_SUPPORTED_P tilepro_scalar_mode_supported_p
>  
> --- gcc/config/s390/s390.c.jj 2016-09-27 09:46:13.0 +0200
> +++ gcc/config/s390/s390.c2016-10-12 18:17:24.873959377 +0200
> @@ -15124,6 +15124,11 @@ s390_invalid_binary_op (int op ATTRIBUTE
>  #undef TARGET_OPTION_OVERRIDE
>  #define TARGET_OPTION_OVERRIDE s390_option_override
>  
> +#ifdef TARGET_THREAD_SSP_OFFSET
> +#undef TARGET_STACK_PROTECT_GUARD
> +#define TARGET_STACK_PROTECT_GUARD hook_tree_void_null
> +#endif
> +
>  #undef   TARGET_ENCODE_SECTION_INFO
>  #define TARGET_ENCODE_SECTION_INFO s390_encode_section_info
>  
> --- gcc/config/i386/i386.c.jj 2016-10-08 14:04:16.859911786 +0200
> +++ gcc/config/i386/i386.c2016-10-12 18:10:00.344570177 +0200
> @@ -44023,6 +44023,18 @@ ix86_mangle_type (const_tree type)
>  }
>  }
>  
> +#ifdef TARGET_THREAD_SSP_OFFSET
> +/* If using TLS guards, don't waste time creating and expanding
> +   __stack_chk_guard decl and MEM as we are going to ignore it.  */
> +static tree
> +ix86_stack_protect_guard (void)
> +{
> +  if (TARGET_SSP_TLS_GUARD)
> +return NULL_TREE;
> +  return default_stack_protect_guard ();
> +}
> +#endif
> +
>  /* For 32-bit code we can save PIC register setup by using
> __stack_chk_fail_local hidden function instead of calling
> __stack_chk_fail directly.  64-bit code doesn't need to setup any PIC
> @@ -50614,6 +50626,11 @@ ix86_addr_space_zero_address_valid (addr
>  #undef TARGET_MANGLE_TYPE
>  #define TARGET_MANGLE_TYPE ix86_mangle_type
>  
> +#ifdef TARGET_THREAD_SSP_OFFSET
> +#undef TARGET_STACK_PROTECT_GUARD
> +#define TARGET_STACK_PROTECT_GUARD ix86_stack_protect_guard
> +#endif
> +
>  #if !TARGET_MACHO
>  #undef TARGET_STACK_PROTECT_FAIL
>  #define TARGET_STACK_PROTECT_FAIL ix86_stack_protect_fail
> --- gcc/config/sparc/sparc.c.jj   2016-10-12 01:16:42.0 +0200
> +++ gcc/config/sparc/sparc.c  2016-10-12 18:17:44.595710443 +0200
> @@ -798,6 +798,11 @@ char sparc_hard_reg_printed[8];
>  #undef TARGET_OPTION_OVERRIDE
>  #define TARGET_OPTION_OVERRIDE sparc_option_override
>  
> +#ifdef TARGET_THREAD_SSP_OFFSET
> +#undef TARGET_STACK_PROTECT_GUARD
> +#define TARGET_STACK_PROTECT_GUARD hook_tree_void_null
> +#endif
> +
>  #if TARGET_GNU_TLS && defined(HAVE_AS_SPARC_UA_PCREL)
>  #undef TARGET_ASM_OUTPUT_DWARF_DTPREL
>  #define TARGET_ASM_OUTPUT_DWARF_DTPREL sparc_output_dwarf_dtprel
> ---

Re: [PATCH] Fix PR77826

On Wed, 12 Oct 2016, Richard Biener wrote:

> On Wed, 12 Oct 2016, Marc Glisse wrote:
> 
> > On Wed, 12 Oct 2016, Richard Biener wrote:
> > 
> > > So with doing the same on GENERIC I hit
> > > 
> > > FAIL: g++.dg/cpp1y/constexpr-array4.C  -std=c++14 (test for excess errors)
> > > 
> > > with the pattern
> > > 
> > >  /* (T)(P + A) - (T)P -> (T) A */
> > >  (for add (plus pointer_plus)
> > >   (simplify
> > >(minus (convert (add @0 @1))
> > > (convert @0))
> > 
> > Ah, grep missed that one because it is on 2 lines :-(
> > 
> > >...
> > > (convert @1
> > > 
> > > 
> > > no longer applying to
> > > 
> > > (long int) ((int *)  + 12) - (long int) 
> > > 
> > > because while (int *)  is equal to (long int)  (operand_equal_p
> > > does STRIP_NOPS) they obviously do not have the same type.
> > 
> > I believe we are comparing (int *)  to , not to (long int)  But 
> > yes,
> > indeed.
> > 
> > > So on GENERIC we have to consider that we feed operand_equal_p with
> > > non-ATOMs (arbitrary expressions).  The pattern above is "safe" as it
> > > doesn't reference @0 in the result (not sure if we should take advantage 
> > > of
> > > that in genmatch).
> > 
> > Since we are in the process of defining an
> > operand_equal_for_(generic|gimple)_match_p, we could tweak it to check the
> > type only for INTEGER_CST, or to still STRIP_NOPS, or similar.
> > 
> > Or we could remain very strict and refine the pattern, allowing a convert on
> > the pointer (we might want to split the plus and pointer_plus versions then,
> > for clarity). There are not many optimizations that are mandated by 
> > front-ends
> > and for which this is an issue.
> > 
> > > The other FAILs with doing the same on GENERIC are
> > > 
> > > FAIL: g++.dg/gomp/declare-simd-3.C  -std=gnu++11 (test for excess errors)
> > > FAIL: g++.dg/torture/pr71448.C   -O0  (test for excess errors)
> > > FAIL: g++.dg/vect/simd-clone-6.cc  -std=c++11 (test for excess errors)
> > > 
> > > the simd ones are 'warning: ignoring large linear step' and the pr71448.C
> > > case is very similar to the above.
> > 
> > Yes, I expect they all come from the same 1 or 2 transformations.
> > 
> > > > > > If we stick to the old behavior, maybe we could have some genmatch
> > > > > > magic to help with the constant capture weirdness. With matching
> > > > > > captures, we could select which operand (among those supposed to be
> > > > > > equivalent) is actually captured more cleverly, either with an
> > > > > > explicit marker, or by giving priority to the one that is not
> > > > > > immediatly below convert? in the pattern.
> > > > > 
> > > > > This route is a difficult one to take
> > > > 
> > > > The simplest version I was thinking about was @0 for a true capture, and
> > > > @@0
> > > > for something that just has to be checked for equality with @0.
> > > 
> > > Hmm, ok.  So you'd have @@0 having to match @0 and we'd get the @0 for
> > > the result in a reliable way?  Sounds like a reasonable idea, I'll see
> > > how that works out (we could auto-detect '@@' if the capture is not
> > > used in the result, see above).
> > 
> > It probably doesn't bring much compared to the @0@4 syntax you were 
> > suggesting below, so if that is easier...
> 
> Will figure that out ...
> 
> > > > > -- what would be possible is to
> > > > > capture a specific operand.  Like allow one to write
> > > > > 
> > > > > (op (op @0@4 @1) (op @0@3 @2))
> > > > > 
> > > > > and thus actually have three "names" for @0.  We have this internally
> > > > > already when you write
> > > > > 
> > > > > (convert?@0 @1)
> > > > > 
> > > > > for the case where there is no conversion.  @0 and @1 are the same
> > > > > in this case.
> > > > 
> > > > Looks nice and convenient (assuming lax constant matching).
> > > 
> > > Yes, w/o lax matching it has of course little value.
> > > 
> > > > > Not sure if this helps enough cases.
> > > > 
> > > > IIRC, in all cases where we had trouble with operand_equal_p, chosing
> > > > which
> > > > operand to capture would have solved the issue.
> > > 
> > > Yes.  We'd still need to actually catch all those cases...
> > 
> > Being forced to chose which operand we capture (say with @ vs @@, making 2
> > occurences of @0 a syntax error) might help, but it would also require
> > updating many patterns for which this was never an issue (easy but boring).
> 
> We can even have today
> 
>  (plus (minus @0 @1) (plus @0 @2) @0)
> 
> thus three matching @0.  Now if we have @@ telling to use value equality
> rather than "node equality" _and_ making the @@ select the canonical
> operand to choose then we'd have at most one @@ per ID.  Somewhat tricky
> but not impossible to implement I think.

So here it is.  It allows us to remove the operand_equal_p hacks:

@@ -334,11 +334,8 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)

 /* X - (X / Y) * Y is the same as X % Y.  */
 (simplify
- (minus (convert1? @2) (convert2? (mult:c (trunc_div @0 @1) @1)))
- /* We cannot use matching captures here,

Re: [PATCH, GCC] Move MEMMODEL_* and enum memmodel from coretypes.h to memmodel.h

On Wed, 12 Oct 2016, Joseph Myers wrote:

> On Wed, 12 Oct 2016, Thomas Preudhomme wrote:
> 
> > This patch is a follow up of [1] which aims to have all memory model related
> > declarations in memmodel.h. To achieve that, this patch moves memory model
> > related declaration from coretypes.h into memmodel.h. Note that since
> > memmodel.h is now included from libgcc it needs to have a runtime library
> > exception.
> 
> I think libgcc should be using the __ATOMIC_* predefines instead of the 
> MEMMODEL_* host-side constants.  (In general, we should be moving away 
> from including host-side headers in target-side code.)

Ah, if we have those then I agree.

Richard.

[PATCH] Fix exception-specification of std::invoke


The arguments to the is_nothrow_callable traits were not adding && to
the types, so they failed with anything that formed an invalid
function type.

* include/bits/invoke.h (__invoke): Fix exception-specification.
* include/std/functional (invoke): Likewise.
* testsuite/20_util/function_objects/invoke/1.cc: New test.

Tested powerpc64le-linux, committed to trunk.

commit e799357d09cf2d731a50a36ade71a39104eef6e5
Author: Jonathan Wakely 
Date:   Thu Oct 13 01:38:29 2016 +0100

Fix exception-specification of std::invoke

* include/bits/invoke.h (__invoke): Fix exception-specification.
* include/std/functional (invoke): Likewise.
* testsuite/20_util/function_objects/invoke/1.cc: New test.

diff --git a/libstdc++-v3/include/bits/invoke.h 
b/libstdc++-v3/include/bits/invoke.h
index 60405b5..2bbdab7 100644
--- a/libstdc++-v3/include/bits/invoke.h
+++ b/libstdc++-v3/include/bits/invoke.h
@@ -87,7 +87,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 constexpr typename result_of<_Callable&&(_Args&&...)>::type
 __invoke(_Callable&& __fn, _Args&&... __args)
-noexcept(__is_nothrow_callable<_Callable(_Args&&...)>::value)
+noexcept(__is_nothrow_callable<_Callable&&(_Args&&...)>::value)
 {
   using __result_of = result_of<_Callable&&(_Args&&...)>;
   using __type = typename __result_of::type;
diff --git a/libstdc++-v3/include/std/functional 
b/libstdc++-v3/include/std/functional
index 2587392..6a45314 100644
--- a/libstdc++-v3/include/std/functional
+++ b/libstdc++-v3/include/std/functional
@@ -200,7 +200,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 inline result_of_t<_Callable&&(_Args&&...)>
 invoke(_Callable&& __fn, _Args&&... __args)
-noexcept(is_nothrow_callable_v<_Callable(_Args&&...)>)
+noexcept(is_nothrow_callable_v<_Callable&&(_Args&&...)>)
 {
   return std::__invoke(std::forward<_Callable>(__fn),
   std::forward<_Args>(__args)...);
diff --git a/libstdc++-v3/testsuite/20_util/function_objects/invoke/1.cc 
b/libstdc++-v3/testsuite/20_util/function_objects/invoke/1.cc
new file mode 100644
index 000..81bf25a
--- /dev/null
+++ b/libstdc++-v3/testsuite/20_util/function_objects/invoke/1.cc
@@ -0,0 +1,30 @@
+// Copyright (C) 2016 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-do compile { target c++11 } }
+
+#include 
+
+struct abstract {
+  virtual ~abstract() = 0;
+  void operator()() noexcept;
+};
+
+static_assert( noexcept(std::__invoke(std::declval())), "" );
+#if __cpp_lib_invoke
+static_assert( noexcept(std::invoke(std::declval())), "" );
+#endif

Re: [PATCH] - improve sprintf buffer overflow detection (middle-end/49905)

2016-10-13 Thread Rainer Orth

Hi Martin,

>> as it happens, I'd already started bootstraps with your patch before
>> your mail arrived :-)
>
> Thanks for your help getting to the bottom of this!
>
>>
>> We're left with
>>
>> FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-1.c (test for excess errors)
>> FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-4.c (test for excess errors)
>>
>> for 32 bit and
>>
>> FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-4.c (test for excess errors)
>>
>> for 64 bit on both i386-pc-solaris2.12 and sparc-sun-solaris2.12.
>>
>> In the 32-bit builtin-sprintf-warn-1.c case, there are many instances of
>>
>> Excess errors:
>> /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-1.c:224:3:
>> warning: format '%lc' expects argument of type 'wint_t', but argument 5
>> has type 'int' [-Wformat=]
>
> I've built the sparc-sun-solaris2.12 toolchain and reproduced these
> warnings.  They are vestiges of those I saw and some of which I fixed
> before.  The problem is that %lc expects a wint_t argument which on
> this target is an alias for long in but the argument of 0 has type
> int.  The warning is coming out of the -Wformat checker which doesn't
> seem to care that int and long have the same size.  I've committed
> r240758 that should fix the remaining warnings of this kind but long
> term I think GCC should change to avoid warning in this case (Clang
> doesn't).
>
>>
>> while the second is
>>
>> Excess errors:
>> /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-4.c:15:23:
>> warning: writing a terminating nul past the end of the destination
>> [-Wformat-length=]/vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-4.c:30:21:
>> warning: writing format character '4' at offset 3 past the end of the
>> destination [-Wformat-length=]
>> /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-4.c:46:21:
>> warning: writing format character '4' at offset 3 past the end of the
>> destination [-Wformat-length=]
>> /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-4.c:61:25:
>> warning: writing a terminating nul past the end of the destination
>> [-Wformat-length=]
>> /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-4.c:74:22:
>> warning: '%-s' directive writing 4 bytes into a region of size 1
>> [-Wformat-length=]
>>
>> I've no idea yet why in the first error message two different messages
>> are joined into one line.  Probably something with DejaGnu mangling the
>> output...
>
> I've reproduced this as well and it took me a while to see the
> problem.  It turns out that the target specifier I used in the
> test (*-*-*-*) happened to match my native target
> x86_64-pc-linux-gnu but not sparc-sun-solaris2.12.  Let me fix
> that in the next patch.  Hopefully with that all the remaining
> failures should clear up.
>
> Thanks again for your help and patience!

No worries: I've refreshed your patch on top of Thomas Preud'homme's for
PR testsuite/77710 and found that one more bit is needed to fix this
completely.  32-bit Solaris shows three more warnings:

/vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-1.c:1355:3:
 warning: format '%lc' expects argument of type 'wint_t', but argument 6 has 
type 'int' [-Wformat=]
/vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-1.c:1356:3:
 warning: format '%lc' expects argument of type 'wint_t', but argument 6 has 
type 'int' [-Wformat=]
/vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-1.c:1357:3:
 warning: format '%lc' expects argument of type 'wint_t', but argument 6 has 
type 'int' [-Wformat=]

Fixed as follows:

# HG changeset patch
# Parent  1aaf616a61b8ea3ecff9313e059a1e85571cdde1
[testsuite] Fix 32-bit gcc.dg/tree-ssa/builtin-sprintf-warn-1.c on Solaris

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-1.c b/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-1.c
--- a/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-1.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-1.c
@@ -1352,9 +1352,9 @@ void test_snprintf_chk_c_const (void)
   T (3, "%c_%c", '1', '2');  /* { dg-warning "output truncated" } */
 
   /* Wide characters.  */
-  T (0, "%lc",  0);
-  T (1, "%lc",  0);
-  T (2, "%lc",  0);
+  T (0, "%lc",  (wint_t)0);
+  T (1, "%lc",  (wint_t)0);
+  T (2, "%lc",  (wint_t)0);
 
   /* The following could result in as few as a single byte and in as many
  as MB_CUR_MAX, but since the MB_CUR_MAX value is a runtime property

With this one and your refreshed patch, all failures are gone now for
i386-pc-solaris2.12, sparc-sun-solaris2.12, and x86_64-pc-linux-gnu.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: [PATCH][check_GNU_style.sh] More aggressively ignore dg-xxx directives

2016-10-13 Thread Kyrill Tkachov

On 12/10/16 17:49, Martin Sebor wrote:

On 10/12/2016 06:43 AM, Kyrill Tkachov wrote:

On 12/10/16 11:18, Kyrill Tkachov wrote:

On 12/10/16 10:57, Kyrill Tkachov wrote:

On 11/10/16 20:19, Jakub Jelinek wrote:

On Tue, Oct 11, 2016 at 01:11:04PM -0600, Martin Sebor wrote:

Also, the pattern that starts with "/\+\+\+" looks like it's missing
the ^ anchor.  Presumably it should be "/^\+\+\+ \/testsuite\//".

No, it will be almost never +++ /testsuite/
There needs to be .* in between "+++ " and "/testsuite/", and perhaps
it should also ignore "+++ testsuite/".
So /^\+\+\+ (.*\/)?testsuite\// ?
Also, normally (when matching $0) there won't be newlines in the text.

Jakub

Thanks.
Here is the updated patch with your suggestions.

Actually, I've encountered a problem:

 85 # Remove the testsuite part of the diff.  We don't care about GNU
style
 86 # in testcases and the dg-* directives give too many false positives.
 87 remove_testsuite ()
 88 {
 89   awk 'BEGIN{testsuite=0} /\+\+\+ / && ! /testsuite\//{testsuite=0} \
 90{if (!testsuite) print} /^\+\+\+
(.*\/)?testsuite\//{testsuite=1}'
 91 }
 92
 93 grep $format '^+' $files \
 94 | remove_testsuite \
 95 | grep -v ':+++' \
 96 > $inp

The /^\+\+\+ (.*\/)?testsuite\// doesn't ever match when the ^ anchor
is used.
The awk command matches fine by itself but not when fed from the "grep
$format '^+' $files"
command because grep adds the line numbers and file names.
So is it okay to omit the ^ here?

I think the AWK regex will not work correctly when the patch has
the line number prefix like "1234: " (AFAICT, this can only happen
in the second invocation of the remove_testsuite function which
also has the problem below making me wonder if your testing
exercised that mode).

Huh, you're right, but it didn't cause problems in my testing, which is weird.

I think the AWK regex needs to be changed to handle that.  It should
start with something like "^([1-9][0-9]*:)?\+\+\+"

I think it needs to be
^(.*:)?([1-9][0-9]*:)?\+\+\+
because grep -nH would add the filename as well as the line number in the first
invocation of remove_testsuite.
This revision does that.

I tried to test the patch but it doesn't seem to work.  When passed
a patch as an argument it hangs.  The hunk below isn't quite right:

 # Don't reuse $inp, which may be generated using -H and thus contain a
-# file prefix.
-grep -n '^+' $f \
+# file prefix.  Re-remove the testsuite since we're not using $inp.
+remove_testsuite $f \
+| grep -n '^+' \
 | grep -v ':+++' \
 > $tmp

The remove_testsuite function ignores arguments so passing $f to it
won't do anything except hang waiting for input.  This should look
closer to this (it worked in my very limited testing):

cat $f | remove_testsuite \

Thanks for the help,
Kyrill

2016-10-13  Kyrylo Tkachov  

* check_GNU_style.sh (remove_testsuite): New function.
Use it to remove testsuite from the diff.

Martin

diff --git a/contrib/check_GNU_style.sh b/contrib/check_GNU_style.sh
index 87a276c9cf47b5e07c4407f740ce05dce1928c30..fb7494661ee8ff4d4e58ed05137bb69aab7c46a7 100755
--- a/contrib/check_GNU_style.sh
+++ b/contrib/check_GNU_style.sh
@@ -81,7 +81,17 @@ if [ $nfiles -eq 1 ]; then
 else
 format="-nH"
 fi
+
+# Remove the testsuite part of the diff.  We don't care about GNU style
+# in testcases and the dg-* directives give too many false positives.
+remove_testsuite ()
+{
+  awk 'BEGIN{testsuite=0} /^(.*:)?([1-9][0-9]*:)?\+\+\+ / && ! /testsuite\//{testsuite=0} \
+   {if (!testsuite) print} /^(.*:)?([1-9][0-9]*:)?\+\+\+ (.*\/)?testsuite\//{testsuite=1}'
+}
+
 grep $format '^+' $files \
+| remove_testsuite \
 | grep -v ':+++' \
 > $inp

@@ -160,8 +170,9 @@ col (){
 	fi

 	# Don't reuse $inp, which may be generated using -H and thus contain a
-	# file prefix.
-	grep -n '^+' $f \
+	# file prefix.  Re-remove the testsuite since we're not using $inp.
+	cat $f | remove_testsuite \
+	| grep -n '^+' \
 	| grep -v ':+++' \
 	> $tmp

@@ -174,11 +185,10 @@ col (){
 	# Expand tabs to spaces according to tab positions.
 	# Keep long lines, make short lines empty.  Print the part past 80 chars
 	# in red.
-# Don't complain about dg-xxx directives in tests.
 	cat "$tmp" \
 	| sed 's/^[0-9]*:+//' \
 	| expand \
-	| awk '$0 !~ /{[[:space:]]*dg-(error|warning|message)[[:space:]]/ { \
+	| awk '{ \
 		 if (length($0) > 80) \
 		   printf "%s\033[1;31m%s\033[0m\n", \
 			  substr($0,1,80), \

Re: [PATCH] Do not merge BBs with a different EH landing pads (PR, tree-optimization/77943)

On Thu, Oct 13, 2016 at 9:40 AM, Martin Liška  wrote:
> Hi.
>
> Following patch adds code that is already present in IPA ICF.
> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
>
> Ready to be installed?

Ok, though technically this is conservative and we could still merge if
the blocks associated with the landing pads are merged as well?

Thanks,
Richard.

> Martin

Re: C++ PATCH for c++/77742 (-Waligned-new and placement new)

2016-10-13 Thread Andreas Schwab

FAIL: g++.dg/cpp1z/aligned-new7.C   (test for excess errors)
Excess errors:
/daten/aranym/gcc/gcc-20161013/gcc/testsuite/g++.dg/cpp1z/aligned-new7.C:13:41: 
warning: requested alignment 8 is larger than 2 [-Wattributes]

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

Re: [PATCH] Do not merge BBs with a different EH landing pads (PR, tree-optimization/77943)

On 10/13/2016 10:46 AM, Richard Biener wrote:
> On Thu, Oct 13, 2016 at 9:40 AM, Martin Liška  wrote:
>> Hi.
>>
>> Following patch adds code that is already present in IPA ICF.
>> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
>>
>> Ready to be installed?
> 
> Ok, though technically this is conservative and we could still merge if
> the blocks associated with the landing pads are merged as well?

Agree with you, however as find_duplicates operates BB by BB, it would require
introduction of some kind of dependencies (e.g. you can merge BB1 with BB2, if
BBx would be merged with BBy). I can write it on my TODO list after we'll
transform tail-merge to IPA ICF infrastructure.

Does it work for you?
Martin

> 
> Thanks,
> Richard.
> 
>> Martin

Re: [PATCH GCC 2/9]Add interface reseting original copy tables in cfg.c

2016-10-13 Thread Bin.Cheng

On Wed, Sep 7, 2016 at 1:20 PM, Jeff Law  wrote:
> On 09/06/2016 12:50 PM, Bin Cheng wrote:
>>
>> Hi,
>> This simple patch adds interface reseting original copy table in cfg.c.
>> This will be used in rewriting vect_do_peeling_* functions in vectorizer so
>> that we don't need to release/allocate tables between prolog and epilog
>> peeling.
>>
>> Thanks,
>> bin
>>
>> 2016-09-01  Bin Cheng  
>>
>> * cfg.c (reset_original_copy_tables): New func.
>> * cfg.h (reset_original_copy_tables): New decl.
>>
> Needs a function comment for reset_original_copy_tables.  Should be fine
> with that change.
Hi,
Comment added as suggested.  Attached patch committed.

Thanks,
bin
>
> Jeff
diff --git a/gcc/cfg.c b/gcc/cfg.c
index cab66c6..ee2e42c 100644
--- a/gcc/cfg.c
+++ b/gcc/cfg.c
@@ -1066,6 +1066,18 @@ initialize_original_copy_tables (void)
   loop_copy = new hash_table (10);
 }
 
+/* Reset the data structures to maintain mapping between blocks and
+   its copies.  */
+
+void
+reset_original_copy_tables (void)
+{
+  gcc_assert (original_copy_bb_pool);
+  bb_original->empty ();
+  bb_copy->empty ();
+  loop_copy->empty ();
+}
+
 /* Free the data structures to maintain mapping between blocks and
its copies.  */
 void
diff --git a/gcc/cfg.h b/gcc/cfg.h
index 6c8ba7e..ad935e3 100644
--- a/gcc/cfg.h
+++ b/gcc/cfg.h
@@ -108,6 +108,7 @@ extern void scale_bbs_frequencies_int (basic_block *, int, 
int, int);
 extern void scale_bbs_frequencies_gcov_type (basic_block *, int, gcov_type,
 gcov_type);
 extern void initialize_original_copy_tables (void);
+extern void reset_original_copy_tables (void);
 extern void free_original_copy_tables (void);
 extern void set_bb_original (basic_block, basic_block);
 extern basic_block get_bb_original (basic_block);

Re: [patch, avr, pr71676 and pr71678] Issues with casesi expand

2016-10-13 Thread Pitchumani Sivanupandi


On Monday 26 September 2016 08:19 PM, Georg-Johann Lay wrote:

On 26.09.2016 15:19, Pitchumani Sivanupandi wrote:

Attached patch for PR71676 and PR71678.

PR71676 is for AVR target that generates wrong code when switch case 
index is

more than 16 bits.

Switch case index of larger than SImode are checked for out of range 
before

'casesi' expand. RTL expand of casesi gets index as SImode, but index is
compared in HImode and ignores upper 16bits.

Attached patch changes the expansion for casesi to make the index 
comparison

in SImode and code generation accordingly.

PR71678 is ICE because below pattern in 'casesi' is not recognized.
(set (reg:HI 47)
 (minus:HI (subreg:HI (subreg:SI (reg:DI 44) 0) 0)
   (reg:HI 45)))

Fix of PR71676 avoids the above pattern as it changes the comparison
to SImode.


But this means that all comparisons are now performed in SImode which 
is a great performance loss for most programs which will switch on 
16-bit values.


IMO we need a less intrusive (w.r.t. performance) approach.


Yes.

I tried to split 'casesi' into several based on case values so that 
compare is done
in less expensive modes (i.e. QI or HI). In few cases it is not possible 
without

SImode subtract/ compare.

Pattern casesi will have index in SI mode. So, out of range checks will 
be expensive

as most common uses (in AVR) of case values will be in QI/HI mode.

e.g.
  if case values in QI range
if upper three bytes index is set
  goto out_of_range

offset = index - lower_bound (QImode)
if offset > case_range   (QImode)
  goto out_of_range
goto jump_table + offset

  else if case values in HI range
if index[2,3] is set
  goto out_of_range

offset = index - lower_bound (HImode)
if offset > case_range   (HImode)
  goto out_of_range
goto jump_table + offset

This modification will not work for the negative index values. Because 
code to check

upper bytes of index will be expensive than the SImode subtract/ compare.

So, I'm trying to update fix to have SImode subtract/ compare if the 
case values include
negative integers. For, others will try to optimize as mentioned above. 
Is that approach OK?


Alternatively we can have flags to generate shorter code for 'casesi' 
using HImode

subtract/ compare. But correctness is not guaranteed (PR71676).

Regards,
Pitchumani

Re: Set nonnull attribute to ptr_info_def based on VRP

On Thu, Oct 13, 2016 at 6:49 AM, kugan
 wrote:
> Hi Richard,
>
>>
>> what does this try to do?  Preserve info VRP computed across PTA?
>>
>> I think we didn't yet sort out the nonlocal/escaped vs. null handling
>> properly
>> (or how PTA should handle get_ptr_nonnull).  The way you are using it
>> asks for pt.null to be orthogonal to nonlocal/escaped and thus having
>> nonlocal or escaped would also require setting ptr.null in PTA.  It then
>> would be also more canonical to set it for pt.anything as well.  Which
>> means conservatively handling it would be equivalent to flipping its
>> semantic and changing its name to pt.nonnull.
>>
>> That said, you seem to be simply "reserving" the bit for VRP, keeping it
>> conservatively true when "not computed".  I guess I'm fine with this for
>> now
>> but it should be documented in the header file that way.
>>
>
> Thanks for the comments.
>
> To summarize, currently I am not relying on PTA analysis at all. Just saving
> null from VRP (or rather nonnull) and preserving it across PTA. Primary
> intention is to pass it for PARM_DECL SSA names (from ipa-vrp).
>
> In this case, using  pt.anything/nonlocal/escaped will only make the result
> more pessimistic.
>
> Ideally, we should improve pt.null within PTA but for now as you said, I
> will document it.
>
> When we start using pt.null from PTA analysis, we would also have to take
> into account pt.anything/nonlocal/escaped.
>
> Does that make sense?

Yes.

Thanks,
Richard.

> Thanks,
> Kugan
>

Re: [PATCH] Fix -Wimplicit-fallthrough ICE (PR c/77946)

2016-10-13 Thread Marek Polacek

On Thu, Oct 13, 2016 at 01:25:22AM +0200, Jakub Jelinek wrote:
> Hi!
> 
> Seems 2 functions in varasm.c just use TREE_PUBLIC on LABEL_DECLs together
> with other kinds of decls, but as TREE_PUBLIC on LABEL_DECLs means now
> something different, it breaks badly.
> While I could change those 2 functions in varasm.c, I'm afraid other
> functions might be doing something similar, so I think TREE_PRIVATE which is
> used far less often is a better choice for the flag bit here.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
 
Given this is a part of fallthru machinery: LGTM, but can't really approve.

Marek

Re: [PATCH] Tweaks to print_rtx_function


On 10/12/2016 11:04 PM, David Malcolm wrote:


This patch implements:
* the renumbering of non-virtual pseudos, using
  LAST_VIRTUAL_REGISTER + 1 as a base.
* omitting the edge "(flags)" directive if there aren't any

Bootstrap & regrtest in progress.

OK for trunk if they pass?


I tend to think probably yes. Let's say ok if I don't object by tomorrow 
:) I'm still wondering whether we want to use some sort of prefix like 
$p or %p which is distinct from any hard register name for clarity.



Bernd

Re: fix -fmax-errors & notes

2016-10-13 Thread Nathan Sidwell


On 10/11/16 16:07, David Malcolm wrote:


This logic is running when the next diagnostic is about to be emitted.
But what if the user has selected -Wfatal-errors and there's a single
error and no further diagnostics?  Could this change the observable
behavior?  (I'm trying to think of a case here, but failing).



This version only moves the -fmax-errors handling.  I've addressed your testcase 
comments too.  WDYT?


nathan

2016-10-13  Nathan Sidwell  

	* diagnostic.c (diagnostic_action_after_output): Remove max error
	handling here 
	(diagnostic_report_diagnostic): ... do it here instead.

	testsuite/
	* c-c++-common/fmax-errors.c: Make sure note is emitted.

Index: diagnostic.c
===
--- diagnostic.c	(revision 241027)
+++ diagnostic.c	(working copy)
@@ -470,18 +470,8 @@ diagnostic_action_after_output (diagnost
 	  diagnostic_finish (context);
 	  exit (FATAL_EXIT_CODE);
 	}
-  if (context->max_errors != 0
-	  && ((unsigned) (diagnostic_kind_count (context, DK_ERROR)
-			  + diagnostic_kind_count (context, DK_SORRY)
-			  + diagnostic_kind_count (context, DK_WERROR))
-	  >= context->max_errors))
-	{
-	  fnotice (stderr,
-		   "compilation terminated due to -fmax-errors=%u.\n",
-		   context->max_errors);
-	  diagnostic_finish (context);
-	  exit (FATAL_EXIT_CODE);
-	}
+  /* -fmax-error handling is just before the next diagnostic is
+	 emitted.  */
   break;
 
 case DK_ICE:
@@ -834,9 +824,7 @@ diagnostic_report_diagnostic (diagnostic
  -Wno-error=*.  */
   if (context->warning_as_error_requested
   && diagnostic->kind == DK_WARNING)
-{
-  diagnostic->kind = DK_ERROR;
-}
+diagnostic->kind = DK_ERROR;
 
   if (diagnostic->option_index
   && diagnostic->option_index != permissive_error_option (context))
@@ -892,6 +880,25 @@ diagnostic_report_diagnostic (diagnostic
 	return false;
 }
 
+  if (diagnostic->kind != DK_NOTE && context->max_errors)
+{
+  /* Check, before emitting the diagnostic, whether we would
+	 exceed the limit.  This way we will emit notes relevant to
+	 the final emitted error.  */
+  int count = (diagnostic_kind_count (context, DK_ERROR)
+		   + diagnostic_kind_count (context, DK_SORRY)
+		   + diagnostic_kind_count (context, DK_WERROR));
+
+  if ((unsigned) count >= context->max_errors)
+	{
+	  fnotice (stderr,
+		   "compilation terminated due to -fmax-errors=%u.\n",
+		   context->max_errors);
+	  diagnostic_finish (context);
+	  exit (FATAL_EXIT_CODE);
+	}
+}
+
   context->lock++;
 
   if (diagnostic->kind == DK_ICE || diagnostic->kind == DK_ICE_NOBT)
Index: testsuite/c-c++-common/fmax-errors.c
===
--- testsuite/c-c++-common/fmax-errors.c	(revision 241027)
+++ testsuite/c-c++-common/fmax-errors.c	(working copy)
@@ -1,11 +1,21 @@
 /* PR c/44782 */
 /* { dg-do compile } */
-/* { dg-options "-fmax-errors=3" } */
+/* { dg-options "-fmax-errors=3 -Wall" } */
 
 void foo (unsigned int i, unsigned int j)
 {
   (i) ();			/* { dg-error "" } */
   (j) ();			/* { dg-error "" } */
-  (i+j) ();			/* { dg-error "" } */
+
+  i + j; /* { dg-warning "" }  */
+
+  (k) ();			/* { dg-error "" } */
+  /* Make sure we see the notes related to the final error we emit.  */
+  /* { dg-message "identifier" "" { target c } 12 } */
+
+  /* Warnings after the final error should not appear.  */
+  i + j; /* no warning.  */
+
   (i*j) ();			/* no error here due to -fmax-errors */
+
 } /* { dg-prune-output "compilation terminated" } */

Re: [PATCH GCC 9/9]Prove no-overflow in computation of LOOP_VINFO_NITERS and improve code generation

2016-10-13 Thread Bin.Cheng

On Mon, Sep 12, 2016 at 8:58 PM, Jeff Law  wrote:
> On 09/06/2016 12:54 PM, Bin Cheng wrote:
>>
>> Hi,
>> LOOP_VINFO_NITERS is computed as LOOP_VINFO_NITERSM1 + 1, which could
>> overflow in loop niters' type.  Vectorizer needs to generate more code
>> computing vectorized niters if overflow does happen.  However, For common
>> loops, there is no overflow actually, this patch tries to prove the
>> no-overflow information and use that to improve code generation.  At the
>> moment, no-overflow information comes either from loop niter analysis, or
>> the truth that we know loop is peeled for non-zero iterations in prologue
>> peeling.  For the latter case, it doesn't matter if the original
>> LOOP_VINFO_NITERS overflows or not, because computation LOOP_VINFO_NITERS -
>> LOOP_VINFO_PEELING_FOR_ALIGNMENT cancels the overflow by underflow.
>>
>> Thanks,
>> bin
>>
>> 2016-09-01  Bin Cheng  
>>
>> * tree-vect-loop.c (loop_niters_no_overflow): New func.
>> (vect_transform_loop): Call loop_niters_no_overflow.  Pass the
>> no-overflow information to vect_do_peeling_for_loop_bound and
>> vect_gen_vector_loop_niters.
>>
> OK when prereqs are all approved.
Hi,
I revised this patch using widest_int comparison for trees, rather
than int.  Attached new patch is committed.  Also committed all
patches in peel refactoring patch set, they are posted at:
https://gcc.gnu.org/ml/gcc-patches/2016-09/msg00326.html
https://gcc.gnu.org/ml/gcc-patches/2016-10/msg01012.html
https://gcc.gnu.org/ml/gcc-patches/2016-09/msg00328.html
https://gcc.gnu.org/ml/gcc-patches/2016-09/msg00329.html
https://gcc.gnu.org/ml/gcc-patches/2016-09/msg00330.html
https://gcc.gnu.org/ml/gcc-patches/2016-09/msg00331.html
https://gcc.gnu.org/ml/gcc-patches/2016-09/msg00332.html
https://gcc.gnu.org/ml/gcc-patches/2016-09/msg00333.html

The patch set bootstrap and test again on x86_64 and AArch64.  No
regression found.
I will keep eyes on possible fallouts.

Thanks,
bin

>
> jeff
diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index 0470445..9cca9b7 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -6620,6 +6620,39 @@ vect_loop_kill_debug_uses (struct loop *loop, gimple 
*stmt)
 }
 }
 
+/* Given loop represented by LOOP_VINFO, return true if computation of
+   LOOP_VINFO_NITERS (= LOOP_VINFO_NITERSM1 + 1) doesn't overflow, false
+   otherwise.  */
+
+static bool
+loop_niters_no_overflow (loop_vec_info loop_vinfo)
+{
+  /* Constant case.  */
+  if (LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo))
+{
+  tree cst_niters = LOOP_VINFO_NITERS (loop_vinfo);
+  tree cst_nitersm1 = LOOP_VINFO_NITERSM1 (loop_vinfo);
+
+  gcc_assert (TREE_CODE (cst_niters) == INTEGER_CST);
+  gcc_assert (TREE_CODE (cst_nitersm1) == INTEGER_CST);
+  if (wi::to_widest (cst_nitersm1) < wi::to_widest (cst_niters))
+   return true;
+}
+
+  widest_int max;
+  struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
+  /* Check the upper bound of loop niters.  */
+  if (get_max_loop_iterations (loop, ))
+{
+  tree type = TREE_TYPE (LOOP_VINFO_NITERS (loop_vinfo));
+  signop sgn = TYPE_SIGN (type);
+  widest_int type_max = widest_int::from (wi::max_value (type), sgn);
+  if (max < type_max)
+   return true;
+}
+  return false;
+}
+
 /* Function vect_transform_loop.
 
The analysis phase has determined that the loop is vectorizable.
@@ -6707,8 +6740,9 @@ vect_transform_loop (loop_vec_info loop_vinfo)
   tree niters = vect_build_loop_niters (loop_vinfo);
   LOOP_VINFO_NITERS_UNCHANGED (loop_vinfo) = niters;
   tree nitersm1 = unshare_expr (LOOP_VINFO_NITERSM1 (loop_vinfo));
+  bool niters_no_overflow = loop_niters_no_overflow (loop_vinfo);
   vect_do_peeling (loop_vinfo, niters, nitersm1, _vector, th,
-  check_profitability, false);
+  check_profitability, niters_no_overflow);
   if (niters_vector == NULL_TREE)
 {
   if (LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo))
@@ -6717,7 +6751,7 @@ vect_transform_loop (loop_vec_info loop_vinfo)
   LOOP_VINFO_INT_NITERS (loop_vinfo) / vf);
   else
vect_gen_vector_loop_niters (loop_vinfo, niters, _vector,
-false);
+niters_no_overflow);
 }
 
   /* 1) Make sure the loop header has exactly two entries

Re: [RFC] Speed-up -fprofile-update=atomic

On Wed, Oct 12, 2016 at 3:52 PM, Martin Liška  wrote:
> On 10/04/2016 11:45 AM, Richard Biener wrote:
>> On Thu, Sep 15, 2016 at 12:00 PM, Martin Liška  wrote:
>>> On 09/07/2016 02:09 PM, Richard Biener wrote:
 On Wed, Sep 7, 2016 at 1:37 PM, Martin Liška  wrote:
> On 08/18/2016 06:06 PM, Richard Biener wrote:
>> On August 18, 2016 5:54:49 PM GMT+02:00, Jakub Jelinek 
>>  wrote:
>>> On Thu, Aug 18, 2016 at 08:51:31AM -0700, Andi Kleen wrote:
> I'd prefer to make updates atomic in multi-threaded applications.
> The best proxy we have for that is -pthread.
>
> Is it slower, most definitely, but odds are we're giving folks
> garbage data otherwise, which in many ways is even worse.

 It will likely be catastrophically slower in some cases.

 Catastrophically as in too slow to be usable.

 An atomic instruction is a lot more expensive than a single
>>> increment. Also
 they sometimes are really slow depending on the state of the machine.
>>>
>>> Can't we just have thread-local copies of all the counters (perhaps
>>> using
>>> __thread pointer as base) and just atomically merge at thread
>>> termination?
>>
>> I suggested that as well but of course it'll have its own class of 
>> issues (short lived threads, so we need to somehow re-use counters from 
>> terminated threads, large number of threads and thus using too much 
>> memory for the counters)
>>
>> Richard.
>
> Hello.
>
> I've got written the approach on my TODO list, let's see whether it would 
> be doable in a reasonable amount of time.
>
> I've just finished some measurements to illustrate slow-down of 
> -fprofile-update=atomic approach.
> All numbers are: no profile, -fprofile-generate, -fprofile-generate 
> -fprofile-update=atomic
> c-ray benchmark (utilizing 8 threads, -O3): 1.7, 15.5., 38.1s
> unrar (utilizing 8 threads, -O3): 3.6, 11.6, 38s
> tramp3d (1 thread, -O3): 18.0, 46.6, 168s
>
> So the slow-down is roughly 300% compared to -fprofile-generate. I'm not 
> having much experience with default option
> selection, but these numbers can probably help.
>
> Thoughts?

 Look at the generated code for an instrumented simple loop and see that for
 the non-atomic updates we happily apply store-motion to the counter update
 and thus we only get one counter update per loop exit rather than one per
 loop iteration.  Now see what happens for the atomic case (I suspect you
 get one per iteration).

 I'll bet this accounts for most of the slowdown.

 Back in time ICC which had atomic counter updates (but using function
 calls - ugh!) had a > 1000% overhead with FDO for tramp3d (they also
 didn't have early inlining -- removing abstraction helps reducing the 
 number
 of counters significantly).

 Richard.
>>>
>>> Hi.
>>>
>>> During Cauldron I discussed with Richi approaches how to speed-up ARCS
>>> profile counter updates. My first attempt is to utilize TLS storage, where
>>> every function is accumulating arcs counters. These are eventually added
>>> (using atomic operations) to the global one at the very end of a function.
>>> Currently I rely on target support of TLS, which is questionable whether
>>> to have such a requirement for -fprofile-update=atomic, or to add a new 
>>> option value
>>> like -fprofile-update=atomic-tls?
>>>
>>> Running the patch on tramp3d, compared to previous numbers, it takes 88s to 
>>> finish.
>>> Time shrinks to 50%, compared to the current implementation.
>>>
>>> Thoughts?
>>
>> Hmm, I thought I suggested that you can simply use automatic storage
>> (which effectively
>> is TLS...) for regions that are not forked or abnormally left (which
>> means SESE regions
>> that have no calls that eventually terminate or throw externally).
>>
>> So why did you end up with TLS?
>
> Hi.
>
> Usage for TLS does not makes sense, stupid mistake ;)
>
> By using SESE regions, do you mean the infrastructure that is utilized
> by Graphite machinery?

No, just as "single-entry single-exit region" which means placing of
initializations of the internal counters to zero and the updates of the
actual counters is "obvious".

Note that this "optimization" isn't one if the SESE region does not contain
cycle(s).  Unless there is a way to do an atomic update of a bunch of
counters faster than doing them separately.  This optimization will also
increase register pressure (or force the internal counters to the stack).
Thus selecting which counters to "optimize" and which ones to leave in place
might be necessary.

Richard.

> Thanks,
> Martin
>
>>
>> Richard.
>>
>>> Martin
>>>

> Martin
>
>>
>>>  Jakub
>>
>>
>
>>>
>

Re: [PATCH] Fix -Wimplicit-fallthrough ICE (PR c/77946)


On 10/13/2016 01:25 AM, Jakub Jelinek wrote:

Seems 2 functions in varasm.c just use TREE_PUBLIC on LABEL_DECLs together
with other kinds of decls, but as TREE_PUBLIC on LABEL_DECLs means now
something different, it breaks badly.


Which functions are these?


PR c/77946
* tree.h (FALLTHROUGH_LABEL_P): Use private_flag instead of
public_flag.
* varasm.c (default_binds_local_p_3): Formatting fix.

* c-c++-common/Wimplicit-fallthrough-34.c: New test.


Ok. Let's hope this one works.


Bernd

Re: [patch] Fix GC issue triggered by arithmetic overflow checking

2016-10-13 Thread Eric Botcazou

> Yes.  But that's not the only source for DECL_UID differences.  Btw,
> I see lots of FOR_EACH_HASH_TABLE_ELEMENT in var-tracking.c
> but they don't look like their outcome is supposed to be dependent on
> element ordering.

This leads to NOTE_INSN_VAR_LOCATION notes emitted in a different order, which 
are then interpreted by dwarf2out_var_location.  In particular:

(note 6350 6349 6351 (var_location temp (nil)) NOTE_INSN_VAR_LOCATION)
(note 6351 6350 6352 (var_location temp$low (mem/c:DI (plus:SI (reg/f:SI 30 
%fp)
(const_int -112 [0xff90])) [10 MEM[(struct cpp_num 
*) + 8B]+0 S8 A64])) NOTE_INSN_VAR_LOCATION)
(note 6352 6351 6353 (var_location temp$8 (nil)) NOTE_INSN_VAR_LOCATION)
[...]
(code_label 2091 6355 2092 79 912 "" [1 uses])
(note 2092 2091 5271 79 [bb 79] NOTE_INSN_BASIC_BLOCK)

is interpreted differently from:

(note 6350 6349 6351 (var_location temp (nil)) NOTE_INSN_VAR_LOCATION)
(note 6351 6350 6352 (var_location temp$8 (nil)) NOTE_INSN_VAR_LOCATION)
(note 6352 6351 6353 (var_location temp$low (mem/c:DI (plus:SI (reg/f:SI 30 
%fp)
(const_int -112 [0xff90])) [10 MEM[(struct cpp_num 
*) + 8B]+0 S8 A64])) NOTE_INSN_VAR_LOCATION)
[...]
(note 2092 2091 5271 79 [bb 79] NOTE_INSN_BASIC_BLOCK)

@@ -32608,6 +32608,17 @@
.uleb128 0x8
.byte   0x93! DW_OP_piece
.uleb128 0x8
+   .uaword .LLVL592-.LLtext0   ! Location list begin address 
(*.LLLST153)
+   .uaword .LLVL597-.LLtext0   ! Location list end address 
(*.LLLST153)
+   .uahalf 0x9 ! Location expression size
+   .byte   0x93! DW_OP_piece
+   .uleb128 0x8
+   .byte   0x8e! DW_OP_breg30
+   .sleb128 -112
+   .byte   0x93! DW_OP_piece
+   .uleb128 0x8
+   .byte   0x93! DW_OP_piece
+   .uleb128 0x8
.uaword .LLVL695-.LLtext0   ! Location list begin address 
(*.LLLST153)
.uaword .LLVL696-.LLtext0   ! Location list end address 
(*.LLLST153)
.uahalf 0xe ! Location expression size

probably because the non-null location comes last in the second case.

-- 
Eric Botcazou

Re: [PATCH] Fix -Wimplicit-fallthrough ICE (PR c/77946)


On 10/13/2016 12:20 PM, Jakub Jelinek wrote:


both relied on TREE_PUBLIC be actually false for LABEL_DECLs, because
otherwise they have code later on that can't handle LABE_DECLs (plus callers
also not expecting LABEL_DECLs might not bind locally or might not bind to
the current def.


Ok, thanks. Guess I'll be testing the following:

@@ -6867,6 +6877,7 @@ default_binds_local_p_3 (const_tree exp,
   /* Static variables are always local.  */
   if (! TREE_PUBLIC (exp))
 return true;
+  gcc_assert (TREE_CODE (exp) != LABEL_DECL);

   /* With resolution file in hand, take look into resolutions.
  We can't just return true for resolved_locally symbols,
@@ -6978,6 +6989,7 @@ decl_binds_to_current_def_p (const_tree
 return false;
   if (!TREE_PUBLIC (decl))
 return true;
+  gcc_assert (TREE_CODE (exp) != LABEL_DECL);

   /* When resolution is available, just use it.  */
   if (symtab_node *node = symtab_node::get (decl))


Bernd

Re: [PATCH v4] PR48344: Fix unrecognizable insn error with -fstack-limit-register=r2


On 09/30/2016 10:02 PM, Andreas Schwab wrote:

On Feb 11 2016, Kelvin Nilsen  wrote:


* opts-global.c (handle_common_deferred_options): Introduce and
initialize two global variables to remember command-line options
specifying a stack-limiting register.
* opts.h: Add extern declarations of the two new global variables.
* emit-rtl.c (init_emit_once): Initialize the stack_limit_rtx
variable based on the values of the two new global variables.


That breaks gcc.target/m68k/stack-limit-1.c:

/daten/aranym/gcc/test/gcc/testsuite/gcc.target/m68k/stack-limit-1.c: In 
function 'dummy':
/daten/aranym/gcc/test/gcc/testsuite/gcc.target/m68k/stack-limit-1.c:6:1: 
error: unrecognizable insn:
(insn 10 9 11 (trap_if (ltu (cc0)
(const_int 0 [0]))
(const_int 1 [0x1])) 
/daten/aranym/gcc/test/gcc/testsuite/gcc.target/m68k/stack-limit-1.c:6 -1
 (nil))
/daten/aranym/gcc/test/gcc/testsuite/gcc.target/m68k/stack-limit-1.c:6:1: 
internal compiler error: in extract_insn, at recog.c:2287


Please file a PR if this isn't fixed yet.


Bernd

Re: [patch] Fix GC issue triggered by arithmetic overflow checking

2016-10-13 Thread Eric Botcazou

> I guess it depends on whether temp$8 and temp$low overlap or not.

I don't think so, the code is identical so that would be a more serious bug:

struct cpp_num
{
  cpp_num_part high;
  cpp_num_part low;
  bool unsignedp;
  bool overflow;
};

I think it's just dwarf2out_var_location being sensitive to the ordering.

-- 
Eric Botcazou

Re: [PATCH] Do not merge BBs with a different EH landing pads (PR, tree-optimization/77943)

On Thu, Oct 13, 2016 at 11:15 AM, Martin Liška  wrote:
> On 10/13/2016 10:46 AM, Richard Biener wrote:
>> On Thu, Oct 13, 2016 at 9:40 AM, Martin Liška  wrote:
>>> Hi.
>>>
>>> Following patch adds code that is already present in IPA ICF.
>>> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
>>>
>>> Ready to be installed?
>>
>> Ok, though technically this is conservative and we could still merge if
>> the blocks associated with the landing pads are merged as well?
>
> Agree with you, however as find_duplicates operates BB by BB, it would require
> introduction of some kind of dependencies (e.g. you can merge BB1 with BB2, if
> BBx would be merged with BBy). I can write it on my TODO list after we'll
> transform tail-merge to IPA ICF infrastructure.
>
> Does it work for you?

Yes, as said, the patch is ok as-is.

Richard.

> Martin
>
>>
>> Thanks,
>> Richard.
>>
>>> Martin
>

Re: [PATCH] (v2) Add a "compact" mode to print_rtx_function


On 10/12/2016 10:37 PM, David Malcolm wrote:

It didn't pass, due to this change:

 (print_rtx_operand_code_i): When printing source locations, wrap
 xloc.file in quotes. [...snip...]

[...]

The following is a revised version of the patch which updates this test case.


Also ok. This reminds me, wrapping the filename in quotes was a side 
issue - what I was really hoping for was to have testcases without this 
visual clutter unless they wanted to explicitly test functionality 
related to it.



Bernd

Re: [ping * 2] remove optab functions for [us]divmod_optab in optabs.def


On 10/06/2016 07:43 AM, Prathamesh Kulkarni wrote:

Pinging patch: https://gcc.gnu.org/ml/gcc-patches/2016-09/msg01038.html


If I understand correctly this is a latent issue where nonexistant 
libfunc names are stored (but not currently used). If that's correct, 
then OK.



Bernd

Re: Compile-time improvement for if conversion.

On Wed, Oct 12, 2016 at 3:37 PM, Yuri Rumyantsev  wrote:
> Richard,
>
> Here is updated patch . I avoided creation of new entry/exit blocks
> but instead add check to border cases - do not consider also blocks
> which are out of region.
>
> Any comments will be appreciated.

I mostly like it now.  Can you split out all the common parts from the
dom_info constructor
to a helper method please and just compute m_n_basic_blocks and a max_index and
do all the memory allocation in that common function?

@@ -276,9 +334,10 @@ dom_info::calc_dfs_tree_nonrec (basic_block bb)
  bn = e->src;

  /* If the next node BN is either already visited or a border
-block the current edge is useless, and simply overwritten
-with the next edge out of the current node.  */
- if (bn == m_end_block || m_dfs_order[bn->index])
+block or out of region the current edge is useless, and simply
+overwritten with the next edge out of the current node.  */
+ if (bn == m_end_block || bn->dom[d_i] == NULL

clever idea ;)  Just thinking if this means we support single entry,
multiple exit
regions for CDI_DOMINATORs and multiple entry, single exit regions for
CDI_POST_DOMINATORs.  I think so.  Please update the function comments
accordingly.


+  m_dfsnum = 1;
+  m_nodes = 0;
+  m_fake_exit_edge = NULL; /* Assume that region is SCC.  */

you mean SESE rather than SCC?

@@ -511,7 +573,7 @@ dom_info::calc_idoms ()
   : ei_start (bb->preds);
   edge_iterator einext;

-  if (m_reverse)
+  if (m_reverse && !in_region)
{
  /* If this block has a fake edge to exit, process that first.  */
  if (bitmap_bit_p (m_fake_exit_edge, bb->index))

I think it's better to simply change the if (m_reverse) test to a if
(m_fake_exit_edge) test.

I noticed you do not initialize n_bbs_in_dom_tree (just set it to
zero), it's not really used
anywhere (but in an assert).

free_dominance_info_for_region needs a function level comment (per
coding guidelines).

Please make free_dominance_info_for_region take a struct function * pointer.

 @@ -1318,11 +1350,13 @@ if_convertible_loop_p_1 (struct loop *loop,
vec *refs)
 {
   unsigned int i;
   basic_block exit_bb = NULL;
+  vec region;

   if (find_data_references_in_loop (loop, refs) == chrec_dont_know)
 return false;

-  calculate_dominance_info (CDI_DOMINATORS);
+  if (dom_info_state (CDI_DOMINATORS) != DOM_OK)
+calculate_dominance_info (CDI_DOMINATORS);

This is a premature (unwanted) change.

Other than that the only O(function-size) part left is the zeroing in

+  /* Determine max basic block index in region.  */
+  int max_index = region[0]->index;
+  for (size_t i = 1; i < num; i++)
+if (region[i]->index > max_index)
+  max_index = region[i]->index;
+  max_index += 1;
+  m_dfs_order = new_zero_array  (max_index + 1);
+  m_dfs_last = _dfs_order[max_index];

I suppose we can try addressing this as followup.

Thanks a lot for doing this.
Richard.

> 2016-10-11 16:48 GMT+03:00 Richard Biener :
>> On Tue, Oct 11, 2016 at 3:23 PM, Yuri Rumyantsev  wrote:
>>> Richard,
>>>
>>> I implemented this by passing callback function in_region which
>>> returns true if block belongs to region.
>>> I am testing it now
>>>
>>> I attach modified patch for your quick review.
>>
>> +  FOR_EACH_VEC_ELT (region, i, bb)
>> +{
>> +  bb->dom[dir_index] = et_new_tree (bb);
>> +}
>> +  /* Check that region is SESE region.  */
>> +  entry = region[0];
>> +  if ( EDGE_COUNT (entry->succs) > 1)
>> +{
>> +  /* Create new entry block with the only successor.  */
>> +  basic_block succ = NULL;
>> +  bool found = false;
>> +  FOR_EACH_EDGE (e, ei, entry->succs)
>> +   if (in_region (region, e->dest))
>> + {
>> +   gcc_assert (!found);
>>
>> is that check equal to e->dest->dom[dir_index] != NULL?  I think we
>> shouldn't need the callback.
>>
>> +new_entry = create_empty_bb (entry);
>> +unchecked_make_edge (new_entry, succ, 0);
>>
>> I still think this is somewhat gross and we should try to avoid it
>> somehow - at least it's well-hidden now ;)
>>
>> +/* Put it to region as entry node.  */
>> +region[0] = new_entry;
>>
>> so region[0] is overwritten now?
>>
>> As far as I understand calc_dfs_tree it should be possible to compute
>> the region on-the-fly
>> from calling calc_dfs_tree_nonrec on the entry/exit (also maybe
>> avoiding some of the still
>> "large" complexity of zeroing arrays in the constructor).
>>
>> And if we use that DFS walk directly we should be able to avoid
>> creating those fake entry/exit
>> blocks by using entry/exit edges instead... (somehow).
>>
>> Richard.
>>
>>
>>
>>> Thanks.
>>>
>>> 2016-10-11 13:33 GMT+03:00 Richard Biener :
 On Mon, Oct 10, 2016 at 4:17 PM, Yuri

Re: [patch] Fix GC issue triggered by arithmetic overflow checking

On Thu, Oct 13, 2016 at 12:15 PM, Eric Botcazou  wrote:
>> Yes.  But that's not the only source for DECL_UID differences.  Btw,
>> I see lots of FOR_EACH_HASH_TABLE_ELEMENT in var-tracking.c
>> but they don't look like their outcome is supposed to be dependent on
>> element ordering.
>
> This leads to NOTE_INSN_VAR_LOCATION notes emitted in a different order, which
> are then interpreted by dwarf2out_var_location.  In particular:
>
> (note 6350 6349 6351 (var_location temp (nil)) NOTE_INSN_VAR_LOCATION)
> (note 6351 6350 6352 (var_location temp$low (mem/c:DI (plus:SI (reg/f:SI 30
> %fp)
> (const_int -112 [0xff90])) [10 MEM[(struct cpp_num
> *) + 8B]+0 S8 A64])) NOTE_INSN_VAR_LOCATION)
> (note 6352 6351 6353 (var_location temp$8 (nil)) NOTE_INSN_VAR_LOCATION)
> [...]
> (code_label 2091 6355 2092 79 912 "" [1 uses])
> (note 2092 2091 5271 79 [bb 79] NOTE_INSN_BASIC_BLOCK)
>
> is interpreted differently from:
>
> (note 6350 6349 6351 (var_location temp (nil)) NOTE_INSN_VAR_LOCATION)
> (note 6351 6350 6352 (var_location temp$8 (nil)) NOTE_INSN_VAR_LOCATION)
> (note 6352 6351 6353 (var_location temp$low (mem/c:DI (plus:SI (reg/f:SI 30
> %fp)
> (const_int -112 [0xff90])) [10 MEM[(struct cpp_num
> *) + 8B]+0 S8 A64])) NOTE_INSN_VAR_LOCATION)
> [...]
> (note 2092 2091 5271 79 [bb 79] NOTE_INSN_BASIC_BLOCK)
>
> @@ -32608,6 +32608,17 @@
> .uleb128 0x8
> .byte   0x93! DW_OP_piece
> .uleb128 0x8
> +   .uaword .LLVL592-.LLtext0   ! Location list begin address
> (*.LLLST153)
> +   .uaword .LLVL597-.LLtext0   ! Location list end address
> (*.LLLST153)
> +   .uahalf 0x9 ! Location expression size
> +   .byte   0x93! DW_OP_piece
> +   .uleb128 0x8
> +   .byte   0x8e! DW_OP_breg30
> +   .sleb128 -112
> +   .byte   0x93! DW_OP_piece
> +   .uleb128 0x8
> +   .byte   0x93! DW_OP_piece
> +   .uleb128 0x8
> .uaword .LLVL695-.LLtext0   ! Location list begin address
> (*.LLLST153)
> .uaword .LLVL696-.LLtext0   ! Location list end address
> (*.LLLST153)
> .uahalf 0xe ! Location expression size
>
> probably because the non-null location comes last in the second case.

Definitely looks like a bug to me.  Can you open a PR for this so it doesn't get
lost?

Thanks,
Richard.

> --
> Eric Botcazou

Re: [PATCH] Fix -Wimplicit-fallthrough ICE (PR c/77946)

2016-10-13 Thread Jakub Jelinek

On Thu, Oct 13, 2016 at 12:11:36PM +0200, Bernd Schmidt wrote:
> On 10/13/2016 01:25 AM, Jakub Jelinek wrote:
> >Seems 2 functions in varasm.c just use TREE_PUBLIC on LABEL_DECLs together
> >with other kinds of decls, but as TREE_PUBLIC on LABEL_DECLs means now
> >something different, it breaks badly.
> 
> Which functions are these?

The ones I've noticed were:

bool
default_binds_local_p_3 (const_tree exp, bool shlib, bool weak_dominate,
 bool extern_protected_data, bool common_local_p)
{
  /* A non-decl is an entry in the constant pool.  */
  if (!DECL_P (exp))
return true;
...
  /* Static variables are always local.  */
  if (! TREE_PUBLIC (exp))
return true;
...
}

bool
decl_binds_to_current_def_p (const_tree decl)
{
  gcc_assert (DECL_P (decl));
  if (!targetm.binds_local_p (decl))
return false;
  if (!TREE_PUBLIC (decl))
return true;
...
}

both relied on TREE_PUBLIC be actually false for LABEL_DECLs, because
otherwise they have code later on that can't handle LABE_DECLs (plus callers
also not expecting LABEL_DECLs might not bind locally or might not bind to
the current def.

Jakub

[PATCH] Fix exception-specifications for std::_Not_fn