[Patch, x86_64] Fix znver1 imov/imovx load reservations.

2016-10-10 Thread Kumar, Venkataramanan
Hi  Maintainers,

The below patch fixes integer load type reservations for -march=znver1. 

Bootstrapped and regtested  on x86_64-pc-linux-gnu. 

Ok to commit to trunk ? 

(-Snip)
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 9659fbf..19b4066 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,7 @@
+2016-10-11  Venkataramanan Kumar  
+
+   * config/i386/znver1.md : Fix imov/imovx load type reservations.
+
 2016-10-09  Prathamesh Kulkarni  

* ipa-cp.c (ipcp_alignment_lattice): Remove.
diff --git a/gcc/config/i386/znver1.md b/gcc/config/i386/znver1.md
index 7db0562..93a707a 100644
--- a/gcc/config/i386/znver1.md
+++ b/gcc/config/i386/znver1.md
@@ -228,18 +228,18 @@
   (eq_attr "memory" "store")))
   "znver1-direct,znver1-ieu,znver1-store")

-(define_insn_reservation "znver1_load_imov_double_load" 6
+(define_insn_reservation "znver1_load_imov_double_load" 5
 (and (eq_attr "cpu" "znver1")
  (and (eq_attr "znver1_decode" "double")
   (and (eq_attr "type" "imovx")
(eq_attr "memory" "load"
-"znver1-double,znver1-load,znver1-ieu")
+"znver1-double,znver1-load")

-(define_insn_reservation "znver1_load_imov_direct_load" 5
+(define_insn_reservation "znver1_load_imov_direct_load" 4
 (and (eq_attr "cpu" "znver1")
  (and (eq_attr "type" "imov,imovx")
   (eq_attr "memory" "load")))
-"znver1-direct,znver1-load,znver1-ieu")
+"znver1-direct,znver1-load")

 ;; INTEGER/GENERAL instructions
 ;; register/imm operands only: ALU, ICMP, NEG, NOT, ROTATE, ISHIFT, TEST
(-Snip)

Regards,
Venkat.


Re: [RFC][VRP] Improve intersect_ranges

2016-10-10 Thread kugan

Hi Richard,

On 10/10/16 20:13, Richard Biener wrote:

On Sat, Oct 8, 2016 at 9:38 PM, kugan  wrote:

Hi Richard,

Thanks for the review.
On 07/10/16 20:11, Richard Biener wrote:


On Fri, Oct 7, 2016 at 12:00 AM, kugan
 wrote:


Hi,

In vrp intersect_ranges, Richard recently changed it to create integer
value
ranges when it is integer singleton.

Maybe we should do the same when the other range is a complex ranges with
SSA_NAME (like [x+2, +INF])?

Attached patch tries to do this. There are cases where it will be
beneficial
as the  testcase in the patch. (For this testcase to work with Early VRP,
we
need the patch posted at
https://gcc.gnu.org/ml/gcc-patches/2016-10/msg00413.html)

Bootstrapped and regression tested on x86_64-linux-gnu with no new
regressions.



This is not clearly a win, in fact it can completely lose an ASSERT_EXPR
because there is no way to add its effect back as an equivalence.  The
current choice of always using the "left" keeps the ASSERT_EXPR range
and is able to record the other range via an equivalence.



How about changing the order in Early VRP when we are dealing with the same
SSA_NAME in inner and outer scope. Here is a patch that does this. Is this
OK if no new regressions?


I'm not sure if this is a good way forward.  The failure with the testcase is
that we don't extract a range for k from if (j < k) which I believe another
patch from you addresses?


Yes,  I have committed that. I am trying to add test cases for this and 
thats when I stumbled on this:


For:
foo (int k, int j)
{
   :
   if (j_1(D) > 9)
 goto ;
   else
 goto ;

   :
   if (j_1(D) < k_2(D))
 goto ;
   else
 goto ;

   :
   k_3 = k_2(D) + 1;
   if (k_2(D) <= 8)
 goto ;
   else
 goto ;

   :
   abort ();

   :
   return j_1(D);

}

Before we look at - if (j_1(D) < k_2(D))
j_1 (D) has [10, +INF]  EQUIVALENCES: { j_1(D) } (1 elements)

When we look at  if (j_1(D) < k_2(D))
The range is [-INF, k_2(D) + -1]  EQUIVALENCES: { j_1(D) } (1 elements)

We intersect:
[-INF, k_2(D) + -1]  EQUIVALENCES: { j_1(D) } (1 elements)
and
[10, +INF]  EQUIVALENCES: { j_1(D) } (1 elements)

to
[-INF, k_2(D) + -1]  EQUIVALENCES: { j_1(D) } (1 elements)

Due to this, in if (j_1(D) < k_2(D)) , we get pessimistic value range 
for k_2(D)


Thanks,
Kugan



As said the issue is with the equivalence / value-range representation so
you can't do sth like

  /* Discover VR when condition is true.  */
  extract_range_for_var_from_comparison_expr (op0, code, op0, op1, );
  if (old_vr->type == VR_RANGE || old_vr->type == VR_ANTI_RANGE)
vrp_intersect_ranges (, old_vr);

  /* If we found any usable VR, set the VR to ssa_name and create a
 PUSH old value in the stack with the old VR.  */
  if (vr.type == VR_RANGE || vr.type == VR_ANTI_RANGE)
{
  new_vr = vrp_value_range_pool.allocate ();
  *new_vr = vr;
  push_value_range (op0, new_vr);
  ->>>  add equivalence to old_vr for new_vr.

because old_vr and new_vr are the 'same' (they are associated with SSA name op0)

Richard.


Thanks,
Kugan






My thought on this was that we need to separate "ranges" and associated
SSA names so we can introduce new ranges w/o the need for an SSA name
(and thus we can create an equivalence to the ASSERT_EXPR range).
IIRC I started on this at some point but never finished it ...

Richard.


Thanks,
Kugan


gcc/testsuite/ChangeLog:

2016-10-07  Kugan Vivekanandarajah  

* gcc.dg/tree-ssa/evrp6.c: New test.

gcc/ChangeLog:

2016-10-07  Kugan Vivekanandarajah  

* tree-vrp.c (intersect_ranges): If we failed to handle
the intersection and the other range involves computation with
symbolic values, choose integer range if available.







Go patch committed: move Backend/Linemap creation out of frontend

2016-10-10 Thread Ian Lance Taylor
This patch by Than McIntosh moves the calls that create the
GCC-specific Backend and Linemap objects out of the Go frontend into
the gccgo interface code.  This allows for more flexibility creating
those objects.  Bootstrapped and ran Go testsuite on
x86_64-pc-linux-gnu.  Committed to mainline.

Ian

2016-10-10  Than McIntosh  

* go-gcc.h: New file.
* go-c.h (struct go_create_gogo_args): Add backend and linemap
fields.
* go-lang.c: Include "go-gcc.h".
(go_langhook_init): Set linemap and backend fields of args.
* go-gcc.cc: Include "go-gcc.h".
* go-linemap.cc: Include "go-gcc.h".
Index: gcc/go/go-c.h
===
--- gcc/go/go-c.h   (revision 240942)
+++ gcc/go/go-c.h   (working copy)
@@ -22,6 +22,8 @@ along with GCC; see the file COPYING3.
 
 #define GO_EXTERN_C
 
+class Linemap;
+class Backend;
 
 /* Functions defined in the Go frontend proper called by the GCC
interface.  */
@@ -36,9 +38,11 @@ struct go_create_gogo_args
   int int_type_size;
   int pointer_size;
   const char* pkgpath;
-  const char *prefix;
-  const char *relative_import_path;
-  const char *c_header;
+  const char* prefix;
+  const char* relative_import_path;
+  const char* c_header;
+  Backend* backend;
+  Linemap* linemap;
   bool check_divide_by_zero;
   bool check_divide_overflow;
   bool compiling_runtime;
Index: gcc/go/go-gcc.cc
===
--- gcc/go/go-gcc.cc(revision 240942)
+++ gcc/go/go-gcc.cc(working copy)
@@ -43,6 +43,7 @@
 #include "builtins.h"
 
 #include "go-c.h"
+#include "go-gcc.h"
 
 #include "gogo.h"
 #include "backend.h"
Index: gcc/go/go-gcc.h
===
--- gcc/go/go-gcc.h (revision 0)
+++ gcc/go/go-gcc.h (working copy)
@@ -0,0 +1,33 @@
+/* go-gcc.h -- Header file for go backend-specific interfaces.
+   Copyright (C) 2016 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#ifndef GO_GO_GCC_BACKEND_H
+#define GO_GO_GCC_BACKEND_H
+
+class Backend;
+
+// Create and return a Backend object for use with the GCC backend.
+
+extern Backend *go_get_backend();
+
+// Create and return a Linemap object for use with the GCC backend.
+
+extern Linemap *go_get_linemap();
+
+#endif // !defined(GO_GCC_BACKEND_H)
Index: gcc/go/go-lang.c
===
--- gcc/go/go-lang.c(revision 240942)
+++ gcc/go/go-lang.c(working copy)
@@ -37,6 +37,7 @@ along with GCC; see the file COPYING3.
 #include 
 
 #include "go-c.h"
+#include "go-gcc.h"
 
 /* Language-dependent contents of a type.  */
 
@@ -111,6 +112,8 @@ go_langhook_init (void)
   args.check_divide_overflow = go_check_divide_overflow;
   args.compiling_runtime = go_compiling_runtime;
   args.debug_escape_level = go_debug_escape_level;
+  args.linemap = go_get_linemap();
+  args.backend = go_get_backend();
   go_create_gogo ();
 
   build_common_builtin_nodes ();
Index: gcc/go/go-linemap.cc
===
--- gcc/go/go-linemap.cc(revision 240942)
+++ gcc/go/go-linemap.cc(working copy)
@@ -6,6 +6,8 @@
 
 #include "go-linemap.h"
 
+#include "go-gcc.h"
+
 // This class implements the Linemap interface defined by the
 // frontend.
 
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 240956)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-ecf9b645cefc5c3b4e6339adeb452b2d8642cf3e
+a700fa1908aa2a36f05b3ee09932f814fd94a10d
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/backend.h
===
--- gcc/go/gofrontend/backend.h (revision 240942)
+++ gcc/go/gofrontend/backend.h (working copy)
@@ -740,8 +740,4 @@ class Backend
const std::vector& variable_decls) = 0;
 };
 
-// The backend interface has to define this function.
-
-extern Backend* go_get_backend();
-
 #endif // !defined(GO_BACKEND_H)
Index: gcc/go/gofrontend/go-linemap.h
===
--- gcc/go/gofrontend/go-linemap.h  (revision 240942)

[PING] [PATCH] fix outstanding -Wformat-length failures (pr77735 et al.)

2016-10-10 Thread Martin Sebor

I'm looking for a review of the patch below:

  https://gcc.gnu.org/ml/gcc-patches/2016-10/msg00043.html

The patch should clean up the remaining test suite failures on
ILP32 targets and also fixes up some remaining issues in the
gimple-ssa-sprintf pass that stand in the way of re-enabling
the printf-return-value optimization.

I'm traveling next week so I'm hoping to enable the optimization
shortly after this patch goes in so that if there's any fallout
from it I can fix it before I leave.

Thanks
Martin

On 10/02/2016 02:10 PM, Martin Sebor wrote:

The attached patch fixes a number of outstanding test failures
and ILP32-related bugs in the gimple-ssa-sprintf pattch pointed
out in bug 77676 and 77735).  The patch also fixes c_strlen to
correctly handle wide strings (previously it accepted them but
treated them as nul-terminated byte sequences), and adjusts the
handling of "%a" to avoid assuming a specific number of decimal
digits (this is likely a defect in C11 that I'm pursuing with
WG14).

Tested on powerpc64le, i386, and x86_64.

There is one outstanding failure in the builtin-sprintf-warn-1.c
test on powerpc64le that looks like it might be due to the
printf_pointer_format target hook not having been set up entirely
correctly.  I'll look into that separately, along with pr77819.

Martin




Go patch committed: copy print code from Go 1.7 runtime

2016-10-10 Thread Ian Lance Taylor
This patch copies the code that implements the print and println
predeclared functions from the Go 1.7 runtime.  The compiler is
changed to use the new names, and to call the printlock and
printunlock functions around a sequence of print calls.  The writebuf
field in the g struct changes to a slice.  Bootstrapped and ran Go
testsuite on x86_64-pc-linux-gnu.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 240942)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-f3658aea2493c7f1c4a72502f9e7da562c7764c4
+ecf9b645cefc5c3b4e6339adeb452b2d8642cf3e
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/expressions.cc
===
--- gcc/go/gofrontend/expressions.cc(revision 240942)
+++ gcc/go/gofrontend/expressions.cc(working copy)
@@ -7018,6 +7018,26 @@ Builtin_call_expression::do_lower(Gogo*
  }
   }
   break;
+
+case BUILTIN_PRINT:
+case BUILTIN_PRINTLN:
+  // Force all the arguments into temporary variables, so that we
+  // don't try to evaluate something while holding the print lock.
+  if (this->args() == NULL)
+   break;
+  for (Expression_list::iterator pa = this->args()->begin();
+  pa != this->args()->end();
+  ++pa)
+   {
+ if (!(*pa)->is_variable())
+   {
+ Temporary_statement* temp =
+   Statement::make_temporary(NULL, *pa, loc);
+ inserter->insert(temp);
+ *pa = Expression::make_temporary_reference(temp, loc);
+   }
+   }
+  break;
 }
 
   return this;
@@ -8336,7 +8356,9 @@ Builtin_call_expression::do_get_backend(
 case BUILTIN_PRINTLN:
   {
const bool is_ln = this->code_ == BUILTIN_PRINTLN;
-Expression* print_stmts = NULL;
+
+   Expression* print_stmts = Runtime::make_call(Runtime::PRINTLOCK,
+location, 0);
 
const Expression_list* call_args = this->args();
if (call_args != NULL)
@@ -8348,8 +8370,7 @@ Builtin_call_expression::do_get_backend(
if (is_ln && p != call_args->begin())
  {
 Expression* print_space =
-Runtime::make_call(Runtime::PRINT_SPACE,
-   this->location(), 0);
+ Runtime::make_call(Runtime::PRINTSP, location, 0);
 
 print_stmts =
 Expression::make_compound(print_stmts, print_space,
@@ -8360,51 +8381,51 @@ Builtin_call_expression::do_get_backend(
Type* type = arg->type();
 Runtime::Function code;
if (type->is_string_type())
-  code = Runtime::PRINT_STRING;
+  code = Runtime::PRINTSTRING;
else if (type->integer_type() != NULL
 && type->integer_type()->is_unsigned())
  {
Type* itype = Type::lookup_integer_type("uint64");
arg = Expression::make_cast(itype, arg, location);
-code = Runtime::PRINT_UINT64;
+code = Runtime::PRINTUINT;
  }
else if (type->integer_type() != NULL)
  {
Type* itype = Type::lookup_integer_type("int64");
arg = Expression::make_cast(itype, arg, location);
-code = Runtime::PRINT_INT64;
+code = Runtime::PRINTINT;
  }
else if (type->float_type() != NULL)
  {
 Type* dtype = Type::lookup_float_type("float64");
 arg = Expression::make_cast(dtype, arg, location);
-code = Runtime::PRINT_DOUBLE;
+code = Runtime::PRINTFLOAT;
  }
else if (type->complex_type() != NULL)
  {
 Type* ctype = Type::lookup_complex_type("complex128");
 arg = Expression::make_cast(ctype, arg, location);
-code = Runtime::PRINT_COMPLEX;
+code = Runtime::PRINTCOMPLEX;
  }
else if (type->is_boolean_type())
-  code = Runtime::PRINT_BOOL;
+  code = Runtime::PRINTBOOL;
else if (type->points_to() != NULL
 || type->channel_type() != NULL
 || type->map_type() != NULL
 || type->function_type() != NULL)
  {
 arg = Expression::make_cast(type, arg, location);
-code = Runtime::PRINT_POINTER;
+code = 

Re: [PATCH] Improve performance of list::reverse

2016-10-10 Thread Elliot Goodrich
I haven't yet but I will try and sort it out tomorrow.

If we're replacing the current method with one that takes a size
parameter when _GLIBCXX_USE_CXX11_ABI is defined, is this going to
cause any issues with ABI compatibility? If not, then I agree that we
should go with the #if version.

On 10 October 2016 at 17:12, Jonathan Wakely  wrote:
> On 09/10/16 16:23 +0100, Elliot Goodrich wrote:
>>
>> Hi,
>>
>> If we unroll the loop so that we iterate both forwards and backwards,
>> we can take advantage of memory-level parallelism when chasing
>> pointers. This means that reverse takes 35% less time when nodes are
>> randomly scattered in memory and about the same time if nodes are
>> contiguous.
>>
>> Further, as our node pointers will never alias, we can interleave the
>> swaps of the next and previous pointers to remove further data
>> dependencies. This takes another 5% off the time when nodes are
>> scattered in memory and takes 20% off when nodes are contiguous.
>>
>> All in all we save 20%-40% depending on the memory layout.
>
>
> Nice, thanks for the patch.
>
> Do you have (or are you willing to sign) a copyright assignment for
> GCC?
>
> See https://gcc.gnu.org/contribute.html#legal for details.
>
>> For future improvement, by passing whether there is an odd or even
>> number of nodes in the list we can hoist one of the ifs out of the
>> loop and gain another 5-10% but most likely this is only possible when
>> _GLIBCXX_USE_CXX11_ABI is defined and size() is O(1). This would bring
>> the saving to 30%-45%. Is it worth writing a new overload of
>> _M_reverse which takes the size of the list?
>
>
> That certainly seems worthwhile. Do we need an overload or can it just
> be done with #if? It seems to me we'd either want to use the size, or
> not use it, we wouldn't want both versions defined at once. That
> suggests #if to me.
>


Re: Always support float128 on ia64 (PR target/77586)

2016-10-10 Thread Jeff Law

On 10/04/2016 10:46 AM, Joseph Myers wrote:

Bug 77586, and previously
, reports
ia64-elf failing to build because of float128_type_node being NULL,
but being used by the back end for __float128.

The global float128_type_node is only available conditionally, if
target hooks indicate TFmode is not only available as a scalar mode
and of the right format, but also supported in libgcc.  The back-end
support, however, expects the type always to be available for
__float128 even if the libgcc support is missing.

Although a target-specific node could be restored in the case where
libgcc support is missing, it seems better to address the missing
libgcc support.  Thus, this patch enables TFmode soft-fp in libgcc
globally for all ia64 targets.  Support for XFmode in libgcc (that is,
for libgcc2.c XFmode functions, not soft-fp) is also enabled for all
ia64 targets so that ia64 no longer needs to define the
TARGET_LIBGCC_FLOATING_MODE_SUPPORTED_P hook.

I've confirmed that ia64-elf builds cc1 with this patch and it passes
-fself-test.  I have not otherwise tested the patch.  It's plausible
that ia64-elf and ia64-freebsd might work as-is, but ia64-vms probably
needs further changes, by someone familiar with VMS shared libraries,
to implement an equivalent of ia64/t-softfp-compat in that case
(avoiding conflicts between __divtf3 from soft-fp and the old alias
for __divxf3).

gcc:
2016-10-04  Joseph Myers  

PR target/77586
* config/ia64/ia64.c (ia64_libgcc_floating_mode_supported_p)
(TARGET_LIBGCC_FLOATING_MODE_SUPPORTED_P): Remove.
* config/ia64/elf.h (IA64_NO_LIBGCC_TFMODE): Likewise.
* config/ia64/freebsd.h (IA64_NO_LIBGCC_TFMODE): Likewise.
* config/ia64/vms.h (IA64_NO_LIBGCC_XFMODE)
(IA64_NO_LIBGCC_TFMODE): Likewise.

libgcc:
2016-10-04  Joseph Myers  

PR target/77586
* config.host (ia64*-*-elf*, ia64*-*-freebsd*, ia64-hp-*vms*): Use
soft-fp.
Given it's a clear step forward and the inability to test the least 
common platform (vms), I'm OK with this patch.


jeff



Re: [PATCH] Update docs on libstdc++ source-code layout

2016-10-10 Thread Jonathan Wakely

On 10/10/16 19:57 +0100, Jonathan Wakely wrote:

Self-explanatory updates to the docs, and regenerating after the
various recent changes.

* doc/xml/manual/appendix_contributing.xml (contrib.organization):
Describe other subdirectories and add markup. Remove outdated
reference to check-script target.
* doc/html/*: Regenerate.

Committed to trunk.


Some further markup improvements and corrections for outdated text.

Committed to trunk.

commit ae505b77cef62a4ee79dd374e75d88c223e945ed
Author: Jonathan Wakely 
Date:   Mon Oct 10 23:33:15 2016 +0100

Improve docs on libstdc++ source-code layout

	* doc/xml/manual/appendix_contributing.xml (contrib.organization):
	Replace  with nested  elements. Update
	some more outdated text.
	* doc/html/*: Regenerate.

diff --git a/libstdc++-v3/doc/xml/manual/appendix_contributing.xml b/libstdc++-v3/doc/xml/manual/appendix_contributing.xml
index ee35dd9..1ee848f 100644
--- a/libstdc++-v3/doc/xml/manual/appendix_contributing.xml
+++ b/libstdc++-v3/doc/xml/manual/appendix_contributing.xml
@@ -203,105 +203,195 @@
 GCC sources contains the files needed to create the GNU C++ Library.
   
 
-  
+
 It has subdirectories:
+
 
-  doc
+
+  
+  doc
+  
 Files in HTML and text format that document usage, quirks of the
 implementation, and contributor checklists.
+
+  
 
-  include
+  
+  include
+  
 All header files for the C++ library are within this directory,
 modulo specific runtime-related files that are in the libsupc++
 directory.
 
-include/std
-  Files meant to be found by #include name directives in
-  standard-conforming user programs.
+
+
+include/std
+
+  Files meant to be found by #include name directives
+  in standard-conforming user programs.
+  
+
 
-include/c
+
+include/c
+
   Headers intended to directly include standard C headers.
   [NB: this can be enabled via --enable-cheaders=c]
+  
+
 
-include/c_global
+
+include/c_global
+
   Headers intended to include standard C headers in
-  the global namespace, and put select names into the std::
+  the global namespace, and put select names into the std::
   namespace.  [NB: this is the default, and is the same as
   --enable-cheaders=c_global]
+  
+
 
-include/c_std
+
+include/c_std
+
   Headers intended to include standard C headers
-  already in namespace std, and put select names into the std::
+  already in namespace std, and put select names into the std::
   namespace.  [NB: this is the same as
   --enable-cheaders=c_std]
+  
+
 
-include/bits
+
+include/bits
+
   Files included by standard headers and by other files in
   the bits directory.
+  
+
 
-include/backward
-  Headers provided for backward compatibility, such as iostream.h.
+
+include/backward
+
+  Headers provided for backward compatibility, such as
+  backward/hash_map.
   They are not used in this library.
+
+
 
-include/ext
+
+include/ext
+
   Headers that define extensions to the standard library.  No
-  standard header refers to any of them.
+  standard header refers to any of them, in theory (there are some
+  exceptions).
+  
+
+
+  
+  
 
-  scripts
+  
+  scripts
+  
 Scripts that are used during the configure, build, make, or test
 process.
+
+  
 
-  src
+  
+  src
+  
 Files that are used in constructing the library, but are not
 installed.
 
-src/c++98
+
+
+src/c++98
+
 Source files compiled using -std=gnu++98.
+  
+
 
-src/c++11
+
+src/c++11
+
 Source files compiled using -std=gnu++11.
+  
+
 
-src/filesystem
+
+src/filesystem
+
 Source files for the Filesystem TS.
+  
+
 
-src/shared
+
+src/shared
+
 Source code included by other files under both
 src/c++98 and
 src/c++11
+  
+
+
+  
+  
 
-  testsuites/[backward, demangle, ext, performance, thread, 17_* to 30_*]
+  
+  testsuites/[backward, demangle, ext, performance, thread, 17_* to 30_*]
+  
 Test programs are here, and may be used to begin to exercise the
 library.  Support for "make check" and "make check-install" is
 complete, and runs through all the subdirectories here when this
 command is issued from the build directory.  Please note that
 "make check" requires DejaGNU 1.4 or later to be installed.
+
+  
+
 
+
 Other subdirectories contain variant versions of certain files
 that are meant to be copied or linked by the configure script.
 Currently these are:
+config/abi
+config/allocator
+config/cpu
+config/io
+config/locale
+config/os
+
+
 
-  config/abi
-  config/cpu
-  config/io
-  config/locale
-  config/os
-
+
 In addition, 

Re: [PATCH 4/5] shrink-wrap: Shrink-wrapping for separate components

2016-10-10 Thread Segher Boessenkool
On Mon, Oct 10, 2016 at 03:21:31PM -0600, Jeff Law wrote:
> On 09/30/2016 04:34 AM, Segher Boessenkool wrote:
> >[ whoops, message too big, resending with the attachment compressed ]
> >
> >On Tue, Sep 27, 2016 at 03:14:51PM -0600, Jeff Law wrote:
> >>With transposition issue addressed, the only blocker I see are some
> >>simple testcases we can add to the suite.  They don't have to be real
> >>extensive.  And one motivating example for the list archives, ideally
> >>the glibc malloc case.
> >
> >And here is the malloc testcase.
> >
> >A very important (for performance) function is _int_malloc, which starts
> >with
> [ ... ]
> THanks.  What I think is important to note with this example is the bits 
> that were pushed into the path with the sysmalloc/alloc_perturb calls. 
> That's an unlikely path.

alloc_perturb is a no-op, and inlined as such: as nothing :-)

> We have to extrapolate a bit from the assembly provided.  In the not 
> separately shrink-wrapped version, we have a full prologue of stores and 
> two instances of a full epilogue (though only one ever executes) provided.
> 
> With separate shrink wrapping the (presumably) very cold path where we 
> error has virtually no prologue/epilogue.  That's probably a nop from a 
> performance standpoint.
> 
> More interesting is the path where we call sysmalloc/alloc_perturb, it's 
> a cold path, but not as cold as the error path.  We save/restore 4 regs 
> in that case.  Rather than a full prologue/epilogue.  So there's clearly 
> a savings there, though again, via the expect it's a cold path.
> 
> Where we have to extrapolate is the hot path.  Presumably on the hot 
> path we're saving/restoring ~4 fewer registers.   I haven't verified 
> that, but that is kindof the whole point here.

We save/restore just four registers total on the hot path.  And yes,
that is the point :-)

The hot exit is

.L683:
ld 14,144(1)
ld 15,152(1)
ld 25,232(1)
ld 30,272(1)
addi 3,4,16
.L673:
addi 1,1,288
blr

so four GPR restores and no LR restore.  Without separate shrink-wrapping
this was

.L641:
addi 3,21,16
b .L631

[ ... ]

.L631:
addi 1,1,288
ld 29,16(1)
ld 14,-144(1)
ld 15,-136(1)
ld 16,-128(1)
ld 17,-120(1)
ld 18,-112(1)
ld 19,-104(1)
ld 20,-96(1)
ld 21,-88(1)
ld 22,-80(1)
ld 23,-72(1)
ld 24,-64(1)
mtlr 29
ld 25,-56(1)
ld 26,-48(1)
ld 27,-40(1)
ld 28,-32(1)
ld 29,-24(1)
ld 30,-16(1)
ld 31,-8(1)
blr

(18 GPRs as well as LR).

I didn't show this path because there is a whole bunch of branches with
inline asm in the way.

The sysmalloc path was

.L635:
li 4,0
.L761:
addi 1,1,288
mr 3,14
ld 14,16(1)
ld 15,-136(1)
ld 16,-128(1)
ld 17,-120(1)
ld 18,-112(1)
ld 19,-104(1)
ld 20,-96(1)
ld 21,-88(1)
ld 22,-80(1)
ld 23,-72(1)
ld 24,-64(1)
ld 25,-56(1)
mtlr 14
ld 26,-48(1)
ld 14,-144(1)
ld 27,-40(1)
ld 28,-32(1)
ld 29,-24(1)
ld 30,-16(1)
ld 31,-8(1)
b sysmalloc

and now is

.L677:
mr 3,14
ld 15,152(1)
ld 14,144(1)
ld 25,232(1)
ld 30,272(1)
li 4,0
addi 1,1,288
b sysmalloc

I attach malloc.s.{no,yes}, I hope you can stomach that.  Well you
can read HP-PA, heh.


Segher


malloc.s.no.gz
Description: GNU Zip compressed data


malloc.s.yes.gz
Description: GNU Zip compressed data


Re: [v3 PATCH] Make any's copy assignment operator exception-safe, don't copy the underlying value when any is moved, make in_place constructors explicit.

2016-10-10 Thread Jonathan Wakely

On 10/10/16 22:21 +0300, Ville Voutilainen wrote:

This code was all pretty carefully written to avoid any redundant
operations. Does this change buy us anything except simpler code?


As discussed, destroying the value but leaving the manager non-null will
do bad things.


Oops again on my part! Not so carefully written, or tested.


New patch attached, ok for trunk?


OK, thanks.


Re: [PATCH, C++] Warn on redefinition of builtin functions (PR c++/71973)

2016-10-10 Thread Bernd Edlinger
On 10/06/16 22:37, Bernd Edlinger wrote:
> On 10/06/16 16:14, Kyrill Tkachov wrote:
>>
>> @@ -1553,7 +1588,7 @@ duplicate_decls (tree newdecl, tree olddecl, bool
>>
>> /* Whether or not the builtin can throw exceptions has no
>>bearing on this declarator.  */
>> -  TREE_NOTHROW (olddecl) = 0;
>> +  TREE_NOTHROW (olddecl) = TREE_NOTHROW (newdecl);
>>
>> The comment would need to be updated I think.
>
> Probably, yes.
>
> Actually the code did *not* do what the comment said, and
> effectively set the nothrow attribute to zero, thus
> the eh handlers were emitted when not needed.
>
> And IMHO the new code does now literally do what the comment
> said.
>
> At this point there follow 1000+ lines of code, in the same
> function that merge olddecl into newdecl and back again.
>
> The code is dependent on the types_match variable,
> and in the end newdecl is free'd an olddecl returned.
>
> At some places the code is impossible to understand:
> Look for the memcpy :-/
>
> I *think* the intention is to merge the attribute from the
> builtin when the header file is not explicitly giving,
> some or all attributes, when the parameters match.
>
> But when the parameters do not match, the header file
> changes the builtin's signature, and overrides the
> builtin attributes more or less with defaults or
> whatever is in the header file.
>
>


A few more thoughts, that may help to clarify a few things here.

Regarding this hunk:

else if (! same_type_p (TREE_VALUE (t1), TREE_VALUE (t2)))
  break;
+ if (t1 || t2
+ || ! same_type_p (TREE_TYPE (TREE_TYPE (olddecl)),
+   TREE_TYPE (TREE_TYPE (newdecl
+   warning_at (DECL_SOURCE_LOCATION (newdecl),
+   OPT_Wbuiltin_function_redefined,
+   "declaration of %q+#D conflicts with built-in "
+   "declaration %q#D", newdecl, olddecl);
}
  else if ((DECL_EXTERN_C_P (newdecl)

meanwhile I start to think that the "if" here is unnecessary,
because if decls_match returns false, the declarations are certainly
different.  And the warning is thus already justified at this point.
Removing the if changes nothing, the condition is always satisfied.

Regarding this hunk:

/* Whether or not the builtin can throw exceptions has no
  bearing on this declarator.  */
-  TREE_NOTHROW (olddecl) = 0;
+  TREE_NOTHROW (olddecl) = TREE_NOTHROW (newdecl);

You may ask, why the old code was working most of the time.
I think, usually, when types_match == true, there happens another
assignment to TREE_NOTHROW, later in that function around line 2183:

   /* Merge the type qualifiers.  */
   if (TREE_READONLY (newdecl))
 TREE_READONLY (olddecl) = 1;
   if (TREE_THIS_VOLATILE (newdecl))
 TREE_THIS_VOLATILE (olddecl) = 1;
   if (TREE_NOTHROW (newdecl))
 TREE_NOTHROW (olddecl) = 1;

This is in a big "if (types_match)", so I think that explains,
why the old code did work normally, and why it fails if the
parameter don't match, but I still have no idea what to say
in the comment, except that the code should exactly do what
the comment above says.


Bernd.


Re: [patch] aarch64-*-freebsd* support for gcc.

2016-10-10 Thread Andreas Tobler

On 10.10.16 23:10, Jeff Law wrote:

On 10/10/2016 03:07 PM, Andreas Tobler wrote:

Hi all,

the attached patch brings support for the aarch64-*-freebsd* target.

Bootstraped and tested, results on the list. Not that many results due
to board instabilities I lack a cavium ;)

Ok for main? And if yes, how far can I backport? Down to 5.4?

TIA,
Andreas

libgcc:

2016-10-10  Andreas Tobler  

* config.host: Add support for aarch64-*-freebsd*.

gcc:

2016-10-10  Andreas Tobler  

* config.gcc: Add aarch64-*-freebsd* support.
* config.host: Likewise.
* config/aarch64/aarch64-freebsd.h: New file.
* config/aarch64/t-aarch64-freebsd: Ditto.

toplevel:

2016-10-10  Andreas Tobler 

* configure.ac: Add aarch64-*-freebsd*.
* configure: Regenerate.

Certainly OK for the trunk.  Jakub, Richi & Joseph make the rules for
the release branches.


Thank you again Jeff.

Committed to trunk with r240949.

Andreas



Re: PATCH to introduce c-family/c-warn.c

2016-10-10 Thread Jeff Law

On 10/10/2016 10:36 AM, Marek Polacek wrote:

As outlined recently, this patch creates a new c-warn.c file, where various
diagnostic routines should reside, making c-common.c a little bit shorter.
There are no function changes though.  While at it, I fixed all tabs/space
problems in those functions that I've moved.  Some functions are contentious
and could arguably be in either file.

Next step is probably to create c-attribs.c.

Bootstrapped/regtested on x86_64-linux and ppc64-linux, ok for trunk?

2016-10-10  Marek Polacek  

* Makefile.in (C_COMMON_OBJS): Add c-family/c-warn.o.

* c-common.c (fold_for_warn): No longer static.
(bool_promoted_to_int_p): Likewise.
(c_common_get_narrower): Likewise.
(constant_expression_warning): Move to c-warn.c.
(constant_expression_error): Likewise.
(overflow_warning): Likewise.
(warn_logical_operator): Likewise.
(find_array_ref_with_const_idx_r): Likewise.
(warn_tautological_cmp): Likewise.
(expr_has_boolean_operands_p): Likewise.
(warn_logical_not_parentheses): Likewise.
(warn_if_unused_value): Likewise.
(strict_aliasing_warning): Likewise.
(sizeof_pointer_memaccess_warning): Likewise.
(check_main_parameter_types): Likewise.
(conversion_warning): Likewise.
(warnings_for_convert_and_check): Likewise.
(match_case_to_enum_1): Likewise.
(match_case_to_enum): Likewise.
(c_do_switch_warnings): Likewise.
(warn_for_omitted_condop): Likewise.
(readonly_error): Likewise.
(lvalue_error): Likewise.
(invalid_indirection_error): Likewise.
(warn_array_subscript_with_type_char): Likewise.
(warn_about_parentheses): Likewise.
(warn_for_unused_label): Likewise.
(warn_for_div_by_zero): Likewise.
(warn_for_memset): Likewise.
(warn_for_sign_compare): Likewise.
(do_warn_double_promotion): Likewise.
(do_warn_unused_parameter): Likewise.
(record_locally_defined_typedef): Likewise.
(maybe_record_typedef_use): Likewise.
(maybe_warn_unused_local_typedefs): Likewise.
(maybe_warn_bool_compare): Likewise.
(maybe_warn_shift_overflow): Likewise.
(warn_duplicated_cond_add_or_warn): Likewise.
(diagnose_mismatched_attributes): Likewise.
* c-common.h: Move the declarations from c-warn.c to its own section.
* c-warn.c: New file.

OK and creating c-attribs.c is pre-approved as well.

jeff



Re: [PATCH 4/5] shrink-wrap: Shrink-wrapping for separate components

2016-10-10 Thread Jeff Law

On 09/30/2016 04:34 AM, Segher Boessenkool wrote:

[ whoops, message too big, resending with the attachment compressed ]

On Tue, Sep 27, 2016 at 03:14:51PM -0600, Jeff Law wrote:

With transposition issue addressed, the only blocker I see are some
simple testcases we can add to the suite.  They don't have to be real
extensive.  And one motivating example for the list archives, ideally
the glibc malloc case.


And here is the malloc testcase.

A very important (for performance) function is _int_malloc, which starts
with

[ ... ]
THanks.  What I think is important to note with this example is the bits 
that were pushed into the path with the sysmalloc/alloc_perturb calls. 
That's an unlikely path.


We have to extrapolate a bit from the assembly provided.  In the not 
separately shrink-wrapped version, we have a full prologue of stores and 
two instances of a full epilogue (though only one ever executes) provided.


With separate shrink wrapping the (presumably) very cold path where we 
error has virtually no prologue/epilogue.  That's probably a nop from a 
performance standpoint.


More interesting is the path where we call sysmalloc/alloc_perturb, it's 
a cold path, but not as cold as the error path.  We save/restore 4 regs 
in that case.  Rather than a full prologue/epilogue.  So there's clearly 
a savings there, though again, via the expect it's a cold path.


Where we have to extrapolate is the hot path.  Presumably on the hot 
path we're saving/restoring ~4 fewer registers.   I haven't verified 
that, but that is kindof the whole point here.




Thanks,
Jeff


Re: [patch] aarch64-*-freebsd* support for gcc.

2016-10-10 Thread Jeff Law

On 10/10/2016 03:07 PM, Andreas Tobler wrote:

Hi all,

the attached patch brings support for the aarch64-*-freebsd* target.

Bootstraped and tested, results on the list. Not that many results due
to board instabilities I lack a cavium ;)

Ok for main? And if yes, how far can I backport? Down to 5.4?

TIA,
Andreas

libgcc:

2016-10-10  Andreas Tobler  

* config.host: Add support for aarch64-*-freebsd*.

gcc:

2016-10-10  Andreas Tobler  

* config.gcc: Add aarch64-*-freebsd* support.
* config.host: Likewise.
* config/aarch64/aarch64-freebsd.h: New file.
* config/aarch64/t-aarch64-freebsd: Ditto.

toplevel:

2016-10-10  Andreas Tobler 

* configure.ac: Add aarch64-*-freebsd*.
* configure: Regenerate.
Certainly OK for the trunk.  Jakub, Richi & Joseph make the rules for 
the release branches.


jeff


[patch] aarch64-*-freebsd* support for gcc.

2016-10-10 Thread Andreas Tobler

Hi all,

the attached patch brings support for the aarch64-*-freebsd* target.

Bootstraped and tested, results on the list. Not that many results due 
to board instabilities I lack a cavium ;)


Ok for main? And if yes, how far can I backport? Down to 5.4?

TIA,
Andreas

libgcc:

2016-10-10  Andreas Tobler  

* config.host: Add support for aarch64-*-freebsd*.

gcc:

2016-10-10  Andreas Tobler  

* config.gcc: Add aarch64-*-freebsd* support.
* config.host: Likewise.
* config/aarch64/aarch64-freebsd.h: New file.
* config/aarch64/t-aarch64-freebsd: Ditto.

toplevel:

2016-10-10  Andreas Tobler 

* configure.ac: Add aarch64-*-freebsd*.
* configure: Regenerate.

Index: configure.ac
===
--- configure.ac(revision 240948)
+++ configure.ac(working copy)
@@ -727,6 +727,9 @@
   *-*-vxworks*)
 noconfigdirs="$noconfigdirs target-libffi"
 ;;
+  aarch64*-*-freebsd*)
+noconfigdirs="$noconfigdirs target-libffi"
+;;
   alpha*-*-*vms*)
 noconfigdirs="$noconfigdirs target-libffi"
 ;;
Index: gcc/config.gcc
===
--- gcc/config.gcc  (revision 240948)
+++ gcc/config.gcc  (working copy)
@@ -937,6 +937,11 @@
done
TM_MULTILIB_CONFIG=`echo $TM_MULTILIB_CONFIG | sed 's/^,//'`
;;
+aarch64*-*-freebsd*)
+   tm_file="${tm_file} dbxelf.h elfos.h ${fbsd_tm_file}"
+   tm_file="${tm_file} aarch64/aarch64-elf.h aarch64/aarch64-freebsd.h"
+   tmake_file="${tmake_file} aarch64/t-aarch64 aarch64/t-aarch64-freebsd"
+   ;;
 aarch64*-*-linux*)
tm_file="${tm_file} dbxelf.h elfos.h gnu-user.h linux.h glibc-stdint.h"
tm_file="${tm_file} aarch64/aarch64-elf.h aarch64/aarch64-linux.h"
Index: gcc/config.host
===
--- gcc/config.host (revision 240948)
+++ gcc/config.host (working copy)
@@ -99,7 +99,7 @@
 esac
 
 case ${host} in
-  aarch64*-*-linux*)
+  aarch64*-*-freebsd* | aarch64*-*-linux*)
 case ${target} in
   aarch64*-*-*)
host_extra_gcc_objs="driver-aarch64.o"
Index: gcc/config/aarch64/aarch64-freebsd.h
===
--- gcc/config/aarch64/aarch64-freebsd.h(nonexistent)
+++ gcc/config/aarch64/aarch64-freebsd.h(working copy)
@@ -0,0 +1,94 @@
+/* Definitions for AArch64 running FreeBSD
+   Copyright (C) 2016 Free Software Foundation, Inc.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GCC is distributed in the hope that it will be useful, but
+   WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING3.  If not see
+   .  */
+
+#ifndef GCC_AARCH64_FREEBSD_H
+#define GCC_AARCH64_FREEBSD_H
+
+#undef  SUBTARGET_CPP_SPEC
+#define SUBTARGET_CPP_SPEC FBSD_CPP_SPEC
+
+#if TARGET_BIG_ENDIAN_DEFAULT
+#define TARGET_LINKER_EMULATION  "aarch64fbsdb"
+#else
+#define TARGET_LINKER_EMULATION  "aarch64fbsd"
+#endif
+
+#undef  SUBTARGET_EXTRA_LINK_SPEC
+#define SUBTARGET_EXTRA_LINK_SPEC " -m" TARGET_LINKER_EMULATION
+
+#undef  FBSD_TARGET_LINK_SPEC
+#define FBSD_TARGET_LINK_SPEC " \
+%{p:%nconsider using `-pg' instead of `-p' with gprof (1) } \
+%{v:-V} \
+%{assert*} %{R*} %{rpath*} %{defsym*}   \
+%{shared:-Bshareable %{h*} %{soname*}}  \
+%{symbolic:-Bsymbolic}  \
+%{static:-Bstatic}  \
+%{!static:  \
+  %{rdynamic:-export-dynamic}   \
+  %{!shared:-dynamic-linker " FBSD_DYNAMIC_LINKER " }}  \
+-X" SUBTARGET_EXTRA_LINK_SPEC " \
+%{mbig-endian:-EB} %{mlittle-endian:-EL}"
+
+#if TARGET_FIX_ERR_A53_835769_DEFAULT
+#define CA53_ERR_835769_SPEC \
+  " %{!mno-fix-cortex-a53-835769:--fix-cortex-a53-835769}"
+#else
+#define CA53_ERR_835769_SPEC \
+  " %{mfix-cortex-a53-835769:--fix-cortex-a53-835769}"
+#endif
+
+#ifdef TARGET_FIX_ERR_A53_843419_DEFAULT
+#define CA53_ERR_843419_SPEC \
+  " %{!mno-fix-cortex-a53-843419:--fix-cortex-a53-843419}"
+#else
+#define CA53_ERR_843419_SPEC \
+  " %{mfix-cortex-a53-843419:--fix-cortex-a53-843419}"
+#endif
+

[tree-optimization/71947] Avoid unwanted propagations

2016-10-10 Thread Jeff Law



So if we have an equality conditional between A & B, we record into our 
const/copy tables A = B and B = A.


This helps us discover some of the more obscure equivalences. But it 
also creates problems with an expression like


A ^ B

Where we might cprop the first operand generating

B ^ B

Then the second generating

B ^ A

ANd we've lost the folding opportunity.  At first I'd tried folding 
after each propagation step, but that turns into a bit of a nightmare 
because of changes in the underlying structure of the gimple statement 
and cycles that may develop if we re-build the operand cache after folding.


This approach is simpler and should catch all these cases for binary 
operators.  We just track the last copy propagated argument and refuse 
to ping-pong propagations.


It fixes the tests from 71947 and 77647 without regressing (obviously). 
I've included an xfailed test for a more complex situation that we don't 
currently handle (would require backtracking from the equality 
comparison through the logicals that feed the equality comparison).


Bootstrapped and regression tested on x86_64.  Applied to the trunk.

commit 6223e6e425b6de916f0330b9dbe5698765d4a73c
Author: law 
Date:   Mon Oct 10 20:40:59 2016 +

PR tree-optimization/71947
* tree-ssa-dom.c (cprop_into_stmt): Avoid replacing A with B, then
B with A within a single statement.

PR tree-optimization/71947
* gcc.dg/tree-ssa/pr71947-1.c: New test.
* gcc.dg/tree-ssa/pr71947-2.c: New test.
* gcc.dg/tree-ssa/pr71947-3.c: New test.
* gcc.dg/tree-ssa/pr71947-4.c: New test.
* gcc.dg/tree-ssa/pr71947-5.c: New test.
* gcc.dg/tree-ssa/pr71947-6.c: New test.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@240947 
138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 1738bc7..16e25bf 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,9 @@
+2016-10-10  Jeff Law  
+
+PR tree-optimization/71947
+   * tree-ssa-dom.c (cprop_into_stmt): Avoid replacing A with B, then
+   B with A within a single statement.
+
 2016-10-10  Bill Schmidt  
 
PR tree-optimization/77824
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 04966cf..e31bcc6 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,13 @@
+2016-10-10  Jeff Law  
+
+   PR tree-optimization/71947
+   * gcc.dg/tree-ssa/pr71947-1.c: New test.
+   * gcc.dg/tree-ssa/pr71947-2.c: New test.
+   * gcc.dg/tree-ssa/pr71947-3.c: New test.
+   * gcc.dg/tree-ssa/pr71947-4.c: New test.
+   * gcc.dg/tree-ssa/pr71947-5.c: New test.
+   * gcc.dg/tree-ssa/pr71947-6.c: New test.
+
 2016-10-10  Thomas Koenig  
 
PR fortran/77915
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr71947-1.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr71947-1.c
new file mode 100644
index 000..b033495
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr71947-1.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */ 
+/* { dg-options "-O2 -fno-tree-vrp -fdump-tree-dom-details" } */
+
+
+int f(int x, int y)
+{
+   int ret;
+
+   if (x == y)
+ ret = x ^ y;
+   else
+ ret = 1;
+
+   return ret;
+}
+
+/* { dg-final { scan-tree-dump "Folded to: ret_\[0-9\]+ = 0;"  "dom2" } } */
+
+
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr71947-2.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr71947-2.c
new file mode 100644
index 000..de8f88b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr71947-2.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-tree-vrp -fdump-tree-dom-details" } */
+
+
+int f(int x, int y)
+{
+  int ret;
+  if (x == y)
+ret = x - y;
+  else
+ret = 1;
+
+  return ret;
+}
+
+/* { dg-final { scan-tree-dump "Folded to: ret_\[0-9\]+ = 0;"  "dom2" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr71947-3.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr71947-3.c
new file mode 100644
index 000..e79847f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr71947-3.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-tree-vrp -fdump-tree-dom-details" } */
+
+int f(int x, int y)
+{
+  int ret = 10;
+  if (x == y)
+ret = x  -  y;
+  return ret;
+}
+
+/* { dg-final { scan-tree-dump "Folded to: ret_\[0-9\]+ = 0;"  "dom2" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr71947-4.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr71947-4.c
new file mode 100644
index 000..a881f0d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr71947-4.c
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-tree-vrp -fdump-tree-dom-details" } */
+
+
+
+static inline long load(long *p)
+{
+long ret;
+asm ("movq  %1,%0\n\t" : "=r" (ret) : "m" (*p));
+if (ret != *p)
+__builtin_unreachable();
+return ret;
+}
+
+long foo(long *mem)

Re: [PATCH] 77864 Fix noexcept conditions for map/set default constructors

2016-10-10 Thread Tim Song
Trying again...with a few edits.

> On Mon, Oct 10, 2016 at 3:24 PM, François Dumont 
> wrote:
>
> @@ -602,24 +612,32 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>  struct _Rb_tree_impl : public _Node_allocator
>  {
>_Key_compare _M_key_compare;
> -  _Rb_tree_node_base _M_header;
> +  _Rb_header_node _M_header;
> +#if __cplusplus < 201103L
>size_type _M_node_count; // Keeps track of size of tree.
> +#else
> +  size_type _M_node_count = 0; // Keeps track of size of tree.
> +#endif
>
> +#if __cplusplus < 201103L
>_Rb_tree_impl()
> -  : _Node_allocator(), _M_key_compare(), _M_header(),
> -_M_node_count(0)
> -  { _M_initialize(); }
> +  : _M_node_count(0)
> +  { }
> +#else
> +  _Rb_tree_impl() = default;
> +#endif


The default constructor of the associative containers is required to
value-initialize the comparator (see their synopses in
[map/set/multimap/multiset.overview]).

 _Rb_tree_impl() = default; doesn't do that; it default-initializes the
 comparator instead.

Tim


Fwd: C++ PATCH for c++/77890, 77912 (C++17 class deduction issues)

2016-10-10 Thread Jason Merrill
77890: we were losing the CLASS_PLACEHOLDER_TEMPLATE when reducing the
level of a TEMPLATE_TYPE_PARM.

77912: after 77890 was fixed, we were complaining about an undefined
deduction guide; set cp_unevaluated_operand to prevent that.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit 927a7c6142c7e08bdc37e94e776201f801de0df8
Author: Jason Merrill 
Date:   Mon Oct 10 13:52:50 2016 -0400

C++17 class deduction issues

PR c++/77890
PR c++/77912
* pt.c (do_class_deduction): Set cp_unevaluated_operand.
(tsubst) [TEMPLATE_TYPE_PARM]: Copy CLASS_PLACEHOLDER_TEMPLATE.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index f6cd3ea..28b1c98 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -13233,11 +13233,15 @@ tsubst (tree t, tree args, tsubst_flags_t complain, 
tree in_decl)
TYPE_POINTER_TO (r) = NULL_TREE;
TYPE_REFERENCE_TO (r) = NULL_TREE;
 
-   /* Propagate constraints on placeholders.  */
 if (TREE_CODE (t) == TEMPLATE_TYPE_PARM)
-  if (tree constr = PLACEHOLDER_TYPE_CONSTRAINTS (t))
-   PLACEHOLDER_TYPE_CONSTRAINTS (r)
- = tsubst_constraint (constr, args, complain, in_decl);
+ {
+   /* Propagate constraints on placeholders.  */
+   if (tree constr = PLACEHOLDER_TYPE_CONSTRAINTS (t))
+ PLACEHOLDER_TYPE_CONSTRAINTS (r)
+   = tsubst_constraint (constr, args, complain, in_decl);
+   else if (tree pl = CLASS_PLACEHOLDER_TEMPLATE (t))
+ CLASS_PLACEHOLDER_TEMPLATE (r) = pl;
+ }
 
if (TREE_CODE (r) == TEMPLATE_TEMPLATE_PARM)
  /* We have reduced the level of the template
@@ -24431,9 +24435,10 @@ do_class_deduction (tree tmpl, tree init, 
tsubst_flags_t complain)
   return error_mark_node;
 }
 
+  ++cp_unevaluated_operand;
   tree t = build_new_function_call (cands, , /*koenig*/false,
complain|tf_decltype);
-
+  --cp_unevaluated_operand;
   release_tree_vector (args);
 
   return TREE_TYPE (t);
diff --git a/gcc/testsuite/g++.dg/cpp1z/class-deduction19.C 
b/gcc/testsuite/g++.dg/cpp1z/class-deduction19.C
new file mode 100644
index 000..38327d1
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/class-deduction19.C
@@ -0,0 +1,20 @@
+// PR c++/77912
+// { dg-options -std=c++1z }
+
+template struct S{S(T){}}; 
+
+//error: invalid use of template type parameter 'S'
+template auto f(T t){return S(t);}
+
+int main()
+{
+  //fails
+  f(42);
+
+  //fails
+  //error: invalid use of template type parameter 'S'
+  [](auto a){return S(a);}(42); 
+
+  //works
+  [](int a){return S(a);}(42);
+}
diff --git a/gcc/testsuite/g++.dg/cpp1z/class-deduction20.C 
b/gcc/testsuite/g++.dg/cpp1z/class-deduction20.C
new file mode 100644
index 000..58e8f7d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/class-deduction20.C
@@ -0,0 +1,21 @@
+// PR c++/77890
+// { dg-options -std=c++1z }
+
+template struct S{S(F&){}}; 
+void f()
+{
+  S([]{});
+}
+
+template 
+struct scope_guard : TF
+{
+scope_guard(TF f) : TF{f} { }
+~scope_guard() { (*this)(); }
+};
+
+void g() 
+{
+struct K { void operator()() {} };
+scope_guard _{K{}};
+}


[PATCH], PR 77924, Fix PowerPC breakage on AIX

2016-10-10 Thread Michael Meissner
I accidently broke AIX with my patch on October 6th.  That patch split
-mfloat128 into -mfloat128-type and -mfloat128 under PowerPC Linux.  This patch
fixes that issue.  I bootstrapped it on PowerPC Linux with no regressions, and
David Edelsohn reports that it fixes the problem on AIX.  Is it ok to apply the
patch?

2016-10-10  Michael Meissner  

PR target/77924
* config/rs6000/rs6000.c (rs6000_init_builtins): Only create the
distinct __ibm128 IBM extended double type if long doubles are
128-bits and the default format for long double is IEEE 128-bit.

Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 240941)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -16572,10 +16572,10 @@ rs6000_init_builtins (void)
  floating point, we need make sure the type is non-zero or else self-test
  fails during bootstrap.
 
- We don't register a built-in type for __ibm128 or __float128 if the type
- is the same as long double.  Instead we add a #define for __ibm128 or
- __float128 in rs6000_cpu_cpp_builtins to long double.  */
-  if (TARGET_IEEEQUAD || !TARGET_LONG_DOUBLE_128)
+ We don't register a built-in type for __ibm128 if the type is the same as
+ long double.  Instead we add a #define for __ibm128 in
+ rs6000_cpu_cpp_builtins to long double.  */
+  if (TARGET_LONG_DOUBLE_128 && FLOAT128_IEEE_P (TFmode))
 {
   ibm128_float_type_node = make_node (REAL_TYPE);
   TYPE_PRECISION (ibm128_float_type_node) = 128;

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797



Re: PING! Re: [PATCH, Fortran] Extension: COTAN and degree-valued trig intrinsics with -fdec-math

2016-10-10 Thread Steve Kargl
On Mon, Oct 10, 2016 at 12:29:32PM -0700, Jerry DeLisle wrote:
> On 10/10/2016 08:06 AM, Fritz Reese wrote:
> > https://gcc.gnu.org/ml/fortran/2016-09/msg00163.html [original]
> > https://gcc.gnu.org/ml/fortran/2016-09/msg00183.html [latest]
> > 
> > On Wed, Sep 28, 2016 at 4:14 PM, Fritz Reese  wrote:
> >> Attached is a patch extending the GNU Fortran front-end to support
> >> some additional math intrinsics, enabled with a new compile flag
> >> -fdec-math. The flag adds the COTAN intrinsic (cotangent), as well as
> >> degree versions of all trigonometric intrinsics (SIND, TAND, ACOSD,
> >> etc...). This extension allows for further compatibility with legacy
> >> code that depends on the compiler to support such intrinsic functions.
> > 
> > Patch is still pending. Current draft of the patch is re-attached for
> > convenience, since it was amended twice since the original post. OK
> > for trunk?
> > 
> 
> OK, thanks for the work.
> 

Sorry about following behind. I did intend to review the patch, but
time got away from me.  There are a few small clean-up that can be
done.  For example,

+static gfc_expr *
+get_radians (gfc_expr *deg)
+{
+  mpfr_t tmp;
...

+  /* Set factor = pi / 180.  */
+  factor = gfc_get_constant_expr (deg->ts.type, deg->ts.kind, >where);
+  mpfr_const_pi (factor->value.real, GFC_RND_MODE);
+  mpfr_init (tmp);
+  mpfr_set_d (tmp, 180.0, GFC_RND_MODE);
+  mpfr_div (factor->value.real, factor->value.real, tmp, GFC_RND_MODE);
+  mpfr_clear (tmp);

the tmp variable is unneeded in the above.  Converting the double
precision 180.0 to mpfr_t and then dividing is probably slower
than just dividing by 180.

+  /* Set factor = pi / 180.  */
+  factor = gfc_get_constant_expr (deg->ts.type, deg->ts.kind, >where);
+  mpfr_const_pi (factor->value.real, GFC_RND_MODE);
+  mpfr_div_ui (factor->value.real, factor->value.real, 180, GFC_RND_MODE);

Of course, the clean-up can be done post-commit by Fritz.

-- 
Steve


[libgo] Silence compiler error message

2016-10-10 Thread Eric Botcazou
Hi,

on Solaris the configuration of the library yields an ugly:

checking whether linker supports split/non-split linked together... cc1: 
error: '-fsplit-stack' is not supported by this compiler configuration
xgcc: error: conftest1.o: No such file or directory
no

Tested on x86-64/Linux and SPARC/Solaris, OK for the mainline?


2016-10-10  Eric Botcazou  

* configure.ac (libgo_cv_c_linker_split_non_split): Redirect compiler
output to /dev/null.
* configure: Regenerate.

-- 
Eric BotcazouIndex: configure.ac
===
--- configure.ac	(revision 240888)
+++ configure.ac	(working copy)
@@ -447,9 +447,9 @@ EOF
 cat > conftest2.c << EOF
 void f() {}
 EOF
-$CC -c -fsplit-stack $CFLAGS $CPPFLAGS conftest1.c
-$CC -c $CFLAGS $CPPFLAGS conftest2.c
-if $CC -o conftest conftest1.$ac_objext conftest2.$ac_objext; then
+$CC -c -fsplit-stack $CFLAGS $CPPFLAGS conftest1.c >/dev/null 2>&1
+$CC -c $CFLAGS $CPPFLAGS conftest2.c > /dev/null 2>&1
+if $CC -o conftest conftest1.$ac_objext conftest2.$ac_objext > /dev/null 2>&1; then
   libgo_cv_c_linker_split_non_split=yes
 else
   libgo_cv_c_linker_split_non_split=no


New Swedish PO file for 'gcc' (version 6.2.0)

2016-10-10 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the Swedish team of translators.  The file is available at:

http://translationproject.org/latest/gcc/sv.po

(This file, 'gcc-6.2.0.sv.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

http://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

http://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.




Move OVERRIDE/FINAL from gcc/coretypes.h to include/ansidecl.h (was: Re: [PATCH 1/2] Add OVERRIDE and FINAL macros to coretypes.h)

2016-10-10 Thread Pedro Alves
Please find below a patch moving the FINAL/OVERRIDE macros to
include/ansidecl.h, as I was suggesting in the earlier discussion:

On 05/06/2016 07:33 PM, Trevor Saunders wrote:
> On Fri, May 06, 2016 at 07:10:33PM +0100, Pedro Alves wrote:
>> On 05/06/2016 06:56 PM, Pedro Alves wrote:

>> I was going to suggest to put this in include/ansidecl.h,
>> so that all C++ libraries / programs in binutils-gdb use the same
>> thing, instead of each reinventing the wheel, and I found
>> there's already something there:
>>
>> /* This is used to mark a class or virtual function as final.  */
>> #if __cplusplus >= 201103L
>> #define GCC_FINAL final
>> #elif GCC_VERSION >= 4007
>> #define GCC_FINAL __final
>> #else
>> #define GCC_FINAL
>> #endif
>>
>> From:
>>
>>  https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00455.html
>>
>> Apparently the patch that actually uses that was reverted,
>> as I can't find any use.
> 
> Yeah, I wanted to use it to work around gdb not dealing well with stuff
> in the anon namespace, but somehow that broke aix, and some people
> objected and I haven't gotten back to it.
> 
>> I like your names without the GCC_ prefix better though,
>> for the same reason of standardizing binutils-gdb + gcc
>> on the same symbols.
> 
> I agree, though I'm not really sure when gdb / binutils stuff will
> support building as C++11.

Meanwhile, GDB master is C++-only nowadays, and we support building
with a C++11 compiler, provided there are C++03 fallbacks in place.
I'd like to start using FINAL/OVERRIDE, and seems better to me to
standardize on the same symbol names across the trees.
This patch removes the existing GCC_FINAL macro, since nothing is
using it.

OK to apply?

From: Pedro Alves 
Date: 2016-10-10 19:25:47 +0100

Move OVERRIDE/FINAL from gcc/coretypes.h to include/ansidecl.h

So that GDB and other projects that share the top level can use them.

Bootstrapped with all default languages on x86-64 Fedora 23.

gcc/ChangeLog:
-mm-dd  Pedro Alves  

* coretypes.h (OVERRIDE, FINAL): Delete, moved to
include/ansidecl.h.

include/ChangeLog:
-mm-dd  Pedro Alves  

* ansidecl.h (GCC_FINAL): Delete.
(OVERRIDE, FINAL): New, moved from gcc/coretypes.h.
---

 gcc/coretypes.h |   25 -
 1 file changed, 25 deletions(-)

diff --git a/gcc/coretypes.h b/gcc/coretypes.h
index fe1e984..a9c4df9 100644
--- a/gcc/coretypes.h
+++ b/gcc/coretypes.h
@@ -367,31 +367,6 @@ typedef void (*gt_pointer_operator) (void *, void *);
 typedef unsigned char uchar;
 #endif
 
-/* C++11 adds the ability to add "override" after an implementation of a
-   virtual function in a subclass, to:
- (A) document that this is an override of a virtual function
- (B) allow the compiler to issue a warning if it isn't (e.g. a mismatch
- of the type signature).
-
-   Similarly, it allows us to add a "final" to indicate that no subclass
-   may subsequently override the vfunc.
-
-   Provide OVERRIDE and FINAL as macros, allowing us to get these benefits
-   when compiling with C++11 support, but without requiring C++11.
-
-   For gcc, use "-std=c++11" to enable C++11 support; gcc 6 onwards enables
-   this by default (actually GNU++14).  */
-
-#if __cplusplus >= 201103
-/* C++11 claims to be available: use it: */
-#define OVERRIDE override
-#define FINAL final
-#else
-/* No C++11 support; leave the macros empty: */
-#define OVERRIDE
-#define FINAL
-#endif
-
 /* Most host source files will require the following headers.  */
 #if !defined (GENERATOR_FILE) && !defined (USED_FOR_TARGET)
 #include "machmode.h"
diff --git a/include/ansidecl.h b/include/ansidecl.h
index 6e4bfc2..ee93421 100644
--- a/include/ansidecl.h
+++ b/include/ansidecl.h
@@ -313,13 +313,29 @@ So instead we use the macro below and test it against 
specific values.  */
 #define ENUM_BITFIELD(TYPE) unsigned int
 #endif
 
-/* This is used to mark a class or virtual function as final.  */
-#if __cplusplus >= 201103L
-#define GCC_FINAL final
-#elif GCC_VERSION >= 4007
-#define GCC_FINAL __final
+/* C++11 adds the ability to add "override" after an implementation of a
+   virtual function in a subclass, to:
+ (A) document that this is an override of a virtual function
+ (B) allow the compiler to issue a warning if it isn't (e.g. a mismatch
+ of the type signature).
+
+   Similarly, it allows us to add a "final" to indicate that no subclass
+   may subsequently override the vfunc.
+
+   Provide OVERRIDE and FINAL as macros, allowing us to get these benefits
+   when compiling with C++11 support, but without requiring C++11.
+
+   For gcc, use "-std=c++11" to enable C++11 support; gcc 6 onwards enables
+   this by default (actually GNU++14).  */
+
+#if __cplusplus >= 201103
+/* C++11 claims to be available: use it: */
+#define OVERRIDE override
+#define FINAL final
 #else
-#define GCC_FINAL
+/* No C++11 support; leave the macros empty: */

Re: PING! Re: [PATCH, Fortran] Extension: COTAN and degree-valued trig intrinsics with -fdec-math

2016-10-10 Thread Jerry DeLisle
On 10/10/2016 08:06 AM, Fritz Reese wrote:
> https://gcc.gnu.org/ml/fortran/2016-09/msg00163.html [original]
> https://gcc.gnu.org/ml/fortran/2016-09/msg00183.html [latest]
> 
> On Wed, Sep 28, 2016 at 4:14 PM, Fritz Reese  wrote:
>> Attached is a patch extending the GNU Fortran front-end to support
>> some additional math intrinsics, enabled with a new compile flag
>> -fdec-math. The flag adds the COTAN intrinsic (cotangent), as well as
>> degree versions of all trigonometric intrinsics (SIND, TAND, ACOSD,
>> etc...). This extension allows for further compatibility with legacy
>> code that depends on the compiler to support such intrinsic functions.
> 
> Patch is still pending. Current draft of the patch is re-attached for
> convenience, since it was amended twice since the original post. OK
> for trunk?
> 

OK, thanks for the work.

Jerry


Re: [PATCH] 77864 Fix noexcept conditions for map/set default constructors

2016-10-10 Thread François Dumont

On 09/10/2016 17:14, Jonathan Wakely wrote:

On 08/10/16 22:55 +0200, François Dumont wrote:

On 06/10/2016 23:34, Jonathan Wakely wrote:

On 06/10/16 22:17 +0200, François Dumont wrote:
Another approach is to rely on existing compiler ability to compute 
conditional noexcept when defaulting implementations. This is what 
I have done in this patch.


The new default constructor on _Rb_tree_node_base is not a problem 
as it is not used to build _Rb_tree_node.


Why not?


_Rb_tree_node_base is used in 2 context. As member of _Rb_tree_impl 
in which case we need the new default constructor. And also as base 
class of _Rb_tree_node which is never constructed. Nodes are being 
allocated and then associated value is being constructed through the 
allocator, the node default constructor itself is never invoked.


In C++03 mode that is true, but it's only valid because the type is
trivially-constructible. If the type requires "non-vacuous
initialization" then it's not valid to allocate memory for it and
start using it without invoking a constructor.


  Good to know.


If you add a
non-trivial constructor then we can't do that any more.

In C++11 and later, see line 550 in 

   ::new(__node) _Rb_tree_node<_Val>;

This default-constructs a tree node. Currently there is no
user-provided default constructor, so default-construction does no
initialization. Adding your constructor would mean it is used for
every node.


I missed this call, indeed. I should have deleted default constructor 
and run compilation to be sure.




   If you think it is cleaner to create an intermediate type that 
will take care of this initialization through its default constructor 
I can do that.




I'll try to do the same for copy constructor/assignment and move 
constructor/assignment.


We need to make sure we don't change whether any of those operations
are trivial (which shouldn't be a problem for copy/move, because they
are definitely very non-trivial and will stay that way!)

Does this change the default constructors from non-trivial to trivial?
It would be a major compiler bug if making a constructor default was 
making it trivial.


I must be misunderstanding you, because this is not a bug:


No, my fault, I was misunderstanding you. Now that I know about validity 
of using a "non-constructed" type only if trivial, it is much clearer.


So here is the fixed patch with your proposed intermediate type 
containing the necessary default constructor.


Being tested, ok to commit if successful ?

François

diff --git a/libstdc++-v3/include/bits/stl_map.h b/libstdc++-v3/include/bits/stl_map.h
index e5b2a1b..dea7d5b 100644
--- a/libstdc++-v3/include/bits/stl_map.h
+++ b/libstdc++-v3/include/bits/stl_map.h
@@ -167,11 +167,11 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   /**
*  @brief  Default constructor creates no elements.
*/
-  map()
-  _GLIBCXX_NOEXCEPT_IF(
-	  is_nothrow_default_constructible::value
-	  && is_nothrow_default_constructible::value)
-  : _M_t() { }
+#if __cplusplus < 201103L
+  map() : _M_t() { }
+#else
+  map() = default;
+#endif
 
   /**
*  @brief  Creates a %map with no elements.
diff --git a/libstdc++-v3/include/bits/stl_multimap.h b/libstdc++-v3/include/bits/stl_multimap.h
index d240427..7e86b76 100644
--- a/libstdc++-v3/include/bits/stl_multimap.h
+++ b/libstdc++-v3/include/bits/stl_multimap.h
@@ -164,11 +164,11 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   /**
*  @brief  Default constructor creates no elements.
*/
-  multimap()
-  _GLIBCXX_NOEXCEPT_IF(
-	  is_nothrow_default_constructible::value
-	  && is_nothrow_default_constructible::value)
-  : _M_t() { }
+#if __cplusplus < 201103L
+  multimap() : _M_t() { }
+#else
+  multimap() = default;
+#endif
 
   /**
*  @brief  Creates a %multimap with no elements.
diff --git a/libstdc++-v3/include/bits/stl_multiset.h b/libstdc++-v3/include/bits/stl_multiset.h
index cc068a9..7fe2fbd 100644
--- a/libstdc++-v3/include/bits/stl_multiset.h
+++ b/libstdc++-v3/include/bits/stl_multiset.h
@@ -144,11 +144,11 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   /**
*  @brief  Default constructor creates no elements.
*/
-  multiset()
-  _GLIBCXX_NOEXCEPT_IF(
-	  is_nothrow_default_constructible::value
-	  && is_nothrow_default_constructible::value)
-  : _M_t() { }
+#if __cplusplus < 201103L
+  multiset() : _M_t() { }
+#else
+  multiset() = default;
+#endif
 
   /**
*  @brief  Creates a %multiset with no elements.
diff --git a/libstdc++-v3/include/bits/stl_set.h b/libstdc++-v3/include/bits/stl_set.h
index 3938351..5ed9672 100644
--- a/libstdc++-v3/include/bits/stl_set.h
+++ b/libstdc++-v3/include/bits/stl_set.h
@@ -147,11 +147,11 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   /**
*  @brief  Default constructor creates no elements.
*/
-  set()
-  _GLIBCXX_NOEXCEPT_IF(
-	  is_nothrow_default_constructible::value
-	  && 

Re: [v3 PATCH] Make any's copy assignment operator exception-safe, don't copy the underlying value when any is moved, make in_place constructors explicit.

2016-10-10 Thread Ville Voutilainen
On 10 October 2016 at 21:19, Jonathan Wakely  wrote:
> I prefer to put "explicit" on a line of its own, as we do for return
> types, but I won't complain if you leave it like this.

Changed.

>> + any(__rhs).swap(*this);
>
>
> I was trying to avoid the "redundant" xfer operations that the swap
> does, but I don't think we can do that and be exception safe. This is
> simple and safe, and I think its optimal. Thanks.

Right, as discussed, this is now just a move assignment from a temporary.

>
>> }
>>   return *this;
>> }
>> @@ -232,7 +228,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>>   else if (this != &__rhs)
>> {
>>   if (has_value())
>> -   _M_manager(_Op_destroy, this, nullptr);
>> +   reset();
>
>
> If you're going to use reset() then you don't need the has_value()
> check first. I think the reason I didn't use reset() was to avoid the

I removed the check, works fine.

> dead store to _M_manager that reset() does, since the compiler might
> not detect it's dead (because the next store is done by the call
> through a function pointer).
> This code was all pretty carefully written to avoid any redundant
> operations. Does this change buy us anything except simpler code?

As discussed, destroying the value but leaving the manager non-null will
do bad things.

New patch attached, ok for trunk?

2016-10-10  Ville Voutilainen  

Make any's copy assignment operator exception-safe,
don't copy the underlying value when any is moved,
make in_place constructors explicit.
* include/std/any (any(in_place_type_t<_ValueType>, _Args&&...)):
Make explicit.
(any(in_place_type_t<_ValueType>, initializer_list<_Up>, _Args&&...)):
Likewise.
(operator=(const any&)): Make strongly exception-safe.
(operator=(any&&)): reset() unconditionally in the case where
rhs has a value.
(operator=(_ValueType&&)): Indent the return type.
(_Manager_internal<_Tp>::_S_manage): Move in _Op_xfer, don't copy.
* testsuite/20_util/any/assign/2.cc: Adjust.
* testsuite/20_util/any/assign/exception.cc: New.
* testsuite/20_util/any/cons/2.cc: Adjust.
* testsuite/20_util/any/cons/explicit.cc: New.
* testsuite/20_util/any/misc/any_cast_neg.cc: Ajust.
diff --git a/libstdc++-v3/include/std/any b/libstdc++-v3/include/std/any
index 9160035..45a2145 100644
--- a/libstdc++-v3/include/std/any
+++ b/libstdc++-v3/include/std/any
@@ -179,6 +179,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  typename _Tp = _Decay<_ValueType>,
  typename _Mgr = _Manager<_Tp>,
   __any_constructible_t<_Tp, _Args&&...> = false>
+  explicit
   any(in_place_type_t<_ValueType>, _Args&&... __args)
   : _M_manager(&_Mgr::_S_manage)
   {
@@ -192,6 +193,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  typename _Mgr = _Manager<_Tp>,
   __any_constructible_t<_Tp, initializer_list<_Up>,
_Args&&...> = false>
+  explicit
   any(in_place_type_t<_ValueType>,
  initializer_list<_Up> __il, _Args&&... __args)
   : _M_manager(&_Mgr::_S_manage)
@@ -207,16 +209,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 /// Copy the state of another object.
 any& operator=(const any& __rhs)
 {
-  if (!__rhs.has_value())
-   reset();
-  else if (this != &__rhs)
-   {
- if (has_value())
-   _M_manager(_Op_destroy, this, nullptr);
- _Arg __arg;
- __arg._M_any = this;
- __rhs._M_manager(_Op_clone, &__rhs, &__arg);
-   }
+  *this = any(__rhs);
   return *this;
 }
 
@@ -231,8 +224,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
reset();
   else if (this != &__rhs)
{
- if (has_value())
-   _M_manager(_Op_destroy, this, nullptr);
+ reset();
  _Arg __arg;
  __arg._M_any = this;
  __rhs._M_manager(_Op_xfer, &__rhs, &__arg);
@@ -243,7 +235,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 /// Store a copy of @p __rhs as the contained object.
 template>
-enable_if_t::value, any&>
+  enable_if_t::value, any&>
   operator=(_ValueType&& __rhs)
   {
*this = any(std::forward<_ValueType>(__rhs));
@@ -556,7 +548,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
__ptr->~_Tp();
break;
   case _Op_xfer:
-   ::new(&__arg->_M_any->_M_storage._M_buffer) _Tp(*__ptr);
+   ::new(&__arg->_M_any->_M_storage._M_buffer) _Tp
+ (std::move(*const_cast<_Tp*>(__ptr)));
__ptr->~_Tp();
__arg->_M_any->_M_manager = __any->_M_manager;
const_cast(__any)->_M_manager = nullptr;
diff --git a/libstdc++-v3/testsuite/20_util/any/assign/2.cc 
b/libstdc++-v3/testsuite/20_util/any/assign/2.cc
index b333e5d..28f06a0 100644
--- a/libstdc++-v3/testsuite/20_util/any/assign/2.cc
+++ 

[PATCH] Update docs on libstdc++ source-code layout

2016-10-10 Thread Jonathan Wakely

Self-explanatory updates to the docs, and regenerating after the
various recent changes.

* doc/xml/manual/appendix_contributing.xml (contrib.organization):
Describe other subdirectories and add markup. Remove outdated
reference to check-script target.
* doc/html/*: Regenerate.

Committed to trunk.

commit 40bed069fd9497174b398c683d684fc825867cb7
Author: Jonathan Wakely 
Date:   Mon Oct 10 19:54:50 2016 +0100

Update docs on libstdc++ source-code layout

* doc/xml/manual/appendix_contributing.xml (contrib.organization):
Describe other subdirectories and add markup. Remove outdated
reference to check-script target.
* doc/html/*: Regenerate.

diff --git a/libstdc++-v3/doc/xml/manual/appendix_contributing.xml 
b/libstdc++-v3/doc/xml/manual/appendix_contributing.xml
index d7df13c..ee35dd9 100644
--- a/libstdc++-v3/doc/xml/manual/appendix_contributing.xml
+++ b/libstdc++-v3/doc/xml/manual/appendix_contributing.xml
@@ -199,91 +199,104 @@
   
 
   
-The unpacked source directory of libstdc++ contains the files
-needed to create the GNU C++ Library.
+The libstdc++-v3 directory in the
+GCC sources contains the files needed to create the GNU C++ Library.
   
 
   
 It has subdirectories:
 
-  doc
+  doc
 Files in HTML and text format that document usage, quirks of the
 implementation, and contributor checklists.
 
-  include
+  include
 All header files for the C++ library are within this directory,
 modulo specific runtime-related files that are in the libsupc++
 directory.
 
-include/std
+include/std
   Files meant to be found by #include name directives in
   standard-conforming user programs.
 
-include/c
+include/c
   Headers intended to directly include standard C headers.
-  [NB: this can be enabled via --enable-cheaders=c]
+  [NB: this can be enabled via --enable-cheaders=c]
 
-include/c_global
+include/c_global
   Headers intended to include standard C headers in
   the global namespace, and put select names into the std::
   namespace.  [NB: this is the default, and is the same as
-  --enable-cheaders=c_global]
+  --enable-cheaders=c_global]
 
-include/c_std
+include/c_std
   Headers intended to include standard C headers
   already in namespace std, and put select names into the std::
-  namespace.  [NB: this is the same as --enable-cheaders=c_std]
+  namespace.  [NB: this is the same as
+  --enable-cheaders=c_std]
 
-include/bits
+include/bits
   Files included by standard headers and by other files in
   the bits directory.
 
-include/backward
+include/backward
   Headers provided for backward compatibility, such as iostream.h.
   They are not used in this library.
 
-include/ext
+include/ext
   Headers that define extensions to the standard library.  No
   standard header refers to any of them.
 
-  scripts
+  scripts
 Scripts that are used during the configure, build, make, or test
 process.
 
-  src
+  src
 Files that are used in constructing the library, but are not
 installed.
 
-  testsuites/[backward, demangle, ext, performance, thread, 17_* to 30_*]
+src/c++98
+Source files compiled using -std=gnu++98.
+
+src/c++11
+Source files compiled using -std=gnu++11.
+
+src/filesystem
+Source files for the Filesystem TS.
+
+src/shared
+Source code included by other files under both
+src/c++98 and
+src/c++11
+
+  testsuites/[backward, demangle, ext, 
performance, thread, 17_* to 30_*]
 Test programs are here, and may be used to begin to exercise the
 library.  Support for "make check" and "make check-install" is
 complete, and runs through all the subdirectories here when this
 command is issued from the build directory.  Please note that
-"make check" requires DejaGNU 1.4 or later to be installed.  Please
-note that "make check-script" calls the script mkcheck, which
-requires bash, and which may need the paths to bash adjusted to
-work properly, as /bin/bash is assumed.
+"make check" requires DejaGNU 1.4 or later to be installed.
 
 Other subdirectories contain variant versions of certain files
 that are meant to be copied or linked by the configure script.
 Currently these are:
 
-  config/abi
-  config/cpu
-  config/io
-  config/locale
-  config/os
+  config/abi
+  config/cpu
+  config/io
+  config/locale
+  config/os
 
 In addition, a subdirectory holds the convenience library libsupc++.
 
-  libsupc++
+  libsupc++
 Contains the runtime library for C++, including exception
 handling and memory allocation and deallocation, RTTI, terminate
 handlers, etc.
 
-Note that glibc also has a bits/ subdirectory.  We will either
-need to be careful not to collide with names in its bits/
-directory; or rename bits to (e.g.) cppbits/.
+Note that 

[PATCH] Fix PR77824

2016-10-10 Thread Bill Schmidt
Hi,

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77824 reports unreachable code 
where MODIFY_EXPR
is being tested instead of SSA_NAME to identify RHS's for copies.  This patch 
corrects that.
I instrumented the compiler to identify copies being added to the candidate 
table, and found
that this now occurs frequently in GCC's support libraries as well as 
throughout SPEC CPU2006.
I spot-checked the SLSR dumps for a number of code examples and found that, 
while copies often
now appear in the candidate table, and sometimes appear in candidate chains 
representing 
potential opportunities, I have not yet found a place where this changes code 
generation.

Bootstrapped and tested for powerpc64le-unknown-linux-gnu with no regressions, 
committed.

Thanks,

Bill



2016-10-10  Bill Schmidt  

PR tree-optimization/77824
* gimple-ssa-strength-reduction.c (stmt_cost): Explicitly return
zero cost for copies.
(find_candidates_dom_walker::before_dom_children): Replace
MODIFY_EXPR with SSA_NAME.
(replace_mult_candidate): Likewise.
(replace_profitable_candidates): Likewise.


Index: gcc/gimple-ssa-strength-reduction.c
===
--- gcc/gimple-ssa-strength-reduction.c (revision 240924)
+++ gcc/gimple-ssa-strength-reduction.c (working copy)
@@ -688,6 +688,9 @@ stmt_cost (gimple *gs, bool speed)
 
 /* Note that we don't assign costs to copies that in most cases
will go away.  */
+case SSA_NAME:
+  return 0;
+  
 default:
   ;
 }
@@ -1693,7 +1696,7 @@ find_candidates_dom_walker::before_dom_children (b
  gcc_fallthrough ();
 
CASE_CONVERT:
-   case MODIFY_EXPR:
+   case SSA_NAME:
case NEGATE_EXPR:
  rhs1 = gimple_assign_rhs1 (gs);
  if (TREE_CODE (rhs1) != SSA_NAME)
@@ -1724,7 +1727,7 @@ find_candidates_dom_walker::before_dom_children (b
  slsr_process_cast (gs, rhs1, speed);
  break;
 
-   case MODIFY_EXPR:
+   case SSA_NAME:
  slsr_process_copy (gs, rhs1, speed);
  break;
 
@@ -2010,7 +2013,7 @@ replace_mult_candidate (slsr_cand_t c, tree basis_
   && bump.to_shwi () != HOST_WIDE_INT_MIN
   /* It is not useful to replace casts, copies, or adds of
 an SSA name and a constant.  */
-  && cand_code != MODIFY_EXPR
+  && cand_code != SSA_NAME
   && !CONVERT_EXPR_CODE_P (cand_code)
   && cand_code != PLUS_EXPR
   && cand_code != POINTER_PLUS_EXPR
@@ -3445,7 +3448,7 @@ replace_profitable_candidates (slsr_cand_t c)
 to a cast or copy.  */
   if (i >= 0
  && profitable_increment_p (i) 
- && orig_code != MODIFY_EXPR
+ && orig_code != SSA_NAME
  && !CONVERT_EXPR_CODE_P (orig_code))
{
  if (phi_dependent_cand_p (c))



Re: [PATCH 10/16] Introduce class function_reader (v3)

2016-10-10 Thread Richard Sandiford
David Malcolm  writes:
> On Wed, 2016-10-05 at 18:00 +0200, Bernd Schmidt wrote:
>> On 10/05/2016 06:15 PM, David Malcolm wrote:
>> >* errors.c: Use consistent pattern for bconfig.h vs config.h
>> >includes.
>> >(progname): Wrap with #ifdef GENERATOR_FILE.
>> >(error): Likewise.  Add "error: " to message.
>> >(fatal): Likewise.
>> >(internal_error): Likewise.
>> >(trim_filename): Likewise.
>> >(fancy_abort): Likewise.
>> >* errors.h (struct file_location): Move here from read-md.h.
>> >(file_location::file_location): Likewise.
>> >(error_at): New decl.
>> 
>> Can you split these out into a separate patch as well? I'll require
>> more 
>> explanation for them and they seem largely independent.
>
> [CCing Richard Sandiford]
>
> The gen* tools have their own diagnostics system, in errors.c:
>
> /* warning, error, and fatal.  These definitions are suitable for use
>in the generator programs; the compiler has a more elaborate suite
>of diagnostic printers, found in diagnostic.c.  */
>
> with file locations tracked using read-md.h's struct file_location,
> rather than location_t (aka libcpp's source_location).
>
> Implementing an RTL frontend by using the RTL reader from read-rtl.c
> means that we now need a diagnostics subsystem on the *host* for
> handling errors in RTL files, rather than just on the build machine.
>
> There seem to be two ways to do this:
>
>   (A) build the "light" diagnostics system (errors.c) for the host as
> well as build machine, and link it with the RTL reader there, so there
> are two parallel diagnostics subsystems.
>
>   (B) build the "real" diagnostics system (diagnostics*) for the
> *build* machine as well as the host, and use it from the gen* tools,
> eliminating the "light" system, and porting the gen* tools to use
> libcpp for location tracking.
>
> Approach (A) seems to be simpler, which is what this part of the patch
> does.
>
> I've experimented with approach (B).  I think it's doable, but it's
> much more invasive (perhaps needing a libdiagnostics.a and a
> build/libdiagnostics.a in gcc/Makefile.in), so I hope this can be
> followup work.
>
> I can split the relevant parts out into a separate patch, but I was
> wondering if either of you had a strong opinion on (A) vs (B) before I
> do so?

(A) sounds fine to me FWIW.  And sorry for the slow reply.

Thanks,
Richard


[PATCH] Correct C++11 implementation status docs

2016-10-10 Thread Jonathan Wakely

The std::list allocator status and the note about timed mutexes are
out of date, those are both completely implemented now (there's a
fallback timed mutex for targets without _POSIX_TIMEOUTS).

* doc/xml/manual/status_cxx2011.xml: Correct C++11 status.

Committed to trunk. I'll backport this to the branches as appropriate.


commit cdeed69de9aae70a15633a160378d84fbd03478c
Author: Jonathan Wakely 
Date:   Mon Oct 10 19:33:23 2016 +0100

Correct C++11 implementation status docs

* doc/xml/manual/status_cxx2011.xml: Correct C++11 status.

diff --git a/libstdc++-v3/doc/xml/manual/status_cxx2011.xml 
b/libstdc++-v3/doc/xml/manual/status_cxx2011.xml
index 83a266f..705f2ee 100644
--- a/libstdc++-v3/doc/xml/manual/status_cxx2011.xml
+++ b/libstdc++-v3/doc/xml/manual/status_cxx2011.xml
@@ -1340,12 +1340,10 @@ particular release.
   
 
 
-  
   23.2.1
   General container requirements
-  Partial
-  list does not meet the requirements
- relating to allocator use and propagation.
+  Y
+  
 
 
   23.2.2
@@ -1396,11 +1394,10 @@ particular release.
   
 
 
-  
   23.3.5
   Class template list
-  Partial
-  Incomplete allocator support.
+  Y
+  
 
 
   23.3.6
@@ -2349,8 +2346,7 @@ particular release.
   30.4.1.3
   Timed mutex types
   
-  On POSIX sytems these types are only defined if the OS
- supports the POSIX Timeouts option. 
+  
 
 
   30.4.1.3.1


[PATCH] Use noexcept instead of _GLIBCXX_USE_NOEXCEPT

2016-10-10 Thread Jonathan Wakely

This file is compiled with -std=gnu++11 so there's no need to use the
macro, we can use noexcept directly.

* libsupc++/eh_ptr.cc (exception_ptr): Replace _GLIBCXX_USE_NOEXCEPT
with noexcept.

Tested powerpc64le-linbux, committed to trunk.


commit f80e14a01697a34b835638a303967c0a7ad194a1
Author: Jonathan Wakely 
Date:   Mon Oct 10 18:38:20 2016 +0100

Use noexcept instead of _GLIBCXX_USE_NOEXCEPT

* libsupc++/eh_ptr.cc (exception_ptr): Replace _GLIBCXX_USE_NOEXCEPT
with noexcept.

diff --git a/libstdc++-v3/libsupc++/eh_ptr.cc b/libstdc++-v3/libsupc++/eh_ptr.cc
index 3b8e0a01..f3c910b 100644
--- a/libstdc++-v3/libsupc++/eh_ptr.cc
+++ b/libstdc++-v3/libsupc++/eh_ptr.cc
@@ -63,33 +63,31 @@ static_assert( adjptr<__cxa_exception>()
 #endif
 }
 
-std::__exception_ptr::exception_ptr::exception_ptr() _GLIBCXX_USE_NOEXCEPT
+std::__exception_ptr::exception_ptr::exception_ptr() noexcept
 : _M_exception_object(0) { }
 
 
-std::__exception_ptr::exception_ptr::exception_ptr(void* obj)
-_GLIBCXX_USE_NOEXCEPT
+std::__exception_ptr::exception_ptr::exception_ptr(void* obj) noexcept
 : _M_exception_object(obj)  { _M_addref(); }
 
 
-std::__exception_ptr::exception_ptr::exception_ptr(__safe_bool)
-_GLIBCXX_USE_NOEXCEPT
+std::__exception_ptr::exception_ptr::exception_ptr(__safe_bool) noexcept
 : _M_exception_object(0) { }
 
 
 std::__exception_ptr::
-exception_ptr::exception_ptr(const exception_ptr& other) _GLIBCXX_USE_NOEXCEPT
+exception_ptr::exception_ptr(const exception_ptr& other) noexcept
 : _M_exception_object(other._M_exception_object)
 { _M_addref(); }
 
 
-std::__exception_ptr::exception_ptr::~exception_ptr() _GLIBCXX_USE_NOEXCEPT
+std::__exception_ptr::exception_ptr::~exception_ptr() noexcept
 { _M_release(); }
 
 
 std::__exception_ptr::exception_ptr&
 std::__exception_ptr::
-exception_ptr::operator=(const exception_ptr& other) _GLIBCXX_USE_NOEXCEPT
+exception_ptr::operator=(const exception_ptr& other) noexcept
 {
   exception_ptr(other).swap(*this);
   return *this;
@@ -97,7 +95,7 @@ exception_ptr::operator=(const exception_ptr& other) 
_GLIBCXX_USE_NOEXCEPT
 
 
 void
-std::__exception_ptr::exception_ptr::_M_addref() _GLIBCXX_USE_NOEXCEPT
+std::__exception_ptr::exception_ptr::_M_addref() noexcept
 {
   if (_M_exception_object)
 {
@@ -109,7 +107,7 @@ std::__exception_ptr::exception_ptr::_M_addref() 
_GLIBCXX_USE_NOEXCEPT
 
 
 void
-std::__exception_ptr::exception_ptr::_M_release() _GLIBCXX_USE_NOEXCEPT
+std::__exception_ptr::exception_ptr::_M_release() noexcept
 {
   if (_M_exception_object)
 {
@@ -128,13 +126,12 @@ std::__exception_ptr::exception_ptr::_M_release() 
_GLIBCXX_USE_NOEXCEPT
 
 
 void*
-std::__exception_ptr::exception_ptr::_M_get() const _GLIBCXX_USE_NOEXCEPT
+std::__exception_ptr::exception_ptr::_M_get() const noexcept
 { return _M_exception_object; }
 
 
 void
-std::__exception_ptr::exception_ptr::swap(exception_ptr )
-  _GLIBCXX_USE_NOEXCEPT
+std::__exception_ptr::exception_ptr::swap(exception_ptr ) noexcept
 {
   void *tmp = _M_exception_object;
   _M_exception_object = other._M_exception_object;
@@ -144,27 +141,24 @@ std::__exception_ptr::exception_ptr::swap(exception_ptr 
)
 
 // Retained for compatibility with CXXABI_1.3.
 void
-std::__exception_ptr::exception_ptr::_M_safe_bool_dummy()
-  _GLIBCXX_USE_NOEXCEPT { }
+std::__exception_ptr::exception_ptr::_M_safe_bool_dummy() noexcept { }
 
 
 // Retained for compatibility with CXXABI_1.3.
 bool
-std::__exception_ptr::exception_ptr::operator!() const _GLIBCXX_USE_NOEXCEPT
+std::__exception_ptr::exception_ptr::operator!() const noexcept
 { return _M_exception_object == 0; }
 
 
 // Retained for compatibility with CXXABI_1.3.
-std::__exception_ptr::exception_ptr::operator __safe_bool() const
-_GLIBCXX_USE_NOEXCEPT
+std::__exception_ptr::exception_ptr::operator __safe_bool() const noexcept
 {
   return _M_exception_object ? _ptr::_M_safe_bool_dummy : 0;
 }
 
 
 const std::type_info*
-std::__exception_ptr::exception_ptr::__cxa_exception_type() const
-  _GLIBCXX_USE_NOEXCEPT
+std::__exception_ptr::exception_ptr::__cxa_exception_type() const noexcept
 {
   __cxa_exception *eh = __get_exception_header_from_obj (_M_exception_object);
   return eh->exceptionType;
@@ -172,19 +166,17 @@ 
std::__exception_ptr::exception_ptr::__cxa_exception_type() const
 
 
 bool std::__exception_ptr::operator==(const exception_ptr& lhs,
- const exception_ptr& rhs)
-  _GLIBCXX_USE_NOEXCEPT
+ const exception_ptr& rhs) noexcept
 { return lhs._M_exception_object == rhs._M_exception_object; }
 
 
 bool std::__exception_ptr::operator!=(const exception_ptr& lhs,
- const exception_ptr& rhs)
-  _GLIBCXX_USE_NOEXCEPT
+ const exception_ptr& rhs) noexcept
 { return !(lhs == rhs);}
 
 
 std::exception_ptr
-std::current_exception() _GLIBCXX_USE_NOEXCEPT
+std::current_exception() 

Re: [v3 PATCH] Make any's copy assignment operator exception-safe, don't copy the underlying value when any is moved, make in_place constructors explicit.

2016-10-10 Thread Jonathan Wakely

On 10/10/16 19:19 +0100, Jonathan Wakely wrote:

On 08/10/16 16:07 +0300, Ville Voutilainen wrote:

Tested on Linux-x64.

2016-10-08  Ville Voutilainen  

  Make any's copy assignment operator exception-safe,
  don't copy the underlying value when any is moved,
  make in_place constructors explicit.
  * include/std/any (any(in_place_type_t<_ValueType>, _Args&&...)):
  Make explicit.
  (any(in_place_type_t<_ValueType>, initializer_list<_Up>, _Args&&...)):
  Likewise.
  (operator=(const any&)): Make strongly exception-safe.
  (operator=(any&&)): Reset the manager when resetting the value.
  This makes the state saner if an exception is thrown during the move.
  (_Manager_internal<_Tp>::_S_manage): Move in _Op_xfer, don't copy.
  * testsuite/20_util/any/assign/2.cc: Adjust.
  * testsuite/20_util/any/assign/exception.cc: New.
  * testsuite/20_util/any/cons/2.cc: Adjust.
  * testsuite/20_util/any/cons/explicit.cc: New.
  * testsuite/20_util/any/misc/any_cast_neg.cc: Ajust.



diff --git a/libstdc++-v3/include/std/any b/libstdc++-v3/include/std/any
index 9160035..78bdf89 100644
--- a/libstdc++-v3/include/std/any
+++ b/libstdc++-v3/include/std/any
@@ -179,7 +179,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  typename _Tp = _Decay<_ValueType>,
  typename _Mgr = _Manager<_Tp>,
 __any_constructible_t<_Tp, _Args&&...> = false>
-  any(in_place_type_t<_ValueType>, _Args&&... __args)
+  explicit any(in_place_type_t<_ValueType>, _Args&&... __args)
 : _M_manager(&_Mgr::_S_manage)
 {
   _Mgr::_S_create(_M_storage, std::forward<_Args>(__args)...);
@@ -192,8 +192,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  typename _Mgr = _Manager<_Tp>,
 __any_constructible_t<_Tp, initializer_list<_Up>,
_Args&&...> = false>
-  any(in_place_type_t<_ValueType>,
- initializer_list<_Up> __il, _Args&&... __args)
+  explicit any(in_place_type_t<_ValueType>,
+  initializer_list<_Up> __il, _Args&&... __args)
 : _M_manager(&_Mgr::_S_manage)
 {
   _Mgr::_S_create(_M_storage, __il, std::forward<_Args>(__args)...);


I prefer to put "explicit" on a line of its own, as we do for return
types, but I won't complain if you leave it like this.


@@ -211,11 +211,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
reset();
 else if (this != &__rhs)
{
- if (has_value())
-   _M_manager(_Op_destroy, this, nullptr);
- _Arg __arg;
- __arg._M_any = this;
- __rhs._M_manager(_Op_clone, &__rhs, &__arg);
+ any(__rhs).swap(*this);


I was trying to avoid the "redundant" xfer operations that the swap
does, but I don't think we can do that and be exception safe. This is
simple and safe, and I think its optimal. Thanks.


As discussed on IRC, it can be:

 else if (this != &__rhs)
   *this = any(__rhs);

which does one clone, one xfer and one destroy.

This way the effort of avoiding an extra xfer op is in the move
assignment operator.


As a drive-by fix, on operator=(_ValueType&& __rhs) please indent the
return type to line up with "operator".


Re: [PATCH] Implement new hook for max_align_t_align

2016-10-10 Thread John David Anglin
Attached is an updated version using the new builtin __MAX_ALIGN_T_ALIGN__.  
This
simplifies the declaration of max_align_t and ensures it is always the same as 
max_align_t_align().

Tested on hppa-unknown-linux-gnu.  Okay for trunk?

Dave
--
John David Anglin   dave.ang...@bell.net


2016-10-10  John David Anglin  

gcc/c-family/
* c-common.c (c_stddef_cpp_builtins): Add __MAX_ALIGN_T_ALIGN__ builtin
define.
(max_align_t_align): Move to targhooks.c.
* c-common.h (max_align_t_align): Delete.
gcc/
* target.def (max_align_t_align): New target hook.
* targhooks.c (default_max_align_t_align): New.
* targhooks.h (default_max_align_t_align): Declare.
* config/pa/pa.c (pa_max_align_t_align): New.
(TARGET_MAX_ALIGN_T_ALIGN): Define.
* ginclude/stddef.h (max_align_t): Use __MAX_ALIGN_T_ALIGN__ builtin.
* doc/tm.texi.in (TARGET_MAX_ALIGN_T_ALIGN): Add documentation hook.
* doc/tm.texi: Update.
gcc/cp/
* decl.c (cxx_init_decl_processing): Use max_align_t_align target hook.
* init.c (build_new_1): Likewise.

Index: c-family/c-common.c
===
--- c-family/c-common.c (revision 240901)
+++ c-family/c-common.c (working copy)
@@ -6683,6 +6683,8 @@
 builtin_define_with_value ("__INTPTR_TYPE__", INTPTR_TYPE, 0);
   if (UINTPTR_TYPE)
 builtin_define_with_value ("__UINTPTR_TYPE__", UINTPTR_TYPE, 0);
+  builtin_define_with_int_value ("__MAX_ALIGN_T_ALIGN__",
+targetm.max_align_t_align () / BITS_PER_UNIT);
 }
 
 static void
@@ -12925,22 +12927,6 @@
   return stv_nothing;
 }
 
-/* Return the alignment of std::max_align_t.
-
-   [support.types.layout] The type max_align_t is a POD type whose alignment
-   requirement is at least as great as that of every scalar type, and whose
-   alignment requirement is supported in every context.  */
-
-unsigned
-max_align_t_align ()
-{
-  unsigned int max_align = MAX (TYPE_ALIGN (long_long_integer_type_node),
-   TYPE_ALIGN (long_double_type_node));
-  if (float128_type_node != NULL_TREE)
-max_align = MAX (max_align, TYPE_ALIGN (float128_type_node));
-  return max_align;
-}
-
 /* Return true iff ALIGN is an integral constant that is a fundamental
alignment, as defined by [basic.align] in the c++-11
specifications.
@@ -12954,7 +12940,7 @@
 bool
 cxx_fundamental_alignment_p (unsigned align)
 {
-  return (align <= max_align_t_align ());
+  return (align <= targetm.max_align_t_align ());
 }
 
 /* Return true if T is a pointer to a zero-sized aggregate.  */
Index: c-family/c-common.h
===
--- c-family/c-common.h (revision 240901)
+++ c-family/c-common.h (working copy)
@@ -866,7 +866,6 @@
 extern bool keyword_is_storage_class_specifier (enum rid);
 extern bool keyword_is_type_qualifier (enum rid);
 extern bool keyword_is_decl_specifier (enum rid);
-extern unsigned max_align_t_align (void);
 extern bool cxx_fundamental_alignment_p (unsigned);
 extern bool pointer_to_zero_sized_aggr_p (tree);
 extern bool diagnose_mismatched_attributes (tree, tree);
Index: config/pa/pa.c
===
--- config/pa/pa.c  (revision 240901)
+++ config/pa/pa.c  (working copy)
@@ -194,6 +194,7 @@
 static bool pa_legitimate_constant_p (machine_mode, rtx);
 static unsigned int pa_section_type_flags (tree, const char *, int);
 static bool pa_legitimate_address_p (machine_mode, rtx, bool);
+static unsigned int pa_max_align_t_align (void);
 
 /* The following extra sections are only used for SOM.  */
 static GTY(()) section *som_readonly_data_section;
@@ -400,6 +401,9 @@
 #undef TARGET_LRA_P
 #define TARGET_LRA_P hook_bool_void_false
 
+#undef TARGET_MAX_ALIGN_T_ALIGN
+#define TARGET_MAX_ALIGN_T_ALIGN pa_max_align_t_align
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 
 /* Parse the -mfixed-range= option string.  */
@@ -10719,4 +10723,16 @@
   return NULL_RTX;
 }
 
+/* The maximimum alignment in bits for the POD type std:max_align_t.
+   This is set to 128 on 32-bit non HP-UX systems to suppress warnings
+   about new with extended alignment.  This arises because various POSIX
+   types such as pthread_mutex_t have for historical reasons 128-bit
+   alignment but the default alignment of std:max_align_t is 64 bits.  */
+
+static unsigned int
+pa_max_align_t_align (void)
+{
+  return TARGET_HPUX && !TARGET_64BIT ? 64 : 128;
+}
+
 #include "gt-pa.h"
Index: cp/decl.c
===
--- cp/decl.c   (revision 240901)
+++ cp/decl.c   (working copy)
@@ -4082,7 +4082,7 @@
   if (aligned_new_threshold == -1)
 aligned_new_threshold = (cxx_dialect >= cxx1z) ? 1 : 0;
   if (aligned_new_threshold == 1)
-aligned_new_threshold = max_align_t_align () / BITS_PER_UNIT;
+

Re: [v3 PATCH] Make any's copy assignment operator exception-safe, don't copy the underlying value when any is moved, make in_place constructors explicit.

2016-10-10 Thread Jonathan Wakely

On 08/10/16 16:07 +0300, Ville Voutilainen wrote:

Tested on Linux-x64.

2016-10-08  Ville Voutilainen  

   Make any's copy assignment operator exception-safe,
   don't copy the underlying value when any is moved,
   make in_place constructors explicit.
   * include/std/any (any(in_place_type_t<_ValueType>, _Args&&...)):
   Make explicit.
   (any(in_place_type_t<_ValueType>, initializer_list<_Up>, _Args&&...)):
   Likewise.
   (operator=(const any&)): Make strongly exception-safe.
   (operator=(any&&)): Reset the manager when resetting the value.
   This makes the state saner if an exception is thrown during the move.
   (_Manager_internal<_Tp>::_S_manage): Move in _Op_xfer, don't copy.
   * testsuite/20_util/any/assign/2.cc: Adjust.
   * testsuite/20_util/any/assign/exception.cc: New.
   * testsuite/20_util/any/cons/2.cc: Adjust.
   * testsuite/20_util/any/cons/explicit.cc: New.
   * testsuite/20_util/any/misc/any_cast_neg.cc: Ajust.



diff --git a/libstdc++-v3/include/std/any b/libstdc++-v3/include/std/any
index 9160035..78bdf89 100644
--- a/libstdc++-v3/include/std/any
+++ b/libstdc++-v3/include/std/any
@@ -179,7 +179,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  typename _Tp = _Decay<_ValueType>,
  typename _Mgr = _Manager<_Tp>,
  __any_constructible_t<_Tp, _Args&&...> = false>
-  any(in_place_type_t<_ValueType>, _Args&&... __args)
+  explicit any(in_place_type_t<_ValueType>, _Args&&... __args)
  : _M_manager(&_Mgr::_S_manage)
  {
_Mgr::_S_create(_M_storage, std::forward<_Args>(__args)...);
@@ -192,8 +192,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  typename _Mgr = _Manager<_Tp>,
  __any_constructible_t<_Tp, initializer_list<_Up>,
_Args&&...> = false>
-  any(in_place_type_t<_ValueType>,
- initializer_list<_Up> __il, _Args&&... __args)
+  explicit any(in_place_type_t<_ValueType>,
+  initializer_list<_Up> __il, _Args&&... __args)
  : _M_manager(&_Mgr::_S_manage)
  {
_Mgr::_S_create(_M_storage, __il, std::forward<_Args>(__args)...);


I prefer to put "explicit" on a line of its own, as we do for return
types, but I won't complain if you leave it like this.


@@ -211,11 +211,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
reset();
  else if (this != &__rhs)
{
- if (has_value())
-   _M_manager(_Op_destroy, this, nullptr);
- _Arg __arg;
- __arg._M_any = this;
- __rhs._M_manager(_Op_clone, &__rhs, &__arg);
+ any(__rhs).swap(*this);


I was trying to avoid the "redundant" xfer operations that the swap
does, but I don't think we can do that and be exception safe. This is
simple and safe, and I think its optimal. Thanks.


}
  return *this;
}
@@ -232,7 +228,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  else if (this != &__rhs)
{
  if (has_value())
-   _M_manager(_Op_destroy, this, nullptr);
+   reset();


If you're going to use reset() then you don't need the has_value()
check first. I think the reason I didn't use reset() was to avoid the
dead store to _M_manager that reset() does, since the compiler might
not detect it's dead (because the next store is done by the call
through a function pointer).

This code was all pretty carefully written to avoid any redundant
operations. Does this change buy us anything except simpler code?



  _Arg __arg;
  __arg._M_any = this;
  __rhs._M_manager(_Op_xfer, &__rhs, &__arg);
@@ -556,7 +552,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
__ptr->~_Tp();
break;
  case _Op_xfer:
-   ::new(&__arg->_M_any->_M_storage._M_buffer) _Tp(*__ptr);
+   ::new(&__arg->_M_any->_M_storage._M_buffer) _Tp
+ (std::move(*const_cast<_Tp*>(__ptr)));


I was looking at this recently and wondering why I did a copy not a
move. *cough* no redundant operations *cough* Oops.




Go patch committed: remove GCC-specific linemap usage

2016-10-10 Thread Ian Lance Taylor
This patch by Than McIntosh removes a GCC-specific use of the linemap
code to retrieve the line number.  Bootstrapped and ran Go testsuite
on x86_64-pc-linux-gnu.  Committed to mainline.

Ian

2016-10-10  Than McIntosh  

* go-linemap.cc (Gcc_linemap::location_line): New method.
Index: gcc/go/go-linemap.cc
===
--- gcc/go/go-linemap.cc(revision 240755)
+++ gcc/go/go-linemap.cc(working copy)
@@ -32,6 +32,9 @@ class Gcc_linemap : public Linemap
   std::string
   to_string(Location);
 
+  int
+  location_line(Location);
+
  protected:
   Location
   get_predeclared_location();
@@ -88,6 +91,13 @@ Gcc_linemap::to_string(Location location
   return ss.str();
 }
 
+// Return the line number for a given location (for debugging dumps)
+int
+Gcc_linemap::location_line(Location loc)
+{
+  return LOCATION_LINE(loc.gcc_location());
+}
+
 // Stop getting locations.
 
 void
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 240941)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-9401e714d690e3907a64ac5c8cd5aed9e28f511b
+f3658aea2493c7f1c4a72502f9e7da562c7764c4
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/escape.cc
===
--- gcc/go/gofrontend/escape.cc (revision 240941)
+++ gcc/go/gofrontend/escape.cc (working copy)
@@ -145,7 +145,7 @@ Node::details() const
   std::stringstream details;
 
   if (!this->is_sink())
-details << " l(" << LOCATION_LINE(this->location().gcc_location()) << ")";
+details << " l(" << Linemap::location_to_line(this->location()) << ")";
 
   bool is_varargs = false;
   bool is_address_taken = false;
Index: gcc/go/gofrontend/go-linemap.h
===
--- gcc/go/gofrontend/go-linemap.h  (revision 240755)
+++ gcc/go/gofrontend/go-linemap.h  (working copy)
@@ -63,6 +63,10 @@ class Linemap
   virtual std::string
   to_string(Location) = 0;
 
+  // Return the line number for a given location (for debugging dumps)
+  virtual int
+  location_line(Location) = 0;
+
  protected:
   // Return a special Location used for predeclared identifiers.  This
   // Location should be different from that for any actual source
@@ -135,6 +139,14 @@ class Linemap
 go_assert(Linemap::instance_ != NULL);
 return Linemap::instance_->to_string(loc);
   }
+
+  // Return line number for location
+  static int
+  location_to_line(Location loc)
+  {
+go_assert(Linemap::instance_ != NULL);
+return Linemap::instance_->location_line(loc);
+  }
 };
 
 // The backend interface must define this function.  It should return


[PATCH] Minor simplification to std::_Bind_result helpers

2016-10-10 Thread Jonathan Wakely

We don't need to define new class templates for the SFINAE helpers in
_Bind_result, we can just use alias templates. This also moves where
the helpers are used to the return types, instead of as a defaulted
argument.

* include/std/functional (_Bind_result::__enable_if_void): Use alias
template instead of class template.
(_Bind_result::__disable_if_void): Likewise.
(_Bind_result::__call): Adjust uses of __enable_if_void and
__disable_if_void.

Tested powerpc64le-linux, committed to trunk.

commit 1330ba1b3b4ccddc64e532756aa2f571f27ae2ad
Author: Jonathan Wakely 
Date:   Mon Oct 10 17:00:49 2016 +0100

Minor simplification to std::_Bind_result helpers

* include/std/functional (_Bind_result::__enable_if_void): Use alias
template instead of class template.
(_Bind_result::__disable_if_void): Likewise.
(_Bind_result::__call): Adjust uses of __enable_if_void and
__disable_if_void.

diff --git a/libstdc++-v3/include/std/functional 
b/libstdc++-v3/include/std/functional
index 1c7523e..2587392 100644
--- a/libstdc++-v3/include/std/functional
+++ b/libstdc++-v3/include/std/functional
@@ -1000,15 +1000,17 @@ _GLIBCXX_MEM_FN_TRAITS(&&, false_type, true_type)
 
   // sfinae types
   template
-   struct __enable_if_void : enable_if::value, int> { };
+   using __enable_if_void
+ = typename enable_if{}>::type;
+
   template
-   struct __disable_if_void : enable_if::value, int> { };
+   using __disable_if_void
+ = typename enable_if{}, _Result>::type;
 
   // Call unqualified
   template
-   _Result
-   __call(tuple<_Args...>&& __args, _Index_tuple<_Indexes...>,
-   typename __disable_if_void<_Res>::type = 0)
+   __disable_if_void<_Res>
+   __call(tuple<_Args...>&& __args, _Index_tuple<_Indexes...>)
{
  return _M_f(_Mu<_Bound_args>()
  (std::get<_Indexes>(_M_bound_args), __args)...);
@@ -1016,9 +1018,8 @@ _GLIBCXX_MEM_FN_TRAITS(&&, false_type, true_type)
 
   // Call unqualified, return void
   template
-   void
-   __call(tuple<_Args...>&& __args, _Index_tuple<_Indexes...>,
-   typename __enable_if_void<_Res>::type = 0)
+   __enable_if_void<_Res>
+   __call(tuple<_Args...>&& __args, _Index_tuple<_Indexes...>)
{
  _M_f(_Mu<_Bound_args>()
   (std::get<_Indexes>(_M_bound_args), __args)...);
@@ -1026,9 +1027,8 @@ _GLIBCXX_MEM_FN_TRAITS(&&, false_type, true_type)
 
   // Call as const
   template
-   _Result
-   __call(tuple<_Args...>&& __args, _Index_tuple<_Indexes...>,
-   typename __disable_if_void<_Res>::type = 0) const
+   __disable_if_void<_Res>
+   __call(tuple<_Args...>&& __args, _Index_tuple<_Indexes...>) const
{
  return _M_f(_Mu<_Bound_args>()
  (std::get<_Indexes>(_M_bound_args), __args)...);
@@ -1036,9 +1036,8 @@ _GLIBCXX_MEM_FN_TRAITS(&&, false_type, true_type)
 
   // Call as const, return void
   template
-   void
-   __call(tuple<_Args...>&& __args, _Index_tuple<_Indexes...>,
-   typename __enable_if_void<_Res>::type = 0) const
+   __enable_if_void<_Res>
+   __call(tuple<_Args...>&& __args, _Index_tuple<_Indexes...>) const
{
  _M_f(_Mu<_Bound_args>()
   (std::get<_Indexes>(_M_bound_args),  __args)...);
@@ -1046,9 +1045,8 @@ _GLIBCXX_MEM_FN_TRAITS(&&, false_type, true_type)
 
   // Call as volatile
   template
-   _Result
-   __call(tuple<_Args...>&& __args, _Index_tuple<_Indexes...>,
-   typename __disable_if_void<_Res>::type = 0) volatile
+   __disable_if_void<_Res>
+   __call(tuple<_Args...>&& __args, _Index_tuple<_Indexes...>) volatile
{
  return _M_f(_Mu<_Bound_args>()
  (__volget<_Indexes>(_M_bound_args), __args)...);
@@ -1056,9 +1054,8 @@ _GLIBCXX_MEM_FN_TRAITS(&&, false_type, true_type)
 
   // Call as volatile, return void
   template
-   void
-   __call(tuple<_Args...>&& __args, _Index_tuple<_Indexes...>,
-   typename __enable_if_void<_Res>::type = 0) volatile
+   __enable_if_void<_Res>
+   __call(tuple<_Args...>&& __args, _Index_tuple<_Indexes...>) volatile
{
  _M_f(_Mu<_Bound_args>()
   (__volget<_Indexes>(_M_bound_args), __args)...);
@@ -1066,9 +1063,9 @@ _GLIBCXX_MEM_FN_TRAITS(&&, false_type, true_type)
 
   // Call as const volatile
   template
-   _Result
-   __call(tuple<_Args...>&& __args, _Index_tuple<_Indexes...>,
-   typename __disable_if_void<_Res>::type = 0) const volatile
+   __disable_if_void<_Res>
+   __call(tuple<_Args...>&& __args,
+  _Index_tuple<_Indexes...>) const volatile
{
  return _M_f(_Mu<_Bound_args>()
  (__volget<_Indexes>(_M_bound_args), 

Re: [PATCH] Improve performance of list::reverse

2016-10-10 Thread Jonathan Wakely

On 09/10/16 16:23 +0100, Elliot Goodrich wrote:

Hi,

If we unroll the loop so that we iterate both forwards and backwards,
we can take advantage of memory-level parallelism when chasing
pointers. This means that reverse takes 35% less time when nodes are
randomly scattered in memory and about the same time if nodes are
contiguous.

Further, as our node pointers will never alias, we can interleave the
swaps of the next and previous pointers to remove further data
dependencies. This takes another 5% off the time when nodes are
scattered in memory and takes 20% off when nodes are contiguous.

All in all we save 20%-40% depending on the memory layout.


Nice, thanks for the patch.

Do you have (or are you willing to sign) a copyright assignment for
GCC?

See https://gcc.gnu.org/contribute.html#legal for details.


For future improvement, by passing whether there is an odd or even
number of nodes in the list we can hoist one of the ifs out of the
loop and gain another 5-10% but most likely this is only possible when
_GLIBCXX_USE_CXX11_ABI is defined and size() is O(1). This would bring
the saving to 30%-45%. Is it worth writing a new overload of
_M_reverse which takes the size of the list?


That certainly seems worthwhile. Do we need an overload or can it just
be done with #if? It seems to me we'd either want to use the size, or
not use it, we wouldn't want both versions defined at once. That
suggests #if to me.



[hsa-branch 9/9] Fix another finalizer type complaint

2016-10-10 Thread Martin Jambor
Hi,

the subsequent patch deals with a finalizer error issued when we ave a
register-register move of an HSAIL vector type.  Apparently, such a move
must obey the same rules as vector loads and stores.

Committed to the branch, queued for merge to trunk soon.
Thanks,

Martin

2016-10-03  Martin Jambor  

* hsa-gen.c (hsa_build_append_simple_mov): Use mem_type_for_type.
---
 gcc/hsa-gen.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/gcc/hsa-gen.c b/gcc/hsa-gen.c
index fd0dbcd..0b25f66 100644
--- a/gcc/hsa-gen.c
+++ b/gcc/hsa-gen.c
@@ -2227,8 +2227,10 @@ hsa_reg_or_immed_for_gimple_op (tree op, hsa_bb *hbb)
 void
 hsa_build_append_simple_mov (hsa_op_reg *dest, hsa_op_base *src, hsa_bb *hbb)
 {
-  hsa_insn_basic *insn = new hsa_insn_basic (2, BRIG_OPCODE_MOV, dest->m_type,
-dest, src);
+  /* Moves of packed data between registers need to adhere to the same type
+ rules like when dealing with memory.  */
+  BrigType16_t tp = mem_type_for_type (dest->m_type);
+  hsa_insn_basic *insn = new hsa_insn_basic (2, BRIG_OPCODE_MOV, tp, dest, 
src);
   if (hsa_op_reg *sreg = dyn_cast  (src))
 gcc_assert (hsa_type_bit_size (dest->m_type)
== hsa_type_bit_size (sreg->m_type));
-- 
2.10.0


[hsa-branch 8/9] Fail instead of calling an unknown GOMP builtin

2016-10-10 Thread Martin Jambor
Hi,

this patch is a bit of a hack to make sure we do not emit calls to
libgomp run-time functions which are not available at the HSA GPU side,
such as run-time loop scheduling routines.  If we fail at the caller
side, we avoid issues with finalizer looking at calls to non-existing
functions.

Committed to the branch, queued for merge to trunk soon.
Thanks,

Martin

2016-10-03  Martin Jambor  

* hsa-gen.c (gen_hsa_insns_for_call): Fail when encountering a
GOMP builtin that we cannot process ourselves.
---
 gcc/hsa-gen.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/gcc/hsa-gen.c b/gcc/hsa-gen.c
index 8893a28..fd0dbcd 100644
--- a/gcc/hsa-gen.c
+++ b/gcc/hsa-gen.c
@@ -5972,7 +5972,15 @@ gen_hsa_insns_for_call (gimple *stmt, hsa_bb *hbb)
   break;
 default:
   {
-   gen_hsa_insns_for_direct_call (stmt, hbb);
+   tree name_tree = DECL_NAME (fndecl);
+   const char *s = IDENTIFIER_POINTER (name_tree);
+   size_t len = strlen (s);
+   if (len > 4 && (strncmp (s, "__builtin_GOMP_", 15) == 0))
+ HSA_SORRY_ATV (gimple_location (stmt),
+"support for HSA does not implement GOMP function %s",
+s);
+   else
+ gen_hsa_insns_for_direct_call (stmt, hbb);
return;
   }
 }
-- 
2.10.0



[hsa-branch 7/9] Ignore prefetch builtin

2016-10-10 Thread Martin Jambor
Hi,

this patch makes HSAIL expansion ignore prefetch built-ins.  It is a bit
less straightforward because we also need to handle cases where the call
does not pass gimple_call_builtin_p test because of argument type
mismatches.

Committed to the branch, queued for merge to trunk soon.
Thanks,

Martin

2016-10-03  Martin Jambor  

* hsa-gen.c (gen_hsa_insns_for_call): Ignore prefetch builtin.
---
 gcc/hsa-gen.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/gcc/hsa-gen.c b/gcc/hsa-gen.c
index ad40087..8893a28 100644
--- a/gcc/hsa-gen.c
+++ b/gcc/hsa-gen.c
@@ -5530,6 +5530,12 @@ gen_hsa_insns_for_call (gimple *stmt, hsa_bb *hbb)
   if (!gimple_call_builtin_p (stmt, BUILT_IN_NORMAL))
 {
   tree function_decl = gimple_call_fndecl (stmt);
+  /* Prefetch pass can create type-mismatching prefetch builtin calls which
+fail the gimple_call_builtin_p test above.  Handle them here.  */
+  if (DECL_BUILT_IN_CLASS (function_decl)
+ && DECL_FUNCTION_CODE (function_decl) == BUILT_IN_PREFETCH)
+   return;
+
   if (function_decl == NULL_TREE)
{
  HSA_SORRY_AT (gimple_location (stmt),
@@ -5962,6 +5968,8 @@ gen_hsa_insns_for_call (gimple *stmt, hsa_bb *hbb)
gen_hsa_alloca (call, hbb);
break;
   }
+case BUILT_IN_PREFETCH:
+  break;
 default:
   {
gen_hsa_insns_for_direct_call (stmt, hbb);
-- 
2.10.0



[hsa-branch 5/9] Properly detect variadic arguments

2016-10-10 Thread Martin Jambor
Hi,

this patch from Martin properly detects some variadic calls which we have
failed to detect before during expansion to HSAIL.

Committed to the branch, queued for merge to trunk soon.
Thanks,

Martin

2016-10-03  Martin Liska  
Martin Jambor  

* hsa-gen.c (verify_function_arguments): Properly detect variadic
arguments.
---
 gcc/hsa-gen.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/hsa-gen.c b/gcc/hsa-gen.c
index efb87a0..ac83e9e 100644
--- a/gcc/hsa-gen.c
+++ b/gcc/hsa-gen.c
@@ -3444,13 +3444,14 @@ gen_hsa_insns_for_switch_stmt (gswitch *s, hsa_bb *hbb)
 static void
 verify_function_arguments (tree decl)
 {
+  tree type = TREE_TYPE (decl);
   if (DECL_STATIC_CHAIN (decl))
 {
   HSA_SORRY_ATV (EXPR_LOCATION (decl),
 "HSA does not support nested functions: %D", decl);
   return;
 }
-  else if (!TYPE_ARG_TYPES (TREE_TYPE (decl)))
+  else if (!TYPE_ARG_TYPES (type) || stdarg_p (type))
 {
   HSA_SORRY_ATV (EXPR_LOCATION (decl),
 "HSA does not support functions with variadic arguments "
-- 
2.10.0



[hsa-branch 3/9] Handle simds within gridified loops gracefully

2016-10-10 Thread Martin Jambor
Hi,

this patch deals with simd constructs in gridified OpenMP loops.
Standalone simds are dealt with by forcing the gridified copy to have
OMP_CLAUSE_SAFELEN_EXPR of one, while simds which are a part of a
combined construct with the gridified parallel loop are simply
discarded.

Committed to the branch, queued for merge to trunk soon.
Thanks,

Martin

2016-10-03  Martin Jambor  

* omp-low.c (grid_find_ungridifiable_statement): Do not bail out
for simd loops.
(grid_inner_loop_gridifiable_p): Likewise.
(grid_process_grid_body): New function.
(grid_eliminate_combined_simd_part): Likewise.
(grid_mark_tiling_loops): Use it. Walk body of the loop with
grid_process_grid_body.
(grid_process_kernel_body_copy): Likewise.
---
 gcc/omp-low.c | 137 +++---
 1 file changed, 122 insertions(+), 15 deletions(-)

diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index 05015bd..a51474b 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -17478,17 +17478,6 @@ grid_find_ungridifiable_statement 
(gimple_stmt_iterator *gsi,
   *handled_ops_p = true;
   wi->info = stmt;
   return error_mark_node;
-
-case GIMPLE_OMP_FOR:
-  if ((gimple_omp_for_kind (stmt) & GF_OMP_FOR_SIMD)
- && gimple_omp_for_combined_into_p (stmt))
-   {
- *handled_ops_p = true;
- wi->info = stmt;
- return error_mark_node;
-   }
-  break;
-
 default:
   break;
 }
@@ -17614,10 +17603,6 @@ grid_inner_loop_gridifiable_p (gomp_for *gfor, 
grid_prop *grid)
dump_printf_loc (MSG_MISSED_OPTIMIZATION, grid->target_loc,
   GRID_MISSED_MSG_PREFIX "the inner loop contains "
   "call to a noreturn function\n");
- else if (gimple_code (bad) == GIMPLE_OMP_FOR)
-   dump_printf_loc (MSG_MISSED_OPTIMIZATION, grid->target_loc,
-GRID_MISSED_MSG_PREFIX "the inner loop contains "
-"a simd construct\n");
  else
dump_printf_loc (MSG_MISSED_OPTIMIZATION, grid->target_loc,
 GRID_MISSED_MSG_PREFIX "the inner loop contains "
@@ -18212,6 +18197,113 @@ grid_copy_leading_local_assignments (gimple_seq src, 
gimple_stmt_iterator *dst,
   return NULL;
 }
 
+/* Statement walker function to make adjustments to statements within the
+   gridifed kernel copy.  */
+
+static tree
+grid_process_grid_body (gimple_stmt_iterator *gsi, bool *handled_ops_p,
+   struct walk_stmt_info *)
+{
+  *handled_ops_p = false;
+  gimple *stmt = gsi_stmt (*gsi);
+  if (gimple_code (stmt) == GIMPLE_OMP_FOR
+  && (gimple_omp_for_kind (stmt) & GF_OMP_FOR_SIMD))
+  {
+gomp_for *loop = as_a  (stmt);
+tree clauses = gimple_omp_for_clauses (loop);
+tree cl = find_omp_clause (clauses, OMP_CLAUSE_SAFELEN);
+if (cl)
+  OMP_CLAUSE_SAFELEN_EXPR (cl) = integer_one_node;
+else
+  {
+   tree c = build_omp_clause (UNKNOWN_LOCATION, OMP_CLAUSE_SAFELEN);
+   OMP_CLAUSE_SAFELEN_EXPR (c) = integer_one_node;
+   OMP_CLAUSE_CHAIN (c) = clauses;
+   gimple_omp_for_set_clauses (loop, c);
+  }
+  }
+  return NULL_TREE;
+}
+
+/* Given a PARLOOP that is a normal for looping construct but also a part of a
+   combined construct with a simd loop, eliminate the simd loop.  */
+
+static void
+grid_eliminate_combined_simd_part (gomp_for *parloop)
+{
+  struct walk_stmt_info wi;
+
+  memset (, 0, sizeof (wi));
+  wi.val_only = true;
+  enum gf_mask msk = GF_OMP_FOR_SIMD;
+  wi.info = (void *) 
+  walk_gimple_seq (gimple_omp_body (parloop), find_combined_for, NULL, );
+  gimple *stmt = (gimple *) wi.info;
+  /* We expect that the SIMD id the only statement in the parallel loop.  */
+  gcc_assert (stmt
+ && gimple_code (stmt) == GIMPLE_OMP_FOR
+ && (gimple_omp_for_kind (stmt) == GF_OMP_FOR_SIMD)
+ && gimple_omp_for_combined_into_p (stmt)
+ && !gimple_omp_for_combined_p (stmt));
+  gomp_for *simd = as_a  (stmt);
+
+  /* Copy over the iteration properties because the body refers to the index in
+ the bottmom-most loop.  */
+  unsigned i, collapse = gimple_omp_for_collapse (parloop);
+  gcc_checking_assert (collapse == gimple_omp_for_collapse (simd));
+  for (i = 0; i < collapse; i++)
+{
+  gimple_omp_for_set_index (parloop, i, gimple_omp_for_index (simd, i));
+  gimple_omp_for_set_initial (parloop, i, gimple_omp_for_initial (simd, 
i));
+  gimple_omp_for_set_final (parloop, i, gimple_omp_for_final (simd, i));
+  gimple_omp_for_set_incr (parloop, i, gimple_omp_for_incr (simd, i));
+}
+
+  tree *tgt= gimple_omp_for_clauses_ptr (parloop);
+  while (*tgt)
+tgt = _CLAUSE_CHAIN (*tgt);
+
+  /* Copy over all clauses, except for linaer clauses, which are turned into
+ private clauses, and all other simd-specificl clauses, which 

[hsa-branch 6/9] Expand FMA_EXPR to HSAIL

2016-10-10 Thread Martin Jambor
Hi,

the following patch adds expansion of fused multiply and add to HSAIL.
The scalar variant is straightforwardly converted to an HSAIL equivalent
while any vector instance is expanded into separate multiplication and
additions.

Committed to the branch, queued for merge to trunk soon.
Thanks,

Martin

2016-10-03  Martin Jambor  

* hsa-gen.c (gen_hsa_insns_for_operation_assignment): Handle
FMA_EXPR and ternary operators in general.  Remove obsolete
fallthrough comments.
---
 gcc/hsa-gen.c | 27 ---
 1 file changed, 24 insertions(+), 3 deletions(-)

diff --git a/gcc/hsa-gen.c b/gcc/hsa-gen.c
index ac83e9e..ad40087 100644
--- a/gcc/hsa-gen.c
+++ b/gcc/hsa-gen.c
@@ -3076,6 +3076,23 @@ gen_hsa_insns_for_operation_assignment (gimple *assign, 
hsa_bb *hbb)
 case NEGATE_EXPR:
   opcode = BRIG_OPCODE_NEG;
   break;
+case FMA_EXPR:
+  /* There is a native HSA instruction for scalar FMAs but not for vector
+ones.  */
+  if (TREE_CODE (TREE_TYPE (lhs)) == VECTOR_TYPE)
+   {
+ hsa_op_reg *dest
+   = hsa_cfun->reg_for_gimple_ssa (gimple_assign_lhs (assign));
+ hsa_op_with_type *op1 = hsa_reg_or_immed_for_gimple_op (rhs1, hbb);
+ hsa_op_with_type *op2 = hsa_reg_or_immed_for_gimple_op (rhs2, hbb);
+ hsa_op_with_type *op3 = hsa_reg_or_immed_for_gimple_op (rhs3, hbb);
+ hsa_op_reg *tmp = new hsa_op_reg (dest->m_type);
+ gen_hsa_binary_operation (BRIG_OPCODE_MUL, tmp, op1, op2, hbb);
+ gen_hsa_binary_operation (BRIG_OPCODE_ADD, dest, tmp, op3, hbb);
+ return;
+   }
+  opcode = BRIG_OPCODE_MAD;
+  break;
 case MIN_EXPR:
   opcode = BRIG_OPCODE_MIN;
   break;
@@ -3275,14 +3292,18 @@ gen_hsa_insns_for_operation_assignment (gimple *assign, 
hsa_bb *hbb)
   switch (rhs_class)
 {
 case GIMPLE_TERNARY_RHS:
-  gcc_unreachable ();
+  {
+   hsa_op_with_type *op3 = hsa_reg_or_immed_for_gimple_op (rhs3, hbb);
+   hsa_insn_basic *insn = new hsa_insn_basic (4, opcode, dest->m_type, 
dest,
+  op1, op2, op3);
+   hbb->append_insn (insn);
+  }
   return;
 
-  /* Fall through */
 case GIMPLE_BINARY_RHS:
   gen_hsa_binary_operation (opcode, dest, op1, op2, hbb);
   break;
-  /* Fall through */
+
 case GIMPLE_UNARY_RHS:
   gen_hsa_unary_operation (opcode, dest, op1, hbb);
   break;
-- 
2.10.0



[hsa-branch 1/9] Builtins for gridsize and currentworkgroupsize

2016-10-10 Thread Martin Jambor
Hi,

the patch below makes the griddim and currentworkgroupsize special HSA
instructions available for omp lowering through a builtin.  They are
then used by subsequent patch to implement conditions determining the
last iteration for the lastprivate OpenMP sharing clause.

Committed to the branch, queued for merge to trunk soon.
Thanks,

Martin

2016-10-03  Martin Jambor  

* hsa-builtins.def (BUILT_IN_HSA_GRIDSIZE): New.
(BUILT_IN_HSA_CURRENTWORKGROUPSIZE): Likewise.
* hsa-gen.c (gen_hsa_insns_for_call): Handle BUILT_IN_HSA_GRIDSIZE.
---
 gcc/hsa-builtins.def | 4 
 gcc/hsa-gen.c| 6 ++
 2 files changed, 10 insertions(+)

diff --git a/gcc/hsa-builtins.def b/gcc/hsa-builtins.def
index dcd0c55..cc0409e 100644
--- a/gcc/hsa-builtins.def
+++ b/gcc/hsa-builtins.def
@@ -33,3 +33,7 @@ DEF_HSA_BUILTIN (BUILT_IN_HSA_WORKITEMID, "hsa_workitemid",
 BT_FN_UINT_UINT, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_HSA_BUILTIN (BUILT_IN_HSA_WORKITEMABSID, "hsa_workitemabsid",
 BT_FN_UINT_UINT, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_HSA_BUILTIN (BUILT_IN_HSA_GRIDSIZE, "hsa_gridsize",
+BT_FN_UINT_UINT, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_HSA_BUILTIN (BUILT_IN_HSA_CURRENTWORKGROUPSIZE, "hsa_currentworkgroupsize",
+BT_FN_UINT_UINT, ATTR_CONST_NOTHROW_LEAF_LIST)
diff --git a/gcc/hsa-gen.c b/gcc/hsa-gen.c
index f63608c..deb2a07 100644
--- a/gcc/hsa-gen.c
+++ b/gcc/hsa-gen.c
@@ -5812,6 +5812,12 @@ gen_hsa_insns_for_call (gimple *stmt, hsa_bb *hbb)
 case BUILT_IN_HSA_WORKITEMABSID:
   query_hsa_grid_dim (stmt, BRIG_OPCODE_WORKITEMABSID, hbb);
   break;
+case BUILT_IN_HSA_GRIDSIZE:
+  query_hsa_grid_dim (stmt, BRIG_OPCODE_GRIDSIZE, hbb);
+  break;
+case BUILT_IN_HSA_CURRENTWORKGROUPSIZE:
+  query_hsa_grid_dim (stmt, BRIG_OPCODE_CURRENTWORKGROUPSIZE, hbb);
+  break;
 
 case BUILT_IN_GOMP_BARRIER:
   hbb->append_insn (new hsa_insn_br (0, BRIG_OPCODE_BARRIER, 
BRIG_TYPE_NONE,
-- 
2.10.0



[hsa-branch 2/9] Lastprivate lowering for gridified kernels

2016-10-10 Thread Martin Jambor
Hi,

this patch implements the lastprivate data sharing clauses of gridified
OpenMP looping constructs.  It adds code to construct a special
condition to identify he "last" loop iteration using special HSA
instructions, because that way we do not need information about all HSA
dimensions conveyed from callers and could modify only a small fraction
of the non-gridification code.

On the gridification side, it creates group-segment copies of internal
loop lastprivate variables as means to transfer the value from the
"last" work-item to all work-items that then continue working with the
value.

Committed to the branch, queued for merge to trunk soon.
Thanks,

Martin

2016-10-03  Martin Jambor  

* gimple.h (GF_OMP_FOR_GRID_PHONY): Added comment.
(GF_OMP_FOR_GRID_INTRA_GROUP): New.
(gimple_omp_for_grid_phony): Added checking assert.
(gimple_omp_for_set_grid_phony): Likewise.
(gimple_omp_for_grid_intra_group): New function.
(gimple_omp_for_set_grid_intra_group): Likewise.
(gimple_omp_for_grid_group_iter): Added checking assert.
(gimple_omp_for_set_grid_group_iter): Likewise.
* omp-low.c (lower_lastprivate_clauses): Also handle predicates
that are not simple comparisons.
(grid_lastprivate_predicate): New function.
(lower_omp_for_lastprivate): Generate conditions for gridified kernels.
(lower_omp_for): Adjust phony predicate call.
(grid_parallel_clauses_gridifiable): Allow lastprivate.
(grid_inner_loop_gridifiable_p): Likewise.
(grid_mark_tiling_loops): Generate copies of lastprivate variables
to group variables.
(grid_mark_tiling_parallels_and_loops): Create binds for bodies of
a parallel statements.
(grid_process_kernel_body_copy): Avoid reusing variable name.
---
 gcc/gimple.h  |  36 +
 gcc/omp-low.c | 235 +-
 2 files changed, 187 insertions(+), 84 deletions(-)

diff --git a/gcc/gimple.h b/gcc/gimple.h
index ce3a161..3e84e6b0 100644
--- a/gcc/gimple.h
+++ b/gcc/gimple.h
@@ -162,7 +162,12 @@ enum gf_mask {
 GF_OMP_FOR_KIND_CILKSIMD   = GF_OMP_FOR_SIMD | 1,
 GF_OMP_FOR_COMBINED= 1 << 4,
 GF_OMP_FOR_COMBINED_INTO   = 1 << 5,
+/* The following flag must not be used on GF_OMP_FOR_KIND_GRID_LOOP loop
+   statements.  */
 GF_OMP_FOR_GRID_PHONY  = 1 << 6,
+/* The following two flags should only be set on GF_OMP_FOR_KIND_GRID_LOOP
+   loop statements.  */
+GF_OMP_FOR_GRID_INTRA_GROUP= 1 << 6,
 GF_OMP_FOR_GRID_GROUP_ITER  = 1 << 7,
 GF_OMP_TARGET_KIND_MASK= (1 << 4) - 1,
 GF_OMP_TARGET_KIND_REGION  = 0,
@@ -5123,6 +5128,8 @@ gimple_omp_for_set_pre_body (gimple *gs, gimple_seq 
pre_body)
 static inline bool
 gimple_omp_for_grid_phony (const gomp_for *omp_for)
 {
+  gcc_checking_assert (gimple_omp_for_kind (omp_for)
+  != GF_OMP_FOR_KIND_GRID_LOOP);
   return (gimple_omp_subcode (omp_for) & GF_OMP_FOR_GRID_PHONY) != 0;
 }
 
@@ -5131,18 +5138,45 @@ gimple_omp_for_grid_phony (const gomp_for *omp_for)
 static inline void
 gimple_omp_for_set_grid_phony (gomp_for *omp_for, bool value)
 {
+  gcc_checking_assert (gimple_omp_for_kind (omp_for)
+  != GF_OMP_FOR_KIND_GRID_LOOP);
   if (value)
 omp_for->subcode |= GF_OMP_FOR_GRID_PHONY;
   else
 omp_for->subcode &= ~GF_OMP_FOR_GRID_PHONY;
 }
 
+/* Return the kernel_intra_group of a GRID_LOOP OMP_FOR statement.  */
+
+static inline bool
+gimple_omp_for_grid_intra_group (const gomp_for *omp_for)
+{
+  gcc_checking_assert (gimple_omp_for_kind (omp_for)
+  == GF_OMP_FOR_KIND_GRID_LOOP);
+  return (gimple_omp_subcode (omp_for) & GF_OMP_FOR_GRID_INTRA_GROUP) != 0;
+}
+
+/* Set kernel_intra_group flag of OMP_FOR to VALUE.  */
+
+static inline void
+gimple_omp_for_set_grid_intra_group (gomp_for *omp_for, bool value)
+{
+  gcc_checking_assert (gimple_omp_for_kind (omp_for)
+  == GF_OMP_FOR_KIND_GRID_LOOP);
+  if (value)
+omp_for->subcode |= GF_OMP_FOR_GRID_INTRA_GROUP;
+  else
+omp_for->subcode &= ~GF_OMP_FOR_GRID_INTRA_GROUP;
+}
+
 /* Return true if iterations of a grid OMP_FOR statement correspond to HSA
groups.  */
 
 static inline bool
 gimple_omp_for_grid_group_iter (const gomp_for *omp_for)
 {
+  gcc_checking_assert (gimple_omp_for_kind (omp_for)
+  == GF_OMP_FOR_KIND_GRID_LOOP);
   return (gimple_omp_subcode (omp_for) & GF_OMP_FOR_GRID_GROUP_ITER) != 0;
 }
 
@@ -5151,6 +5185,8 @@ gimple_omp_for_grid_group_iter (const gomp_for *omp_for)
 static inline void
 gimple_omp_for_set_grid_group_iter (gomp_for *omp_for, bool value)
 {
+  gcc_checking_assert (gimple_omp_for_kind (omp_for)
+  == GF_OMP_FOR_KIND_GRID_LOOP);
   if (value)
 omp_for->subcode |= GF_OMP_FOR_GRID_GROUP_ITER;
   else
diff --git a/gcc/omp-low.c b/gcc/omp-low.c

[hsa-branch 4/9] Add expansion of reciprocal of square root

2016-10-10 Thread Martin Jambor
Hi,

this patch is a simple addition of reciprocal of square root gimple
function into its HSAIL equivalent.

Committed to the branch, queued for merge to trunk soon.
Thanks,

Martin

2016-10-03  Martin Jambor  

* hsa-gen.c (gen_hsa_insn_for_internal_fn_call): Also handle IFN_RSQRT.
---
 gcc/hsa-gen.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/gcc/hsa-gen.c b/gcc/hsa-gen.c
index deb2a07..efb87a0 100644
--- a/gcc/hsa-gen.c
+++ b/gcc/hsa-gen.c
@@ -5386,6 +5386,10 @@ gen_hsa_insn_for_internal_fn_call (gcall *stmt, hsa_bb 
*hbb)
   gen_hsa_unaryop_for_builtin (BRIG_OPCODE_SQRT, stmt, hbb);
   break;
 
+case IFN_RSQRT:
+  gen_hsa_unaryop_for_builtin (BRIG_OPCODE_NRSQRT, stmt, hbb);
+  break;
+
 case IFN_TRUNC:
   gen_hsa_unaryop_for_builtin (BRIG_OPCODE_TRUNC, stmt, hbb);
   break;
-- 
2.10.0



Re: [PATCH 4/4][Ada,DJGPP] Ada support for DJGPP

2016-10-10 Thread Andris Pavenis

On 10/10/2016 06:22 PM, Arnaud Charlet wrote:



PS. What about last versions of other 2 not yet approved patches (1 and 3)?

There have been many back and forth and many updates, so I do not know where
we are on these. I'm pretty sure I OKed one of the other parts, but best
to resubmit them cleanly (so with latest patches, changelog, etc...).
There are no changes since submitting last versions of patches 1 and 3. So I just pointed to 
messages in mail archives

in separate e-mails.

Andris



[PING][PATCH 3/4][Ada,DJGPP] Ada support for DJGPP

2016-10-10 Thread Andris Pavenis

I'd like to ping patch

https://gcc.gnu.org/ml/gcc-patches/2016-09/msg00164.html

Additional comments about using ZCX_By_Default := true are in

https://gcc.gnu.org/ml/gcc-patches/2016-09/msg00845.html

Andris



Re: [Committed] S/390: Wrap more macro args into ()

2016-10-10 Thread Andreas Schwab
On Okt 10 2016, Andreas Krebbel  wrote:

> @@ -491,7 +491,7 @@ extern const char *s390_host_detect_local_cpu (int argc, 
> const char **argv);
>s390_hard_regno_mode_ok ((REGNO), (MODE))
>  
>  #define HARD_REGNO_RENAME_OK(FROM, TO)  \
> -  s390_hard_regno_rename_ok (FROM, TO)
> +  s390_hard_regno_rename_ok ((FROM), (TO))

That should not be necessary.  The only way to get an error is if you
play dirty games with macros expanding to a bare comma.

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


[PING][PATCH 1/4][Ada,DJGPP] Ada support for DJGPP

2016-10-10 Thread Andris Pavenis

I'd like to ping this patch.

Last version of the patch together with Changelog entry can be found in mailing 
list archive:

https://gcc.gnu.org/ml/gcc-patches/2016-08/msg01229.html

Andris



[Committed] S/390: Wrap more macro args into ()

2016-10-10 Thread Andreas Krebbel
Turned out that there where a few () around macro args uses missing.
One real problem with it was detected with the int-in-bool-context in
the definition of DBX_REGISTER_NUMBER. But while being at it I've
also tried to fix other places where brackets might be missing.

gcc/ChangeLog:

2016-10-10  Andreas Krebbel  

* config/s390/s390.h: Wrap more macros args in brackets and fix
some formatting.
---
 gcc/ChangeLog  |  4 +++
 gcc/config/s390/s390.h | 88 ++
 2 files changed, 49 insertions(+), 43 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 6d27102..abe0194 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,7 @@
+2016-10-10  Andreas Krebbel  
+
+   * config/s390/s390.h: Wrap more macros args in brackets and fix
+
 2016-10-10  Andreas Schwab  
 
PR target/77738
diff --git a/gcc/config/s390/s390.h b/gcc/config/s390/s390.h
index 3a7be1a..501c8e4 100644
--- a/gcc/config/s390/s390.h
+++ b/gcc/config/s390/s390.h
@@ -320,9 +320,9 @@ extern const char *s390_host_detect_local_cpu (int argc, 
const char **argv);
FUNCTION is VOIDmode because calling convention maintains SP.
BLOCK needs Pmode for SP.
NONLOCAL needs twice Pmode to maintain both backchain and SP.  */
-#define STACK_SAVEAREA_MODE(LEVEL)  \
-  (LEVEL == SAVE_FUNCTION ? VOIDmode\
-  : LEVEL == SAVE_NONLOCAL ? (TARGET_64BIT ? OImode : TImode) : Pmode)
+#define STACK_SAVEAREA_MODE(LEVEL) \
+  ((LEVEL) == SAVE_FUNCTION ? VOIDmode \
+   : (LEVEL) == SAVE_NONLOCAL ? (TARGET_64BIT ? OImode : TImode) : Pmode)
 
 
 /* Type layout.  */
@@ -491,7 +491,7 @@ extern const char *s390_host_detect_local_cpu (int argc, 
const char **argv);
   s390_hard_regno_mode_ok ((REGNO), (MODE))
 
 #define HARD_REGNO_RENAME_OK(FROM, TO)  \
-  s390_hard_regno_rename_ok (FROM, TO)
+  s390_hard_regno_rename_ok ((FROM), (TO))
 
 #define MODES_TIEABLE_P(MODE1, MODE2)  \
(((MODE1) == SFmode || (MODE1) == DFmode)   \
@@ -584,7 +584,7 @@ enum reg_class
reload can decide not to use the hard register because some
constant was forced to be in memory.  */
 #define IRA_HARD_REGNO_ADD_COST_MULTIPLIER(regno)  \
-  (regno != BASE_REGNUM ? 0.0 : 0.5)
+  ((regno) != BASE_REGNUM ? 0.0 : 0.5)
 
 /* Register -> class mapping.  */
 extern const enum reg_class regclass_map[FIRST_PSEUDO_REGISTER];
@@ -617,10 +617,10 @@ extern const enum reg_class 
regclass_map[FIRST_PSEUDO_REGISTER];
 
  FIXME: Should we try splitting it into two vlgvg's/vlvg's instead?  */
 #define SECONDARY_MEMORY_NEEDED(CLASS1, CLASS2, MODE)  \
-  (((reg_classes_intersect_p (CLASS1, VEC_REGS)
\
- && reg_classes_intersect_p (CLASS2, GENERAL_REGS))
\
-|| (reg_classes_intersect_p (CLASS1, GENERAL_REGS) \
-   && reg_classes_intersect_p (CLASS2, VEC_REGS))) \
+  (((reg_classes_intersect_p ((CLASS1), VEC_REGS)  \
+ && reg_classes_intersect_p ((CLASS2), GENERAL_REGS))  \
+|| (reg_classes_intersect_p ((CLASS1), GENERAL_REGS)   \
+   && reg_classes_intersect_p ((CLASS2), VEC_REGS)))   \
&& (!TARGET_DFP || !TARGET_64BIT || GET_MODE_SIZE (MODE) != 8)  \
&& (!TARGET_VX || (SCALAR_FLOAT_MODE_P (MODE)   \
  && GET_MODE_SIZE (MODE) > 8)))
@@ -630,7 +630,7 @@ extern const enum reg_class 
regclass_map[FIRST_PSEUDO_REGISTER];
 #define SECONDARY_MEMORY_NEEDED_MODE(MODE) \
  (GET_MODE_BITSIZE (MODE) < 32 \
   ? mode_for_size (32, GET_MODE_CLASS (MODE), 0)   \
-  : MODE)
+  : (MODE))
 
 
 /* Stack layout and calling conventions.  */
@@ -720,8 +720,8 @@ extern const enum reg_class 
regclass_map[FIRST_PSEUDO_REGISTER];
 /* Define the dwarf register mapping.
v16-v31 -> 68-83
rX  -> X  otherwise  */
-#define DBX_REGISTER_NUMBER(regno) \
-  ((regno >= 38 && regno <= 53) ? regno + 30 : regno)
+#define DBX_REGISTER_NUMBER(regno) \
+  (((regno) >= 38 && (regno) <= 53) ? (regno) + 30 : (regno))
 
 /* Frame registers.  */
 
@@ -832,24 +832,25 @@ CUMULATIVE_ARGS;
operand.  If we find one, push the reload and jump to WIN.  This
macro is used in only one place: `find_reloads_address' in reload.c.  */
 #define LEGITIMIZE_RELOAD_ADDRESS(AD, MODE, OPNUM, TYPE, IND, WIN) \
-do {   \
-  rtx new_rtx = legitimize_reload_address (AD, MODE, OPNUM, (int)(TYPE));  
\
-  if (new_rtx) \
-{  \
-  (AD) = new_rtx;  

[PATCH] Implement constexpr std::addressof for C++17

2016-10-10 Thread Jonathan Wakely

Thank to the new __builtin_addressof that Jakub added we can do this
now.

* doc/xml/manual/intro.xml: Document DR 2296 status.
* doc/xml/manual/status_cxx2017.xml: Update status.
* include/bits/move.h (__addressof): Add _GLIBCXX_CONSTEXPR and
call __builtin_addressof.
(addressof): Add _GLIBCXX17_CONSTEXPR.
* testsuite/20_util/addressof/requirements/constexpr.cc: New test.
* testsuite/20_util/forward/c_neg.cc: Adjust dg-error lineno.
* testsuite/20_util/forward/f_neg.cc: Likewise.

Tested powerpc64le-linux, committed to trunk.


commit d450a86bff30b38464e440f6157c39b399b54cc1
Author: Jonathan Wakely 
Date:   Thu Oct 6 18:53:28 2016 +0100

Implement constexpr std::addressof for C++17

* doc/xml/manual/intro.xml: Document DR 2296 status.
* doc/xml/manual/status_cxx2017.xml: Update status.
* include/bits/move.h (__addressof): Add _GLIBCXX_CONSTEXPR and
call __builtin_addressof.
(addressof): Add _GLIBCXX17_CONSTEXPR.
* testsuite/20_util/addressof/requirements/constexpr.cc: New test.
* testsuite/20_util/forward/c_neg.cc: Adjust dg-error lineno.
* testsuite/20_util/forward/f_neg.cc: Likewise.

diff --git a/libstdc++-v3/doc/xml/manual/intro.xml 
b/libstdc++-v3/doc/xml/manual/intro.xml
index 4747851..265ef67 100644
--- a/libstdc++-v3/doc/xml/manual/intro.xml
+++ b/libstdc++-v3/doc/xml/manual/intro.xml
@@ -961,6 +961,13 @@ requirements of the license of GCC.
 is included by array.
 
 
+http://www.w3.org/1999/xlink; 
xlink:href="../ext/lwg-defects.html#2296">2296:
+   std::addressof should be constexpr
+
+Use __builtin_addressof and add
+constexpr to addressof for C++17 and later.
+
+
 http://www.w3.org/1999/xlink; 
xlink:href="../ext/lwg-defects.html#2313">2313:
tuple_size should always derive from 
integral_constantsize_t, N
 
diff --git a/libstdc++-v3/doc/xml/manual/status_cxx2017.xml 
b/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
index c03978e..c6b8440 100644
--- a/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
+++ b/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
@@ -253,14 +253,13 @@ Feature-testing recommendations for C++.
 
 
 
-  
std::addressof should be constexpr 
   
http://www.w3.org/1999/xlink; 
xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0304r0.html#2296;>
LWG2296

   
-   No 
+   7 
__cpp_lib_addressof_constexpr >= 201603 
 
 
diff --git a/libstdc++-v3/include/bits/move.h b/libstdc++-v3/include/bits/move.h
index 9deec42..a5002fc 100644
--- a/libstdc++-v3/include/bits/move.h
+++ b/libstdc++-v3/include/bits/move.h
@@ -43,12 +43,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
*  @ingroup utilities
*/
   template
-inline _Tp*
+inline _GLIBCXX_CONSTEXPR _Tp*
 __addressof(_Tp& __r) _GLIBCXX_NOEXCEPT
-{
-  return reinterpret_cast<_Tp*>
-   (_cast(reinterpret_cast(__r)));
-}
+{ return __builtin_addressof(__r); }
 
 _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace
@@ -123,6 +120,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   // declval, from type_traits.
 
+#if __cplusplus > 201402L
+  // _GLIBCXX_RESOLVE_LIB_DEFECTS
+  // 2296. std::addressof should be constexpr
+# define __cpp_lib_addressof_constexpr 201603
+#endif
   /**
*  @brief Returns the actual address of the object or function
* referenced by r, even in the presence of an overloaded
@@ -131,7 +133,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
*  @return   The actual address.
   */
   template
-inline _Tp*
+inline _GLIBCXX17_CONSTEXPR _Tp*
 addressof(_Tp& __r) noexcept
 { return std::__addressof(__r); }
 
diff --git a/libstdc++-v3/testsuite/20_util/addressof/requirements/constexpr.cc 
b/libstdc++-v3/testsuite/20_util/addressof/requirements/constexpr.cc
new file mode 100644
index 000..998d087
--- /dev/null
+++ b/libstdc++-v3/testsuite/20_util/addressof/requirements/constexpr.cc
@@ -0,0 +1,55 @@
+// Copyright (C) 2016 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-options "-std=gnu++1z" }
+// { dg-do compile { target c++1z } }
+
+#include 
+
+// LWG 2296 std::addressof should be constexpr
+

Re: [PATCH 4/4][Ada,DJGPP] Ada support for DJGPP

2016-10-10 Thread Arnaud Charlet
> >>  int
> >>  __gnat_get_maximum_file_name_length (void)
> >>  {
> >>+#if defined (__DJGPP__)
> >>+  return (_use_lfn(".")) ? -1 : 8;
> >>+#else
> >>return -1;
> >>+#endif
> >>  }
> >Is the above change really necessary? Would be nice to get rid of this
> >extra code. The rest looks OK to me.
> 
> It is be possible to leave this part out for now.

OK without this part then.

> PS. What about last versions of other 2 not yet approved patches (1 and 3)?

There have been many back and forth and many updates, so I do not know where
we are on these. I'm pretty sure I OKed one of the other parts, but best
to resubmit them cleanly (so with latest patches, changelog, etc...).

Arno


Re: [PATCH 4/4][Ada,DJGPP] Ada support for DJGPP

2016-10-10 Thread Andris Pavenis

On 09/25/2016 07:25 PM, Arnaud Charlet wrote:

  int
  __gnat_get_maximum_file_name_length (void)
  {
+#if defined (__DJGPP__)
+  return (_use_lfn(".")) ? -1 : 8;
+#else
return -1;
+#endif
  }

Is the above change really necessary? Would be nice to get rid of this
extra code. The rest looks OK to me.


It is be possible to leave this part out for now.

We could return to this part later separately.

Andris

PS. What about last versions of other 2 not yet approved patches (1 and 3)?

>From bd1698bff232bdc4258c70f49add1869276184db Mon Sep 17 00:00:00 2001
From: Andris Pavenis 
Date: Mon, 10 Oct 2016 18:14:52 +0300
Subject: [PATCH 4/4] [DJGPP, Ada] Ada support

* ada/adaint.c: Include process.h, signal.h, dir.h and utime.h for DJGPP.
  ISALPHA: include  and define to isalpha for DJGPP when IN_RTS is defined.
  (DIR_SEPARATOR) define to '\\' for DJGPP.
  (__gnat_get_file_names_case_sensitive): return 0 for DJGPP unless
  overriden in environment
  (__gnat_is_absolute_path): Support MS-DOS style absolute paths for DJGPP.
  (__gnat_portable_spawn): Use spewnvp for DJGPP.
  (__gnat_portable_no_block_spawn): Use spawnvp for DJGPP.
  (__gnat_portable_wait): Return 0 for DJGPP.
---
 gcc/ada/adaint.c | 39 ---
 1 file changed, 32 insertions(+), 7 deletions(-)

diff --git a/gcc/ada/adaint.c b/gcc/ada/adaint.c
index f317865..17d6f1f 100644
--- a/gcc/ada/adaint.c
+++ b/gcc/ada/adaint.c
@@ -112,7 +112,18 @@
 extern "C" {
 #endif
 
-#if defined (__MINGW32__) || defined (__CYGWIN__)
+#if defined (__DJGPP__)
+
+/* For isalpha-like tests in the compiler, we're expected to resort to
+   safe-ctype.h/ISALPHA.  This isn't available for the runtime library
+   build, so we fallback on ctype.h/isalpha there.  */
+
+#ifdef IN_RTS
+#include 
+#define ISALPHA isalpha
+#endif
+
+#elif defined (__MINGW32__) || defined (__CYGWIN__)
 
 #include "mingw32.h"
 
@@ -165,11 +176,16 @@ UINT CurrentCCSEncoding;
 #include 
 #endif
 
-#if defined (_WIN32)
-
+#if defined (__DJGPP__)
 #include 
 #include 
 #include 
+#include 
+#undef DIR_SEPARATOR
+#define DIR_SEPARATOR '\\'
+
+#elif defined (_WIN32)
+
 #include 
 #include 
 #include 
@@ -560,7 +576,7 @@ __gnat_get_file_names_case_sensitive (void)
 	{
 	  /* By default, we suppose filesystems aren't case sensitive on
 	 Windows and Darwin (but they are on arm-darwin).  */
-#if defined (WINNT) \
+#if defined (WINNT) || defined (__DJGPP__) \
   || (defined (__APPLE__) && !(defined (__arm__) || defined (__arm64__)))
 	  file_names_case_sensitive_cache = 0;
 #else
@@ -576,7 +592,7 @@ __gnat_get_file_names_case_sensitive (void)
 int
 __gnat_get_env_vars_case_sensitive (void)
 {
-#if defined (WINNT)
+#if defined (WINNT) || defined (__DJGPP__)
  return 0;
 #else
  return 1;
@@ -1646,7 +1662,7 @@ __gnat_is_absolute_path (char *name, int length)
 #else
   return (length != 0) &&
  (*name == '/' || *name == DIR_SEPARATOR
-#if defined (WINNT)
+#if defined (WINNT) || defined(__DJGPP__)
   || (length > 1 && ISALPHA (name[0]) && name[1] == ':')
 #endif
 	  );
@@ -2234,7 +2250,7 @@ __gnat_portable_spawn (char *args[] ATTRIBUTE_UNUSED)
 #if defined (__vxworks) || defined(__PikeOS__)
   return -1;
 
-#elif defined (_WIN32)
+#elif defined (__DJGPP__) || defined (_WIN32)
   /* args[0] must be quotes as it could contain a full pathname with spaces */
   char *args_0 = args[0];
   args[0] = (char *)xmalloc (strlen (args_0) + 3);
@@ -2606,6 +2622,12 @@ __gnat_portable_no_block_spawn (char *args[] ATTRIBUTE_UNUSED)
   /* Not supported.  */
   return -1;
 
+#elif defined(__DJGPP__)
+  if (spawnvp (P_WAIT, args[0], args) != 0)
+return -1;
+  else
+return 0;
+
 #elif defined (_WIN32)
 
   HANDLE h = NULL;
@@ -2649,6 +2671,9 @@ __gnat_portable_wait (int *process_status)
 
   pid = win32_wait ();
 
+#elif defined (__DJGPP__)
+  /* Child process has already ended in case of DJGPP.
+ No need to do anything. Just return success. */
 #else
 
   pid = waitpid (-1, , 0);
-- 
2.7.4



PING! Re: [PATCH, Fortran] Extension: COTAN and degree-valued trig intrinsics with -fdec-math

2016-10-10 Thread Fritz Reese
https://gcc.gnu.org/ml/fortran/2016-09/msg00163.html [original]
https://gcc.gnu.org/ml/fortran/2016-09/msg00183.html [latest]

On Wed, Sep 28, 2016 at 4:14 PM, Fritz Reese  wrote:
> Attached is a patch extending the GNU Fortran front-end to support
> some additional math intrinsics, enabled with a new compile flag
> -fdec-math. The flag adds the COTAN intrinsic (cotangent), as well as
> degree versions of all trigonometric intrinsics (SIND, TAND, ACOSD,
> etc...). This extension allows for further compatibility with legacy
> code that depends on the compiler to support such intrinsic functions.

Patch is still pending. Current draft of the patch is re-attached for
convenience, since it was amended twice since the original post. OK
for trunk?

---
Fritz Reese


2016-09-28  Fritz Reese  

 New flag -fdec-math for COTAN and degree trig intrinsics.

gcc/fortran/
* lang.opt: New flag -fdec-math.
* options.c (set_dec_flags): Enable with -fdec.
* invoke.texi, gfortran.texi, intrinsic.texi: Update documentation.
* intrinsics.c (add_functions, do_simplify): New intrinsics
with -fdec-math.
* gfortran.h (gfc_isym_id): New isym GFC_ISYM_COTAN.
* gfortran.h (gfc_resolve_atan2d, gfc_resolve_cotan,
gfc_resolve_trigd, gfc_resolve_atrigd): New prototypes.
* iresolve.c (resolve_trig_call, get_degrees, get_radians,
is_trig_resolved, gfc_resolve_cotan, gfc_resolve_trigd,
gfc_resolve_atrigd, gfc_resolve_atan2d): New functions.
* intrinsics.h (gfc_simplify_atan2d, gfc_simplify_atrigd,
gfc_simplify_cotan, gfc_simplify_trigd): New prototypes.
* simplify.c (simplify_trig_call, degrees_f, radians_f,
gfc_simplify_cotan, gfc_simplify_trigd, gfc_simplify_atrigd,
gfc_simplify_atan2d): New functions.

gcc/testsuite/gfortran.dg/
* dec_math.f90: New testsuite.
commit 126e89b660fad6b21f50c48e2af616225a727586
Author: Fritz Reese 
Date:   Wed Sep 28 16:11:23 2016 -0400

	New flag -fdec-math for COTAN and degree trig intrinsics.

	gcc/fortran/
	* lang.opt: New flag -fdec-math.
	* options.c (set_dec_flags): Enable with -fdec.
	* invoke.texi, gfortran.texi, intrinsic.texi: Update documentation.
	* intrinsics.c (add_functions, do_simplify): New intrinsics
	with -fdec-math.
	* gfortran.h (gfc_isym_id): New isym GFC_ISYM_COTAN.
	* gfortran.h (gfc_resolve_atan2d, gfc_resolve_cotan,
	gfc_resolve_trigd, gfc_resolve_atrigd): New prototypes.
	* iresolve.c (resolve_trig_call, get_degrees, get_radians,
	is_trig_resolved, gfc_resolve_cotan, gfc_resolve_trigd,
	gfc_resolve_atrigd, gfc_resolve_atan2d): New functions.
	* intrinsics.h (gfc_simplify_atan2d, gfc_simplify_atrigd,
	gfc_simplify_cotan, gfc_simplify_trigd): New prototypes.
	* simplify.c (simplify_trig_call, degrees_f, radians_f,
	gfc_simplify_cotan, gfc_simplify_trigd, gfc_simplify_atrigd,
	gfc_simplify_atan2d): New functions.

	gcc/testsuite/gfortran.dg/
	* dec_math.f90: New testsuite.

diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index d6b92a6..f8f3d4a 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -391,6 +391,7 @@ enum gfc_isym_id
   GFC_ISYM_CONVERSION,
   GFC_ISYM_COS,
   GFC_ISYM_COSH,
+  GFC_ISYM_COTAN,
   GFC_ISYM_COUNT,
   GFC_ISYM_CPU_TIME,
   GFC_ISYM_CSHIFT,
diff --git a/gcc/fortran/gfortran.texi b/gcc/fortran/gfortran.texi
index 3ebe3c7..a11eb84 100644
--- a/gcc/fortran/gfortran.texi
+++ b/gcc/fortran/gfortran.texi
@@ -1466,6 +1466,7 @@ without warning.
 * Form feed as whitespace::
 * TYPE as an alias for PRINT::
 * %LOC as an rvalue::
+* Extended math intrinsics::
 @end menu
 
 @node Old-style kind specifications
@@ -2519,6 +2520,42 @@ integer :: i
 call sub(%loc(i))
 @end smallexample
 
+@node Extended math intrinsics
+@subsection Extended math intrinsics
+@cindex intrinsics, math
+@cindex intrinsics, trigonometric functions
+
+GNU Fortran supports an extended list of mathematical intrinsics with the
+compile flag @option{-fdec-math} for compatability with legacy code.
+These intrinsics are described fully in @ref{Intrinsic Procedures} where it is
+noted that they are extensions and should be avoided whenever possible.
+
+Specifically, @option{-fdec-math} enables the @ref{COTAN} intrinsic, and
+trigonometric intrinsics which accept or produce values in degrees instead of
+radians.  Here is a summary of the new intrinsics:
+
+@multitable @columnfractions .5 .5
+@headitem Radians @tab Degrees
+@item @code{@ref{ACOS}}   @tab @code{@ref{ACOSD}}*
+@item @code{@ref{ASIN}}   @tab @code{@ref{ASIND}}*
+@item @code{@ref{ATAN}}   @tab @code{@ref{ATAND}}*
+@item @code{@ref{ATAN2}}  @tab @code{@ref{ATAN2D}}*
+@item @code{@ref{COS}}@tab @code{@ref{COSD}}*
+@item @code{@ref{COTAN}}* @tab @code{@ref{COTAND}}*
+@item @code{@ref{SIN}}@tab 

Re: [PATCH v4 0/6] Separate shrink-wrapping

2016-10-10 Thread Segher Boessenkool
Ping.

On Mon, Oct 03, 2016 at 01:48:17PM +, Segher Boessenkool wrote:
> I updated according to Jeff's latest comments (importantly, we cannot
> move a *logue in front of a move in general), and added some testcases.
> 
> Bootstrapping is in progress on today's trunk, powerpc64-linux and
> powerpc64le-linux.
> 
> Is this okay to commit now?
> 
> 
> Segher
> 
> 
> Segher Boessenkool (6):
>   separate shrink-wrap: New command-line flag, status flag, hooks, and
> doc
>   dce: Don't dead-code delete separately wrapped restores
>   regrename: Don't rename restores
>   shrink-wrap: Shrink-wrapping for separate components
>   rs6000: Separate shrink-wrapping
>   shrink-wrap: Testcases for separate shrink-wrapping
> 
>  gcc/common.opt |   4 +
>  gcc/config/rs6000/rs6000.c | 269 +++-
>  gcc/dce.c  |   9 +
>  gcc/doc/invoke.texi|  11 +-
>  gcc/doc/tm.texi|  63 ++
>  gcc/doc/tm.texi.in |  38 ++
>  gcc/emit-rtl.h |   4 +
>  gcc/function.c |  15 +-
>  gcc/regrename.c|   7 +
>  gcc/shrink-wrap.c  | 741 
> +
>  gcc/shrink-wrap.h  |   1 +
>  gcc/target.def |  57 ++
>  .../gcc.target/powerpc/shrink-wrap-separate-0.c|  22 +
>  .../gcc.target/powerpc/shrink-wrap-separate-1.c|  18 +
>  .../gcc.target/powerpc/shrink-wrap-separate-2.c|  26 +
>  15 files changed, 1265 insertions(+), 20 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/shrink-wrap-separate-0.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/shrink-wrap-separate-1.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/shrink-wrap-separate-2.c
> 
> -- 
> 1.9.3


[patch, fortran, committed] Fix PR 77915

2016-10-10 Thread Thomas Koenig

Hello world,

I have committed the attached patch to trunk as obvious after
regression-testing. Will commit to gcc6 soon.

Regards

Thomas

2016-10-10  Thomas Koenig  

PR fortran/77915
* frontend-passes.c (inline_matmul_assign):  Return early if
inside a FORALL statement.

2016-10-10  Thomas Koenig  

PR fortran/77915
* gfortran.dg/matmul_11.f90:  New test.
Index: frontend-passes.c
===
--- frontend-passes.c	(Revision 240927)
+++ frontend-passes.c	(Arbeitskopie)
@@ -2857,6 +2857,11 @@ inline_matmul_assign (gfc_code **c, int *walk_subt
   if (in_where)
 return 0;
 
+  /* The BLOCKS generated for the temporary variables and FORALL don't
+ mix.  */
+  if (forall_level > 0)
+return 0;
+
   /* For now don't do anything in OpenMP workshare, it confuses
  its translation, which expects only the allowed statements in there.
  We should figure out how to parallelize this eventually.  */
! { dg-do compile }
! { dg-options "-ffrontend-optimize -fdump-tree-original" }
! PR 77915 - ICE of matmul with forall.
program x
  integer, parameter :: d = 3
  real,dimension(d,d,d) :: cube,xcube
  real, dimension(d,d) :: cmatrix
  integer :: i,j
  forall(i=1:d,j=1:d)
 xcube(i,j,:) = matmul(cmatrix,cube(i,j,:))
  end forall
end program x

! { dg-final { scan-tree-dump-times "_gfortran_matmul" 1 "original" } }


[avr,committed] Include string.h in gen-avr-mmcu-texi.c

2016-10-10 Thread Georg-Johann Lay

https://gcc.gnu.org/r240925
https://gcc.gnu.org/r240926
https://gcc.gnu.org/r240927

gen-avr-mmcu-texi.c missed the inclusion of string.h (for strcmp).  Applied as 
obvious.



Johann

* config/avr/gen-avr-mmcu-texi.c (string.h): Include.

Index: config/avr/gen-avr-mmcu-texi.c
===
--- config/avr/gen-avr-mmcu-texi.c  (revision 240924)
+++ config/avr/gen-avr-mmcu-texi.c  (revision 240925)
@@ -19,6 +19,7 @@

 #include 
 #include 
+#include 

 #define IN_GEN_AVR_MMCU_TEXI



Re: Compile-time improvement for if conversion.

2016-10-10 Thread Yuri Rumyantsev
Richard,

If "fake" exit or entry block is created in dominance how we can
determine what is its the only  predecessor or successor without using
a notion of loop?

2016-10-10 15:00 GMT+03:00 Richard Biener :
> On Mon, Oct 10, 2016 at 1:42 PM, Yuri Rumyantsev  wrote:
>> Thanks Richard for your comments.
>> I'd like to answer on your last comment regarding use split_edge()
>> instead of creating fake post-header. I started with this splitting
>> but it requires to fix-up closed ssa form by creating additional phi
>> nodes, so I decided to use only cfg change without updating ssa form.
>> Other changes look reasonable and will fix them.
>
> Ah.  In this case can you investigate what it takes to make the entry/exit
> edges rather than BBs?  That is, introduce those "fakes" only internally
> in dominance.c?
>
>> 2016-10-10 12:52 GMT+03:00 Richard Biener :
>>> On Wed, Oct 5, 2016 at 3:22 PM, Yuri Rumyantsev  wrote:
 Hi All,

 Here is implementation of Richard proposal:

 < For general infrastructure it would be nice to expose a (post-)dominator
 < compute for MESE (post-dominators) / SEME (dominators) regions.  I 
 believe
 < what makes if-conversion expensive is the post-dom compute which happens
 < for each loop for the whole function.  It shouldn't be very difficult
 < to write this,
 < sharing as much as possible code with the current DOM code might need
 < quite some refactoring though.

 I implemented this proposal by adding calculation of dominance info
 for SESE regions and incorporate this change to if conversion pass.
 SESE region is built by adding loop pre-header and possibly fake
 post-header blocks to loop body. Fake post-header is deleted after
 predication completion.

 Bootstrapping and regression testing did not show any new failures.

 Is it OK for trunk?
>>>
>>> It's mostly reasonable but I have a few comments.  First, re-using
>>> bb->dom[] for the dominator info is somewhat fragile but indeed
>>> a requirement to make the patch reasonably small.  Please,
>>> in calculate_dominance_info_for_region, make sure that
>>> !dom_info_available_p (dir).
>>>
>>> You pass loop * everywhere but require ->aux to be set up as
>>> an array of BBs forming the region with special BBs at array ends.
>>>
>>> Please instead pass in a vec which avoids using ->aux
>>> and also allows other non-loop-based SESE regions to be used
>>> (I couldn't spot anything that relies on this being a loop).
>>>
>>> Adding a convenience wrapper for loop  * would be of course nice,
>>> to cover the special pre/post-header code in tree-if-conv.c.
>>>
>>> In theory a SESE region is fully specified by its entry end exit _edge_,
>>> so you might want to see if it's possible to use such a pair of edges
>>> to guard the dfs/idom walks to avoid the need to create fake blocks.
>>>
>>> Btw, instead of using create_empty_bb, unchecked_make_edge, etc.
>>> please use split_edge() of the entry/exit edges.
>>>
>>> Richard.
>>>
 ChangeLog:
 2016-10-05  Yuri Rumyantsev  

 * dominance.c : Include cfgloop.h for loop recognition.
 (dom_info): Add new functions and add boolean argument to recognize
 computation for loop region.
 (dom_info::dom_info): New function.
 (dom_info::calc_dfs_tree): Add boolean argument IN_REGION to not
 handle unvisited blocks.
 (dom_info::calc_idoms): Likewise.
 (compute_dom_fast_query_in_region): New function.
 (calculate_dominance_info): Invoke calc_dfs_tree and calc_idoms with
 false argument.
 (calculate_dominance_info_for_region): New function.
 (free_dominance_info_for_region): Likewise.
 (verify_dominators): Invoke calc_dfs_tree and calc_idoms with false
 argument.
 * dominance.h: Add prototype for introduced functions
 calculate_dominance_info_for_region and
 free_dominance_info_for_region.
 tree-if-conv.c: Add to local variables ifc_sese_bbs & fake_postheader.
 (build_sese_region): New function.
 (if_convertible_loop_p_1): Invoke local version of post-dominators
 calculation, free it after basic block predication and delete created
 fake post-header block if any.
 (tree_if_conversion): Delete call of free_dominance_info for
 post-dominators, free ifc_sese_bbs which represents SESE region.
 (pass_if_conversion::execute): Delete detection of infinite loops
 and fake edges to exit block since post-dominator calculation is
 performed per if-converted loop only.


[patch,avr] Use avr-passes.def to register passes.

2016-10-10 Thread Georg-Johann Lay
This is a code clean-up using the new -passes.def feature in order to 
register avr target passes and to get -fdump-xxx etc. to work for such passes.


Ok for trunk?

Johann

* config/avr/avr-passes.def: New file.
* config/avr/t-avr (PASSES_EXTRA): Add avr-passes.def.
* config/avr/avr-protos.h (gcc::context, rtl_opt_pass): Declare.
(make_avr_pass_recompute_note): New proto.
* config/avr/avr.c (make_avr_pass_recompute_notes): New function.
(avr_pass_recompute_notes): Use anonymous namespace.
(avr_register_passes): Remove function...
(avr_option_override): ...and its call.
Index: config/avr/avr-passes.def
===
--- config/avr/avr-passes.def	(nonexistent)
+++ config/avr/avr-passes.def	(working copy)
@@ -0,0 +1,28 @@
+/* Description of target passes for AVR.
+   Copyright (C) 2016 Free Software Foundation, Inc. */
+
+/* This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it under
+   the terms of the GNU General Public License as published by the Free
+   Software Foundation; either version 3, or (at your option) any later
+   version.
+
+   GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+   WARRANTY; without even the implied warranty of MERCHANTABILITY or
+   FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+   for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING3.  If not see
+   .  */
+
+/* This avr-specific pass (re)computes insn notes, in particular REG_DEAD
+   notes which are used by `avr.c::reg_unused_after' and branch offset
+   computations.  These notes must be correct, i.e. there must be no
+   dangling REG_DEAD notes; otherwise wrong code might result, cf. PR64331.
+
+   DF needs (correct) CFG, hence right before free_cfg is the last
+   opportunity to rectify notes.  */
+
+INSERT_PASS_BEFORE (pass_free_cfg, 1, avr_pass_recompute_notes);
Index: config/avr/avr-protos.h
===
--- config/avr/avr-protos.h	(revision 240915)
+++ config/avr/avr-protos.h	(working copy)
@@ -154,6 +154,11 @@ extern void asm_output_float (FILE *file
 
 extern bool avr_have_dimode;
 
+namespace gcc { class context; }
+class rtl_opt_pass;
+
+extern rtl_opt_pass *make_avr_pass_recompute_notes (gcc::context *);
+
 /* From avr-log.c */
 
 #define avr_dump(...) avr_vdump (NULL, __FUNCTION__, __VA_ARGS__)
Index: config/avr/avr.c
===
--- config/avr/avr.c	(revision 240915)
+++ config/avr/avr.c	(working copy)
@@ -295,6 +295,7 @@ avr_to_int_mode (rtx x)
 : simplify_gen_subreg (int_mode_for_mode (mode), x, mode, 0);
 }
 
+namespace {
 
 static const pass_data avr_pass_data_recompute_notes =
 {
@@ -328,20 +329,12 @@ public:
   }
 }; // avr_pass_recompute_notes
 
+} // anon namespace
 
-static void
-avr_register_passes (void)
+rtl_opt_pass*
+make_avr_pass_recompute_notes (gcc::context *ctxt)
 {
-  /* This avr-specific pass (re)computes insn notes, in particular REG_DEAD
- notes which are used by `avr.c::reg_unused_after' and branch offset
- computations.  These notes must be correct, i.e. there must be no
- dangling REG_DEAD notes; otherwise wrong code might result, cf. PR64331.
-
- DF needs (correct) CFG, hence right before free_cfg is the last
- opportunity to rectify notes.  */
-
-  register_pass (new avr_pass_recompute_notes (g, "avr-notes-free-cfg"),
- PASS_POS_INSERT_BEFORE, "*free_cfg", 1);
+  return new avr_pass_recompute_notes (ctxt, "avr-notes-free-cfg");
 }
 
 
@@ -464,11 +457,6 @@ avr_option_override (void)
   init_machine_status = avr_init_machine_status;
 
   avr_log_set_avr_log();
-
-  /* Register some avr-specific pass(es).  There is no canonical place for
- pass registration.  This function is convenient.  */
-
-  avr_register_passes ();
 }
 
 /* Function to set up the backend function structure.  */
Index: config/avr/t-avr
===
--- config/avr/t-avr	(revision 240915)
+++ config/avr/t-avr	(working copy)
@@ -16,6 +16,8 @@
 # along with GCC; see the file COPYING3.  If not see
 # .
 
+PASSES_EXTRA += $(srcdir)/config/avr/avr-passes.def
+
 driver-avr.o: $(srcdir)/config/avr/driver-avr.c \
   $(CONFIG_H) $(SYSTEM_H) coretypes.h \
   $(srcdir)/config/avr/avr-arch.h $(TM_H)


[PATCH] [ARC] New option handling, refurbish multilib support.

2016-10-10 Thread Claudiu Zissulescu
Hi Andrew,

This is updated patch of the original sent to mailing list some while ago.

What is new:
 - Do not use MULTILIB_REUSE as its semantic changed, and the old one was 
causing issues while building.
 - Update invoke.texi documentation adding nps400 option to mcpu.

This patch is important as it changes the way how we handle CPU
variations and multilib support. It will be great if you can include
this patch on your review list before any other one.

Thanks,
Claudiu

gcc/
2016-05-09  Claudiu Zissulescu  

* config/arc/arc-arch.h: New file.
* config/arc/arc-arches.def: Likewise.
* config/arc/arc-cpus.def: Likewise.
* config/arc/arc-options.def: Likewise.
* config/arc/t-multilib: Likewise.
* config/arc/genmultilib.awk: Likewise.
* config/arc/genoptions.awk: Likewise.
* config/arc/arc-tables.opt: Likewise.
* config/arc/driver-arc.c: Likewise.
* common/config/arc/arc-common.c (arc_handle_option): Trace
toggled options.
* config.gcc (arc*-*-*): Add arc-tables.opt to arc's extra
options; check for supported cpu against arc-cpus.def file.
(arc*-*-elf*, arc*-*-linux-uclibc*): Use new make fragment; define
TARGET_CPU_BUILD macro; add driver-arc.o as an extra object.
* config/arc/arc-c.def: Add emacs local variables.
* config/arc/arc-opts.h (processor_type): Use arc-cpus.def file.
(FPU_FPUS, FPU_FPUD, FPU_FPUDA, FPU_FPUDA_DIV, FPU_FPUDA_FMA)
(FPU_FPUDA_ALL, FPU_FPUS_DIV, FPU_FPUS_FMA, FPU_FPUS_ALL)
(FPU_FPUD_DIV, FPU_FPUD_FMA, FPU_FPUD_ALL): New defines.
(DEFAULT_arc_fpu_build): Define.
(DEFAULT_arc_mpy_option): Define.
* config/arc/arc-protos.h (arc_init): Delete.
* config/arc/arc.c (arc_cpu_name): New variable.
(arc_selected_cpu, arc_selected_arch, arc_arcem, arc_archs)
(arc_arc700, arc_arc600, arc_arc601): New variable.
(arc_init): Add static; remove selection of default tune value,
cleanup obsolete error messages.
(arc_override_options): Make use of .def files for selecting the
right cpu and option configurations.
* config/arc/arc.h (stdbool.h): Include.
(TARGET_CPU_DEFAULT): Define.
(CPP_SPEC): Remove mcpu=NPS400 handling.
(arc_cpu_to_as): Declare.
(EXTRA_SPEC_FUNCTIONS): Define.
(OPTION_DEFAULT_SPECS): Likewise.
(ASM_DEFAULT): Remove.
(ASM_SPEC): Use arc_cpu_to_as.
(DRIVER_SELF_SPECS): Remove deprecated options.
(arc_arc700, arc_arc600, arc_arc601, arc_arcem, arc_archs):
Declare.
(TARGET_ARC600, TARGET_ARC601, TARGET_ARC700, TARGET_EM)
(TARGET_HS, TARGET_V2, TARGET_ARC600): Make them use arc_arc*
variables.
(MULTILIB_DEFAULTS): Use ARC_MULTILIB_CPU_DEFAULT.
* config/arc/arc.md (attr_cpu): Remove.
* config/arc/arc.opt (arc_mpy_option): Make it target variable.
(mno-mpy): Deprecate.
(mcpu=ARC600, mcpu=ARC601, mcpu=ARC700, mcpu=NPS400, mcpu=ARCEM)
(mcpu=ARCHS): Remove.
(mcrc, mdsp-packa, mdvbf, mmac-d16, mmac-24, mtelephony, mrtsc):
Deprecate.
(mbarrel_shifte, mspfp_, mdpfp_, mdsp_pack, mmac_): Remove.
(arc_fpu): Use new defines.
(arc_seen_options): New target variable.
* config/arc/t-arc (driver-arc.o): New target.
(arc-cpus, t-multilib, arc-tables.opt): Likewise.
* config/arc/t-arc-newlib: Delete.
* config/arc/t-arc-uClibc: Renamed to t-uClibc.
* doc/invoke.texi (ARC): Update arc options.
---
 gcc/common/config/arc/arc-common.c | 162 -
 gcc/config.gcc |  47 +
 gcc/config/arc/arc-arch.h  | 120 ++
 gcc/config/arc/arc-arches.def  |  35 +++
 gcc/config/arc/arc-c.def   |   4 +
 gcc/config/arc/arc-cpus.def|  47 +
 gcc/config/arc/arc-options.def |  69 +
 gcc/config/arc/arc-opts.h  |  47 +++--
 gcc/config/arc/arc-protos.h|   1 -
 gcc/config/arc/arc-tables.opt  |  90 
 gcc/config/arc/arc.c   | 186 ++---
 gcc/config/arc/arc.h   |  91 -
 gcc/config/arc/arc.md  |   5 -
 gcc/config/arc/arc.opt | 109 ++--
 gcc/config/arc/driver-arc.c|  80 +++
 gcc/config/arc/genmultilib.awk | 203 +
 gcc/config/arc/genoptions.awk  |  86 
 gcc/config/arc/t-arc   |  19 
 gcc/config/arc/t-arc-newlib|  46 -
 gcc/config/arc/t-arc-uClibc|  20 
 gcc/config/arc/t-multilib  |  34 +++
 gcc/config/arc/t-uClibc|  20 
 gcc/doc/invoke.texi|  90 +---
 23 files changed, 1235 insertions(+), 376 

Re: [PATCH, ARM 5/7] Add support for MOVT/MOVW to ARMv8-M Baseline

2016-10-10 Thread Christophe Lyon
Hi Thomas,

On 13 July 2016 at 17:34, Thomas Preudhomme
 wrote:
> On Wednesday 13 July 2016 17:14:52 Christophe Lyon wrote:
>> Hi Thomas,
>
> Hi Christophe,
>
>>
>> I'm seeing:
>> gcc.target/arm/pr42574.c: syntax error in target selector
>> "arm_thumb1_ok && { ! arm_thumb1_movt_ok }" for " dg-do 1 compile {
>> arm_thumb1_ok && { ! arm_thumb1_movt_ok } } "
>
> Oops. I remember the trial and error to find the right amount of curly braces
> yet I can indeed reproduce the error now. The target keyword is missing. I'll
> submit a patch asap.
>
> Best regards,
>
> Thomas

I've noticed that the new test
  gcc.target/arm/movdi_movw.c scan-assembler-times movw\tr0, #61680 1
fails on armeb-none-linux-gnueabihf
--with-mode=thumb --with-cpu=cortex-a9 --with-fpu=neon-fp16

the other new tests pass, and using --with=mode=arm makes all three
of them unsupported.

Sorry I missed it when I reported the other error.

Can you have a look?

Thanks

Christophe


[PATCH] Implement LWG 2192 and LWG 2294 for std::abs

2016-10-10 Thread Jonathan Wakely

It looks like I forgot to send this patch to the lists last month.

This implements the requirements that all overloads of std::abs are
declared by either of  or . This ensures that
including only one of those headers and calling std::abs doesn't cause
conversions from integers to floating point types, or vice versa.

* doc/xml/manual/intro.xml: Document LWG 2192 changes.
* doc/html/*: Regenerate.
* include/Makefile.am: Add bits/std_abs.h.
* include/Makefile.in: Regenerate.
* include/bits/std_abs.h: New header defining all required overloads
of std::abs in one place (LWG 2294).
* include/c_global/cmath (abs(double), abs(float), abs(long double)):
Move to bits/std_abs.h.
(abs<_Tp>(_Tp)): Remove.
* include/c_global/cstdlib (abs(long), abs(long long), abs(__int)):
Move to bits/std_abs.h.
* testsuite/26_numerics/headers/cmath/dr2192.cc: New test.
* testsuite/26_numerics/headers/cmath/dr2192_neg.cc: New test.
* testsuite/26_numerics/headers/cstdlib/dr2192.cc: New test.
* testsuite/26_numerics/headers/cstdlib/dr2192_neg.cc: New test.

Tested on ppc64le and x86_64 GNU/Linux, and committed to trunk on 30
September.

commit 9e441fcfca8a90e72a3cdbed42303dc2353b3da2
Author: redi 
Date:   Fri Sep 30 16:07:43 2016 +

Implement LWG 2192 and LWG 2294 for std::abs

* doc/xml/manual/intro.xml: Document LWG 2192 changes.
* doc/html/*: Regenerate.
* include/Makefile.am: Add bits/std_abs.h.
* include/Makefile.in: Regenerate.
* include/bits/std_abs.h: New header defining all required overloads
of std::abs in one place (LWG 2294).
* include/c_global/cmath (abs(double), abs(float), abs(long double)):
Move to bits/std_abs.h.
(abs<_Tp>(_Tp)): Remove.
* include/c_global/cstdlib (abs(long), abs(long long), abs(__int)):
Move to bits/std_abs.h.
* testsuite/26_numerics/headers/cmath/dr2192.cc: New test.
* testsuite/26_numerics/headers/cmath/dr2192_neg.cc: New test.
* testsuite/26_numerics/headers/cstdlib/dr2192.cc: New test.
* testsuite/26_numerics/headers/cstdlib/dr2192_neg.cc: New test.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@240660 
138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/libstdc++-v3/doc/xml/manual/intro.xml 
b/libstdc++-v3/doc/xml/manual/intro.xml
index 238ab24..4747851 100644
--- a/libstdc++-v3/doc/xml/manual/intro.xml
+++ b/libstdc++-v3/doc/xml/manual/intro.xml
@@ -940,6 +940,13 @@ requirements of the license of GCC.
 Add emplace and emplace_back 
member functions.
 
 
+http://www.w3.org/1999/xlink; 
xlink:href="../ext/lwg-defects.html#2192">2192:
+   Validity and return type of std::abs(0u) is 
unclear
+
+Move all declarations to a common header and remove the
+generic abs which accepted unsigned arguments.
+
+
 http://www.w3.org/1999/xlink; 
xlink:href="../ext/lwg-defects.html#2196">2196:
Specification of 
is_*[copy/move]_[constructible/assignable] unclear for 
non-referencable types
 
diff --git a/libstdc++-v3/include/Makefile.am b/libstdc++-v3/include/Makefile.am
index 7782258..4e63fbb 100644
--- a/libstdc++-v3/include/Makefile.am
+++ b/libstdc++-v3/include/Makefile.am
@@ -159,6 +159,7 @@ bits_headers = \
${bits_srcdir}/shared_ptr_base.h \
${bits_srcdir}/slice_array.h \
${bits_srcdir}/sstream.tcc \
+   ${bits_srcdir}/std_abs.h \
${bits_srcdir}/std_mutex.h \
${bits_srcdir}/stl_algo.h \
${bits_srcdir}/stl_algobase.h \
diff --git a/libstdc++-v3/include/bits/std_abs.h 
b/libstdc++-v3/include/bits/std_abs.h
new file mode 100644
index 000..ab0f980
--- /dev/null
+++ b/libstdc++-v3/include/bits/std_abs.h
@@ -0,0 +1,107 @@
+// -*- C++ -*- C library enhancements header.
+
+// Copyright (C) 2016 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// Under Section 7 of GPL version 3, you are granted additional
+// permissions described in the GCC Runtime Library Exception, version
+// 3.1, as published by the Free Software Foundation.
+
+// You should have received a copy of the GNU General Public License and
+// a copy of the GCC Runtime Library Exception along with this program;
+// see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+// .
+
+/** 

[PATCH] LWG 2733, LWG 2759 reject bool in gcd and lcm

2016-10-10 Thread Jonathan Wakely

These DRs are only in Tentatively Ready status, but they're not
controversial so implementing them immediately seems sensible.

The deleted function is sufficient, but the static assertions are more
user-friendly (and are only tested once, not in every recursive call
to __gcd or __lcm).

* include/experimental/numeric (gcd, lcm): Make bool arguments
ill-formed.
* include/std/numeric (gcd, lcm): Likewise.
* testsuite/26_numerics/gcd/gcd_neg.cc: New test.
* testsuite/26_numerics/lcm/lcm_neg.cc: New test.

Tested x86_64-linux, committed to trunk.

commit a785026d8d928a1492daf6919a57d6cda714f714
Author: Jonathan Wakely 
Date:   Mon Oct 10 11:58:27 2016 +0100

LWG 2733, LWG 2759 reject bool in gcd and lcm

* include/experimental/numeric (gcd, lcm): Make bool arguments
ill-formed.
* include/std/numeric (gcd, lcm): Likewise.
* testsuite/26_numerics/gcd/gcd_neg.cc: New test.
* testsuite/26_numerics/lcm/lcm_neg.cc: New test.

diff --git a/libstdc++-v3/include/experimental/numeric 
b/libstdc++-v3/include/experimental/numeric
index 6d1dc21..0ce4bda 100644
--- a/libstdc++-v3/include/experimental/numeric
+++ b/libstdc++-v3/include/experimental/numeric
@@ -57,8 +57,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 constexpr common_type_t<_Mn, _Nn>
 gcd(_Mn __m, _Nn __n)
 {
-  static_assert(is_integral<_Mn>::value, "arguments to gcd are integers");
-  static_assert(is_integral<_Nn>::value, "arguments to gcd are integers");
+  static_assert(is_integral<_Mn>::value, "gcd arguments are integers");
+  static_assert(is_integral<_Nn>::value, "gcd arguments are integers");
+  static_assert(!is_same<_Mn, bool>::value, "gcd arguments are not bools");
+  static_assert(!is_same<_Nn, bool>::value, "gcd arguments are not bools");
   return std::__detail::__gcd(__m, __n);
 }
 
@@ -67,8 +69,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 constexpr common_type_t<_Mn, _Nn>
 lcm(_Mn __m, _Nn __n)
 {
-  static_assert(is_integral<_Mn>::value, "arguments to lcm are integers");
-  static_assert(is_integral<_Nn>::value, "arguments to lcm are integers");
+  static_assert(is_integral<_Mn>::value, "lcm arguments are integers");
+  static_assert(is_integral<_Nn>::value, "lcm arguments are integers");
+  static_assert(!is_same<_Mn, bool>::value, "lcm arguments are not bools");
+  static_assert(!is_same<_Nn, bool>::value, "lcm arguments are not bools");
   return std::__detail::__lcm(__m, __n);
 }
 
diff --git a/libstdc++-v3/include/std/numeric b/libstdc++-v3/include/std/numeric
index 7b1ab98..4414081 100644
--- a/libstdc++-v3/include/std/numeric
+++ b/libstdc++-v3/include/std/numeric
@@ -96,6 +96,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 __abs_integral(_Tp __val)
 { return __val; }
 
+  void __abs_integral(bool) = delete;
+
   template
 constexpr common_type_t<_Mn, _Nn>
 __gcd(_Mn __m, _Nn __n)
@@ -129,8 +131,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 constexpr common_type_t<_Mn, _Nn>
 gcd(_Mn __m, _Nn __n)
 {
-  static_assert(is_integral<_Mn>::value, "arguments to gcd are integers");
-  static_assert(is_integral<_Nn>::value, "arguments to gcd are integers");
+  static_assert(is_integral<_Mn>::value, "gcd arguments are integers");
+  static_assert(is_integral<_Nn>::value, "gcd arguments are integers");
+  static_assert(!is_same<_Mn, bool>::value, "gcd arguments are not bools");
+  static_assert(!is_same<_Nn, bool>::value, "gcd arguments are not bools");
   return __detail::__gcd(__m, __n);
 }
 
@@ -140,8 +144,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 constexpr common_type_t<_Mn, _Nn>
 lcm(_Mn __m, _Nn __n)
 {
-  static_assert(is_integral<_Mn>::value, "arguments to lcm are integers");
-  static_assert(is_integral<_Nn>::value, "arguments to lcm are integers");
+  static_assert(is_integral<_Mn>::value, "lcm arguments are integers");
+  static_assert(is_integral<_Nn>::value, "lcm arguments are integers");
+  static_assert(!is_same<_Mn, bool>::value, "lcm arguments are not bools");
+  static_assert(!is_same<_Nn, bool>::value, "lcm arguments are not bools");
   return __detail::__lcm(__m, __n);
 }
 
diff --git a/libstdc++-v3/testsuite/26_numerics/gcd/gcd_neg.cc 
b/libstdc++-v3/testsuite/26_numerics/gcd/gcd_neg.cc
new file mode 100644
index 000..231ce8d
--- /dev/null
+++ b/libstdc++-v3/testsuite/26_numerics/gcd/gcd_neg.cc
@@ -0,0 +1,39 @@
+// Copyright (C) 2016 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; 

[PATCH] Define std::allocator::is_always_equal

2016-10-10 Thread Jonathan Wakely

I somehow only added the is_always_equal nested typedef to the
allocator specialization, not the primary template. All the
containers still do the right thing, because they use
allocator_traits::is_always_equal which gives the right
answer, but we still need to provide allocator::is_always_equal to
be conforming.

* include/bits/allocator.h (allocator::is_always_equal): Define.
* testsuite/20_util/allocator/requirements/typedefs.cc: Test for
is_always_equal.
* testsuite/util/testsuite_allocator.h
(uneq_allocator::is_always_equal): Define as false_type.

Tested powerpc64le-linux, committed to trunk/


commit f5020f0fa1dc815eda37d8b1040e7c16f1554114
Author: Jonathan Wakely 
Date:   Mon Oct 10 12:04:24 2016 +0100

Define std::allocator::is_always_equal

* include/bits/allocator.h (allocator::is_always_equal): Define.
* testsuite/20_util/allocator/requirements/typedefs.cc: Test for
is_always_equal.
* testsuite/util/testsuite_allocator.h
(uneq_allocator::is_always_equal): Define as false_type.

diff --git a/libstdc++-v3/include/bits/allocator.h 
b/libstdc++-v3/include/bits/allocator.h
index 984d800..8e78165 100644
--- a/libstdc++-v3/include/bits/allocator.h
+++ b/libstdc++-v3/include/bits/allocator.h
@@ -50,6 +50,9 @@
 #endif
 
 #define __cpp_lib_incomplete_container_elements 201505
+#if __cplusplus >= 201103L
+# define __cpp_lib_allocator_is_always_equal 201411
+#endif
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
@@ -80,7 +83,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   // 2103. std::allocator propagate_on_container_move_assignment
   typedef true_type propagate_on_container_move_assignment;
 
-#define __cpp_lib_allocator_is_always_equal 201411
   typedef true_type is_always_equal;
 #endif
 };
@@ -113,6 +115,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   // _GLIBCXX_RESOLVE_LIB_DEFECTS
   // 2103. std::allocator propagate_on_container_move_assignment
   typedef true_type propagate_on_container_move_assignment;
+
+  typedef true_type is_always_equal;
 #endif
 
   allocator() throw() { }
diff --git a/libstdc++-v3/testsuite/20_util/allocator/requirements/typedefs.cc 
b/libstdc++-v3/testsuite/20_util/allocator/requirements/typedefs.cc
index 028daa9..1b3f14f 100644
--- a/libstdc++-v3/testsuite/20_util/allocator/requirements/typedefs.cc
+++ b/libstdc++-v3/testsuite/20_util/allocator/requirements/typedefs.cc
@@ -48,3 +48,6 @@ static_assert( is_same::value,
"propagate_on_container_move_assignment" );
+
+static_assert( is_same::value,
+   "is_always_equal" );
diff --git a/libstdc++-v3/testsuite/util/testsuite_allocator.h 
b/libstdc++-v3/testsuite/util/testsuite_allocator.h
index 8537a83..dd7e22d 100644
--- a/libstdc++-v3/testsuite/util/testsuite_allocator.h
+++ b/libstdc++-v3/testsuite/util/testsuite_allocator.h
@@ -297,6 +297,7 @@ namespace __gnu_test
 
 #if __cplusplus >= 201103L
   typedef std::true_type   propagate_on_container_swap;
+  typedef std::false_type  is_always_equal;
 #endif
 
   template


Re: Compile-time improvement for if conversion.

2016-10-10 Thread Richard Biener
On Mon, Oct 10, 2016 at 1:42 PM, Yuri Rumyantsev  wrote:
> Thanks Richard for your comments.
> I'd like to answer on your last comment regarding use split_edge()
> instead of creating fake post-header. I started with this splitting
> but it requires to fix-up closed ssa form by creating additional phi
> nodes, so I decided to use only cfg change without updating ssa form.
> Other changes look reasonable and will fix them.

Ah.  In this case can you investigate what it takes to make the entry/exit
edges rather than BBs?  That is, introduce those "fakes" only internally
in dominance.c?

> 2016-10-10 12:52 GMT+03:00 Richard Biener :
>> On Wed, Oct 5, 2016 at 3:22 PM, Yuri Rumyantsev  wrote:
>>> Hi All,
>>>
>>> Here is implementation of Richard proposal:
>>>
>>> < For general infrastructure it would be nice to expose a (post-)dominator
>>> < compute for MESE (post-dominators) / SEME (dominators) regions.  I believe
>>> < what makes if-conversion expensive is the post-dom compute which happens
>>> < for each loop for the whole function.  It shouldn't be very difficult
>>> < to write this,
>>> < sharing as much as possible code with the current DOM code might need
>>> < quite some refactoring though.
>>>
>>> I implemented this proposal by adding calculation of dominance info
>>> for SESE regions and incorporate this change to if conversion pass.
>>> SESE region is built by adding loop pre-header and possibly fake
>>> post-header blocks to loop body. Fake post-header is deleted after
>>> predication completion.
>>>
>>> Bootstrapping and regression testing did not show any new failures.
>>>
>>> Is it OK for trunk?
>>
>> It's mostly reasonable but I have a few comments.  First, re-using
>> bb->dom[] for the dominator info is somewhat fragile but indeed
>> a requirement to make the patch reasonably small.  Please,
>> in calculate_dominance_info_for_region, make sure that
>> !dom_info_available_p (dir).
>>
>> You pass loop * everywhere but require ->aux to be set up as
>> an array of BBs forming the region with special BBs at array ends.
>>
>> Please instead pass in a vec which avoids using ->aux
>> and also allows other non-loop-based SESE regions to be used
>> (I couldn't spot anything that relies on this being a loop).
>>
>> Adding a convenience wrapper for loop  * would be of course nice,
>> to cover the special pre/post-header code in tree-if-conv.c.
>>
>> In theory a SESE region is fully specified by its entry end exit _edge_,
>> so you might want to see if it's possible to use such a pair of edges
>> to guard the dfs/idom walks to avoid the need to create fake blocks.
>>
>> Btw, instead of using create_empty_bb, unchecked_make_edge, etc.
>> please use split_edge() of the entry/exit edges.
>>
>> Richard.
>>
>>> ChangeLog:
>>> 2016-10-05  Yuri Rumyantsev  
>>>
>>> * dominance.c : Include cfgloop.h for loop recognition.
>>> (dom_info): Add new functions and add boolean argument to recognize
>>> computation for loop region.
>>> (dom_info::dom_info): New function.
>>> (dom_info::calc_dfs_tree): Add boolean argument IN_REGION to not
>>> handle unvisited blocks.
>>> (dom_info::calc_idoms): Likewise.
>>> (compute_dom_fast_query_in_region): New function.
>>> (calculate_dominance_info): Invoke calc_dfs_tree and calc_idoms with
>>> false argument.
>>> (calculate_dominance_info_for_region): New function.
>>> (free_dominance_info_for_region): Likewise.
>>> (verify_dominators): Invoke calc_dfs_tree and calc_idoms with false
>>> argument.
>>> * dominance.h: Add prototype for introduced functions
>>> calculate_dominance_info_for_region and
>>> free_dominance_info_for_region.
>>> tree-if-conv.c: Add to local variables ifc_sese_bbs & fake_postheader.
>>> (build_sese_region): New function.
>>> (if_convertible_loop_p_1): Invoke local version of post-dominators
>>> calculation, free it after basic block predication and delete created
>>> fake post-header block if any.
>>> (tree_if_conversion): Delete call of free_dominance_info for
>>> post-dominators, free ifc_sese_bbs which represents SESE region.
>>> (pass_if_conversion::execute): Delete detection of infinite loops
>>> and fake edges to exit block since post-dominator calculation is
>>> performed per if-converted loop only.


[PATCH] Add noexcept to enable_shared_from_this::weak_from_this

2016-10-10 Thread Jonathan Wakely

I missed out the "noexcept" on these new functions.

* include/bits/shared_ptr.h (enable_shared_from_this::weak_from_this):
Add noexcept.
* include/bits/shared_ptr_base.h
(__enable_shared_from_this::weak_from_this): Likewise.
* testsuite/20_util/enable_shared_from_this/members/weak_from_this.cc:
Test exception-specification of weak_from_this.

Tested powerpc64le-linux, committing to trunk.

commit 3f386e54098cb01df83a131ed4a8e22c0b0b52bd
Author: Jonathan Wakely 
Date:   Mon Oct 10 11:42:00 2016 +0100

Add noexcept to enable_shared_from_this::weak_from_this

* include/bits/shared_ptr.h (enable_shared_from_this::weak_from_this):
Add noexcept.
* include/bits/shared_ptr_base.h
(__enable_shared_from_this::weak_from_this): Likewise.
* testsuite/20_util/enable_shared_from_this/members/weak_from_this.cc:
Test exception-specification of weak_from_this.

diff --git a/libstdc++-v3/include/bits/shared_ptr.h 
b/libstdc++-v3/include/bits/shared_ptr.h
index b2523b8..cbcb3b3 100644
--- a/libstdc++-v3/include/bits/shared_ptr.h
+++ b/libstdc++-v3/include/bits/shared_ptr.h
@@ -593,11 +593,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #if __cplusplus > 201402L || !defined(__STRICT_ANSI__) // c++1z or gnu++11
 #define __cpp_lib_enable_shared_from_this 201603
   weak_ptr<_Tp>
-  weak_from_this()
+  weak_from_this() noexcept
   { return this->_M_weak_this; }
 
   weak_ptr
-  weak_from_this() const
+  weak_from_this() const noexcept
   { return this->_M_weak_this; }
 #endif
 
diff --git a/libstdc++-v3/include/bits/shared_ptr_base.h 
b/libstdc++-v3/include/bits/shared_ptr_base.h
index 4ae2668..e8820a1 100644
--- a/libstdc++-v3/include/bits/shared_ptr_base.h
+++ b/libstdc++-v3/include/bits/shared_ptr_base.h
@@ -1562,11 +1562,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
 #if __cplusplus > 201402L || !defined(__STRICT_ANSI__) // c++1z or gnu++11
   __weak_ptr<_Tp, _Lp>
-  weak_from_this()
+  weak_from_this() noexcept
   { return this->_M_weak_this; }
 
   __weak_ptr
-  weak_from_this() const
+  weak_from_this() const noexcept
   { return this->_M_weak_this; }
 #endif
 
diff --git 
a/libstdc++-v3/testsuite/20_util/enable_shared_from_this/members/weak_from_this.cc
 
b/libstdc++-v3/testsuite/20_util/enable_shared_from_this/members/weak_from_this.cc
index b5ebb81..9c33396 100644
--- 
a/libstdc++-v3/testsuite/20_util/enable_shared_from_this/members/weak_from_this.cc
+++ 
b/libstdc++-v3/testsuite/20_util/enable_shared_from_this/members/weak_from_this.cc
@@ -26,6 +26,9 @@
 
 struct X : public std::enable_shared_from_this { };
 
+static_assert( noexcept(std::declval().weak_from_this()) );
+static_assert( noexcept(std::declval().weak_from_this()) );
+
 void
 test01()
 {


Re: Compile-time improvement for if conversion.

2016-10-10 Thread Yuri Rumyantsev
Thanks Richard for your comments.
I'd like to answer on your last comment regarding use split_edge()
instead of creating fake post-header. I started with this splitting
but it requires to fix-up closed ssa form by creating additional phi
nodes, so I decided to use only cfg change without updating ssa form.
Other changes look reasonable and will fix them.

2016-10-10 12:52 GMT+03:00 Richard Biener :
> On Wed, Oct 5, 2016 at 3:22 PM, Yuri Rumyantsev  wrote:
>> Hi All,
>>
>> Here is implementation of Richard proposal:
>>
>> < For general infrastructure it would be nice to expose a (post-)dominator
>> < compute for MESE (post-dominators) / SEME (dominators) regions.  I believe
>> < what makes if-conversion expensive is the post-dom compute which happens
>> < for each loop for the whole function.  It shouldn't be very difficult
>> < to write this,
>> < sharing as much as possible code with the current DOM code might need
>> < quite some refactoring though.
>>
>> I implemented this proposal by adding calculation of dominance info
>> for SESE regions and incorporate this change to if conversion pass.
>> SESE region is built by adding loop pre-header and possibly fake
>> post-header blocks to loop body. Fake post-header is deleted after
>> predication completion.
>>
>> Bootstrapping and regression testing did not show any new failures.
>>
>> Is it OK for trunk?
>
> It's mostly reasonable but I have a few comments.  First, re-using
> bb->dom[] for the dominator info is somewhat fragile but indeed
> a requirement to make the patch reasonably small.  Please,
> in calculate_dominance_info_for_region, make sure that
> !dom_info_available_p (dir).
>
> You pass loop * everywhere but require ->aux to be set up as
> an array of BBs forming the region with special BBs at array ends.
>
> Please instead pass in a vec which avoids using ->aux
> and also allows other non-loop-based SESE regions to be used
> (I couldn't spot anything that relies on this being a loop).
>
> Adding a convenience wrapper for loop  * would be of course nice,
> to cover the special pre/post-header code in tree-if-conv.c.
>
> In theory a SESE region is fully specified by its entry end exit _edge_,
> so you might want to see if it's possible to use such a pair of edges
> to guard the dfs/idom walks to avoid the need to create fake blocks.
>
> Btw, instead of using create_empty_bb, unchecked_make_edge, etc.
> please use split_edge() of the entry/exit edges.
>
> Richard.
>
>> ChangeLog:
>> 2016-10-05  Yuri Rumyantsev  
>>
>> * dominance.c : Include cfgloop.h for loop recognition.
>> (dom_info): Add new functions and add boolean argument to recognize
>> computation for loop region.
>> (dom_info::dom_info): New function.
>> (dom_info::calc_dfs_tree): Add boolean argument IN_REGION to not
>> handle unvisited blocks.
>> (dom_info::calc_idoms): Likewise.
>> (compute_dom_fast_query_in_region): New function.
>> (calculate_dominance_info): Invoke calc_dfs_tree and calc_idoms with
>> false argument.
>> (calculate_dominance_info_for_region): New function.
>> (free_dominance_info_for_region): Likewise.
>> (verify_dominators): Invoke calc_dfs_tree and calc_idoms with false
>> argument.
>> * dominance.h: Add prototype for introduced functions
>> calculate_dominance_info_for_region and
>> free_dominance_info_for_region.
>> tree-if-conv.c: Add to local variables ifc_sese_bbs & fake_postheader.
>> (build_sese_region): New function.
>> (if_convertible_loop_p_1): Invoke local version of post-dominators
>> calculation, free it after basic block predication and delete created
>> fake post-header block if any.
>> (tree_if_conversion): Delete call of free_dominance_info for
>> post-dominators, free ifc_sese_bbs which represents SESE region.
>> (pass_if_conversion::execute): Delete detection of infinite loops
>> and fake edges to exit block since post-dominator calculation is
>> performed per if-converted loop only.


RE: [PATCH] [ARC] Disable compact casesi patterns for arcv2

2016-10-10 Thread Claudiu Zissulescu

> > gcc/
> > 2016-05-09  Claudiu Zissulescu  
> >
> > * common/config/arc/arc-common.c
> (arc_option_optimization_table):
> > Remove compact casesi option.
> > * config/arc/arc.c (arc_override_options): Use compact casesi
> > option only for pre-ARCv2 cores.
> > * doc/invoke.texi (mcompact-casesi): Update text.
> 
> Looks good to me.
> 

Committed r240916.

Thank you for your review,
Claudiu


Re: [PATCH 2/3] Fold __builtin_memchr (version 2)

2016-10-10 Thread Martin Liška
On 10/10/2016 01:28 PM, Wilco Dijkstra wrote:
> Martin Liška  wrote:
>> On 10/07/2016 01:21 PM, Wilco Dijkstra wrote:
>>
>>> I believe target_char_cast is incorrect if the host/target chars are not 
>>> identical
>>> (depending on how constant strings are created there may be signed/unsigned
>>> mismatches too). I recently added target_char_cst_p to gimple-fold.c to 
>>> avoid
>>> char representation mismatches, so it would be better to use that instead.
>>
>> Thank you for the predicate, I'm going to use it.
>>
>> I have one additional question whether also c_getstr should be guarded
>> with a similar guard? Or is it always safe to grab a char* by 
>> TREE_STRING_POINTER
>> and use it by a host string functions (strcmp, ...)?
> 
> Yes I guess that one is incorrect too. I can't find the internal 
> implementation of tree strings,
> but it may well be that GCC just doesn't support any mismatches in 
> host/target character
> size. In any case an explicit check won't do any harm as it isn't possible to 
> use host string
> functions if there is a mismatch in character size.

I will dig in this situation. I'll build a cross-compiler which will have a 
different character size.

> 
> Another thing, what happens with:
> 
> memchr ("abc", 225, 10);
> 
> It seems your new code will call memchr with the given size (and potentially 
> crash) rather
> than report the obvious bug and set a consistent return value that doesn't 
> rely on reading
> random memory on the host.

I asked Jakub about that on IRC already:

 Hi. Just thinking whether we should fold a case like __builtin_memchr 
("a", 'x', 2), which is ubsan?
 marxin: what do you mean by that?  That is NULL, without undefined 
behavior
 jakub: sry, s/2/3
 marxin: don't fold that in that case
 jakub: good, I thought that

It's an opportunity for a warning and as I talked to Martin Sebor, he's aware 
of this as an improvement
of his sprintf warnings he's currently working on.

Martin

> 
> Wilco
> 
> 
> 



Re: [PATCH 2/3] Fold __builtin_memchr (version 2)

2016-10-10 Thread Wilco Dijkstra
Martin Liška  wrote:
> On 10/07/2016 01:21 PM, Wilco Dijkstra wrote:
>
> > I believe target_char_cast is incorrect if the host/target chars are not 
> > identical
> > (depending on how constant strings are created there may be signed/unsigned
> > mismatches too). I recently added target_char_cst_p to gimple-fold.c to 
> > avoid
> > char representation mismatches, so it would be better to use that instead.
>
> Thank you for the predicate, I'm going to use it.
>
> I have one additional question whether also c_getstr should be guarded
> with a similar guard? Or is it always safe to grab a char* by 
> TREE_STRING_POINTER
> and use it by a host string functions (strcmp, ...)?

Yes I guess that one is incorrect too. I can't find the internal implementation 
of tree strings,
but it may well be that GCC just doesn't support any mismatches in host/target 
character
size. In any case an explicit check won't do any harm as it isn't possible to 
use host string
functions if there is a mismatch in character size.

Another thing, what happens with:

memchr ("abc", 225, 10);

It seems your new code will call memchr with the given size (and potentially 
crash) rather
than report the obvious bug and set a consistent return value that doesn't rely 
on reading
random memory on the host.

Wilco





Re: [PATCH] Implement C++17 node extraction and insertion (P0083R5)

2016-10-10 Thread Jonathan Wakely

On 21/09/16 14:48 +0100, Jonathan Wakely wrote:

This implements container node extraction/insertion, and merging. The
patch includes Debug Mode support and pretty printers for the node
handles.

Most of the changes are fairly straightforward, with two things worth
pointing out.

There's a FIXME in bits/hashtable.h due to an exception-safety issue.
If the hash function or equality predicate throws then the node is
destroyed and deallocated. It would be better to leave it unchanged in
the node_handle argument.

I didn't want to make all map and multimap specializations friends of
each other (and similarly for all sets and multisets, and again for
the unordered ones). That would make it too easy to accidentally
access the internals of a map from a map. So I defined the
_Rb_tree_merge_helper and _Hash_merge_helper class templates to
mediate access, so that any access to a "foreign" container type must
be done through that type, and only certain internals can be obtained.


I forgot to mention another thing worth calling out.

I spoke to Richi and Michael Matz at the Cauldron and they assured me
that the middle end won't do any optimizations that would cause the
"magic happens here" part to do the wrong thing. Specifically, the
node handle for maps does this to get a non-const pointer to the
key_type in the pair:

 auto& __key = const_cast<_Key&>(__ptr->_M_valptr()->first);
 _M_pkey = _S_pointer_to(__key);
 _M_pmapped = _S_pointer_to(__ptr->_M_valptr()->second);

Where _S_pointer_to() is:

 template
   using __pointer = __ptr_rebind;

 template
   __pointer<_Tp>
   _S_pointer_to(_Tp& __obj)
   { return pointer_traits<__pointer<_Tp>>::pointer_to(__obj); }

The potentially worrying part of this is the const_cast, but as we
know that the pair is inside a non-const node allocated on the heap,
it will never be in read-only memory or actually non-modifiable. Once
we have std::launder we could consider using that, i.e.

 _M_pkey = _S_pointer_to(std::launder(__key));



Re: [PATCH][store merging][RFA] Re-implement merging code

2016-10-10 Thread Richard Biener
On Mon, 10 Oct 2016, Richard Biener wrote:

> On Mon, 10 Oct 2016, Kyrill Tkachov wrote:
> 
> > 
> > On 10/10/16 11:22, Richard Biener wrote:
> > > On Mon, 10 Oct 2016, Kyrill Tkachov wrote:
> > > 
> > > > Hi Richard,
> > > > 
> > > > As I mentioned, here is the patch applying to the main store merging 
> > > > patch
> > > > to
> > > > re-implement encode_tree_to_bitpos
> > > > to operate on the bytes directly.
> > > > 
> > > > This works fine on little-endian but breaks on big-endian, even for
> > > > merging
> > > > bitfields within a single byte.
> > > > Consider the code snippet from gcc.dg/store_merging_6.c:
> > > > 
> > > > struct bar {
> > > >int a : 3;
> > > >unsigned char b : 4;
> > > >unsigned char c : 1;
> > > >char d;
> > > >char e;
> > > >char f;
> > > >char g;
> > > > };
> > > > 
> > > > void
> > > > foo1 (struct bar *p)
> > > > {
> > > >p->b = 3;
> > > >p->a = 2;
> > > >p->c = 1;
> > > >p->d = 4;
> > > >p->e = 5;
> > > > }
> > > > 
> > > > The correct GIMPLE for these merged stores on big-endian is:
> > > >MEM[(voidD.49 *)p_2(D)] = 18180;
> > > >MEM[(charD.8 *)p_2(D) + 2B] = 5;
> > > > 
> > > > whereas with this patch we emit:
> > > >MEM[(voidD.49 *)p_2(D)] = 39428;
> > > >MEM[(charD.8 *)p_2(D) + 2B] = 5;
> > > > 
> > > > The dump for merging the individual stores without this patch (using the
> > > > correct but costly wide_int approach in the base patch) is:
> > > > After writing 3 of size 4 at position 3 the merged region contains:
> > > > 6 0 0 0 0 0
> > > > After writing 2 of size 3 at position 0 the merged region contains:
> > > > 46 0 0 0 0 0
> > > > After writing 1 of size 1 at position 7 the merged region contains:
> > > > 47 0 0 0 0 0
> > > > After writing 4 of size 8 at position 8 the merged region contains:
> > > > 47 4 0 0 0 0
> > > > After writing 5 of size 8 at position 16 the merged region contains:
> > > > 47 4 5 0 0 0
> > > > 
> > > > 
> > > > And with this patch it is:
> > > > After writing 3 of size 4 at position 3 the merged region contains:
> > > > 18 0 0 0 0 0
> > > > After writing 2 of size 3 at position 0 the merged region contains:
> > > > 1a 0 0 0 0 0
> > > > After writing 1 of size 1 at position 7 the merged region contains:
> > > > 9a 0 0 0 0 0
> > > > After writing 4 of size 8 at position 8 the merged region contains:
> > > > 9a 4 0 0 0 0
> > > > After writing 5 of size 8 at position 16 the merged region contains:
> > > > 9a 4 5 0 0 0
> > > > 
> > > > (Note the dump just dumps the byte array from index 0 to  so the
> > > > first
> > > > thing printed is the lowest numbered byte.
> > > > Also, each byte is dumped in hex.)
> > > > 
> > > > The code as included here doesn't do any byte swapping for big-endian 
> > > > but
> > > > as
> > > > seen from the dump even writing a sub-byte
> > > > bitfield goes wrong so it would be nice to resolve that before going
> > > > forward.
> > > > Any help with debugging this is hugely appreciated. I've included an 
> > > > ASCII
> > > > diagram of the steps in the algorithm
> > > > in the patch itself.
> > > Ah, I think you need to account for BITS_BIG_ENDIAN in
> > > shift_bytes_in_array.  You have to shift towards MSB which means changing
> > > left to right shifts for BITS_BIG_ENDIAN.
> > 
> > Thanks, I'll try it out. But this is on aarch64 where
> > BITS_BIG_ENDIAN is 0 even when BYTES_BIG_ENDIAN is 1
> > so there's something else bad here.
> 
> Maybe I'm confusing all the macros, so maybe it's BYTES_BIG_ENDIAN
> (vs. WORDS_BIG_ENDIAN -- in theory this approach should work for
> pdp11 as well).

Or maybe I'm confusing how get_inner_reference numbers "bits" when
it returns bitpos... (and how a multi-byte value in target memory
representation has to be "shifted" by bitpos).

I really thought BITS_BIG_ENDIAN is the only thing that matters...

Btw, I reproduced on ppc64-linux (which has BITS_BIG_ENDIAN).

Richard.

> Richard.
> 
> > > You also seem to miss to account for amnt / BITS_PER_UNIT != 0.
> > > Independently of BYTES_BIG_ENDIAN it would be
> > > 
> > >ptr[i + (amnt / BITS_PER_UNIT)] = ptr[i] << amnt;
> > > ...
> > 
> > doh, yes. I'll fix that.
> > 
> > > (so best use a single load / store and operate on a temporary).
> > 
> > Thanks,
> > Kyrill
> > 
> > > Richard.
> > > 
> > > > Thanks,
> > > > Kyrill
> > > > 
> > 
> > 
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


Re: [PATCH][store merging][RFA] Re-implement merging code

2016-10-10 Thread Richard Biener
On Mon, 10 Oct 2016, Kyrill Tkachov wrote:

> 
> On 10/10/16 11:22, Richard Biener wrote:
> > On Mon, 10 Oct 2016, Kyrill Tkachov wrote:
> > 
> > > Hi Richard,
> > > 
> > > As I mentioned, here is the patch applying to the main store merging patch
> > > to
> > > re-implement encode_tree_to_bitpos
> > > to operate on the bytes directly.
> > > 
> > > This works fine on little-endian but breaks on big-endian, even for
> > > merging
> > > bitfields within a single byte.
> > > Consider the code snippet from gcc.dg/store_merging_6.c:
> > > 
> > > struct bar {
> > >int a : 3;
> > >unsigned char b : 4;
> > >unsigned char c : 1;
> > >char d;
> > >char e;
> > >char f;
> > >char g;
> > > };
> > > 
> > > void
> > > foo1 (struct bar *p)
> > > {
> > >p->b = 3;
> > >p->a = 2;
> > >p->c = 1;
> > >p->d = 4;
> > >p->e = 5;
> > > }
> > > 
> > > The correct GIMPLE for these merged stores on big-endian is:
> > >MEM[(voidD.49 *)p_2(D)] = 18180;
> > >MEM[(charD.8 *)p_2(D) + 2B] = 5;
> > > 
> > > whereas with this patch we emit:
> > >MEM[(voidD.49 *)p_2(D)] = 39428;
> > >MEM[(charD.8 *)p_2(D) + 2B] = 5;
> > > 
> > > The dump for merging the individual stores without this patch (using the
> > > correct but costly wide_int approach in the base patch) is:
> > > After writing 3 of size 4 at position 3 the merged region contains:
> > > 6 0 0 0 0 0
> > > After writing 2 of size 3 at position 0 the merged region contains:
> > > 46 0 0 0 0 0
> > > After writing 1 of size 1 at position 7 the merged region contains:
> > > 47 0 0 0 0 0
> > > After writing 4 of size 8 at position 8 the merged region contains:
> > > 47 4 0 0 0 0
> > > After writing 5 of size 8 at position 16 the merged region contains:
> > > 47 4 5 0 0 0
> > > 
> > > 
> > > And with this patch it is:
> > > After writing 3 of size 4 at position 3 the merged region contains:
> > > 18 0 0 0 0 0
> > > After writing 2 of size 3 at position 0 the merged region contains:
> > > 1a 0 0 0 0 0
> > > After writing 1 of size 1 at position 7 the merged region contains:
> > > 9a 0 0 0 0 0
> > > After writing 4 of size 8 at position 8 the merged region contains:
> > > 9a 4 0 0 0 0
> > > After writing 5 of size 8 at position 16 the merged region contains:
> > > 9a 4 5 0 0 0
> > > 
> > > (Note the dump just dumps the byte array from index 0 to  so the
> > > first
> > > thing printed is the lowest numbered byte.
> > > Also, each byte is dumped in hex.)
> > > 
> > > The code as included here doesn't do any byte swapping for big-endian but
> > > as
> > > seen from the dump even writing a sub-byte
> > > bitfield goes wrong so it would be nice to resolve that before going
> > > forward.
> > > Any help with debugging this is hugely appreciated. I've included an ASCII
> > > diagram of the steps in the algorithm
> > > in the patch itself.
> > Ah, I think you need to account for BITS_BIG_ENDIAN in
> > shift_bytes_in_array.  You have to shift towards MSB which means changing
> > left to right shifts for BITS_BIG_ENDIAN.
> 
> Thanks, I'll try it out. But this is on aarch64 where
> BITS_BIG_ENDIAN is 0 even when BYTES_BIG_ENDIAN is 1
> so there's something else bad here.

Maybe I'm confusing all the macros, so maybe it's BYTES_BIG_ENDIAN
(vs. WORDS_BIG_ENDIAN -- in theory this approach should work for
pdp11 as well).

Richard.

> > You also seem to miss to account for amnt / BITS_PER_UNIT != 0.
> > Independently of BYTES_BIG_ENDIAN it would be
> > 
> >ptr[i + (amnt / BITS_PER_UNIT)] = ptr[i] << amnt;
> > ...
> 
> doh, yes. I'll fix that.
> 
> > (so best use a single load / store and operate on a temporary).
> 
> Thanks,
> Kyrill
> 
> > Richard.
> > 
> > > Thanks,
> > > Kyrill
> > > 
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


Re: [PATCH][store merging][RFA] Re-implement merging code

2016-10-10 Thread Kyrill Tkachov


On 10/10/16 12:06, Kyrill Tkachov wrote:


On 10/10/16 11:22, Richard Biener wrote:

On Mon, 10 Oct 2016, Kyrill Tkachov wrote:


Hi Richard,

As I mentioned, here is the patch applying to the main store merging patch to
re-implement encode_tree_to_bitpos
to operate on the bytes directly.

This works fine on little-endian but breaks on big-endian, even for merging
bitfields within a single byte.
Consider the code snippet from gcc.dg/store_merging_6.c:

struct bar {
   int a : 3;
   unsigned char b : 4;
   unsigned char c : 1;
   char d;
   char e;
   char f;
   char g;
};

void
foo1 (struct bar *p)
{
   p->b = 3;
   p->a = 2;
   p->c = 1;
   p->d = 4;
   p->e = 5;
}

The correct GIMPLE for these merged stores on big-endian is:
   MEM[(voidD.49 *)p_2(D)] = 18180;
   MEM[(charD.8 *)p_2(D) + 2B] = 5;

whereas with this patch we emit:
   MEM[(voidD.49 *)p_2(D)] = 39428;
   MEM[(charD.8 *)p_2(D) + 2B] = 5;

The dump for merging the individual stores without this patch (using the
correct but costly wide_int approach in the base patch) is:
After writing 3 of size 4 at position 3 the merged region contains:
6 0 0 0 0 0
After writing 2 of size 3 at position 0 the merged region contains:
46 0 0 0 0 0
After writing 1 of size 1 at position 7 the merged region contains:
47 0 0 0 0 0
After writing 4 of size 8 at position 8 the merged region contains:
47 4 0 0 0 0
After writing 5 of size 8 at position 16 the merged region contains:
47 4 5 0 0 0


And with this patch it is:
After writing 3 of size 4 at position 3 the merged region contains:
18 0 0 0 0 0
After writing 2 of size 3 at position 0 the merged region contains:
1a 0 0 0 0 0
After writing 1 of size 1 at position 7 the merged region contains:
9a 0 0 0 0 0
After writing 4 of size 8 at position 8 the merged region contains:
9a 4 0 0 0 0
After writing 5 of size 8 at position 16 the merged region contains:
9a 4 5 0 0 0

(Note the dump just dumps the byte array from index 0 to  so the first
thing printed is the lowest numbered byte.
Also, each byte is dumped in hex.)

The code as included here doesn't do any byte swapping for big-endian but as
seen from the dump even writing a sub-byte
bitfield goes wrong so it would be nice to resolve that before going forward.
Any help with debugging this is hugely appreciated. I've included an ASCII
diagram of the steps in the algorithm
in the patch itself.

Ah, I think you need to account for BITS_BIG_ENDIAN in
shift_bytes_in_array.  You have to shift towards MSB which means changing
left to right shifts for BITS_BIG_ENDIAN.


Thanks, I'll try it out. But this is on aarch64 where
BITS_BIG_ENDIAN is 0 even when BYTES_BIG_ENDIAN is 1
so there's something else bad here.


You also seem to miss to account for amnt / BITS_PER_UNIT != 0.
Independently of BYTES_BIG_ENDIAN it would be

   ptr[i + (amnt / BITS_PER_UNIT)] = ptr[i] << amnt;
...


doh, yes. I'll fix that.



Scratch that, just read your other reply.
The precondition for that function is that the shift amount is less than 
BITS_PER_UNIT.
I'll clarify that in the comment.

Kyril


(so best use a single load / store and operate on a temporary).


Thanks,
Kyrill


Richard.


Thanks,
Kyrill







Re: [PATCH][store merging][RFA] Re-implement merging code

2016-10-10 Thread Kyrill Tkachov


On 10/10/16 11:22, Richard Biener wrote:

On Mon, 10 Oct 2016, Kyrill Tkachov wrote:


Hi Richard,

As I mentioned, here is the patch applying to the main store merging patch to
re-implement encode_tree_to_bitpos
to operate on the bytes directly.

This works fine on little-endian but breaks on big-endian, even for merging
bitfields within a single byte.
Consider the code snippet from gcc.dg/store_merging_6.c:

struct bar {
   int a : 3;
   unsigned char b : 4;
   unsigned char c : 1;
   char d;
   char e;
   char f;
   char g;
};

void
foo1 (struct bar *p)
{
   p->b = 3;
   p->a = 2;
   p->c = 1;
   p->d = 4;
   p->e = 5;
}

The correct GIMPLE for these merged stores on big-endian is:
   MEM[(voidD.49 *)p_2(D)] = 18180;
   MEM[(charD.8 *)p_2(D) + 2B] = 5;

whereas with this patch we emit:
   MEM[(voidD.49 *)p_2(D)] = 39428;
   MEM[(charD.8 *)p_2(D) + 2B] = 5;

The dump for merging the individual stores without this patch (using the
correct but costly wide_int approach in the base patch) is:
After writing 3 of size 4 at position 3 the merged region contains:
6 0 0 0 0 0
After writing 2 of size 3 at position 0 the merged region contains:
46 0 0 0 0 0
After writing 1 of size 1 at position 7 the merged region contains:
47 0 0 0 0 0
After writing 4 of size 8 at position 8 the merged region contains:
47 4 0 0 0 0
After writing 5 of size 8 at position 16 the merged region contains:
47 4 5 0 0 0


And with this patch it is:
After writing 3 of size 4 at position 3 the merged region contains:
18 0 0 0 0 0
After writing 2 of size 3 at position 0 the merged region contains:
1a 0 0 0 0 0
After writing 1 of size 1 at position 7 the merged region contains:
9a 0 0 0 0 0
After writing 4 of size 8 at position 8 the merged region contains:
9a 4 0 0 0 0
After writing 5 of size 8 at position 16 the merged region contains:
9a 4 5 0 0 0

(Note the dump just dumps the byte array from index 0 to  so the first
thing printed is the lowest numbered byte.
Also, each byte is dumped in hex.)

The code as included here doesn't do any byte swapping for big-endian but as
seen from the dump even writing a sub-byte
bitfield goes wrong so it would be nice to resolve that before going forward.
Any help with debugging this is hugely appreciated. I've included an ASCII
diagram of the steps in the algorithm
in the patch itself.

Ah, I think you need to account for BITS_BIG_ENDIAN in
shift_bytes_in_array.  You have to shift towards MSB which means changing
left to right shifts for BITS_BIG_ENDIAN.


Thanks, I'll try it out. But this is on aarch64 where
BITS_BIG_ENDIAN is 0 even when BYTES_BIG_ENDIAN is 1
so there's something else bad here.


You also seem to miss to account for amnt / BITS_PER_UNIT != 0.
Independently of BYTES_BIG_ENDIAN it would be

   ptr[i + (amnt / BITS_PER_UNIT)] = ptr[i] << amnt;
...


doh, yes. I'll fix that.


(so best use a single load / store and operate on a temporary).


Thanks,
Kyrill


Richard.


Thanks,
Kyrill





Re: Fix invalid doloop setup on ia64 (PR target/77738)

2016-10-10 Thread Bernd Schmidt

On 10/10/2016 12:51 PM, Andreas Schwab wrote:

On ia64 the doloop pattern can only work with DImode, so it should
reject any other mode.  Bootstrapped and regtested on ia64-suse-linux.

Andreas.

PR target/77738
* config/ia64/ia64.md ("doloop_end"): Reject if mode of loop
pseudo is not DImode.


Ok. Same issue as on every target that uses doloop.


Bernd



Re: [PATCH 2/3] Fold __builtin_memchr (version 2)

2016-10-10 Thread Martin Liška
On 10/07/2016 01:21 PM, Wilco Dijkstra wrote:
> Hi,
> 
>> -static int
>> +int
>> target_char_cast (tree cst, char *p)
> 
>> +  if (target_char_cast (arg2, ))
>> +return false;
> 
> I believe target_char_cast is incorrect if the host/target chars are not 
> identical
> (depending on how constant strings are created there may be signed/unsigned
> mismatches too). I recently added target_char_cst_p to gimple-fold.c to avoid
> char representation mismatches, so it would be better to use that instead.
> 
> Wilco
> 

Thank you for the predicate, I'm going to use it.

I have one additional question whether also c_getstr should be guarded
with a similar guard? Or is it always safe to grab a char* by 
TREE_STRING_POINTER
and use it by a host string functions (strcmp, ...)?

Martin


Re: PING: [PATCH] Be more conservative in early inliner if FDO is enabled

2016-10-10 Thread Yuan, Pengfei
> On Mon, Oct 10, 2016 at 4:23 AM, Yuan, Pengfei  wrote:
> > Hi,
> >
> > What is the decision on this patch?
> > https://gcc.gnu.org/ml/gcc-patches/2016-09/msg01041.html
> 
> Honza approved the patch already.
> 
> Richard.

Do I need to sign a copyright assignment for the patch?
Moreover, I do not have the permission to commit it.

Regards,
Yuan, Pengfei



Fix invalid doloop setup on ia64 (PR target/77738)

2016-10-10 Thread Andreas Schwab
On ia64 the doloop pattern can only work with DImode, so it should
reject any other mode.  Bootstrapped and regtested on ia64-suse-linux.

Andreas.

PR target/77738
* config/ia64/ia64.md ("doloop_end"): Reject if mode of loop
pseudo is not DImode.

---
 gcc/config/ia64/ia64.md | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/gcc/config/ia64/ia64.md b/gcc/config/ia64/ia64.md
index 7bc21fd8ca..afde75aa74 100644
--- a/gcc/config/ia64/ia64.md
+++ b/gcc/config/ia64/ia64.md
@@ -3959,6 +3959,9 @@
(use (match_operand 1 "" ""))]  ; label
   ""
 {
+  if (GET_MODE (operands[0]) != DImode)
+FAIL;
+
   emit_jump_insn (gen_doloop_end_internal (gen_rtx_REG (DImode, AR_LC_REGNUM),
   operands[1]));
   DONE;
-- 
2.10.1

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: [patch] Fix GC issue triggered by arithmetic overflow checking

2016-10-10 Thread Richard Biener
On Mon, Oct 10, 2016 at 12:38 PM, Eric Botcazou  wrote:
>> I believe the rule is that you might only depend on the order of objects
>> with respect to their DECL_UID, not the actual value of the DECL_UID.
>> As var-tracking shouldn't look at TYPE_DECLs (?) it's probably a latent
>> var-tracking bug as well.
>
> It presumably doesn't look at TYPE_DECLs, simply the DECL_UID of variables is
> also different so this changes some hashing.

Yes.  But that's not the only source for DECL_UID differences.  Btw,
I see lots of FOR_EACH_HASH_TABLE_ELEMENT in var-tracking.c
but they don't look like their outcome is supposed to be dependent on
element ordering.

Did you track down where exactly the code-gen difference appeared?

>> I'd prefer the named parameter to be defaulted to false and the few
>> places in the FEs fixed (eventually that name business should be
>> handled like names for nodes like integer_type_node -- I see no
>> reason why build_complex_type should have this special-case at all!
>> That is, why are the named vairants in the type hash in the first place?)
>
> I think that the calls in build_common_tree_nodes need to be changed too then:
>
>   complex_integer_type_node = build_complex_type (integer_type_node);
>   complex_float_type_node = build_complex_type (float_type_node);
>   complex_double_type_node = build_complex_type (double_type_node);
>   complex_long_double_type_node = build_complex_type (long_double_type_node);
>
> in addition to:
>
> ./ada/gcc-interface/decl.c: = build_complex_type
> ./ada/gcc-interface/decl.c:  return build_complex_type (nt);
> ./ada/gcc-interface/trans.c:  tree gnu_ctype = build_complex_type
> (gnu_type);
> ./c/c-decl.c: specs->type = build_complex_type (specs->type);
> ./c/c-decl.c: specs->type = build_complex_type (specs->type);
> ./c/c-decl.c: specs->type = build_complex_type (specs->type);
> ./c/c-parser.c:  build_complex_type
> ./c/c-typeck.c: return build_complex_type (subtype);
> ./c-family/c-common.c:  return build_complex_type (inner_type);
> ./c-family/c-lex.c:   type = build_complex_type (type);
> ./cp/decl.c:type = build_complex_type (type);
> ./cp/typeck.c:  return build_type_attribute_variant (build_complex_type
> (subtype),
> ./fortran/trans-types.c:gfc_build_complex_type (tree scalar_type)
> ./fortran/trans-types.c:  type = gfc_build_complex_type (type);
> ./go/go-gcc.cc:
> build_complex_type(TREE_TYPE(real_tree)),
> ./go/go-gcc.cc:  type = build_complex_type(type);
> ./lto/lto-lang.c:   return build_complex_type (inner_type);
>
> Or perhaps *only* the calls in build_common_tree_nodes need to be changed?
>
> It's certainly old code (r29604, September 1999).
>
> --
> Eric Botcazou


Re: [patch] Fix GC issue triggered by arithmetic overflow checking

2016-10-10 Thread Richard Biener
On Mon, Oct 10, 2016 at 12:38 PM, Eric Botcazou  wrote:
>> I believe the rule is that you might only depend on the order of objects
>> with respect to their DECL_UID, not the actual value of the DECL_UID.
>> As var-tracking shouldn't look at TYPE_DECLs (?) it's probably a latent
>> var-tracking bug as well.
>
> It presumably doesn't look at TYPE_DECLs, simply the DECL_UID of variables is
> also different so this changes some hashing.
>
>> I'd prefer the named parameter to be defaulted to false and the few
>> places in the FEs fixed (eventually that name business should be
>> handled like names for nodes like integer_type_node -- I see no
>> reason why build_complex_type should have this special-case at all!
>> That is, why are the named vairants in the type hash in the first place?)
>
> I think that the calls in build_common_tree_nodes need to be changed too then:
>
>   complex_integer_type_node = build_complex_type (integer_type_node);
>   complex_float_type_node = build_complex_type (float_type_node);
>   complex_double_type_node = build_complex_type (double_type_node);
>   complex_long_double_type_node = build_complex_type (long_double_type_node);
>
> in addition to:
>
> ./ada/gcc-interface/decl.c: = build_complex_type
> ./ada/gcc-interface/decl.c:  return build_complex_type (nt);
> ./ada/gcc-interface/trans.c:  tree gnu_ctype = build_complex_type
> (gnu_type);
> ./c/c-decl.c: specs->type = build_complex_type (specs->type);
> ./c/c-decl.c: specs->type = build_complex_type (specs->type);
> ./c/c-decl.c: specs->type = build_complex_type (specs->type);
> ./c/c-parser.c:  build_complex_type
> ./c/c-typeck.c: return build_complex_type (subtype);
> ./c-family/c-common.c:  return build_complex_type (inner_type);
> ./c-family/c-lex.c:   type = build_complex_type (type);
> ./cp/decl.c:type = build_complex_type (type);
> ./cp/typeck.c:  return build_type_attribute_variant (build_complex_type
> (subtype),
> ./fortran/trans-types.c:gfc_build_complex_type (tree scalar_type)
> ./fortran/trans-types.c:  type = gfc_build_complex_type (type);
> ./go/go-gcc.cc:
> build_complex_type(TREE_TYPE(real_tree)),
> ./go/go-gcc.cc:  type = build_complex_type(type);
> ./lto/lto-lang.c:   return build_complex_type (inner_type);
>
> Or perhaps *only* the calls in build_common_tree_nodes need to be changed?

I think only the calls in build_common_tree_nodes -- those are the ones
built early and that survive GC.  The patch is ok if it passes testing
with that.

Richard.

> It's certainly old code (r29604, September 1999).
> --
> Eric Botcazou


Re: [AArch64][0/14] ARMv8.2-A FP16 extension support

2016-10-10 Thread James Greenhalgh
On Wed, Oct 05, 2016 at 05:44:08PM +0100, Jiong Wang wrote:
> On 27/09/16 17:03, Jiong Wang wrote:
> >
> > Now as ARM patches have gone in around r240427, I have done a
> quick confirmation
> > on the status of these four pending testsuite patches:
> >
> >   https://gcc.gnu.org/ml/gcc-patches/2016-07/msg00337.html
> >   https://gcc.gnu.org/ml/gcc-patches/2016-07/msg00338.html
> >   https://gcc.gnu.org/ml/gcc-patches/2016-07/msg00339.html
> >   https://gcc.gnu.org/ml/gcc-patches/2016-07/msg00340.html
> >
> > The result is they applies cleanly on gcc trunk, and there is no
> regression on
> > AArch64 native regression test.  Testcases enabled without
> requirement of FP16
> > all passed.
> >
> > I will give a final run on ARM native board and AArch64 emulation
> environment
> > with ARMv8.2-A FP16 enabled. (Have done this before, just in case
> something
> > changed during these days)
> >
> > OK for trunk if there is no regression?
> >
> > Thanks
> 
> Finished the final tests on emulator with FP16 enabled.
> 
>   * No regression on AARCH64, all new testcases passed.
>   * No regression on AARCH32, part of these new testcases UNRESOLVED
> because
> they should be skipped on AARCH32, fixed by the attached trivial patch
> which I will merge into the 4th patch (no affect on changelog).
> 
> OK to commit these patches?

And to be explicit, this is OK too.

Thanks for the tests!

Cheers,
James



Re: [patch] Fix GC issue triggered by arithmetic overflow checking

2016-10-10 Thread Eric Botcazou
> I believe the rule is that you might only depend on the order of objects
> with respect to their DECL_UID, not the actual value of the DECL_UID.
> As var-tracking shouldn't look at TYPE_DECLs (?) it's probably a latent
> var-tracking bug as well.

It presumably doesn't look at TYPE_DECLs, simply the DECL_UID of variables is 
also different so this changes some hashing.

> I'd prefer the named parameter to be defaulted to false and the few
> places in the FEs fixed (eventually that name business should be
> handled like names for nodes like integer_type_node -- I see no
> reason why build_complex_type should have this special-case at all!
> That is, why are the named vairants in the type hash in the first place?)

I think that the calls in build_common_tree_nodes need to be changed too then:

  complex_integer_type_node = build_complex_type (integer_type_node);
  complex_float_type_node = build_complex_type (float_type_node);
  complex_double_type_node = build_complex_type (double_type_node);
  complex_long_double_type_node = build_complex_type (long_double_type_node);

in addition to:

./ada/gcc-interface/decl.c: = build_complex_type
./ada/gcc-interface/decl.c:  return build_complex_type (nt);
./ada/gcc-interface/trans.c:  tree gnu_ctype = build_complex_type 
(gnu_type);
./c/c-decl.c: specs->type = build_complex_type (specs->type);
./c/c-decl.c: specs->type = build_complex_type (specs->type);
./c/c-decl.c: specs->type = build_complex_type (specs->type);
./c/c-parser.c:  build_complex_type
./c/c-typeck.c: return build_complex_type (subtype);
./c-family/c-common.c:  return build_complex_type (inner_type);
./c-family/c-lex.c:   type = build_complex_type (type);
./cp/decl.c:type = build_complex_type (type);
./cp/typeck.c:  return build_type_attribute_variant (build_complex_type 
(subtype),
./fortran/trans-types.c:gfc_build_complex_type (tree scalar_type)
./fortran/trans-types.c:  type = gfc_build_complex_type (type);
./go/go-gcc.cc: 
build_complex_type(TREE_TYPE(real_tree)),
./go/go-gcc.cc:  type = build_complex_type(type);
./lto/lto-lang.c:   return build_complex_type (inner_type);

Or perhaps *only* the calls in build_common_tree_nodes need to be changed?

It's certainly old code (r29604, September 1999).

-- 
Eric Botcazou


Re: [PATCH][store merging][RFA] Re-implement merging code

2016-10-10 Thread Richard Biener
On Mon, 10 Oct 2016, Richard Biener wrote:

> On Mon, 10 Oct 2016, Kyrill Tkachov wrote:
> 
> > Hi Richard,
> > 
> > As I mentioned, here is the patch applying to the main store merging patch 
> > to
> > re-implement encode_tree_to_bitpos
> > to operate on the bytes directly.
> > 
> > This works fine on little-endian but breaks on big-endian, even for merging
> > bitfields within a single byte.
> > Consider the code snippet from gcc.dg/store_merging_6.c:
> > 
> > struct bar {
> >   int a : 3;
> >   unsigned char b : 4;
> >   unsigned char c : 1;
> >   char d;
> >   char e;
> >   char f;
> >   char g;
> > };
> > 
> > void
> > foo1 (struct bar *p)
> > {
> >   p->b = 3;
> >   p->a = 2;
> >   p->c = 1;
> >   p->d = 4;
> >   p->e = 5;
> > }
> > 
> > The correct GIMPLE for these merged stores on big-endian is:
> >   MEM[(voidD.49 *)p_2(D)] = 18180;
> >   MEM[(charD.8 *)p_2(D) + 2B] = 5;
> > 
> > whereas with this patch we emit:
> >   MEM[(voidD.49 *)p_2(D)] = 39428;
> >   MEM[(charD.8 *)p_2(D) + 2B] = 5;
> > 
> > The dump for merging the individual stores without this patch (using the
> > correct but costly wide_int approach in the base patch) is:
> > After writing 3 of size 4 at position 3 the merged region contains:
> > 6 0 0 0 0 0
> > After writing 2 of size 3 at position 0 the merged region contains:
> > 46 0 0 0 0 0
> > After writing 1 of size 1 at position 7 the merged region contains:
> > 47 0 0 0 0 0
> > After writing 4 of size 8 at position 8 the merged region contains:
> > 47 4 0 0 0 0
> > After writing 5 of size 8 at position 16 the merged region contains:
> > 47 4 5 0 0 0
> > 
> > 
> > And with this patch it is:
> > After writing 3 of size 4 at position 3 the merged region contains:
> > 18 0 0 0 0 0
> > After writing 2 of size 3 at position 0 the merged region contains:
> > 1a 0 0 0 0 0
> > After writing 1 of size 1 at position 7 the merged region contains:
> > 9a 0 0 0 0 0
> > After writing 4 of size 8 at position 8 the merged region contains:
> > 9a 4 0 0 0 0
> > After writing 5 of size 8 at position 16 the merged region contains:
> > 9a 4 5 0 0 0
> > 
> > (Note the dump just dumps the byte array from index 0 to  so the first
> > thing printed is the lowest numbered byte.
> > Also, each byte is dumped in hex.)
> > 
> > The code as included here doesn't do any byte swapping for big-endian but as
> > seen from the dump even writing a sub-byte
> > bitfield goes wrong so it would be nice to resolve that before going 
> > forward.
> > Any help with debugging this is hugely appreciated. I've included an ASCII
> > diagram of the steps in the algorithm
> > in the patch itself.
> 
> Ah, I think you need to account for BITS_BIG_ENDIAN in 
> shift_bytes_in_array.  You have to shift towards MSB which means changing
> left to right shifts for BITS_BIG_ENDIAN.
> 
> You also seem to miss to account for amnt / BITS_PER_UNIT != 0.
> Independently of BYTES_BIG_ENDIAN it would be

Ok, that would matter only if you'd merge shift_bytes_in_array,
clear_bit_region and the |-ring of that into the final buffer
(which should be possible).

Richard.


Re: [PATCH][store merging][RFA] Re-implement merging code

2016-10-10 Thread Richard Biener
On Mon, 10 Oct 2016, Kyrill Tkachov wrote:

> Hi Richard,
> 
> As I mentioned, here is the patch applying to the main store merging patch to
> re-implement encode_tree_to_bitpos
> to operate on the bytes directly.
> 
> This works fine on little-endian but breaks on big-endian, even for merging
> bitfields within a single byte.
> Consider the code snippet from gcc.dg/store_merging_6.c:
> 
> struct bar {
>   int a : 3;
>   unsigned char b : 4;
>   unsigned char c : 1;
>   char d;
>   char e;
>   char f;
>   char g;
> };
> 
> void
> foo1 (struct bar *p)
> {
>   p->b = 3;
>   p->a = 2;
>   p->c = 1;
>   p->d = 4;
>   p->e = 5;
> }
> 
> The correct GIMPLE for these merged stores on big-endian is:
>   MEM[(voidD.49 *)p_2(D)] = 18180;
>   MEM[(charD.8 *)p_2(D) + 2B] = 5;
> 
> whereas with this patch we emit:
>   MEM[(voidD.49 *)p_2(D)] = 39428;
>   MEM[(charD.8 *)p_2(D) + 2B] = 5;
> 
> The dump for merging the individual stores without this patch (using the
> correct but costly wide_int approach in the base patch) is:
> After writing 3 of size 4 at position 3 the merged region contains:
> 6 0 0 0 0 0
> After writing 2 of size 3 at position 0 the merged region contains:
> 46 0 0 0 0 0
> After writing 1 of size 1 at position 7 the merged region contains:
> 47 0 0 0 0 0
> After writing 4 of size 8 at position 8 the merged region contains:
> 47 4 0 0 0 0
> After writing 5 of size 8 at position 16 the merged region contains:
> 47 4 5 0 0 0
> 
> 
> And with this patch it is:
> After writing 3 of size 4 at position 3 the merged region contains:
> 18 0 0 0 0 0
> After writing 2 of size 3 at position 0 the merged region contains:
> 1a 0 0 0 0 0
> After writing 1 of size 1 at position 7 the merged region contains:
> 9a 0 0 0 0 0
> After writing 4 of size 8 at position 8 the merged region contains:
> 9a 4 0 0 0 0
> After writing 5 of size 8 at position 16 the merged region contains:
> 9a 4 5 0 0 0
> 
> (Note the dump just dumps the byte array from index 0 to  so the first
> thing printed is the lowest numbered byte.
> Also, each byte is dumped in hex.)
> 
> The code as included here doesn't do any byte swapping for big-endian but as
> seen from the dump even writing a sub-byte
> bitfield goes wrong so it would be nice to resolve that before going forward.
> Any help with debugging this is hugely appreciated. I've included an ASCII
> diagram of the steps in the algorithm
> in the patch itself.

Ah, I think you need to account for BITS_BIG_ENDIAN in 
shift_bytes_in_array.  You have to shift towards MSB which means changing
left to right shifts for BITS_BIG_ENDIAN.

You also seem to miss to account for amnt / BITS_PER_UNIT != 0.
Independently of BYTES_BIG_ENDIAN it would be

  ptr[i + (amnt / BITS_PER_UNIT)] = ptr[i] << amnt;
...

(so best use a single load / store and operate on a temporary).

Richard.

> Thanks,
> Kyrill
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


[Ada] Fix type checking failure with pragma Volatile_Full_Access

2016-10-10 Thread Eric Botcazou
The problem is that we put an alias set on a variant that is not the main one.

Tested on x86_64-suse-linux, applied on the mainline.


2016-10-10  Eric Botcazou  

* gcc-interface/decl.c (gnat_to_gnu_entity): Put volatile qualifier
on types at the very end of the processing.
(gnat_to_gnu_param): Remove redundant test.
(change_qualified_type): Do nothing for unconstrained array types.


2016-10-10  Eric Botcazou  

* gnat.dg/specs/vfa.ads: New test.

-- 
Eric BotcazouIndex: gcc-interface/decl.c
===
--- gcc-interface/decl.c	(revision 240890)
+++ gcc-interface/decl.c	(working copy)
@@ -4728,14 +4728,6 @@ gnat_to_gnu_entity (Entity_Id gnat_entit
 	  && AGGREGATE_TYPE_P (gnu_type)
 	  && TYPE_BY_REFERENCE_P (gnu_type))
 	SET_TYPE_MODE (gnu_type, BLKmode);
-
-	  if (Treat_As_Volatile (gnat_entity))
-	{
-	  const int quals
-		= TYPE_QUAL_VOLATILE
-		  | (Is_Atomic_Or_VFA (gnat_entity) ? TYPE_QUAL_ATOMIC : 0);
-	  gnu_type = change_qualified_type (gnu_type, quals);
-	}
 	}
 
   /* If this is a derived type, relate its alias set to that of its parent
@@ -4816,6 +4808,14 @@ gnat_to_gnu_entity (Entity_Id gnat_entit
 			 ? ALIAS_SET_COPY : ALIAS_SET_SUPERSET);
 	}
 
+  if (Treat_As_Volatile (gnat_entity))
+	{
+	  const int quals
+	= TYPE_QUAL_VOLATILE
+	  | (Is_Atomic_Or_VFA (gnat_entity) ? TYPE_QUAL_ATOMIC : 0);
+	  gnu_type = change_qualified_type (gnu_type, quals);
+	}
+
   if (!gnu_decl)
 	gnu_decl = create_type_decl (gnu_entity_name, gnu_type,
  artificial_p, debug_info_p,
@@ -5386,12 +5386,9 @@ gnat_to_gnu_param (Entity_Id gnat_param,
 }
 
   /* If this is a read-only parameter, make a variant of the type that is
- read-only.  ??? However, if this is an unconstrained array, that type
- can be very complex, so skip it for now.  Likewise for any other
- self-referential type.  */
-  if (ro_param
-  && TREE_CODE (gnu_param_type) != UNCONSTRAINED_ARRAY_TYPE
-  && !CONTAINS_PLACEHOLDER_P (TYPE_SIZE (gnu_param_type)))
+ read-only.  ??? However, if this is a self-referential type, the type
+ can be very complex, so skip it for now.  */
+  if (ro_param && !CONTAINS_PLACEHOLDER_P (TYPE_SIZE (gnu_param_type)))
 gnu_param_type = change_qualified_type (gnu_param_type, TYPE_QUAL_CONST);
 
   /* For foreign conventions, pass arrays as pointers to the element type.
@@ -6254,6 +6251,10 @@ gnu_ext_name_for_subprog (Entity_Id gnat
 static tree
 change_qualified_type (tree type, int type_quals)
 {
+  /* Qualifiers must be put on the associated array type.  */
+  if (TREE_CODE (type) == UNCONSTRAINED_ARRAY_TYPE)
+return type;
+
   return build_qualified_type (type, TYPE_QUALS (type) | type_quals);
 }
 
-- { dg-do compile }
-- { dg-options "-g" }

package VFA is

  type Rec is record
A : Short_Integer;
B : Short_Integer;
  end record;

  type Rec_VFA is new Rec;
  pragma Volatile_Full_Access (Rec_VFA);

end VFA;


Re: [AArch64][14/14] ARMv8.2-A testsuite for new scalar intrinsics

2016-10-10 Thread James Greenhalgh
On Thu, Jul 07, 2016 at 05:19:37PM +0100, Jiong Wang wrote:
> This patch contains testcases for those new scalar intrinsics which are only
> available for AArch64.

OK.

Thanks,
James

> gcc/testsuite/
> 2016-07-07  Jiong Wang 
> 
> * gcc.target/aarch64/advsimd-intrinsics/unary_scalar_op.inc:
> Support FMT64.
> * gcc.target/aarch64/advsimd-intrinsics/vabdh_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcageh_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcagth_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcaleh_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcalth_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vceqh_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vceqzh_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcgeh_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcgezh_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcgth_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcgtzh_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcleh_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vclezh_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vclth_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcltzh_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvtah_s16_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvtah_s64_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvtah_u16_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvtah_u64_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvth_f16_s16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvth_f16_s64_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvth_f16_u16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvth_f16_u64_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvth_n_f16_s16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvth_n_f16_s64_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvth_n_f16_u16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvth_n_f16_u64_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvth_n_s16_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvth_n_s64_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvth_n_u16_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvth_n_u64_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvth_s16_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvth_s64_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvth_u16_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvth_u64_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvtmh_s16_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvtmh_s64_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvtmh_u16_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvtmh_u64_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvtnh_s16_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvtnh_s64_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvtnh_u16_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvtnh_u64_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvtph_s16_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvtph_s64_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvtph_u16_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vcvtph_u64_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vfmash_lane_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vmaxh_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vminh_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vmulh_lane_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vmulxh_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vmulxh_lane_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vrecpeh_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vrecpsh_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vrecpxh_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vrsqrteh_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vrsqrtsh_f16_1.c: New.




Re: [AArch64][13/14] ARMv8.2-A testsuite for new vector intrinsics

2016-10-10 Thread James Greenhalgh
On Thu, Jul 07, 2016 at 05:19:25PM +0100, Jiong Wang wrote:
> This patch contains testcases for those new vector intrinsics which are only
> available for AArch64.


OK.

Thanks,
James

> gcc/testsuite/
> 2016-07-07  Jiong Wang 
> 
> * gcc.target/aarch64/advsimd-intrinsics/vdiv_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vfmas_lane_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vfmas_n_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vmaxnmv_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vmaxv_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vminnmv_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vminv_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vmul_lane_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vmulx_f16_1.c: New
> * gcc.target/aarch64/advsimd-intrinsics/vmulx_lane_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vmulx_n_f16_1.c: New
> * gcc.target/aarch64/advsimd-intrinsics/vpminmaxnm_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vrndi_f16_1.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vsqrt_f16_1.c: New.
> 




Re: PING: [PATCH] Be more conservative in early inliner if FDO is enabled

2016-10-10 Thread Richard Biener
On Mon, Oct 10, 2016 at 4:23 AM, Yuan, Pengfei  wrote:
> Hi,
>
> What is the decision on this patch?
> https://gcc.gnu.org/ml/gcc-patches/2016-09/msg01041.html

Honza approved the patch already.

Richard.

> Regards,
> Yuan, Pengfei
>
>> A new patch for trunk is attached.
>>
>> Regards,
>> Yuan, Pengfei
>>
>>
>> 2016-09-16  Yuan Pengfei  
>>
>>   * doc/invoke.texi (--param early-inlining-insns-feedback): New.
>>   * ipa-inline.c (want_early_inline_function_p): Use
>>   PARAM_EARLY_INLINING_INSNS_FEEDBACK when FDO is enabled.
>>   * params.def (PARAM_EARLY_INLINING_INSNS_FEEDBACK): Define.
>>   (PARAM_EARLY_INLINING_INSNS): Change help string accordingly.
>>
>>
>> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
>> index 8eb5eff..6e7659a 100644
>> --- a/gcc/doc/invoke.texi
>> +++ b/gcc/doc/invoke.texi
>> @@ -9124,12 +9124,18 @@ given call expression.  This parameter limits 
>> inlining only to call expressions
>>  whose probability exceeds the given threshold (in percents).
>>  The default value is 10.
>>
>>  @item early-inlining-insns
>> +@itemx early-inlining-insns-feedback
>>  Specify growth that the early inliner can make.  In effect it increases
>>  the amount of inlining for code having a large abstraction penalty.
>>  The default value is 14.
>>
>> +The @option{early-inlining-insns-feedback} parameter is used only when
>> +profile feedback-directed optimizations are enabled (by
>> +@option{-fprofile-generate} or @option{-fprofile-use}).
>> +The default value is 2.
>> +
>>  @item max-early-inliner-iterations
>>  Limit of iterations of the early inliner.  This basically bounds
>>  the number of nested indirect calls the early inliner can resolve.
>>  Deeper chains are still handled by late inlining.
>> diff --git a/gcc/ipa-inline.c b/gcc/ipa-inline.c
>> index 5c9366a..e028c08 100644
>> --- a/gcc/ipa-inline.c
>> +++ b/gcc/ipa-inline.c
>> @@ -594,10 +594,17 @@ want_early_inline_function_p (struct cgraph_edge *e)
>>  }
>>else
>>  {
>>int growth = estimate_edge_growth (e);
>> +  int growth_limit;
>>int n;
>>
>> +  if ((profile_arc_flag && !flag_test_coverage)
>> +   || (flag_branch_probabilities && !flag_auto_profile))
>> + growth_limit = PARAM_VALUE (PARAM_EARLY_INLINING_INSNS_FEEDBACK);
>> +  else
>> + growth_limit = PARAM_VALUE (PARAM_EARLY_INLINING_INSNS);
>> +
>>if (growth <= 0)
>>   ;
>>else if (!e->maybe_hot_p ()
>>  && growth > 0)
>> @@ -610,9 +617,9 @@ want_early_inline_function_p (struct cgraph_edge *e)
>>xstrdup_for_dump (callee->name ()), callee->order,
>>growth);
>> want_inline = false;
>>   }
>> -  else if (growth > PARAM_VALUE (PARAM_EARLY_INLINING_INSNS))
>> +  else if (growth > growth_limit)
>>   {
>> if (dump_file)
>>   fprintf (dump_file, "  will not early inline: %s/%i->%s/%i, "
>>"growth %i exceeds --param early-inlining-insns\n",
>> @@ -622,9 +629,9 @@ want_early_inline_function_p (struct cgraph_edge *e)
>>growth);
>> want_inline = false;
>>   }
>>else if ((n = num_calls (callee)) != 0
>> -&& growth * (n + 1) > PARAM_VALUE (PARAM_EARLY_INLINING_INSNS))
>> +&& growth * (n + 1) > growth_limit)
>>   {
>> if (dump_file)
>>   fprintf (dump_file, "  will not early inline: %s/%i->%s/%i, "
>>"growth %i exceeds --param early-inlining-insns "
>> diff --git a/gcc/params.def b/gcc/params.def
>> index 79b7dd4..91ea513 100644
>> --- a/gcc/params.def
>> +++ b/gcc/params.def
>> @@ -199,12 +199,20 @@ DEFPARAM(PARAM_INLINE_UNIT_GROWTH,
>>  DEFPARAM(PARAM_IPCP_UNIT_GROWTH,
>>"ipcp-unit-growth",
>>"How much can given compilation unit grow because of the 
>> interprocedural constant propagation (in percent).",
>>10, 0, 0)
>> -DEFPARAM(PARAM_EARLY_INLINING_INSNS,
>> -  "early-inlining-insns",
>> -  "Maximal estimated growth of function body caused by early inlining 
>> of single call.",
>> -  14, 0, 0)
>> +DEFPARAM (PARAM_EARLY_INLINING_INSNS_FEEDBACK,
>> +   "early-inlining-insns-feedback",
>> +   "Maximal estimated growth of function body caused by early "
>> +   "inlining of single call.  Used when profile feedback-directed "
>> +   "optimizations are enabled.",
>> +   2, 0, 0)
>> +DEFPARAM (PARAM_EARLY_INLINING_INSNS,
>> +   "early-inlining-insns",
>> +   "Maximal estimated growth of function body caused by early "
>> +   "inlining of single call.  Used when profile feedback-directed "
>> +   "optimizations are not enabled.",
>> +   14, 0, 0)
>>  DEFPARAM(PARAM_LARGE_STACK_FRAME,
>>"large-stack-frame",
>>"The size of stack frame to be considered large.",
>>256, 0, 0)
>


Re: [AArch64][12/14] ARMv8.2-A testsuite for new data movement intrinsics

2016-10-10 Thread James Greenhalgh
On Thu, Jul 07, 2016 at 05:19:09PM +0100, Jiong Wang wrote:
> This patch contains testcases for those new scalar intrinsics which are only
> available for AArch64.

OK.

Thanks,
James

> 
> gcc/testsuite/
> 2016-07-07  Jiong Wang 
> 
> * gcc.target/aarch64/advsimd-intrinsics/arm-neon-ref.h
> (FP16_SUPPORTED):
> Enable AArch64.
> * gcc.target/aarch64/advsimd-intrinsics/vdup_lane.c: Add
> support for
> vdup*_laneq.
> * gcc.target/aarch64/advsimd-intrinsics/vduph_lane.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vtrn_half.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vuzp_half.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vzip_half.c: New.
> 



Re: [PATCH, PR77558] Remove RECORD_TYPE special-casing in std_canonical_va_list_type

2016-10-10 Thread Richard Biener
On Sun, Sep 25, 2016 at 11:08 AM, Tom de Vries  wrote:
> Hi,
>
> this patch fixes PR77558, an ice-on-invalid-code 6/7 regression.
>
> The fix for PR71602 introduced the invalid-code test-case
> c-c++-common/va-arg-va-list-type.c:
> ...
> __builtin_va_list *pap;
>
> void
> fn1 (void)
> {
>   __builtin_va_arg (pap, double); /* { dg-error "first argument to 'va_arg'
> not of type 'va_list'" } */
> }
> ...
>
> The test-case passes for x86_64, but fails for aarch64 and ICEs for arm.
>
> The ICE happens because the patch for PR71602 is incomplete. The patch tries
> to be more strict about returning a canonical va_list only for actual
> va_lists, but doesn't implement this for structure va_list types, such as we
> have for arm, aarch64 and alpha.
>
> This patch adds the missing part, and fixes the ICE.
>
> OK for trunk, 6-branch?

Ok.

Richard.

> Thanks,
> - Tom


Re: Compile-time improvement for if conversion.

2016-10-10 Thread Richard Biener
On Wed, Oct 5, 2016 at 3:22 PM, Yuri Rumyantsev  wrote:
> Hi All,
>
> Here is implementation of Richard proposal:
>
> < For general infrastructure it would be nice to expose a (post-)dominator
> < compute for MESE (post-dominators) / SEME (dominators) regions.  I believe
> < what makes if-conversion expensive is the post-dom compute which happens
> < for each loop for the whole function.  It shouldn't be very difficult
> < to write this,
> < sharing as much as possible code with the current DOM code might need
> < quite some refactoring though.
>
> I implemented this proposal by adding calculation of dominance info
> for SESE regions and incorporate this change to if conversion pass.
> SESE region is built by adding loop pre-header and possibly fake
> post-header blocks to loop body. Fake post-header is deleted after
> predication completion.
>
> Bootstrapping and regression testing did not show any new failures.
>
> Is it OK for trunk?

It's mostly reasonable but I have a few comments.  First, re-using
bb->dom[] for the dominator info is somewhat fragile but indeed
a requirement to make the patch reasonably small.  Please,
in calculate_dominance_info_for_region, make sure that
!dom_info_available_p (dir).

You pass loop * everywhere but require ->aux to be set up as
an array of BBs forming the region with special BBs at array ends.

Please instead pass in a vec which avoids using ->aux
and also allows other non-loop-based SESE regions to be used
(I couldn't spot anything that relies on this being a loop).

Adding a convenience wrapper for loop  * would be of course nice,
to cover the special pre/post-header code in tree-if-conv.c.

In theory a SESE region is fully specified by its entry end exit _edge_,
so you might want to see if it's possible to use such a pair of edges
to guard the dfs/idom walks to avoid the need to create fake blocks.

Btw, instead of using create_empty_bb, unchecked_make_edge, etc.
please use split_edge() of the entry/exit edges.

Richard.

> ChangeLog:
> 2016-10-05  Yuri Rumyantsev  
>
> * dominance.c : Include cfgloop.h for loop recognition.
> (dom_info): Add new functions and add boolean argument to recognize
> computation for loop region.
> (dom_info::dom_info): New function.
> (dom_info::calc_dfs_tree): Add boolean argument IN_REGION to not
> handle unvisited blocks.
> (dom_info::calc_idoms): Likewise.
> (compute_dom_fast_query_in_region): New function.
> (calculate_dominance_info): Invoke calc_dfs_tree and calc_idoms with
> false argument.
> (calculate_dominance_info_for_region): New function.
> (free_dominance_info_for_region): Likewise.
> (verify_dominators): Invoke calc_dfs_tree and calc_idoms with false
> argument.
> * dominance.h: Add prototype for introduced functions
> calculate_dominance_info_for_region and
> free_dominance_info_for_region.
> tree-if-conv.c: Add to local variables ifc_sese_bbs & fake_postheader.
> (build_sese_region): New function.
> (if_convertible_loop_p_1): Invoke local version of post-dominators
> calculation, free it after basic block predication and delete created
> fake post-header block if any.
> (tree_if_conversion): Delete call of free_dominance_info for
> post-dominators, free ifc_sese_bbs which represents SESE region.
> (pass_if_conversion::execute): Delete detection of infinite loops
> and fake edges to exit block since post-dominator calculation is
> performed per if-converted loop only.


[Ada] Fix inter-unit inlining failure

2016-10-10 Thread Eric Botcazou
This is a regression present on the mainline and 6 branch: the compiler fails 
to inline across units a function declared with pragma Inline_Always because 
the middle-end detects a type mismatch for an argument, after gimplification 
removed a conversion.  The fix is to make the conversion more robust.

Tested on x86_64-suse-linux, applied on mainline and 6 branch.


2016-10-10  Eric Botcazou  

* gcc-interface/utils2.c (find_common_type): Do not return the LHS
type if it's an array with non-constant lower bound and the RHS type
is an array with a constant one.


2016-10-10  Eric Botcazou  

* gnat.dg/inline13.ad[sb]: New test.
* gnat.dg/inline13_pkg.ad[sb]: New helper.

-- 
Eric Botcazou-- { dg-do compile }
-- { dg-options "-O -gnatn" }

package body Inline13 is

  function F (L : Arr) return String is
Local : Arr (1 .. L'Length);
Ret : String (1 .. L'Length);
Pos : Natural := 1;
  begin
Local (1 .. L'Length) := L;
for I in 1 .. Integer (L'Length) loop
   Ret (Pos .. Pos + 8) := " " & Inline13_Pkg.Padded (Local (I));
   Pos := Pos + 9;
end loop;
return Ret;
  end;

end Inline13;
with Inline13_Pkg;

package Inline13 is

  type Arr is array (Positive range <>) of Inline13_Pkg.T;

  function F (L : Arr) return String;

end Inline13;
package body Inline13_Pkg is

  function Padded (Value : T) return Padded_T is
  begin
return Padded_T(Value);
  end Padded;

end Inline13_Pkg;
package Inline13_Pkg is

  subtype Padded_T is String (1..8);

  type T is new Padded_T;

  function Padded (Value : T) return Padded_T;
  pragma Inline_Always (Padded);

end Inline13_Pkg;
Index: gcc-interface/utils2.c
===
--- gcc-interface/utils2.c	(revision 240890)
+++ gcc-interface/utils2.c	(working copy)
@@ -215,27 +215,40 @@ find_common_type (tree t1, tree t2)
  calling into build_binary_op), some others are really expected and we
  have to be careful.  */
 
+  const bool variable_record_on_lhs
+= (TREE_CODE (t1) == RECORD_TYPE
+   && TREE_CODE (t2) == RECORD_TYPE
+   && get_variant_part (t1)
+   && !get_variant_part (t2));
+
+  const bool variable_array_on_lhs
+= (TREE_CODE (t1) == ARRAY_TYPE
+   && TREE_CODE (t2) == ARRAY_TYPE
+   && !TREE_CONSTANT (TYPE_MIN_VALUE (TYPE_DOMAIN (t1)))
+   && TREE_CONSTANT (TYPE_MIN_VALUE (TYPE_DOMAIN (t2;
+
   /* We must avoid writing more than what the target can hold if this is for
  an assignment and the case of tagged types is handled in build_binary_op
  so we use the lhs type if it is known to be smaller or of constant size
  and the rhs type is not, whatever the modes.  We also force t1 in case of
  constant size equality to minimize occurrences of view conversions on the
- lhs of an assignment, except for the case of record types with a variant
- part on the lhs but not on the rhs to make the conversion simpler.  */
+ lhs of an assignment, except for the case of types with a variable part
+ on the lhs but not on the rhs to make the conversion simpler.  */
   if (TREE_CONSTANT (TYPE_SIZE (t1))
   && (!TREE_CONSTANT (TYPE_SIZE (t2))
 	  || tree_int_cst_lt (TYPE_SIZE (t1), TYPE_SIZE (t2))
 	  || (TYPE_SIZE (t1) == TYPE_SIZE (t2)
-	  && !(TREE_CODE (t1) == RECORD_TYPE
-		   && TREE_CODE (t2) == RECORD_TYPE
-		   && get_variant_part (t1)
-		   && !get_variant_part (t2)
+	  && !variable_record_on_lhs
+	  && !variable_array_on_lhs)))
 return t1;
 
-  /* Otherwise, if the lhs type is non-BLKmode, use it.  Note that we know
- that we will not have any alignment problems since, if we did, the
- non-BLKmode type could not have been used.  */
-  if (TYPE_MODE (t1) != BLKmode)
+  /* Otherwise, if the lhs type is non-BLKmode, use it, except for the case of
+ a non-BLKmode rhs and array types with a variable part on the lhs but not
+ on the rhs to make sure the conversion is preserved during gimplification.
+ Note that we know that we will not have any alignment problems since, if
+ we did, the non-BLKmode type could not have been used.  */
+  if (TYPE_MODE (t1) != BLKmode
+  && (TYPE_MODE (t2) == BLKmode || !variable_array_on_lhs))
 return t1;
 
   /* If the rhs type is of constant size, use it whatever the modes.  At


Re: [RFC][VRP] Improve intersect_ranges

2016-10-10 Thread Richard Biener
On Sat, Oct 8, 2016 at 9:38 PM, kugan  wrote:
> Hi Richard,
>
> Thanks for the review.
> On 07/10/16 20:11, Richard Biener wrote:
>>
>> On Fri, Oct 7, 2016 at 12:00 AM, kugan
>>  wrote:
>>>
>>> Hi,
>>>
>>> In vrp intersect_ranges, Richard recently changed it to create integer
>>> value
>>> ranges when it is integer singleton.
>>>
>>> Maybe we should do the same when the other range is a complex ranges with
>>> SSA_NAME (like [x+2, +INF])?
>>>
>>> Attached patch tries to do this. There are cases where it will be
>>> beneficial
>>> as the  testcase in the patch. (For this testcase to work with Early VRP,
>>> we
>>> need the patch posted at
>>> https://gcc.gnu.org/ml/gcc-patches/2016-10/msg00413.html)
>>>
>>> Bootstrapped and regression tested on x86_64-linux-gnu with no new
>>> regressions.
>>
>>
>> This is not clearly a win, in fact it can completely lose an ASSERT_EXPR
>> because there is no way to add its effect back as an equivalence.  The
>> current choice of always using the "left" keeps the ASSERT_EXPR range
>> and is able to record the other range via an equivalence.
>
>
> How about changing the order in Early VRP when we are dealing with the same
> SSA_NAME in inner and outer scope. Here is a patch that does this. Is this
> OK if no new regressions?

I'm not sure if this is a good way forward.  The failure with the testcase is
that we don't extract a range for k from if (j < k) which I believe another
patch from you addresses?

As said the issue is with the equivalence / value-range representation so
you can't do sth like

  /* Discover VR when condition is true.  */
  extract_range_for_var_from_comparison_expr (op0, code, op0, op1, );
  if (old_vr->type == VR_RANGE || old_vr->type == VR_ANTI_RANGE)
vrp_intersect_ranges (, old_vr);

  /* If we found any usable VR, set the VR to ssa_name and create a
 PUSH old value in the stack with the old VR.  */
  if (vr.type == VR_RANGE || vr.type == VR_ANTI_RANGE)
{
  new_vr = vrp_value_range_pool.allocate ();
  *new_vr = vr;
  push_value_range (op0, new_vr);
  ->>>  add equivalence to old_vr for new_vr.

because old_vr and new_vr are the 'same' (they are associated with SSA name op0)

Richard.

> Thanks,
> Kugan
>
>
>
>
>
>> My thought on this was that we need to separate "ranges" and associated
>> SSA names so we can introduce new ranges w/o the need for an SSA name
>> (and thus we can create an equivalence to the ASSERT_EXPR range).
>> IIRC I started on this at some point but never finished it ...
>>
>> Richard.
>>
>>> Thanks,
>>> Kugan
>>>
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> 2016-10-07  Kugan Vivekanandarajah  
>>>
>>> * gcc.dg/tree-ssa/evrp6.c: New test.
>>>
>>> gcc/ChangeLog:
>>>
>>> 2016-10-07  Kugan Vivekanandarajah  
>>>
>>> * tree-vrp.c (intersect_ranges): If we failed to handle
>>> the intersection and the other range involves computation with
>>> symbolic values, choose integer range if available.
>>>
>>>
>>>
>


Re: [VRP] Allocate bitmap before copying

2016-10-10 Thread Richard Biener
On Sat, Oct 8, 2016 at 9:34 PM, kugan  wrote:
> Hi,
>
> In vrp_intersect_ranges_1, when !vr0->equiv, we are copying vr1->equiv
> without allocating bitmap. This patch fixes this.
>
> Bootstrap and regression testing are ongoing. Is this OK if no new
> regressions?

Ok for trunk and branches.

Richard.

> Thanks,
> Kugan
>
> gcc/ChangeLog:
>
> 2016-10-09  Kugan Vivekanandarajah  
>
> * tree-vrp.c (vrp_intersect_ranges_1): Allocate bitmap before
>   copying.


Re: [patch] Fix GC issue triggered by arithmetic overflow checking

2016-10-10 Thread Richard Biener
On Sat, Oct 8, 2016 at 8:56 PM, Eric Botcazou  wrote:
> Hi,
>
> adding patterns for unsigned arithmetic overflow checking in a back-end can
> have unexpected fallout because of a latent GC issue: when they are present,
> GIMPLE optimization passes can create complex (math. sense) types at will by
> invoking build_complex_type.  Now build_complex_type goes through the type
> caonicalization hashtable, which is GC-ed, so its behavior depends on the
> actual collection points.
>
> The other type-building functions present in tree.c do the same so no big deal
> but build_complex_type is special because it also does:
>
>   /* We need to create a name, since complex is a fundamental type.  */
>   if (! TYPE_NAME (t))
> {
>   const char *name;
>   if (component_type == char_type_node)
> name = "complex char";
>   else if (component_type == signed_char_type_node)
> name = "complex signed char";
>   else if (component_type == unsigned_char_type_node)
> name = "complex unsigned char";
>   else if (component_type == short_integer_type_node)
> name = "complex short int";
>   else if (component_type == short_unsigned_type_node)
> name = "complex short unsigned int";
>   else if (component_type == integer_type_node)
> name = "complex int";
>   else if (component_type == unsigned_type_node)
> name = "complex unsigned int";
>   else if (component_type == long_integer_type_node)
> name = "complex long int";
>   else if (component_type == long_unsigned_type_node)
> name = "complex long unsigned int";
>   else if (component_type == long_long_integer_type_node)
> name = "complex long long int";
>   else if (component_type == long_long_unsigned_type_node)
> name = "complex long long unsigned int";
>   else
> name = 0;
>
>   if (name != 0)
> TYPE_NAME (t) = build_decl (UNKNOWN_LOCATION, TYPE_DECL,
> get_identifier (name), t);
> }
>
> so it creates a DECL node every time a new canonical complex type is created,
> bumping the DECL_UID counter in the process.  Which means that the DECL_UID
> counter is sensitive to the collection points, which in turn means that the
> result of algorithms depending on the DECL_UID counter also is.
>
> This for example resulted in a bootstrap comparison failure on a SPARC/Solaris
> machine doing a strict stage2/stage3 comparison because the contents of the
> .debug_loc section were different: location lists computed by var-tracking
> were slightly different because of a different hashing.
>
> I'm not sure whether the hashing done by var-tracking should be sensitive to
> the DECL_UID of nodes or not, but I think that having the DECL_UID counter
> depend on the collection points is highly undesirable, so the attached patch
> attempts to prevent it; it at least fixed the bootstrap comparison failure.

I believe the rule is that you might only depend on the order of objects
with respect to their DECL_UID, not the actual value of the DECL_UID.
As var-tracking shouldn't look at TYPE_DECLs (?) it's probably a latent
var-tracking bug as well.

> Tested on x86_64-suse-linux, OK for the mainline?

I'd prefer the named parameter to be defaulted to false and the few
places in the FEs fixed (eventually that name business should be
handled like names for nodes like integer_type_node -- I see no
reason why build_complex_type should have this special-case at all!
That is, why are the named vairants in the type hash in the first place?)

Richard.

>
> 2016-10-08  Eric Botcazou  
>
> * tree.h (build_complex_type): Add second parameter with default.
> * builtins.c (expand_builtin_cexpi): Pass false in call to above.
> (fold_builtin_sincos): Likewise.
> (fold_builtin_arith_overflow): Likewise.
> * gimple-fold.c (fold_builtin_atomic_compare_exchange): Likewise.
> (gimple_fold_call): Likewise.
> * stor-layout.c (bitwise_type_for_mode): Likewise.
> * tree-ssa-dce.c (maybe_optimize_arith_overflow): Likewise.
> * tree-ssa-math-opts.c (match_uaddsub_overflow): Likewise.
> * tree.c (build_complex): Likewise.
> (build_complex_type): Add NAMED second parameter and adjust recursive
> call.  Create a TYPE_DECL only if NAMED is true.
>
> --
> Eric Botcazou


Re: [AArch64][11/14] ARMv8.2-A FP16 testsuite selector

2016-10-10 Thread James Greenhalgh
On Thu, Jul 07, 2016 at 05:18:41PM +0100, Jiong Wang wrote:
> ARMv8.2-A adds support for scalar and vector FP16 instructions to ARM and
> AArch64. This patch adds support for testing code for AArch64 targets
> using the new instructions. It is based on the target-support code for
> ARMv8.2-A added for ARM (AArch32).

OK.

Thanks,
James

> gcc/testsuite/
> 2016-07-07  Matthew Wahab 
> Jiong Wang 
> 
> * target-supports.exp (add_options_for_arm_v8_2a_fp16_scalar):
> Mention AArch64 support.
> (add_options_for_arm_v8_2a_fp16_neon): Likewise.
> (check_effective_target_arm_v8_2a_fp16_scalar_ok_nocache): Support
> AArch64 targets.
> (check_effective_target_arm_v8_2a_fp16_neon_ok_nocache): Support
> AArch64 targets.
> (check_effective_target_arm_v8_2a_fp16_scalar_hw): Support AArch64
> targets.
> (check_effective_target_arm_v8_2a_fp16_neon_hw): Likewise.
> 



[Ada] Fix wrong code with biased subtype

2016-10-10 Thread Eric Botcazou
This is a regression present on all active branches for a subtype of a biased 
type declared with explicit constraints, which is itself biased.  The compiler 
generates code that computes a wrong value for the conversion to an integer.

Tested on x86_64-suse-linux, applied on all active branches.


2016-10-10  Eric Botcazou  

* gcc-interface/utils.c (convert): For a biased input type, convert
the bias itself to the base type before adding it.


2016-10-10  Eric Botcazou  

* gnat.dg/biased_subtype.adb: New test.

-- 
Eric BotcazouIndex: gcc-interface/utils.c
===
--- gcc-interface/utils.c	(revision 240890)
+++ gcc-interface/utils.c	(working copy)
@@ -4193,12 +4193,15 @@ convert (tree type, tree expr)
   return convert (type, unpadded);
 }
 
-  /* If the input is a biased type, adjust first.  */
+  /* If the input is a biased type, convert first to the base type and add
+ the bias.  Note that the bias must go through a full conversion to the
+ base type, lest it is itself a biased value; this happens for subtypes
+ of biased types.  */
   if (ecode == INTEGER_TYPE && TYPE_BIASED_REPRESENTATION_P (etype))
 return convert (type, fold_build2 (PLUS_EXPR, TREE_TYPE (etype),
    fold_convert (TREE_TYPE (etype), expr),
-   fold_convert (TREE_TYPE (etype),
-		 TYPE_MIN_VALUE (etype;
+   convert (TREE_TYPE (etype),
+		TYPE_MIN_VALUE (etype;
 
   /* If the input is a justified modular type, we need to extract the actual
  object before converting it to any other type with the exceptions of an
@@ -4502,7 +4505,12 @@ convert (tree type, tree expr)
 	  && (ecode == ARRAY_TYPE || ecode == UNCONSTRAINED_ARRAY_TYPE
 	  || (ecode == RECORD_TYPE && TYPE_CONTAINS_TEMPLATE_P (etype
 	return unchecked_convert (type, expr, false);
-  else if (TYPE_BIASED_REPRESENTATION_P (type))
+
+  /* If the output is a biased type, convert first to the base type and
+	 subtract the bias.  Note that the bias itself must go through a full
+	 conversion to the base type, lest it is a biased value; this happens
+	 for subtypes of biased types.  */
+  if (TYPE_BIASED_REPRESENTATION_P (type))
 	return fold_convert (type,
 			 fold_build2 (MINUS_EXPR, TREE_TYPE (type),
 	  convert (TREE_TYPE (type), expr),
-- { dg-do run }
-- { dg-options "-gnatws" }

procedure Biased_Subtype is

   CIM_Max_AA : constant := 9_999_999;
   CIM_Min_AA : constant := -999_999;

   type TIM_AA is range CIM_Min_AA..CIM_Max_AA + 1;
   for TIM_AA'Size use 24;

   subtype STIM_AA is TIM_AA range TIM_AA(CIM_Min_AA)..TIM_AA(CIM_Max_AA);

   SAA : STIM_AA := 1;

begin
   if Integer(SAA) /= 1 then
 raise Program_Error;
   end if;
end;


  1   2   >