date:20180405

Re: [PATCH v2] RISC-V: Support for FreeBSD

2018-04-05 Thread Kito Cheng

Hi Jim:

Theodore Teah sent an mail say "Your assignment/disclaimer process
with the FSF is currently complete.".

Could you help us to commit that?

Thanks :)

On Fri, Feb 23, 2018 at 2:33 AM, Jim Wilson  wrote:
> On Wed, Feb 21, 2018 at 10:34 PM, Kito Cheng  wrote:
>> I don't family with copyright matters, so we can't commit this patch yet
>> until Ruslan send the signed copy and FSF signed it? right?
>
> Yes, I'd prefer that the FSF sign it and add it to the copyright list
> before we commit the patch.
>
> Jim

Re: [PATCH][GCC][mid-end] Fix PR85123 incorrect copies

2018-04-05 Thread Alan Modra

On Thu, Apr 05, 2018 at 01:29:06PM +0100, Tamar Christina wrote:
> diff --git a/gcc/expr.c b/gcc/expr.c
> index 
> 00660293f72e5441a6421a280b04c57fca2922b8..7daeb8c91d758edf0b3dc37f6927380b6f3df877
>  100644
> --- a/gcc/expr.c
> +++ b/gcc/expr.c
> @@ -2749,7 +2749,7 @@ copy_blkmode_to_reg (machine_mode mode_in, tree src)
>  {
>int i, n_regs;
>unsigned HOST_WIDE_INT bitpos, xbitpos, padding_correction = 0, bytes;
> -  unsigned int bitsize;
> +   unsigned int bitsize = 0;
>rtx *dst_words, dst, x, src_word = NULL_RTX, dst_word = NULL_RTX;
>/* No current ABI uses variable-sized modes to pass a BLKmnode type.  */
>fixed_size_mode mode = as_a  (mode_in);
> @@ -2782,7 +2782,7 @@ copy_blkmode_to_reg (machine_mode mode_in, tree src)
>  
>n_regs = (bytes + UNITS_PER_WORD - 1) / UNITS_PER_WORD;
>dst_words = XALLOCAVEC (rtx, n_regs);
> -  bitsize = BITS_PER_WORD;
> +
>if (targetm.slow_unaligned_access (word_mode, TYPE_ALIGN (TREE_TYPE 
> (src
>  bitsize = MIN (TYPE_ALIGN (TREE_TYPE (src)), BITS_PER_WORD);
>  

You calculate bitsize here, then override it in the loop?  Doesn't
that mean strict align targets will use mis-aligned loads and stores?

> @@ -2791,6 +2791,17 @@ copy_blkmode_to_reg (machine_mode mode_in, tree src)
> bitpos < bytes * BITS_PER_UNIT;
> bitpos += bitsize, xbitpos += bitsize)
>  {
> +  /* Find the largest integer mode that can be used to copy all or as
> +  many bits as possible of the structure.  */
> +  opt_scalar_int_mode mode_iter;
> +  FOR_EACH_MODE_IN_CLASS (mode_iter, MODE_INT)
> +  if (GET_MODE_BITSIZE (mode_iter.require ())
> + <= ((bytes * BITS_PER_UNIT) - bitpos)
> +   && GET_MODE_BITSIZE (mode_iter.require ()) <= BITS_PER_WORD)
> + bitsize = GET_MODE_BITSIZE (mode_iter.require ());
> +  else
> + break;
> +

This isn't correct.  Consider a 6 byte struct on a 4 byte word, 8 bit
byte, big-endian target when targetm.calls.return_in_msb is false.

In this scenario, copy_blkmode_to_reg should return two registers, set
as if they had been loaded from two words in memory laid out as
follows (left to right increasing byte addresses):
 ___ ___
|  0  |  0  |  s0 |  s1 |   |  s2 |  s3 |  s4 |  s5 |
 ~~~ ~~~

So we will have xbitpos=16 first time around the loop.  That means
your new code will attempt to store 32 bits into a bit-field starting
at bit 16 in the first 32-bit register, and of course fail.

This scenario used to be handled correctly, at least when the struct
wasn't over-aligned.

-- 
Alan Modra
Australia Development Lab, IBM

[PATCH v2] C++: suggest missing headers for implicit use of "std" (PR c++/85021)

2018-04-05 Thread David Malcolm

On Thu, 2018-03-29 at 15:25 -0400, Jason Merrill wrote:
> On Thu, Mar 22, 2018 at 7:44 PM, David Malcolm 
> wrote:
> > We provide fix-it hints for the most common "std" names when an
> > explicit
> > "std::" prefix is present, however we don't yet provide fix-it
> > hints for
> > this implicit case:
> > 
> >   using namespace std;
> >   void f() {  cout << "test"; }
> > 
> > for which we emit:
> > 
> >   t.cc: In function 'void f()':
> >   t.cc:2:13: error: 'cout' was not declared in this scope
> >void f() {  cout << "test"; }
> >^~~~
> > 
> > This patch detects if a "using namespace std;" directive is present
> > in the current namespace, and if so, offers a suggestion for
> > unrecognized names that are in our list of common "std" names:
> > 
> >   t.cc: In function 'void f()':
> >   t.cc:2:13: error: 'cout' was not declared in this scope
> >void f() {  cout << "test"; }
> >^~~~
> >   t.cc:2:13: note: 'std::cout' is defined in header '';
> > did you forget to '#include '?
> >   +#include 
> >using namespace std;
> >void f() {  cout << "test"; }
> >^~~~
> > 
> > Is there a better way to test for the using directive?
> > 
> > Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
> > 
> > OK for trunk?
> > 
> > gcc/cp/ChangeLog:
> > PR c++/85021
> > * name-lookup.c (has_using_namespace_std_directive_p): New
> > function.
> > (suggest_alternatives_for): Simplify if/else logic using
> > early
> > returns.  If no candidates were found, and there's a
> > "using namespace std;" directive, call
> > maybe_suggest_missing_std_header.
> > (maybe_suggest_missing_header): Split later part of the
> > function
> > into..
> > (maybe_suggest_missing_std_header): New.
> > 
> > gcc/testsuite/ChangeLog:
> > PR c++/85021
> > * g++.dg/lookup/missing-std-include-7.C: New test.
> > ---
> >  gcc/cp/name-lookup.c   | 68
> > +-
> >  .../g++.dg/lookup/missing-std-include-7.C  | 16 +
> >  2 files changed, 70 insertions(+), 14 deletions(-)
> >  create mode 100644 gcc/testsuite/g++.dg/lookup/missing-std-
> > include-7.C
> > 
> > diff --git a/gcc/cp/name-lookup.c b/gcc/cp/name-lookup.c
> > index 061729a..4eb980e 100644
> > --- a/gcc/cp/name-lookup.c
> > +++ b/gcc/cp/name-lookup.c
> > @@ -41,6 +41,7 @@ static cxx_binding *cxx_binding_make (tree value,
> > tree type);
> >  static cp_binding_level *innermost_nonclass_level (void);
> >  static void set_identifier_type_value_with_scope (tree id, tree
> > decl,
> >   cp_binding_level
> > *b);
> > +static bool maybe_suggest_missing_std_header (location_t location,
> > tree name);
> > 
> >  /* Create an overload suitable for recording an artificial
> > TYPE_DECL
> > and another decl.  We use this machanism to implement the
> > struct
> > @@ -5327,6 +5328,23 @@ qualify_lookup (tree val, int flags)
> >return true;
> >  }
> > 
> > +/* Is there a "using namespace std;" directive within the current
> > +   namespace?  */
> > +
> > +static bool
> > +has_using_namespace_std_directive_p ()
> > +{
> > +  vec *usings = DECL_NAMESPACE_USING
> > (current_namespace);
> 
> Checking in just the current namespace won't find a "using namespace
> std" in an inner or outer scope; I think you want to add something to
> name-lookup.c that iterates over the current enclosing scopes like
> name_lookup::search_unqualified.  Nathan can probably be more
> specific.
> 
> Jason

Thanks.

Here's an updated version of the patch.

Rather than just search in DECL_NAMESPACE_USING (current_namespace),
this version mimics name_lookup::search_unqualified by searching for local
using-directives in the current_binding_level until it reaches a sk_namespace,
and then searching in current_namespace and its ancestors.
(and builds out the test coverage accordingly)

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu; adds
57 PASS results to g++.sum.

OK for trunk?

gcc/cp/ChangeLog:
PR c++/85021
* name-lookup.c (using_directives_contain_std_p): New function.
(has_using_namespace_std_directive_p): New function.
(suggest_alternatives_for): Simplify if/else logic using early
returns.  If no candidates were found, and there's a
"using namespace std;" directive, call
maybe_suggest_missing_std_header.
(maybe_suggest_missing_header): Split later part of the function
into..
(maybe_suggest_missing_std_header): New.

gcc/testsuite/ChangeLog:
PR c++/85021
* g++.dg/lookup/missing-std-include-7.C: New test.
---
 gcc/cp/name-lookup.c   |  93 ---
 .../g++.dg/lookup/missing-std-include-7.C  | 100 +
 2 files changed, 179 insertions(+), 14 deletions(-)
 create mode

[doc PATCH] fix up C++ option references (PR 71283)

2018-04-05 Thread Martin Sebor


Attached is the final version of the patch to adjust the lists
of options (C++ Language Options and -Wall) to include missing
C++ options, reference the forms of options that aren't
the default, and use TexInfo tables for the lists of options
in -Wall and -Wextra to address Nathan's comment.  The patch
also fixes bug 71283.

Martin
PR c++/71283 - Inconsistent location for C++ warning options in the manual

gcc/ChangeLog:

	PR c++/71283
	* doc/invoke.texi (C++ Language Options): Add -Wno-terminate.
	Reference -Wno-conversion-null, -Wno-delete-incomplete, and
	-Wno-subobject-linkage instead of the default positive forms.
	(-Wall): Use a table for options.
	Mention -Wclass-memaccess.
	(-Wextra): Use a table for options.

Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi	(revision 259040)
+++ gcc/doc/invoke.texi	(working copy)
@@ -213,12 +213,12 @@ in the following sections.
 -fvisibility-inlines-hidden @gol
 -fvisibility-ms-compat @gol
 -fext-numeric-literals @gol
--Wabi=@var{n}  -Wabi-tag  -Wconversion-null  -Wctor-dtor-privacy @gol
+-Wabi=@var{n}  -Wabi-tag  -Wno-conversion-null  -Wctor-dtor-privacy @gol
 -Wdelete-non-virtual-dtor  -Wliteral-suffix  -Wmultiple-inheritance @gol
 -Wnamespaces  -Wnarrowing @gol
 -Wnoexcept  -Wnoexcept-type  -Wclass-memaccess @gol
 -Wnon-virtual-dtor  -Wreorder  -Wregister @gol
--Weffc++  -Wstrict-null-sentinel  -Wtemplates @gol
+-Weffc++  -Wstrict-null-sentinel  -Wtemplates  -Wno-terminate @gol
 -Wno-non-template-friend  -Wold-style-cast @gol
 -Woverloaded-virtual  -Wno-pmf-conversions @gol
 -Wsign-promo  -Wvirtual-inheritance}
@@ -270,9 +270,9 @@ Objective-C and Objective-C++ Dialects}.
 -Wc++-compat  -Wc++11-compat  -Wc++14-compat  @gol
 -Wcast-align  -Wcast-align=strict  -Wcast-function-type  -Wcast-qual  @gol
 -Wchar-subscripts  -Wchkp  -Wcatch-value  -Wcatch-value=@var{n} @gol
--Wclobbered  -Wcomment  -Wconditionally-supported @gol
+-Wclass-memaccess -Wclobbered  -Wcomment  -Wconditionally-supported @gol
 -Wconversion  -Wcoverage-mismatch  -Wno-cpp  -Wdangling-else  -Wdate-time @gol
--Wdelete-incomplete @gol
+-Wno-delete-incomplete @gol
 -Wno-deprecated  -Wno-deprecated-declarations  -Wno-designated-init @gol
 -Wdisabled-optimization @gol
 -Wno-discarded-qualifiers  -Wno-discarded-array-qualifiers @gol
@@ -318,7 +318,7 @@ Objective-C and Objective-C++ Dialects}.
 -Wstringop-overflow=@var{n} -Wstringop-truncation @gol
 -Wsuggest-attribute=@r{[}pure@r{|}const@r{|}noreturn@r{|}format@r{|}malloc@r{]} @gol
 -Wsuggest-final-types @gol  -Wsuggest-final-methods  -Wsuggest-override @gol
--Wmissing-format-attribute  -Wsubobject-linkage @gol
+-Wmissing-format-attribute  -Wno-subobject-linkage @gol
 -Wswitch  -Wswitch-bool  -Wswitch-default  -Wswitch-enum @gol
 -Wswitch-unreachable  -Wsync-nand @gol
 -Wsystem-headers  -Wtautological-compare  -Wtrampolines  -Wtrigraphs @gol
@@ -3006,8 +3006,20 @@ void h() @{ f(g); @}
 In C++14, @code{f} calls calls @code{f}, but in
 C++17 it calls @code{f}.
 
-@item -Wclass-memaccess @r{(C++ and Objective-C++ only)}
+@item -Wcatch-value
+@itemx -Wcatch-value=@var{n} @r{(C++ and Objective-C++ only)}
+@opindex Wcatch-value
+@opindex Wno-catch-value
+Warn about catch handlers that do not catch via reference.
+With @option{-Wcatch-value=1} (or @option{-Wcatch-value} for short)
+warn about polymorphic class types that are caught by value.
+With @option{-Wcatch-value=2} warn about all class types that are caught
+by value. With @option{-Wcatch-value=3} warn about all types that are
+not caught by reference. @option{-Wcatch-value} is enabled by @option{-Wall}.
+
+@item -Wclass-memaccess @r{(C++ only)}
 @opindex Wclass-memaccess
+@opindex Wno-class-memaccess
 Warn when the destination of a call to a raw memory function such as
 @code{memset} or @code{memcpy} is an object of class type, and when writing
 into such an object might bypass the class non-trivial or deleted constructor
@@ -3907,57 +3919,79 @@ Options} and @ref{Objective-C and Objective-C++ Di
 
 @option{-Wall} turns on the following warning flags:
 
-@gccoptlist{-Waddress   @gol
--Warray-bounds=1 @r{(only with} @option{-O2}@r{)}  @gol
--Wbool-compare  @gol
--Wbool-operation  @gol
--Wc++11-compat  -Wc++14-compat  @gol
--Wcatch-value @r{(C++ and Objective-C++ only)}  @gol
--Wchar-subscripts  @gol
--Wcomment  @gol
--Wduplicate-decl-specifier @r{(C and Objective-C only)} @gol
--Wenum-compare @r{(in C/ObjC; this is on by default in C++)} @gol
--Wformat   @gol
--Wint-in-bool-context  @gol
--Wimplicit @r{(C and Objective-C only)} @gol
--Wimplicit-int @r{(C and Objective-C only)} @gol
--Wimplicit-function-declaration @r{(C and Objective-C only)} @gol
--Winit-self @r{(only for C++)} @gol
--Wlogical-not-parentheses @gol
--Wmain @r{(only for C/ObjC and unless} @option{-ffreestanding}@r{)}  @gol
--Wmaybe-uninitialized @gol
--Wmemset-elt-size @gol
--Wmemset-transposed-args @gol
--Wmisleading-indentation @r{(only

Re: [C++ PATCH] Fix ICE with structured binding initialized from lambda in template (PR c++/85209)

2018-04-05 Thread Jason Merrill

OK.

On Thu, Apr 5, 2018 at 5:09 PM, Jakub Jelinek  wrote:
> Hi!
>
> For cp_finish_decomp we require that the id decl for the structured binding
> are chained together, because we pass just the artificial underlying decl
> and first id and cp_finish_decomp walks DECL_CHAIN to find other ids.
> tsubst_decomp_names was actually requiring not just that, but also that
> the artificial undering decl follows the last id, which is unnecessary and
> isn't actually true if the initializer has some lambdas in it, because those
> might appear in between.
>
> Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
> trunk?
>
> 2018-04-05  Jakub Jelinek  
>
> PR c++/85209
> * pt.c (tsubst_decomp_names): Don't fail or ICE if DECL_CHAIN (decl3)
> is not prev, if prev == decl.
>
> * g++.dg/cpp1z/decomp39.C: New test.
> * g++.dg/cpp1z/decomp40.C: New test.
>
> --- gcc/cp/pt.c.jj  2018-04-05 09:49:06.155459292 +0200
> +++ gcc/cp/pt.c 2018-04-05 15:02:40.290261785 +0200
> @@ -16230,7 +16230,8 @@ tsubst_decomp_names (tree decl, tree pat
>if (error_operand_p (decl3))
> decl = error_mark_node;
>else if (decl != error_mark_node
> -  && DECL_CHAIN (decl3) != prev)
> +  && DECL_CHAIN (decl3) != prev
> +  && decl != prev)
> {
>   gcc_assert (errorcount);
>   decl = error_mark_node;
> --- gcc/testsuite/g++.dg/cpp1z/decomp39.C.jj2018-04-05 15:10:17.053265429 
> +0200
> +++ gcc/testsuite/g++.dg/cpp1z/decomp39.C   2018-04-05 15:10:56.099265281 
> +0200
> @@ -0,0 +1,16 @@
> +// PR c++/85209
> +// { dg-do compile { target c++11 } }
> +// { dg-options "" }
> +
> +template 
> +void
> +foo ()
> +{
> +  auto [a] = []{}; // { dg-error "cannot decompose lambda closure type" }
> +}  // { dg-warning "structured bindings only available 
> with" "" { target c++14_down } .-1 }
> +
> +void
> +bar ()
> +{
> +  foo<0> ();
> +}
> --- gcc/testsuite/g++.dg/cpp1z/decomp40.C.jj2018-04-05 15:11:58.630265722 
> +0200
> +++ gcc/testsuite/g++.dg/cpp1z/decomp40.C   2018-04-05 15:11:49.402265657 
> +0200
> @@ -0,0 +1,18 @@
> +// PR c++/85209
> +// { dg-do compile { target c++11 } }
> +// { dg-options "" }
> +
> +struct S { int a; } s;
> +
> +template 
> +void
> +foo ()
> +{
> +  auto [a] = []{ return s; } ();   // { dg-warning "structured bindings 
> only available with" "" { target c++14_down } }
> +};
> +
> +void
> +bar ()
> +{
> +  foo<0> ();
> +}
>
> Jakub

Re: [C++ PATCH] Fix ICE with #pragma weak and structured bindings (PR c++/85208)

2018-04-05 Thread Jason Merrill

OK.

On Thu, Apr 5, 2018 at 5:06 PM, Jakub Jelinek  wrote:
> Hi!
>
> We can't call maybe_apply_pragma_weak for structured binding decls
> at start_decl time, as that is too early, we don't have DECL_ASSEMBLER_NAME
> computed yet for them and because we don't have the corresponding decls
> around yet, we can't even compute it.
>
> This patch defers the maybe_apply_pragma_weak call from start_decl to
> cp_maybe_mangle_decomp where we just added the mangling the call needs.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2018-04-05  Jakub Jelinek  
>
> PR c++/85208
> * decl.c (start_decl): For DECL_DECOMPOSITION_P decls, don't call
> maybe_apply_pragma_weak here...
> (cp_maybe_mangle_decomp): ... but call it here instead.
>
> * g++.dg/cpp1z/decomp42.C: New test.
>
> --- gcc/cp/decl.c.jj2018-04-04 21:33:20.532639396 +0200
> +++ gcc/cp/decl.c   2018-04-05 13:29:39.128215719 +0200
> @@ -5090,7 +5090,7 @@ start_decl (const cp_declarator *declara
>  }
>
>/* If #pragma weak was used, mark the decl weak now.  */
> -  if (!processing_template_decl)
> +  if (!processing_template_decl && !DECL_DECOMPOSITION_P (decl))
>  maybe_apply_pragma_weak (decl);
>
>if (TREE_CODE (decl) == FUNCTION_DECL
> @@ -7483,6 +7483,7 @@ cp_maybe_mangle_decomp (tree decl, tree
>for (unsigned int i = 0; i < count; i++, d = DECL_CHAIN (d))
> v[count - i - 1] = d;
>SET_DECL_ASSEMBLER_NAME (decl, mangle_decomp (decl, v));
> +  maybe_apply_pragma_weak (decl);
>  }
>  }
>
> --- gcc/testsuite/g++.dg/cpp1z/decomp42.C.jj2018-04-05 13:45:39.521220008 
> +0200
> +++ gcc/testsuite/g++.dg/cpp1z/decomp42.C   2018-04-05 13:45:22.507220268 
> +0200
> @@ -0,0 +1,9 @@
> +// PR c++/85208
> +// { dg-do compile { target c++11 } }
> +// { dg-require-weak "" }
> +// { dg-options "" }
> +
> +#pragma weak _ZDC1d1e1fE
> +struct A { int i, j, k; };
> +auto [a, b, c] = A (); // { dg-warning "structured bindings only available 
> with" "" { target c++14_down } }
> +auto [d, e, f] = A (); // { dg-warning "structured bindings only available 
> with" "" { target c++14_down } }
>
> Jakub

[C++ PATCH] Fix ICE with structured binding initialized from lambda in template (PR c++/85209)

2018-04-05 Thread Jakub Jelinek

Hi!

For cp_finish_decomp we require that the id decl for the structured binding
are chained together, because we pass just the artificial underlying decl
and first id and cp_finish_decomp walks DECL_CHAIN to find other ids.
tsubst_decomp_names was actually requiring not just that, but also that
the artificial undering decl follows the last id, which is unnecessary and
isn't actually true if the initializer has some lambdas in it, because those
might appear in between.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?

2018-04-05  Jakub Jelinek  

PR c++/85209
* pt.c (tsubst_decomp_names): Don't fail or ICE if DECL_CHAIN (decl3)
is not prev, if prev == decl.

* g++.dg/cpp1z/decomp39.C: New test.
* g++.dg/cpp1z/decomp40.C: New test.

--- gcc/cp/pt.c.jj  2018-04-05 09:49:06.155459292 +0200
+++ gcc/cp/pt.c 2018-04-05 15:02:40.290261785 +0200
@@ -16230,7 +16230,8 @@ tsubst_decomp_names (tree decl, tree pat
   if (error_operand_p (decl3))
decl = error_mark_node;
   else if (decl != error_mark_node
-  && DECL_CHAIN (decl3) != prev)
+  && DECL_CHAIN (decl3) != prev
+  && decl != prev)
{
  gcc_assert (errorcount);
  decl = error_mark_node;
--- gcc/testsuite/g++.dg/cpp1z/decomp39.C.jj2018-04-05 15:10:17.053265429 
+0200
+++ gcc/testsuite/g++.dg/cpp1z/decomp39.C   2018-04-05 15:10:56.099265281 
+0200
@@ -0,0 +1,16 @@
+// PR c++/85209
+// { dg-do compile { target c++11 } }
+// { dg-options "" }
+
+template 
+void
+foo ()
+{
+  auto [a] = []{}; // { dg-error "cannot decompose lambda closure type" }
+}  // { dg-warning "structured bindings only available 
with" "" { target c++14_down } .-1 }
+
+void
+bar ()
+{
+  foo<0> ();
+}
--- gcc/testsuite/g++.dg/cpp1z/decomp40.C.jj2018-04-05 15:11:58.630265722 
+0200
+++ gcc/testsuite/g++.dg/cpp1z/decomp40.C   2018-04-05 15:11:49.402265657 
+0200
@@ -0,0 +1,18 @@
+// PR c++/85209
+// { dg-do compile { target c++11 } }
+// { dg-options "" }
+
+struct S { int a; } s;
+
+template 
+void
+foo ()
+{
+  auto [a] = []{ return s; } ();   // { dg-warning "structured bindings 
only available with" "" { target c++14_down } }
+};
+
+void
+bar ()
+{
+  foo<0> ();
+}

Jakub

[C++ PATCH] Fix ICE with #pragma weak and structured bindings (PR c++/85208)

2018-04-05 Thread Jakub Jelinek

Hi!

We can't call maybe_apply_pragma_weak for structured binding decls
at start_decl time, as that is too early, we don't have DECL_ASSEMBLER_NAME
computed yet for them and because we don't have the corresponding decls
around yet, we can't even compute it.

This patch defers the maybe_apply_pragma_weak call from start_decl to
cp_maybe_mangle_decomp where we just added the mangling the call needs.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2018-04-05  Jakub Jelinek  

PR c++/85208
* decl.c (start_decl): For DECL_DECOMPOSITION_P decls, don't call
maybe_apply_pragma_weak here...
(cp_maybe_mangle_decomp): ... but call it here instead.

* g++.dg/cpp1z/decomp42.C: New test.

--- gcc/cp/decl.c.jj2018-04-04 21:33:20.532639396 +0200
+++ gcc/cp/decl.c   2018-04-05 13:29:39.128215719 +0200
@@ -5090,7 +5090,7 @@ start_decl (const cp_declarator *declara
 }
 
   /* If #pragma weak was used, mark the decl weak now.  */
-  if (!processing_template_decl)
+  if (!processing_template_decl && !DECL_DECOMPOSITION_P (decl))
 maybe_apply_pragma_weak (decl);
 
   if (TREE_CODE (decl) == FUNCTION_DECL
@@ -7483,6 +7483,7 @@ cp_maybe_mangle_decomp (tree decl, tree
   for (unsigned int i = 0; i < count; i++, d = DECL_CHAIN (d))
v[count - i - 1] = d;
   SET_DECL_ASSEMBLER_NAME (decl, mangle_decomp (decl, v));
+  maybe_apply_pragma_weak (decl);
 }
 }
 
--- gcc/testsuite/g++.dg/cpp1z/decomp42.C.jj2018-04-05 13:45:39.521220008 
+0200
+++ gcc/testsuite/g++.dg/cpp1z/decomp42.C   2018-04-05 13:45:22.507220268 
+0200
@@ -0,0 +1,9 @@
+// PR c++/85208
+// { dg-do compile { target c++11 } }
+// { dg-require-weak "" }
+// { dg-options "" }
+
+#pragma weak _ZDC1d1e1fE
+struct A { int i, j, k; };
+auto [a, b, c] = A (); // { dg-warning "structured bindings only available 
with" "" { target c++14_down } }
+auto [d, e, f] = A (); // { dg-warning "structured bindings only available 
with" "" { target c++14_down } }

Jakub

C++ PATCH for c++/85136, ICE with designated init in template

2018-04-05 Thread Jason Merrill

The issue here was that check_array_designated_initializer did
constant folding to produce a constant index, but then left the
un-folded value behind to confuse complete_array_type.  Fixed by
updating ce->value with the folded value.  We should also use
fold_non_dependent_expr, and handle dependent designators.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit 095a2dc5c8523e01367d9604f1e47849f7b4dcd9
Author: Jason Merrill 
Date:   Thu Apr 5 13:10:01 2018 -0400

PR c++/85136 - ICE with designated init in template.

* decl.c (maybe_deduce_size_from_array_init): Handle dependent
designated initializer.
(check_array_designated_initializer): Update ce->index with the
constant value.

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 489dcc0a8ed..86251f51eb4 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -5415,12 +5415,15 @@ check_array_designated_initializer (constructor_elt *ce,
 		  ce->index, true);
   if (ce_index
 	  && INTEGRAL_OR_UNSCOPED_ENUMERATION_TYPE_P (TREE_TYPE (ce_index))
-	  && (TREE_CODE (ce_index = maybe_constant_value (ce_index))
+	  && (TREE_CODE (ce_index = fold_non_dependent_expr (ce_index))
 	  == INTEGER_CST))
 	{
 	  /* A C99 designator is OK if it matches the current index.  */
 	  if (wi::to_wide (ce_index) == index)
-	return true;
+	{
+	  ce->index = ce_index;
+	  return true;
+	}
 	  else
 	sorry ("non-trivial designated initializers not supported");
 	}
@@ -5463,8 +5466,12 @@ maybe_deduce_size_from_array_init (tree decl, tree init)
 	  constructor_elt *ce;
 	  HOST_WIDE_INT i;
 	  FOR_EACH_VEC_SAFE_ELT (v, i, ce)
-	if (!check_array_designated_initializer (ce, i))
-	  failure = 1;
+	{
+	  if (instantiation_dependent_expression_p (ce->index))
+		return;
+	  if (!check_array_designated_initializer (ce, i))
+		failure = 1;
+	}
 	}
 
   if (failure)
diff --git a/gcc/testsuite/g++.dg/ext/desig11.C b/gcc/testsuite/g++.dg/ext/desig11.C
new file mode 100644
index 000..34bfbe1044e
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/desig11.C
@@ -0,0 +1,15 @@
+// PR c++/85136
+// { dg-options "" }
+
+enum { e };
+
+template void f()
+{
+  const int x[] = { [e] = 0 };
+  const int y[] = { [I] = 0 };
+}
+
+int main()
+{
+  f<0>();
+}

[PATCH] Fix ICE with statement frontiers (PR sanitizer/85213)

2018-04-05 Thread Jakub Jelinek

Hi!

As mentioned in the PR, on the following testcase we cp_save_expr in order
to avoid evaluating -1 * __builtin_expect (({ ... }), ...) twice and use
the SAVE_EXPR in the original shift as well as in a comparison, but then
we attempt to optimize the comparison by grabbing parts of the SAVE_EXPR,
create comparison out of it and building another SAVE_EXPR around it.
As unshare_expr/mostly_copy_tree_r doesn't unshare STATEMENT_LISTs, we end
up with the STATEMENT_LIST containing DEBUG_BEGIN_STMT and x != 0 multiple
times in the IL and as gimplification is destructive, we destroy the
STATEMENT_LIST the first time and second time ICE on it.

I don't see other way how to resolve this than to avoid this weirdo
optimization, I believe the optimization is from pre-GIMPLE era where we
didn't have hundreds of passes that can optimize it before expansion.

Bootstrapped/regtested on x86_64-linux and i686-linux, haven't seen any
regressions, ok for trunk?

2018-04-05  Jakub Jelinek  

PR sanitizer/85213
* fold-const.c (twoval_comparison_p): Remove SAVE_P argument and don't
look through SAVE_EXPRs with non-side-effects argument.  Adjust
recursive calls.
(fold_comparison): Adjust twoval_comparison_p caller, don't handle
save_p here.

* c-c++-common/ubsan/pr85213.c: New test.

--- gcc/fold-const.c.jj 2018-03-29 12:37:23.887476367 +0200
+++ gcc/fold-const.c2018-04-05 11:30:28.996962481 +0200
@@ -115,7 +115,7 @@ static tree negate_expr (tree);
 static tree associate_trees (location_t, tree, tree, enum tree_code, tree);
 static enum comparison_code comparison_to_compcode (enum tree_code);
 static enum tree_code compcode_to_comparison (enum comparison_code);
-static int twoval_comparison_p (tree, tree *, tree *, int *);
+static int twoval_comparison_p (tree, tree *, tree *);
 static tree eval_subst (location_t, tree, tree, tree, tree, tree);
 static tree optimize_bit_field_compare (location_t, enum tree_code,
tree, tree, tree);
@@ -3549,13 +3549,12 @@ operand_equal_for_comparison_p (tree arg
two different values, which will be stored in *CVAL1 and *CVAL2; if
they are nonzero it means that some operands have already been found.
No variables may be used anywhere else in the expression except in the
-   comparisons.  If SAVE_P is true it means we removed a SAVE_EXPR around
-   the expression and save_expr needs to be called with CVAL1 and CVAL2.
+   comparisons.  
 
If this is true, return 1.  Otherwise, return zero.  */
 
 static int
-twoval_comparison_p (tree arg, tree *cval1, tree *cval2, int *save_p)
+twoval_comparison_p (tree arg, tree *cval1, tree *cval2)
 {
   enum tree_code code = TREE_CODE (arg);
   enum tree_code_class tclass = TREE_CODE_CLASS (code);
@@ -3568,39 +3567,23 @@ twoval_comparison_p (tree arg, tree *cva
   || code == COMPOUND_EXPR))
 tclass = tcc_binary;
 
-  else if (tclass == tcc_expression && code == SAVE_EXPR
-  && ! TREE_SIDE_EFFECTS (TREE_OPERAND (arg, 0)))
-{
-  /* If we've already found a CVAL1 or CVAL2, this expression is
-two complex to handle.  */
-  if (*cval1 || *cval2)
-   return 0;
-
-  tclass = tcc_unary;
-  *save_p = 1;
-}
-
   switch (tclass)
 {
 case tcc_unary:
-  return twoval_comparison_p (TREE_OPERAND (arg, 0), cval1, cval2, save_p);
+  return twoval_comparison_p (TREE_OPERAND (arg, 0), cval1, cval2);
 
 case tcc_binary:
-  return (twoval_comparison_p (TREE_OPERAND (arg, 0), cval1, cval2, save_p)
- && twoval_comparison_p (TREE_OPERAND (arg, 1),
- cval1, cval2, save_p));
+  return (twoval_comparison_p (TREE_OPERAND (arg, 0), cval1, cval2)
+ && twoval_comparison_p (TREE_OPERAND (arg, 1), cval1, cval2));
 
 case tcc_constant:
   return 1;
 
 case tcc_expression:
   if (code == COND_EXPR)
-   return (twoval_comparison_p (TREE_OPERAND (arg, 0),
-cval1, cval2, save_p)
-   && twoval_comparison_p (TREE_OPERAND (arg, 1),
-   cval1, cval2, save_p)
-   && twoval_comparison_p (TREE_OPERAND (arg, 2),
-   cval1, cval2, save_p));
+   return (twoval_comparison_p (TREE_OPERAND (arg, 0), cval1, cval2)
+   && twoval_comparison_p (TREE_OPERAND (arg, 1), cval1, cval2)
+   && twoval_comparison_p (TREE_OPERAND (arg, 2), cval1, cval2));
   return 0;
 
 case tcc_comparison:
@@ -8781,9 +8764,8 @@ fold_comparison (location_t loc, enum tr
   if (TREE_CODE (arg1) == INTEGER_CST && TREE_CODE (arg0) != INTEGER_CST)
 {
   tree cval1 = 0, cval2 = 0;
-  int save_p = 0;
 
-  if (twoval_comparison_p (arg0, , , _p)
+  if (twoval_comparison_p (arg0, , )
  /* Don't handle degenerate cases here; they should already
 have

Re: [PATCH][RFC] Fix PR85180, find_base_term is exponential

2018-04-05 Thread Jakub Jelinek

On Thu, Apr 05, 2018 at 02:32:08PM +0200, Richard Biener wrote:
> 
> This fixes exponential time spent in find_base_term by extending the
> existing fix of NULL-ifying VALUE locations during their processing
> to keep them NULL-ified during the whole recursion.
> 
> As Jakub figured this has the chance of returning NULL prematurely
> for the case of the plus/minus case rejecting a found base on the
> ground of
> 
> if (base != NULL_RTX
> && ((REG_P (tmp1) && REG_POINTER (tmp1))
>  || known_base_value_p (base)))
>   return base;
> 
> so any VALUE visited during a such rejected base will not be
> visited again (but we'll get a NULL result).
> 
> Quoting him from IRC: still no differences, out of 850mil calls

The final statistics across full simultaneous x86_64-linux and i686-linux
bootstraps + regtests is 5 cases where the patch changed result from
4.6 billion find_base_term toplevel calls, all like:
64 /home/jakub/src/gcc/gcc/testsuite/gcc.target/i386/pr53698.c test2 symbol_ref 
0x7f765b2cc840 0 symbol_ref 0x7f765b2cc858 0
64 /home/jakub/src/gcc/gcc/testsuite/gcc.target/i386/pr53698.c test2 symbol_ref 
0x7f765b2cc870 0 symbol_ref 0x7f765b2cc888 0
64 /home/jakub/src/gcc/gcc/testsuite/gcc.target/i386/pr53698.c test2 symbol_ref 
0x7f765b2cd2e8 0 symbol_ref 0x7f765b2cd300 0
64 /home/jakub/src/gcc/gcc/testsuite/gcc.target/i386/pr53698.c test2 symbol_ref 
0x7f765b2cd378 0 symbol_ref 0x7f765b2cd390 0
64 /home/jakub/src/gcc/gcc/testsuite/gcc.target/i386/pr53698.c test2 symbol_ref 
0x7f765b2cd3a8 0 symbol_ref 0x7f765b2cd3c0 0
but when looking at it under debugger, both result from old find_base_term
and new find_base_term look like:
(gdb) p debug_rtx (res)
(symbol_ref:DI ("foo") [flags 0x40] )
(gdb) p debug_rtx (res2)
(symbol_ref:DI ("foo") [flags 0x40] )
just they don't compare pointer equal.  Because it is -mx32, guess it is
just result of
rtx temp = find_base_value (XEXP (src, 0));

if (temp != 0 && CONSTANT_P (temp))
  temp = convert_memory_address (Pmode, temp);
on SImode SYMBOL_REF and thus not a real difference we'd care about.

Jakub

C++ PATCH for c++/83808, ICE with VLA initialization

2018-04-05 Thread Jason Merrill

Since my patch for 69847 stopped us from giving a STRING_CST VLA type,
the check in process_init_constructor_array to make sure our
initializer has the right type breaks.  Let's just adjust the sanity
check to look through arrays rather than try to be more specific.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit 44cef54536ebec8b054767235330bfb86bea912f
Author: Jason Merrill 
Date:   Thu Apr 5 12:01:35 2018 -0400

PR c++/83808 - ICE with VLA initialization.

* typeck2.c (process_init_constructor_array): Don't require a VLA
initializer to have VLA type.

diff --git a/gcc/cp/typeck2.c b/gcc/cp/typeck2.c
index 3bdeae1501f..e5f9a68ec58 100644
--- a/gcc/cp/typeck2.c
+++ b/gcc/cp/typeck2.c
@@ -1319,9 +1319,11 @@ process_init_constructor_array (tree type, tree init, int nested,
   ce->value
 	= massage_init_elt (TREE_TYPE (type), ce->value, nested, complain);
 
-  if (ce->value != error_mark_node)
-	gcc_assert (same_type_ignoring_top_level_qualifiers_p
-		  (TREE_TYPE (type), TREE_TYPE (ce->value)));
+  gcc_checking_assert
+	(ce->value == error_mark_node
+	 || (same_type_ignoring_top_level_qualifiers_p
+	 (strip_array_types (TREE_TYPE (type)),
+	  strip_array_types (TREE_TYPE (ce->value);
 
   flags |= picflag_from_initializer (ce->value);
 }
diff --git a/gcc/testsuite/g++.dg/ext/vla19.C b/gcc/testsuite/g++.dg/ext/vla19.C
new file mode 100644
index 000..287a0d5a381
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/vla19.C
@@ -0,0 +1,16 @@
+// PR c++/83808
+// { dg-additional-options "-Wno-vla" }
+
+struct R { int r; };
+void baz (char *, char *, char *, char *);
+
+void
+foo ()
+{
+  const R a = { 12 };
+  char b[1][a.r] = { { "12345678901" } };
+  char c[a.r] = { "12345678901" };
+  char d[1][a.r] = { { '1', '2', '3', '4', '5', '6', '7', '8', '9', '0', '1', '\0' } };
+  char e[a.r] = { '1', '2', '3', '4', '5', '6', '7', '8', '9', '0', '1', '\0' };
+  baz (b[0], c, d[0], e);
+}

Re: [PATCH] Use dlsym to check if libdl is needed for plugin

2018-04-05 Thread DJ Delorie


"H.J. Lu"  writes:
> config/
>
>   * plugins.m4 (AC_PLUGINS): Use dlsym to check if libdl is needed.

This is OK.

Re: [PATCH] rs6000: Enable -fasynchronous-unwind-tables by default

2018-04-05 Thread David Edelsohn

On Thu, Apr 5, 2018 at 5:50 AM, Jakub Jelinek  wrote:
> On Thu, Apr 05, 2018 at 09:45:54AM +, Segher Boessenkool wrote:
>> To find out where on-entry register values live at any point in a
>> program, GDB currently tries to parse to parse the executable code.
>> This does not work very well, for example it gets confused if some
>> accesses to the stack use the frame pointer (r31) and some use the
>> stack pointer (r1).  A symptom is that backtraces can be cut short.
>>
>> This patch enables -fasynchronous-unwind-tables by default for rs6000,
>> which causes us to emit DWARF unwind tables for all functions, solving
>> these problems.
>>
>> This not do anything for sub-targets without DWARF.
>>
>> It increases executable size, but only modestly, and does not change
>> memory use, only the disk image.
>>
>> Various other targets already do this (x86, s390, tile*).
>
> aarch64-linux* too (since r258871).
>
>> Tested on powerpc64-linux {-m32,-m64}.  David, I'd like to commit this
>> to current trunk; does that seem too dangerous to you?
>
> If David is ok with it, it is fine for trunk even in stage4.

I don't have a fundamental objection.  AIX does use DWARF EH. I'm not
worried about the EH processing itself, but this may trigger paths in
dwarf2out, etc. that aren't prepared for AIX XCOFF. I'm concerned that
it may assume ELF section or insert additional section names or
generally output code that is incompatible with the AIX assembler or
linker.

Also, AIX XCOFF does not have a special section for DWARF EH
information, so it is stuffed in data section and will increase the
memory image size.

We need to test it on AIX and see what happens.

- David

Re: Add workaround to std::variant for Clang bug 31852

2018-04-05 Thread Jonathan Wakely

On 26 March 2018 at 14:10, Jonathan Wakely wrote:
> This makes it possible to use our std::variant with Clang, as well as
> some minor tweaks to avoid ADL (so the compiler doesn't waste time
> looking in associated namespaces) and adjust whitespace.
>
>* include/std/variant (__get): Qualify calls to avoid ADL.
>(__select_index): Adjust whitespace.
>(variant): Add using-declaration to workaround Clang bug.
>
> Tested powerpc64le-linux, committed to trunk.. I'll backport to
> gcc-7-branch too.

The Clang bug means that __get is ambiguous (because it thinks the
friend declaration is a separate overload) so we also need this
change.

Tested powerpc64le-linux, committed to trunk.
commit c09f881053d98370306098b7bad5bb829255379c
Author: Jonathan Wakely 
Date:   Thu Apr 5 14:03:13 2018 +0100

Add another workaround to std::variant for Clang bug 31852

* include/std/variant (_VARIANT_RELATION_FUNCTION_TEMPLATE): Qualify
__get calls to avoid ADL and avoid ambiguity due to Clang bug.

diff --git a/libstdc++-v3/include/std/variant b/libstdc++-v3/include/std/variant
index 028d3064272..48bec528406 100644
--- a/libstdc++-v3/include/std/variant
+++ b/libstdc++-v3/include/std/variant
@@ -272,8 +272,8 @@ namespace __variant
 constexpr bool \
 __erased_##__NAME(const _Variant& __lhs, const _Variant& __rhs) \
 { \
-  return __get<_Np>(std::forward<_Variant>(__lhs)) \
- __OP __get<_Np>(std::forward<_Variant>(__rhs)); \
+  return __variant::__get<_Np>(std::forward<_Variant>(__lhs)) \
+ __OP __variant::__get<_Np>(std::forward<_Variant>(__rhs)); \
 }
 
   _VARIANT_RELATION_FUNCTION_TEMPLATE(<, less)

[PATCH, i386]: FIx PR85193, ICE: SIGSEGV in memory_operand at recog.c:1361 with -O2 -fno-tree-ccp -fno-tree-fre -mno-sse

2018-04-05 Thread Uros Bizjak

Hello!

We have to prevent memory attribute calculation to access non-existent
operand[2] for rotate1 type insns in the same way as for ishift1 type
insns.

2018-04-05  Uros Bizjak  

PR target/85193
* config/i386/i386.md (define_attr "memory"): Handle rotate1 type.

testsuite/ChangeLog:

2018-04-05  Uros Bizjak  

PR target/85193
* gcc.target/i386/pr85193.c: New test.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline, will be backported to release branches.

Uros.
Index: config/i386/i386.md
===
--- config/i386/i386.md (revision 259082)
+++ config/i386/i386.md (working copy)
@@ -752,7 +752,7 @@
   (if_then_else (match_operand 1 "constant_call_address_operand")
 (const_string "none")
 (const_string "load"))
-(and (eq_attr "type" "alu1,negnot,ishift1,sselog1,sseshuf1")
+(and (eq_attr "type" "alu1,negnot,ishift1,rotate1,sselog1,sseshuf1")
  (match_operand 1 "memory_operand"))
   (const_string "both")
 (and (match_operand 0 "memory_operand")
@@ -763,7 +763,7 @@
 (match_operand 1 "memory_operand")
   (const_string "load")
 (and (eq_attr "type"
-"!alu1,negnot,ishift1,
+"!alu1,negnot,ishift1,rotate1,
   imov,imovx,icmp,test,bitmanip,
   fmov,fcmp,fsgn,
   sse,ssemov,ssecmp,ssecomi,ssecvt,ssecvt1,sseicvt,
Index: testsuite/gcc.target/i386/pr85193.c
===
--- testsuite/gcc.target/i386/pr85193.c (nonexistent)
+++ testsuite/gcc.target/i386/pr85193.c (working copy)
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-Wno-psabi -O2 -fno-tree-ccp -fno-tree-fre -mno-sse" } */
+
+typedef unsigned char U __attribute__((vector_size(16)));
+typedef unsigned int V __attribute__((vector_size(16)));
+typedef unsigned long long W __attribute__((vector_size(16)));
+
+extern void bar(U, U);
+
+V v;
+
+void
+foo(U f)
+{
+  f[0] = f[0] << (unsigned char)~v[0] | f[~((W)(U){0, 0, 0, 0, 0, 0, 0, 0, 
5})[1] & 5] >> (-(unsigned char)~v[0] & 7);
+  bar(f, (U){});
+}

C++ PATCH for c++/82152, ICE with class deduction and inherited ctor

2018-04-05 Thread Jason Merrill

Building a deduction guide for a constructor inherited from a
non-dependent base got confused because it has no template parameters.

Since there's no way an inherited constructor could be useful for
class template argument deduction, we can just skip them.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit ea6f7a867106c563c0eb0c419d5f167d05426a9d
Author: Jason Merrill 
Date:   Thu Apr 5 12:23:14 2018 -0400

PR c++/82152 - ICE with class deduction and inherited ctor.

* pt.c (do_class_deduction): Ignore inherited ctors.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index dc74635876e..dc2310aefa8 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -26217,6 +26217,10 @@ do_class_deduction (tree ptype, tree tmpl, tree init, int flags,
   // FIXME cache artificial deduction guides
   for (ovl_iterator iter (CLASSTYPE_CONSTRUCTORS (type)); iter; ++iter)
 {
+  /* Skip inherited constructors.  */
+  if (iter.using_p ())
+	continue;
+
   tree guide = build_deduction_guide (*iter, outer_args, complain);
   if (guide == error_mark_node)
 	return error_mark_node;
diff --git a/gcc/testsuite/g++.dg/cpp1z/class-deduction54.C b/gcc/testsuite/g++.dg/cpp1z/class-deduction54.C
new file mode 100644
index 000..e513980
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/class-deduction54.C
@@ -0,0 +1,15 @@
+// PR c++/82152
+// { dg-additional-options -std=c++17 }
+
+struct Base {};
+
+template
+struct Derived : public Base {
+  using Base::Base;
+};
+
+Derived() -> Derived< void >;
+
+int main() {
+  Derived x;
+}

Re: [PATCH] c/55976 -Werror=return-type should error on returning a value from a void function

2018-04-05 Thread dave . pagan


On 04/04/2018 07:44 PM, Martin Sebor wrote:

On 04/04/2018 05:50 PM, dave.pa...@oracle.com wrote:

On 04/04/2018 10:58 AM, Martin Sebor wrote:

On 04/04/2018 11:15 AM, Jakub Jelinek wrote:

On Tue, Apr 03, 2018 at 01:36:13PM -0600, Martin Sebor wrote:

On 04/03/2018 10:26 AM, dave.pa...@oracle.com wrote:
This patch fixes handlng of -Werror=return-type. Currently, even 
with

the flag specified, return type errors remain warnings.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55976

Function c_finish_return (), when calling pedwarn, should specifiy
OPT_Wreturn_type as the diagnostic index if warn_return_type is 
set so

that the given diagnostic is generated as an error, not a warning.

Patch was successfully bootstrapped and tested on x86_64-linux.


I would expect the option to control the warning consistently so
that when the test case is compiled with just -Wno-return-type
(and no other options) the warning is not issued.  But that's not
what happens because pedwarn() is called with a zero argument as
the option.


I think we need to make sure we error out even with -Wno-return-type
when -pedantic-errors.


That would seem surprising to me.  Is there an existing
precedent for this in GCC? (Any other warnings or options
that are treated this way?)

It would also diverge from Clang, which would be particularly
unhelpful to users of both compilers.  I would suggest to
follow what Clang does in terms of controlling the warning
(though not necessarily in its severity).  It's consistent
and intuitive.  (Clang has -Werror=return-type by default;
that may be overly strict.)


I think these are both good points. While I tend to lean toward
consistency (both within GCC and with clang), if this sort of problem is
potentially worse in GCC 8 (as Jakub suggests) then perhaps it's worth
thinking about how to help prevent it. If we do choose to go this
direction with -pedantic-errors, and there isn't a precedent for it,
then the documentation would require an update to reflect the new 
behavior.


I actually don't think -Wpedantic is the appropriate option
to control the warning in the case Jakub is concerned about
(it may be appropriate for warning about returning a value
from a void function because that's a constraint violation).
-Wpedantic is meant for diagnostics that are required by
the C/C++ standards but that GCC may otherwise silently
accept as extensions.  Defining a non-void function that
doesn't return a value is valid in C is not incorrect and
does not require a diagnostic.  (Using the return value
is undefined, but when it isn't used there is no problem.)

(This is different in recent versions of C++ where a return
statement with no operand in a non-void function requires
a diagnostic, so the C++ handling may need to be different.)


Also, thoughts on this question from my last email?


Since this patch does fix the original problem, what do you recommend?
Scrap this patch? Or let it proceed and submit a new bug noting the
(existing) incorrect behavior of -Wno-return-type?


We could add the discussion in this email to any new bug we create for
-Wno-return-type.


I think for C, handling it under the same bug and in the same
patch would be best.  The C++ bits could come later if needed.


Ok. Thanks, Martin. I'll add proper handling of -Wno-return-type to this 
patch and resubmit it.


--Dave



Martin



--Dave


Martin


Especially when issues this pedwarn warns about are
very hard to debug show stoppers for anybody calling such functions
in GCC 8
(because we turn such spots into __builtin_unreachable () and thus
randomly
can execute completely unrelated code).  So, I think consistency
isn't that
important here.

    Jakub

Re: [og7] vector_length extension part 5: libgomp and tests

2018-04-05 Thread Tom de Vries


On 03/02/2018 09:47 PM, Cesar Philippidis wrote:

libgomp/
* plugin/plugin-nvptx.c (nvptx_exec): Adjust calculations of
workers and vectors.


I wrote a test case that triggers this code, and added it to this code.

Build x86_64 with nvptx accelerator and tested libgomp.

Committed.

Thanks,
- Tom

[nvptx] Handle large vectors in libgomp

2018-04-05  Cesar Philippidis  
	Tom de Vries  

	* plugin/plugin-nvptx.c (nvptx_exec): Adjust calculations of
	workers and vectors.
	* testsuite/libgomp.oacc-c-c++-common/vector-length-128-7.c: New test.

---
 libgomp/plugin/plugin-nvptx.c  | 10 +++---
 .../vector-length-128-7.c  | 41 ++
 2 files changed, 47 insertions(+), 4 deletions(-)

diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c
index bdc0c30..9b4768f 100644
--- a/libgomp/plugin/plugin-nvptx.c
+++ b/libgomp/plugin/plugin-nvptx.c
@@ -734,8 +734,6 @@ nvptx_exec (void (*fn), size_t mapnum, void **hostaddrs, void **devaddrs,
   int threads_per_block = threads_per_sm > block_size
 ? block_size : threads_per_sm;
 
-  threads_per_block /= warp_size;
-
   if (threads_per_sm > cpu_size)
 threads_per_sm = cpu_size;
 
@@ -802,6 +800,10 @@ nvptx_exec (void (*fn), size_t mapnum, void **hostaddrs, void **devaddrs,
 
   if (seen_zero)
 {
+  int vectors = dims[GOMP_DIM_VECTOR] > 0
+	? dims[GOMP_DIM_VECTOR] : warp_size;
+  int workers = threads_per_block / vectors;
+
   for (i = 0; i != GOMP_DIM_MAX; i++)
 	if (!dims[i])
 	  {
@@ -819,10 +821,10 @@ nvptx_exec (void (*fn), size_t mapnum, void **hostaddrs, void **devaddrs,
 		  : 2 * dev_size;
 		break;
 	  case GOMP_DIM_WORKER:
-		dims[i] = threads_per_block;
+		dims[i] = workers;
 		break;
 	  case GOMP_DIM_VECTOR:
-		dims[i] = warp_size;
+		dims[i] = vectors;
 		break;
 	  default:
 		abort ();
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/vector-length-128-7.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/vector-length-128-7.c
new file mode 100644
index 000..60c264c
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/vector-length-128-7.c
@@ -0,0 +1,41 @@
+/* { dg-do run { target openacc_nvidia_accel_selected } } */
+/* { dg-additional-options "-foffload=-fdump-tree-oaccdevlow" } */
+/* { dg-additional-options "-foffload=-mlong-vector-in-workers" } */
+/* { dg-set-target-env-var "GOMP_DEBUG" "1" } */
+
+#include 
+
+#define N 1024
+
+unsigned int a[N];
+unsigned int b[N];
+unsigned int c[N];
+unsigned int n = N;
+
+int
+main (void)
+{
+  for (unsigned int i = 0; i < n; ++i)
+{
+  a[i] = i % 3;
+  b[i] = i % 5;
+}
+
+#pragma acc parallel vector_length (128) copyin (a,b) copyout (c)
+  {
+#pragma acc loop worker
+for (unsigned int i = 0; i < 4; i++)
+#pragma acc loop vector
+  for (unsigned int j = 0; j < n / 4; j++)
+	c[(i * N / 4) + j] = a[(i * N / 4) + j] + b[(i * N / 4) + j];
+  }
+
+  for (unsigned int i = 0; i < n; ++i)
+if (c[i] != (i % 3) + (i % 5))
+  abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-offload-tree-dump "__attribute__\\(\\(oacc function \\(1, 0, 128\\)" "oaccdevlow" } } */
+/* { dg-output "nvptx_exec: kernel main\\\$_omp_fn\\\$0: launch gangs=1, workers=8, vectors=128" } */

Re: [og7] vector_length extension part 2: Generalize state propagation and synchronization

2018-04-05 Thread Tom de Vries


On 03/30/2018 05:14 PM, Tom de Vries wrote:

On 03/30/2018 05:00 PM, Cesar Philippidis wrote:

I should
have checked that patch with the vector length fallback disabled.


Right. The patch series introduces a lot of code that is not exercised.

I've added an -mlong-vector-in-workers option in my local branch and 
added 3 test-cases to exercise the code with fallback disabled everytime 
I run the libgomp tests.




This patch adds that option.

Build x86_64 with nvptx accelerator and tested libgomp.

Committed.

Thanks,
- Tom
[nvptx] Add -mlong-vector-in-workers

2018-04-05  Tom de Vries  

	* config/nvptx/nvptx.c (nvptx_adjust_parallelism): Handle
	nvptx_long_vectors_in_workers.
	* config/nvptx/nvptx.opt (mlong-vector-in-workers): Add option.

	* testsuite/libgomp.oacc-c-c++-common/vector-length-128-4.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/vector-length-128-5.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/vector-length-128-6.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/vector-length-128-8.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/vector-length-128-9.c: New test.

---
 gcc/config/nvptx/nvptx.c   |  3 +-
 gcc/config/nvptx/nvptx.opt |  3 ++
 .../vector-length-128-4.c  | 41 
 .../vector-length-128-5.c  | 42 +
 .../vector-length-128-6.c  | 42 +
 .../vector-length-128-8.c  | 44 ++
 .../vector-length-128-9.c  | 44 ++
 7 files changed, 218 insertions(+), 1 deletion(-)

diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index 595413a..b5e6dce 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -5397,7 +5397,8 @@ nvptx_adjust_parallelism (unsigned inner_mask, unsigned outer_mask)
  worker loop.  Therefore, fallback to setting vector_length to
  PTX_WARP_SIZE.  Hopefully this condition may be relaxed for
  sm_70+ targets.  */
-  if ((inner_mask & GOMP_DIM_MASK (GOMP_DIM_VECTOR))
+  if (nvptx_long_vectors_in_workers == 0
+  && (inner_mask & GOMP_DIM_MASK (GOMP_DIM_VECTOR))
   && (outer_mask & GOMP_DIM_MASK (GOMP_DIM_WORKER)))
 {
   tree attr = tree_cons (get_identifier (NVPTX_GOACC_VL_WARP), NULL_TREE,
diff --git a/gcc/config/nvptx/nvptx.opt b/gcc/config/nvptx/nvptx.opt
index e2d64bd..f7f37ec 100644
--- a/gcc/config/nvptx/nvptx.opt
+++ b/gcc/config/nvptx/nvptx.opt
@@ -62,3 +62,6 @@ Enum(ptx_isa) String(sm_35) Value(PTX_ISA_SM35)
 misa=
 Target RejectNegative ToLower Joined Enum(ptx_isa) Var(ptx_isa_option) Init(PTX_ISA_SM30)
 Specify the version of the ptx ISA to use.
+
+mlong-vector-in-workers
+Target Var(nvptx_long_vectors_in_workers) Undocumented Init(0)
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/vector-length-128-4.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/vector-length-128-4.c
new file mode 100644
index 000..6d43f82
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/vector-length-128-4.c
@@ -0,0 +1,41 @@
+/* { dg-do run { target openacc_nvidia_accel_selected } } */
+/* { dg-additional-options "-foffload=-fdump-tree-oaccdevlow" } */
+/* { dg-additional-options "-foffload=-mlong-vector-in-workers" } */
+/* { dg-set-target-env-var "GOMP_DEBUG" "1" } */
+
+#include 
+
+#define N 1024
+
+unsigned int a[N];
+unsigned int b[N];
+unsigned int c[N];
+unsigned int n = N;
+
+int
+main (void)
+{
+  for (unsigned int i = 0; i < n; ++i)
+{
+  a[i] = i % 3;
+  b[i] = i % 5;
+}
+
+#pragma acc parallel num_workers (2) vector_length (128) copyin (a,b) copyout (c)
+  {
+#pragma acc loop worker
+for (unsigned int i = 0; i < 4; i++)
+#pragma acc loop vector
+  for (unsigned int j = 0; j < n / 4; j++)
+	c[(i * N / 4) + j] = a[(i * N / 4) + j] + b[(i * N / 4) + j];
+  }
+
+  for (unsigned int i = 0; i < n; ++i)
+if (c[i] != (i % 3) + (i % 5))
+  abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-offload-tree-dump "__attribute__\\(\\(oacc function \\(1, 2, 128\\)" "oaccdevlow" } } */
+/* { dg-output "nvptx_exec: kernel main\\\$_omp_fn\\\$0: launch gangs=1, workers=2, vectors=128" } */
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/vector-length-128-5.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/vector-length-128-5.c
new file mode 100644
index 000..661fdc7
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/vector-length-128-5.c
@@ -0,0 +1,42 @@
+/* { dg-do run { target openacc_nvidia_accel_selected } } */
+/* { dg-additional-options "-fopenacc-dim=-:2:128" } */
+/* { dg-additional-options "-foffload=-fdump-tree-oaccdevlow" } */
+/* { dg-additional-options "-foffload=-mlong-vector-in-workers" } */
+/* { dg-set-target-env-var "GOMP_DEBUG" "1" } */
+
+#include 
+
+#define N 1024
+
+unsigned int a[N];
+unsigned int b[N];
+unsigned int c[N];
+unsigned int n = N;

Re: [og7] vector_length extension part 4: target hooks and automatic parallelism

2018-04-05 Thread Tom de Vries


On 03/02/2018 08:18 PM, Cesar Philippidis wrote:

The attached patch adjusts the existing goacc validate_dims target hook
and introduces a new goacc adjust_parallelism target hook.


The attached patch now just introduces the nvptx_adjust_parallelism 
target hook implementation, which enables test-cases to start using the 
feature.


Build x86_64 with nvptx accelerator and tested libgomp.

Committed.

Thanks,
- Tom
[nvptx] Enable large vectors

2018-04-05  Cesar Philippidis  
	Tom de Vries  

	* omp-offload.c (oacc_get_default_dim): New function.
	* omp-offload.h (oacc_get_default_dim): Declare.
	* config/nvptx/nvptx.c (NVPTX_GOACC_VL_WARP): Define.
	(nvptx_goacc_needs_vl_warp): New function.
	(nvptx_goacc_validate_dims): Take larger vector lengths into
	account.
	(nvptx_adjust_parallelism): New function.
	(TARGET_GOACC_ADJUST_PARALLELISM): Define.
	(populate_offload_attrs): Handle the situation where the default
	runtime geometry has not been initialized yet for reductions.

	* testsuite/libgomp.oacc-c-c++-common/vector-length-128-1.c: Expect
	vector length to be 128.
	* testsuite/libgomp.oacc-c-c++-common/vector-length-128-10.c: Same.
	* testsuite/libgomp.oacc-c-c++-common/vector-length-128-2.c: Same.
	* testsuite/libgomp.oacc-c-c++-common/vred2d-128.c: Same.
	* testsuite/libgomp.oacc-fortran/gemm.f90: Same.

---
 gcc/config/nvptx/nvptx.c   | 148 +++--
 gcc/omp-offload.c  |   7 +
 gcc/omp-offload.h  |   2 +
 .../vector-length-128-1.c  |   5 +-
 .../vector-length-128-10.c |   1 -
 .../vector-length-128-2.c  |   5 +-
 .../libgomp.oacc-c-c++-common/vred2d-128.c |   2 -
 libgomp/testsuite/libgomp.oacc-fortran/gemm.f90|   1 -
 8 files changed, 153 insertions(+), 18 deletions(-)

diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index 51bd69d..595413a 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -71,6 +71,7 @@
 #include "fold-const.h"
 #include "intl.h"
 #include "tree-hash-traits.h"
+#include "omp-offload.h"
 
 /* This file should be included last.  */
 #include "target-def.h"
@@ -4634,15 +4635,20 @@ populate_offload_attrs (offload_attrs *oa)
   if (oa->vector_length == 0)
 {
   /* FIXME: Need a more graceful way to handle large vector
-	 lengths in OpenACC routines.  */
+	 lengths in OpenACC routines and also -fopenacc-dims.  */
   if (!lookup_attribute ("omp target entrypoint",
 			 DECL_ATTRIBUTES (current_function_decl)))
 	oa->vector_length = PTX_WARP_SIZE;
-  else
+  else if (PTX_VECTOR_LENGTH != PTX_WARP_SIZE)
 	oa->vector_length = PTX_VECTOR_LENGTH;
 }
   if (oa->num_workers == 0)
-oa->max_workers = PTX_CTA_SIZE / oa->vector_length;
+{
+  if (oa->vector_length == 0)
+	oa->max_workers = PTX_WORKER_LENGTH;
+  else
+	oa->max_workers = PTX_CTA_SIZE / oa->vector_length;
+}
   else
 oa->max_workers = oa->num_workers;
 }
@@ -5193,6 +5199,19 @@ nvptx_simt_vf ()
   return PTX_WARP_SIZE;
 }
 
+#define NVPTX_GOACC_VL_WARP "nvptx vl warp"
+
+/* Return true of the offloaded function needs a vector_length of
+   PTX_WARP_SIZE.  */
+
+static bool
+nvptx_goacc_needs_vl_warp ()
+{
+  tree attr = lookup_attribute (NVPTX_GOACC_VL_WARP,
+DECL_ATTRIBUTES (current_function_decl));
+  return attr != NULL_TREE;
+}
+
 /* Validate compute dimensions of an OpenACC offload or routine, fill
in non-unity defaults.  FN_LEVEL indicates the level at which a
routine might spawn a loop.  It is negative for non-routines.  If
@@ -5201,6 +5220,14 @@ nvptx_simt_vf ()
 static bool
 nvptx_goacc_validate_dims (tree decl, int dims[], int fn_level)
 {
+  int default_vector_length = PTX_VECTOR_LENGTH;
+
+  /* For capability reasons, fallback to vl = 32 for runtime values.  */
+  if (dims[GOMP_DIM_VECTOR] == 0)
+default_vector_length = PTX_WARP_SIZE;
+  else if (decl)
+default_vector_length = oacc_get_default_dim (GOMP_DIM_VECTOR);
+
   /* Detect if a function is unsuitable for offloading.  */
   if (!flag_offload_force && decl)
 {
@@ -5225,18 +5252,20 @@ nvptx_goacc_validate_dims (tree decl, int dims[], int fn_level)
 
   bool changed = false;
 
-  /* The vector size must be 32, unless this is a SEQ routine.  */
+  /* The vector size must be a positive multiple of the warp size,
+ unless this is a SEQ routine.  */
   if (fn_level <= GOMP_DIM_VECTOR && fn_level >= -1
   && dims[GOMP_DIM_VECTOR] >= 0
-  && dims[GOMP_DIM_VECTOR] != PTX_VECTOR_LENGTH)
+  && (dims[GOMP_DIM_VECTOR] % 32 != 0
+	  || dims[GOMP_DIM_VECTOR] == 0))
 {
   if (fn_level < 0 && dims[GOMP_DIM_VECTOR] >= 0)
 	warning_at (decl ? DECL_SOURCE_LOCATION (decl) : UNKNOWN_LOCATION, 0,
 		dims[GOMP_DIM_VECTOR]
 		? G_("using vector_length (%d), ignoring %d")
 		: G_("using vector_length (%d),

Re: [og7] vector_length extension part 3: reductions

2018-04-05 Thread Tom de Vries


On 03/02/2018 06:51 PM, Cesar Philippidis wrote:

This patch teaches the nvptx BE how to process vector reductions with
large vector lengths.


As with the "[nvptx] Generalize state propagation and synchronization" 
patch":

- added use of MAX and ROUND_UP
- added missing initialization of vector_red_partition
- added assert checking vector_red_partition and vector_red_size

Also:
- added FIXME for hack in nvptx_declare_function_name


Build x86_64 with nvptx accelerator and tested libgomp.

Committed.

Thanks,
- Tom
[nvptx] Handle large vector reductions

2018-04-05  Cesar Philippidis  
	Tom de Vries  

	* config/nvptx/nvptx-protos.h (nvptx_output_red_partition): Declare.
	* config/nvptx/nvptx.c (vector_red_size, vector_red_align,
	vector_red_partition, vector_red_sym): New global variables.
	(nvptx_option_override): Initialize vector_red_sym.
	(nvptx_declare_function_name): Restore red_partition register.
	(nvptx_file_end): Emit code to declare the vector reduction variables.
	(nvptx_output_red_partition): New function.
	(nvptx_expand_shared_addr): Add vector argument. Use it to handle
	large vector reductions.
	(enum nvptx_builtins): Add NVPTX_BUILTIN_VECTOR_ADDR.
	(nvptx_init_builtins): Add VECTOR_ADDR.
	(nvptx_expand_builtin): Update call to nvptx_expand_shared_addr.
	Handle nvptx_expand_shared_addr.
	(nvptx_get_shared_red_addr): Add vector argument and handle large
	vectors.
	(nvptx_goacc_reduction_setup): Add offload_attrs argument and handle
	large vectors.
	(nvptx_goacc_reduction_init): Likewise.
	(nvptx_goacc_reduction_fini): Likewise.
	(nvptx_goacc_reduction_teardown): Likewise.
	(nvptx_goacc_reduction): Update calls to nvptx_goacc_reduction_{setup,
	init,fini,teardown}.
	(nvptx_init_axis_predicate): Initialize vector_red_partition.
	(nvptx_set_current_function): Init vector_red_partition.
	* config/nvptx/nvptx.md (UNSPECV_RED_PART): New unspecv.
	(nvptx_red_partition): New insn.
	* config/nvptx/nvptx.h (struct machine_function): Add red_partition.

---
 gcc/config/nvptx/nvptx-protos.h |   1 +
 gcc/config/nvptx/nvptx.c| 154 
 gcc/config/nvptx/nvptx.h|   2 +
 gcc/config/nvptx/nvptx.md   |  12 
 4 files changed, 140 insertions(+), 29 deletions(-)

diff --git a/gcc/config/nvptx/nvptx-protos.h b/gcc/config/nvptx/nvptx-protos.h
index 16b316f..326c38c 100644
--- a/gcc/config/nvptx/nvptx-protos.h
+++ b/gcc/config/nvptx/nvptx-protos.h
@@ -55,5 +55,6 @@ extern const char *nvptx_output_return (void);
 extern const char *nvptx_output_set_softstack (unsigned);
 extern const char *nvptx_output_simt_enter (rtx, rtx, rtx);
 extern const char *nvptx_output_simt_exit (rtx);
+extern const char *nvptx_output_red_partition (rtx, rtx);
 #endif
 #endif
diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index 009ca59..51bd69d 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -143,6 +143,14 @@ static unsigned worker_red_size;
 static unsigned worker_red_align;
 static GTY(()) rtx worker_red_sym;
 
+/* Buffer needed for vector reductions, when vector_length >
+   PTX_WARP_SIZE.  This has to be distinct from the worker broadcast
+   array, as both may be live concurrently.  */
+static unsigned vector_red_size;
+static unsigned vector_red_align;
+static unsigned vector_red_partition;
+static GTY(()) rtx vector_red_sym;
+
 /* Shared memory block for gang-private variables.  */
 static unsigned gangprivate_shared_size;
 static unsigned gangprivate_shared_align;
@@ -219,6 +227,11 @@ nvptx_option_override (void)
   SET_SYMBOL_DATA_AREA (worker_red_sym, DATA_AREA_SHARED);
   worker_red_align = GET_MODE_ALIGNMENT (SImode) / BITS_PER_UNIT;
 
+  vector_red_sym = gen_rtx_SYMBOL_REF (Pmode, "__vector_red");
+  SET_SYMBOL_DATA_AREA (vector_red_sym, DATA_AREA_SHARED);
+  vector_red_align = GET_MODE_ALIGNMENT (SImode) / BITS_PER_UNIT;
+  vector_red_partition = 0;
+
   gangprivate_shared_sym = gen_rtx_SYMBOL_REF (Pmode, "__gangprivate_shared");
   SET_SYMBOL_DATA_AREA (gangprivate_shared_sym, DATA_AREA_SHARED);
   gangprivate_shared_align = GET_MODE_ALIGNMENT (SImode) / BITS_PER_UNIT;
@@ -1096,8 +1109,25 @@ nvptx_init_axis_predicate (FILE *file, int regno, const char *name)
 {
   fprintf (file, "\t{\n");
   fprintf (file, "\t\t.reg.u32\t%%%s;\n", name);
+  if (strcmp (name, "x") == 0 && cfun->machine->red_partition)
+{
+  fprintf (file, "\t\t.reg.u64\t%%t_red;\n");
+  fprintf (file, "\t\t.reg.u64\t%%y64;\n");
+}
   fprintf (file, "\t\tmov.u32\t%%%s, %%tid.%s;\n", name, name);
   fprintf (file, "\t\tsetp.ne.u32\t%%r%d, %%%s, 0;\n", regno, name);
+  if (strcmp (name, "x") == 0 && cfun->machine->red_partition)
+{
+  fprintf (file, "\t\tcvt.u64.u32\t%%y64, %%tid.y;\n");
+  fprintf (file, "\t\tcvta.shared.u64\t%%t_red, __vector_red;\n");
+  fprintf (file, "\t\tmad.lo.u64\t%%r%d, %%y64, %d, %%t_red; "
+	   "// vector reduction buffer\n",
+	   REGNO

C++ PATCH for c++/84665, ICE with array of empty class

2018-04-05 Thread Jason Merrill

Another place we need to use fold_non_dependent_expr so that constant
evaluation happens in non-template context.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit 1f7ec60303854ae6c79034223aba5ad996b86843
Author: Jason Merrill 
Date:   Thu Apr 5 10:46:27 2018 -0400

PR c++/84665 - ICE with array of empty class.

* decl2.c (cp_check_const_attributes): Use fold_non_dependent_expr.

diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c
index 6078fb668a7..b0bf8241f71 100644
--- a/gcc/cp/decl2.c
+++ b/gcc/cp/decl2.c
@@ -1416,7 +1416,7 @@ cp_check_const_attributes (tree attributes)
 	{
 	  tree expr = TREE_VALUE (arg);
 	  if (EXPR_P (expr))
-	TREE_VALUE (arg) = maybe_constant_value (expr);
+	TREE_VALUE (arg) = fold_non_dependent_expr (expr);
 	}
 }
 }
diff --git a/gcc/testsuite/g++.dg/ext/attr-noinline-4.C b/gcc/testsuite/g++.dg/ext/attr-noinline-4.C
new file mode 100644
index 000..27c7ae80fec
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/attr-noinline-4.C
@@ -0,0 +1,10 @@
+// PR c++/84665
+
+struct S {} a[1];
+
+template 
+void
+foo ()
+{
+  __attribute__ ((noinline (a[0]))) int c = 0; // { dg-error "wrong number of arguments" }
+}

Re: [PATCH, GCC/ARM] Fix PR85203: cmse_nonsecure_caller returns wrong result

2018-04-05 Thread Kyrill Tkachov



On 05/04/18 16:13, Thomas Preudhomme wrote:

Hi Kyrill,

On 04/04/18 18:20, Thomas Preudhomme wrote:

Hi Kyrill,

On 04/04/18 18:19, Kyrill Tkachov wrote:

Hi Thomas,

On 04/04/18 18:03, Thomas Preudhomme wrote:

Hi,

__builtin_cmse_nonsecure_caller implementation returns true in almost
all cases due to 2 separate bugs:

* gen_addsi is used instead of gen_andsi to retrieve the lsb
* the lsb boolean value is not negated but the specification [1] says
   the intrinsic should return true for a nonsecure caller and a
   nonsecure caller is characterized with LR's lsb being 0

This was not caught due to (1) lack of runtime test and (2) the existing
RTL scan not taking into account that '.' matches newline in Tcl regular
expressions.

This patch fixes the implementation issues and improves testing of
cmse_nonsecure_caller by (1) adding a runtime test for the secure caller
case and (2) looking for an SET insn of an AND expression in the right
function. This leaves the nonsecure caller case only partly tested
since the exact value being AND and the negation are not covered by the
scan and the existing test infrastructure does not allow 2 separate
compilation and link to be performed. It is enough though to catch the
current incorrect behavior.

The patch also reorganize the scan directives in cmse-1.c to more easily
identify what function they are intended to test in the file.

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2018-04-04  Thomas Preud'homme 

PR target/85203
* config/arm/arm-builtins.c (arm_expand_builtin): Change
expansion to perform a bitwise AND of the argument followed by a
boolean negation of the result.

*** gcc/testsuite/ChangeLog ***

2018-04-04  Thomas Preud'homme 

PR target/85203
* gcc.target/arm/cmse/cmse-1.c: Tighten cmse_nonsecure_caller RTL scan
to match a single insn of the baz function.  Move scan directives at
the end of the file below the functions they are trying to test for
better readability.
* gcc.target/arm/cmse/cmse-16.c: New testcase.

Testing: No bootstrap since only M profile builtin code has been changed
but regression testing for arm-none-eabi targeting Arm Cortex-M23 and
Cortex-M33 shows no regression.

Is this ok for stage4?



Ok, thanks for fixing this.
Does this need backporting to the branches?


Yes to gcc-7-branch only.


The patch applies cleanly on gcc-7-branch and the same testing shows no 
regression. Ok to apply to gcc-7-branch once the patch has baked for 7 days in 
trunk?


Yes, thanks.
Kyrill


Best regards,

Thomas

Re: [C++ PATCH] Implement P0969

2018-04-05 Thread Jakub Jelinek

On Thu, Apr 05, 2018 at 09:53:41AM -0400, Jason Merrill wrote:
> OK.

Is this something that should go into cxx-status.html?  Is that a DR,
applying to C++17 too?

> On Thu, Apr 5, 2018 at 5:06 AM, Ville Voutilainen
>  wrote:
> > Tested on Linux-PPC64.
> >
> > 2018-04-05  Ville Voutilainen  
> >
> > gcc/cp
> >
> > Implement P0969
> > * decl.c (find_decomp_class_base): Check accessibility instead
> > of declared access, adjust diagnostic.
> >
> > testsuite/
> >
> > Implement P0969
> > * g++.dg/cpp1z/decomp4.C: Adjust.
> > * g++.dg/cpp1z/decomp38.C: New.

Jakub

Re: [PATCH, GCC/ARM] Fix PR85203: cmse_nonsecure_caller returns wrong result

2018-04-05 Thread Thomas Preudhomme


Hi Kyrill,

On 04/04/18 18:20, Thomas Preudhomme wrote:

Hi Kyrill,

On 04/04/18 18:19, Kyrill Tkachov wrote:

Hi Thomas,

On 04/04/18 18:03, Thomas Preudhomme wrote:

Hi,

__builtin_cmse_nonsecure_caller implementation returns true in almost
all cases due to 2 separate bugs:

* gen_addsi is used instead of gen_andsi to retrieve the lsb
* the lsb boolean value is not negated but the specification [1] says
   the intrinsic should return true for a nonsecure caller and a
   nonsecure caller is characterized with LR's lsb being 0

This was not caught due to (1) lack of runtime test and (2) the existing
RTL scan not taking into account that '.' matches newline in Tcl regular
expressions.

This patch fixes the implementation issues and improves testing of
cmse_nonsecure_caller by (1) adding a runtime test for the secure caller
case and (2) looking for an SET insn of an AND expression in the right
function. This leaves the nonsecure caller case only partly tested
since the exact value being AND and the negation are not covered by the
scan and the existing test infrastructure does not allow 2 separate
compilation and link to be performed. It is enough though to catch the
current incorrect behavior.

The patch also reorganize the scan directives in cmse-1.c to more easily
identify what function they are intended to test in the file.

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2018-04-04  Thomas Preud'homme 

    PR target/85203
    * config/arm/arm-builtins.c (arm_expand_builtin): Change
    expansion to perform a bitwise AND of the argument followed by a
    boolean negation of the result.

*** gcc/testsuite/ChangeLog ***

2018-04-04  Thomas Preud'homme 

    PR target/85203
    * gcc.target/arm/cmse/cmse-1.c: Tighten cmse_nonsecure_caller RTL scan
    to match a single insn of the baz function.  Move scan directives at
    the end of the file below the functions they are trying to test for
    better readability.
    * gcc.target/arm/cmse/cmse-16.c: New testcase.

Testing: No bootstrap since only M profile builtin code has been changed
but regression testing for arm-none-eabi targeting Arm Cortex-M23 and
Cortex-M33 shows no regression.

Is this ok for stage4?



Ok, thanks for fixing this.
Does this need backporting to the branches?


Yes to gcc-7-branch only.


The patch applies cleanly on gcc-7-branch and the same testing shows no 
regression. Ok to apply to gcc-7-branch once the patch has baked for 7 days in 
trunk?


Best regards,

Thomas

C++ PATCH for c++/85228, ICE with lambda in enumerator

2018-04-05 Thread Jason Merrill

Now that we instantiate lambdas with their enclosing context, the
closure type doesn't have CLASSTYPE_TEMPLATE_INFO.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit 7c51c3a6ef0b137fc124ba65f686a607370f2c3a
Author: Jason Merrill 
Date:   Thu Apr 5 10:26:20 2018 -0400

PR c++/85228 - ICE with lambda in enumerator.

* pt.c (bt_instantiate_type_proc): Don't assume
CLASSTYPE_TEMPLATE_INFO is non-null.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index ed6e62c2364..dc74635876e 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -22841,6 +22841,7 @@ bt_instantiate_type_proc (binding_entry entry, void *data)
   tree storage = *(tree *) data;
 
   if (MAYBE_CLASS_TYPE_P (entry->type)
+  && CLASSTYPE_TEMPLATE_INFO (entry->type)
   && !uses_template_parms (CLASSTYPE_TI_ARGS (entry->type)))
 do_type_instantiation (TYPE_MAIN_DECL (entry->type), storage, 0);
 }
diff --git a/gcc/testsuite/g++.dg/cpp1z/constexpr-lambda21.C b/gcc/testsuite/g++.dg/cpp1z/constexpr-lambda21.C
new file mode 100644
index 000..8b0c95b37f3
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/constexpr-lambda21.C
@@ -0,0 +1,9 @@
+// PR c++/85228
+// { dg-additional-options -std=c++17 }
+
+template struct A
+{
+  enum E { e = []{ return 0; }() };
+};
+
+template class A<0>;

Re: [PATCH] Add -C when using -Wimplicit-fallthrough and --save-temps (PR preprocessor/78497).

2018-04-05 Thread Martin Liška

On 04/05/2018 03:00 PM, Jakub Jelinek wrote:
> On Thu, Apr 05, 2018 at 02:51:22PM +0200, Martin Liška wrote:
>> On 04/04/2018 09:31 PM, Marek Polacek wrote:
>>> On Tue, Apr 03, 2018 at 02:29:37PM +0200, Martin Liška wrote:
 Hi.

 This helps the warning with --save-temps. Doing that one needs to preserve 
 comments
 in preprocessed source file.
>>>
>>> Do we really want to only use -C when -Wimplicit-fallthrough is in effect?  
>>> I
>>> mean, shouldn't we always use -C when -save-temps?
>>
>> Why not, Jakub what do you think? Note that it was originally Jakub's idea 
>> to do that.
> 
> I'd prefer to do that only when we actually care about the comments, it is a
> behavior change in any case, and might be undesirable to some people.
> 
> Note that we do not care about the comments if -Wimplicit-fallthrough=0
> or -Wimplicit-fallthrough=5, but do care for:
> -Wimplicit-fallthrough
> -Wimplicit-fallthrough=1
> -Wimplicit-fallthrough=2
> -Wimplicit-fallthrough=3
> -Wimplicit-fallthrough=4
> -Wextra
> -W

Or you can trigger the warning via
-Werror=implicit-fallthrough*

which would complicate the option matching.

> (unless -Wno-implicit-fallthrough).  So, it would be desirable to:
> 1) swap the order, put save-temps to the outer level
> 2) use
> {Wimplicit-fallthrough*:{!Wimplicit-fallthrough=0:{!Wimplicit-fallthrough=5:...}}}
> 3) verify (including adding testcases) that it doesn't emit comments for the
> =0, =5 or -W -Wno-implicit-fallthrough cases, but does for -W etc.

That would also complicate the exclusion of negative forms.
Sigh. I'm thinking if really worth it..

Martin

> 
>   Jakub
>

Re: [C++ Patch] PR 80956 ("[7/8 Regression] ICE with abstract class vector")

2018-04-05 Thread Jason Merrill

On Thu, Apr 5, 2018 at 10:07 AM, Paolo Carlini  wrote:
> Hi,
>
> On 05/04/2018 15:56, Jason Merrill wrote:
>>
>> On Thu, Apr 5, 2018 at 8:27 AM, Paolo Carlini 
>> wrote:
>>>
>>> Hi,
>>>
>>> the main issue is already fixed in trunk but we still ICE on the reduced
>>> testcase attached by Jakub which has a broken std::initializer_list
>>> missing
>>> the definition. I think we can handle this case similarly to the existing
>>> check in do_pushtag, which would be also consistent with the plain error
>>> we
>>> give for, eg:
>>>
>>> namespace std { template  class initializer_list; }
>>>
>>> template class std::initializer_list;
>>>
>>> However, we still have the option of issuing a fatal_error, like we do in
>>> finish_struct.
>>
>> How about using complete_type_or_maybe_complain instead of a custom error?
>
> Yes, I was about to send a message about that option, I already had it in
> the audit trail, tested too, then started fiddling with fatal_errors and
> forgot to mention it ;) Anyway, would be the below.

OK, thanks.

Jason

Re: [C++ PATCH] Implement P0961

2018-04-05 Thread Jason Merrill

OK, thanks.

On Thu, Apr 5, 2018 at 10:06 AM, Ville Voutilainen
 wrote:
> On 5 April 2018 at 16:37, Jason Merrill  wrote:
>> On Wed, Apr 4, 2018 at 6:26 PM, Ville Voutilainen
>>  wrote:
>>> +   tree parm = TREE_VEC_ELT (TREE_VALUE (tparms), 0);
>>
>> I think you want to use INNERMOST_TEMPLATE_PARMS here rather than
>> TREE_VALUE directly.
>>
>> Please also add a comment quoting the rule you're implementing.
>
> Alright, here's a new patch.

[PATCH][explow] PR target/85173: validize memory before passing it on to target probe_stack

2018-04-05 Thread Kyrill Tkachov


Hi all,

In this PR the expansion code emits an invalid memory address for the stack 
probe, which the backend fails to recognise.
The address is created explicitly in anti_adjust_stack_and_probe_stack_clash in 
explow.c and passed down to gen_probe_stack
without any validation in emit_stack_probe.

This patch fixes the ICE by calling validize_mem on the memory location before 
passing it down to the target.
Jakub pointed out that we also want to create valid addresses for the 
probe_stack_address case, so this patch
creates an expand operand and legitimizes it before passing it down to the 
probe_stack_address expander.

This patch passes bootstrap and testing on arm-none-linux-gnueabihf and 
aarch64-none-linux-gnu
and ppc64le-redhat-linux on gcc112 in the compile farm.

Is this ok for trunk?

Thanks,
Kyrill

P.S. Uros, the alpha probe_stack expander in alpha.md seems incompatible with 
the way the probe_stack name is
used in the midend. It accepts a const_int operand that is used as an offset 
from the stack pointer, rather than accepting
a full memory operand like other targets. Do you think it's better to rename 
the probe_stack pattern there to something
that doesn't conflict with the name the midend assumes?

2018-04-05  Kyrylo Tkachov  

PR target/85173
* explow.c (emit_stack_probe): Call validize_mem on memory location
before passing it to gen_probe_stack.  Create address operand and
legitimize it for the probe_stack_address case.

2018-04-05  Kyrylo Tkachov  

PR target/85173
* gcc.target/arm/pr85173.c: New test.
commit dc4e225eb394eaba8765425c0c7076c13d107580
Author: Kyrylo Tkachov 
Date:   Tue Apr 3 16:46:15 2018 +0100

[explow] PR target/85173: validize memory before passing it on to target probe_stack

diff --git a/gcc/explow.c b/gcc/explow.c
index 042e719..fb2b7ff 100644
--- a/gcc/explow.c
+++ b/gcc/explow.c
@@ -1626,18 +1626,25 @@ void
 emit_stack_probe (rtx address)
 {
   if (targetm.have_probe_stack_address ())
-emit_insn (targetm.gen_probe_stack_address (address));
+{
+  struct expand_operand ops[1];
+  insn_code icode = targetm.code_for_probe_stack_address;
+  create_address_operand (ops, address);
+  maybe_legitimize_operands (icode, 0, 1, ops);
+  expand_insn (icode, 1, ops);
+}
   else
 {
   rtx memref = gen_rtx_MEM (word_mode, address);
 
   MEM_VOLATILE_P (memref) = 1;
+  memref = validize_mem (memref);
 
   /* See if we have an insn to probe the stack.  */
   if (targetm.have_probe_stack ())
-emit_insn (targetm.gen_probe_stack (memref));
+	emit_insn (targetm.gen_probe_stack (memref));
   else
-emit_move_insn (memref, const0_rtx);
+	emit_move_insn (memref, const0_rtx);
 }
 }
 
diff --git a/gcc/testsuite/gcc.target/arm/pr85173.c b/gcc/testsuite/gcc.target/arm/pr85173.c
new file mode 100644
index 000..36105c9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/pr85173.c
@@ -0,0 +1,20 @@
+/* PR target/85173.  */
+
+/* { dg-do compile } */
+/* { dg-options "-O2 -fstack-clash-protection --param stack-clash-protection-probe-interval=14" } */
+/* { dg-require-effective-target arm_thumb2_ok } */
+
+__attribute__((noinline, noclone)) void
+foo (char *p)
+{
+  asm volatile ("" : : "r" (p) : "memory");
+}
+
+/* Nonconstant alloca, small local frame.  */
+__attribute__((noinline, noclone)) void
+f5 (int x)
+{
+  char locals[128];
+  char *vla = __builtin_alloca (x);
+  foo (vla);
+}

Re: C++ PATCH for c++/85200, ICE with constexpr if in generic lambda

2018-04-05 Thread Jason Merrill

On Wed, Apr 4, 2018 at 2:57 PM, Jason Merrill  wrote:
> Since, during partial instantiation of a generic lambda, we don't
> substitute into the body of a constexpr if, we don't do lambda capture
> as part of substitution.  But extract_locals_r is supposed to do it as
> part of remembering the substitution context to be used later.
>
> As it turns out, this was failing because walk_tree_1 expects the
> DECL_INITIAL of a local variable to be walked in the context of the
> BIND_EXPR, but we don't build BIND_EXPRs for compound-statements in a
> template.  So in a template, let's walk the fields of a VAR_DECL from
> its DECL_EXPR.

...but I forgot to also check the original testcase, which was also
failing for a different reason: we were remembering the mapping from
fully general to partially instantiated even for the condition
variable of the constexpr if, which is wrong because we are discarding
that instantiation at this point, so using it in a later instantiation
breaks.

I also considered specifically removing the condition variable from
local_instantiations, but this more general approach seems safer.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit a4b63e3bd89af40e316b1f16f0bba9001cf8b351
Author: Jason Merrill 
Date:   Thu Apr 5 09:12:05 2018 -0400

PR c++/85200 - ICE with constexpr if in generic lambda.

* pt.c (extract_locals_r): Don't record the local specs of variables
declared within the pattern.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 741c578b65b..204e390353c 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -11597,8 +11597,12 @@ tsubst_binary_right_fold (tree t, tree args, tsubst_flags_t complain,
 
 struct el_data
 {
+  hash_set internal;
   tree extra;
   tsubst_flags_t complain;
+
+  el_data (tsubst_flags_t c)
+: extra(NULL_TREE), complain(c) {}
 };
 static tree
 extract_locals_r (tree *tp, int */*walk_subtrees*/, void *data_)
@@ -11606,8 +11610,13 @@ extract_locals_r (tree *tp, int */*walk_subtrees*/, void *data_)
   el_data  = *reinterpret_cast(data_);
   tree *extra = 
   tsubst_flags_t complain = data.complain;
-  if (tree spec = retrieve_local_specialization (*tp))
+  if (TREE_CODE (*tp) == DECL_EXPR)
+data.internal.add (DECL_EXPR_DECL (*tp));
+  else if (tree spec = retrieve_local_specialization (*tp))
 {
+  if (data.internal.contains (*tp))
+	/* Don't mess with variables declared within the pattern.  */
+	return NULL_TREE;
   if (TREE_CODE (spec) == NONTYPE_ARGUMENT_PACK)
 	{
 	  /* Maybe pull out the PARM_DECL for a partial instantiation.  */
@@ -11658,7 +11667,7 @@ extract_locals_r (tree *tp, int */*walk_subtrees*/, void *data_)
 static tree
 extract_local_specs (tree pattern, tsubst_flags_t complain)
 {
-  el_data data = { NULL_TREE, complain };
+  el_data data (complain);
   cp_walk_tree_without_duplicates (, extract_locals_r, );
   return data.extra;
 }
diff --git a/gcc/testsuite/g++.dg/cpp1z/constexpr-if19.C b/gcc/testsuite/g++.dg/cpp1z/constexpr-if19.C
new file mode 100644
index 000..40016a5b7e1
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/constexpr-if19.C
@@ -0,0 +1,17 @@
+// PR c++/85200
+// { dg-additional-options -std=c++17 }
+
+struct A{
+constexpr operator int(){ return 0; }
+};
+
+template < typename >
+void f(){
+[](auto known){
+if constexpr(constexpr int k = known);
+}(A{});
+}
+
+int main(){
+f();
+}

Re: [wwwdocs] gcc-8/changes.html

2018-04-05 Thread Martin Liška

On 04/04/2018 08:51 AM, Gerald Pfeifer wrote:
> On Tue, 3 Apr 2018, Martin Liška wrote:
>> I would like to document features I prepared in time frame of GCC 8.
> 
> A few observations in addition what Martin (S.) provided (Thanks!):
> 
> Index: htdocs/gcc-8/changes.html
> ===
> +  GCOV tool can use TERM colors to provide more readable output.
> 
> Similar to Martin's note.  "GCOV" or the "The GCOV tool", though in the
> latter case probably "The gcov tool"?
> 
> +  AddressSanitizer gained a new pair of sanitization options,
> +  -fsanitize=pointer-compare and -fsanitize=pointer-subtract, which
> 
> -fsanitize..., twice.
> 
> This looks fine with those changes and the others pointed out, thanks!
> 
> Gerald
> 

Thank you both for the review. Final version of the patch that I've just
installed is attached.

Martin
Index: htdocs/gcc-8/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-8/changes.html,v
retrieving revision 1.52
diff --unified -r1.52 changes.html
--- htdocs/gcc-8/changes.html	4 Apr 2018 17:43:03 -	1.52
+++ htdocs/gcc-8/changes.html	5 Apr 2018 14:16:26 -
@@ -113,6 +113,94 @@
 possible for the user to have a finer-grained control over the loop
 unrolling optimization.
   
+  
+The gcov tool can distinguish functions that begin
+on a same line in a source file.  This can be a different template
+instantiation or a class constructor:
+
+File 'ins.C'
+Lines executed:100.00% of 8
+Creating 'ins.C.gcov'
+
+-:0:Source:ins.C
+-:0:Graph:ins.gcno
+-:0:Data:ins.gcda
+-:0:Runs:1
+-:0:Programs:1
+-:1:template
+-:2:class Foo
+-:3:{
+-:4: public:
+2:5:   Foo(): b (1000) {}
+--
+Foo::Foo():
+1:5:   Foo(): b (1000) {}
+--
+Foo::Foo():
+1:5:   Foo(): b (1000) {}
+--
+2:6:   void inc () { b++; }
+--
+Foo::inc():
+1:6:   void inc () { b++; }
+--
+Foo::inc():
+1:6:   void inc () { b++; }
+--
+-:7:
+-:8:  private:
+-:9:   int b;
+-:   10:};
+-:   11:
+1:   12:int main(int argc, char **argv)
+-:   13:{
+1:   14:  Foo a;
+1:   15:  Foo b;
+-:   16:
+1:   17:  a.inc ();
+1:   18:  b.inc ();
+1:   19:}
+
+  
+  The gcov tool has more accurate numbers for execution of lines
+  in a source file.
+  The gcov tool can use TERM colors to provide more readable output.
+  AddressSanitizer gained a new pair of sanitization options,
+  -fsanitize=pointer-compare and -fsanitize=pointer-subtract, which
+  warn about subtraction (or comparison) of pointers that point to
+  a different memory object:
+  
+int
+main ()
+{
+  /* Heap allocated memory.  */
+  char *heap1 = (char *)__builtin_malloc (42);
+  char *heap2 = (char *)__builtin_malloc (42);
+  if (heap1 > heap2)
+  return 1;
+
+  return 0;
+}
+
+==17465==ERROR: AddressSanitizer: invalid-pointer-pair: 0x60400010 0x60400050
+#0 0x40070f in main /tmp/pointer-compare.c:7
+#1 0x76a72a86 in __libc_start_main (/lib64/libc.so.6+0x21a86)
+#2 0x400629 in _start (/tmp/a.out+0x400629)
+
+0x60400010 is located 0 bytes inside of 42-byte region [0x60400010,0x6040003a)
+allocated by thread T0 here:
+#0 0x76efb390 in __interceptor_malloc ../../../../libsanitizer/asan/asan_malloc_linux.cc:86
+#1 0x4006ea in main /tmp/pointer-compare.c:5
+#2 0x76a72a86 in __libc_start_main (/lib64/libc.so.6+0x21a86)
+
+0x60400050 is located 0 bytes inside of 42-byte region [0x60400050,0x6040007a)
+allocated by thread T0 here:
+#0 0x76efb390 in __interceptor_malloc ../../../../libsanitizer/asan/asan_malloc_linux.cc:86
+#1 0x4006f8 in main /tmp/pointer-compare.c:6
+#2 0x76a72a86 in __libc_start_main (/lib64/libc.so.6+0x21a86)
+
+SUMMARY: AddressSanitizer: invalid-pointer-pair /tmp/pointer-compare.c:7 in main
+  
 
 
 
@@ -420,6 +508,9 @@
 
 
   
+  A return statement without an operand in a non-void function
+  is considered unreachable and may be subject to optimization
+  on that basis.
 
 
 Fortran

Re: [og7] vector_length extension part 2: Generalize state propagation and synchronization

2018-04-05 Thread Tom de Vries


On 04/03/2018 05:00 PM, Tom de Vries wrote:

+  unsigned int psize = ROUND_UP (data.offset, oacc_bcast_align);
+  unsigned int pnum = (nvptx_mach_vector_length () > PTX_WARP_SIZE
+  ? nvptx_mach_max_workers () + 1
+  : 1);


This claims too much space for a simple long vector loop. Filed as 
PR85231 - "[og7, openacc, nvptx] Too much shared memory claimed for long 
vector length".


Thanks,
- Tom

Re: [og7] vector_length extension part 3: reductions

2018-04-05 Thread Tom de Vries


On 03/02/2018 06:51 PM, Cesar Philippidis wrote:

This patch teaches the nvptx BE how to process vector reductions with
large vector lengths.


Committed test-case exercising large vector length with reductions.

Thanks,
- Tom
[openacc] Add vector-length-128-10.c

2018-04-05  Tom de Vries  

	* testsuite/libgomp.oacc-c-c++-common/vector-length-128-10.c: New test.

---
 .../vector-length-128-10.c | 40 ++
 1 file changed, 40 insertions(+)

diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/vector-length-128-10.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/vector-length-128-10.c
new file mode 100644
index 000..e46b5cf
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/vector-length-128-10.c
@@ -0,0 +1,40 @@
+/* { dg-do run } */
+
+#include 
+
+#define N 1024
+
+unsigned int a[N];
+unsigned int b[N];
+unsigned int c[N];
+unsigned int n = N;
+
+int
+main (void)
+{
+  for (unsigned int i = 0; i < n; ++i)
+{
+  a[i] = i % 3;
+  b[i] = i % 5;
+}
+
+  unsigned int res = 1;
+  unsigned long long res2 = 1;
+#pragma acc parallel vector_length (128) copyin (a,b) reduction (+:res, res2) copy (res, res2)
+  {
+#pragma acc loop vector reduction (+:res, res2)
+for (unsigned int i = 0; i < n; i++)
+  {
+	res += ((a[i] + b[i]) % 2);
+	res2 += ((a[i] + b[i]) % 2);
+  }
+  }
+
+  if (res != 478)
+abort ();
+  if (res2 != 478)
+abort ();
+
+  return 0;
+}
+/* { dg-prune-output "using vector_length \\(32\\), ignoring 128" } */

Re: [C++ Patch] PR 80956 ("[7/8 Regression] ICE with abstract class vector")

2018-04-05 Thread Paolo Carlini


Hi,

On 05/04/2018 15:56, Jason Merrill wrote:

On Thu, Apr 5, 2018 at 8:27 AM, Paolo Carlini  wrote:

Hi,

the main issue is already fixed in trunk but we still ICE on the reduced
testcase attached by Jakub which has a broken std::initializer_list missing
the definition. I think we can handle this case similarly to the existing
check in do_pushtag, which would be also consistent with the plain error we
give for, eg:

namespace std { template  class initializer_list; }

template class std::initializer_list;

However, we still have the option of issuing a fatal_error, like we do in
finish_struct.

How about using complete_type_or_maybe_complain instead of a custom error?
Yes, I was about to send a message about that option, I already had it 
in the audit trail, tested too, then started fiddling with fatal_errors 
and forgot to mention it ;) Anyway, would be the below.


Thanks,
Paolo.

/
Index: cp/call.c
===
--- cp/call.c   (revision 259124)
+++ cp/call.c   (working copy)
@@ -6880,8 +6880,12 @@ convert_like_real (conversion *convs, tree expr, t
if (array == error_mark_node)
  return error_mark_node;
 
-   /* Build up the initializer_list object.  */
-   totype = complete_type (totype);
+   /* Build up the initializer_list object.  Note: fail gracefully
+  if the object cannot be completed because, for example, no
+  definition is provided.  */
+   totype = complete_type_or_maybe_complain (totype, NULL_TREE, complain);
+   if (!totype)
+ return error_mark_node;
field = next_initializable_field (TYPE_FIELDS (totype));
CONSTRUCTOR_APPEND_ELT (vec, field, array);
field = next_initializable_field (DECL_CHAIN (field));
Index: cp/name-lookup.c
===
--- cp/name-lookup.c(revision 259124)
+++ cp/name-lookup.c(working copy)
@@ -6476,8 +6476,8 @@ do_pushtag (tree name, tree type, tag_scope scope)
  && init_list_identifier == DECL_NAME (TYPE_NAME (type))
  && !CLASSTYPE_TEMPLATE_INFO (type))
{
- error ("declaration of std::initializer_list does not match "
-"#include , isn't a template");
+ error ("declaration of % does not match "
+"%<#include %>, isn't a template");
  return error_mark_node;
}
}
Index: testsuite/g++.dg/cpp0x/initlist100.C
===
--- testsuite/g++.dg/cpp0x/initlist100.C(nonexistent)
+++ testsuite/g++.dg/cpp0x/initlist100.C(working copy)
@@ -0,0 +1,10 @@
+// PR c++/80956
+// { dg-do compile { target c++11 } }
+
+namespace std {
+template  class initializer_list;  // { dg-message "declaration" }
+}
+
+template  struct B { B (std::initializer_list); };
+struct C { virtual int foo (); };
+struct D : C {} d { B { D {} } };  // { dg-error "incomplete|no matching" }
Index: testsuite/g++.dg/cpp0x/initlist101.C
===
--- testsuite/g++.dg/cpp0x/initlist101.C(nonexistent)
+++ testsuite/g++.dg/cpp0x/initlist101.C(working copy)
@@ -0,0 +1,8 @@
+// PR c++/80956
+// { dg-do compile { target c++11 } }
+
+#include 
+
+template  struct B { B (std::initializer_list); };
+struct C { virtual int foo (); };
+struct D : C {} d { B { D {} } };  // { dg-error "no matching" }

Re: [og7] vector_length extension part 2: Generalize state propagation and synchronization

2018-04-05 Thread Tom de Vries


On 04/03/2018 05:00 PM, Tom de Vries wrote:

On 03/02/2018 05:55 PM, Cesar Philippidis wrote:

* config/nvptx/nvptx.c (oacc_bcast_partition): Declare.


One last thing: this variable needs to be reset to zero for every function.

Without this reset, we can generated different code for a function 
depending on whether there's another function in front or not.


In the previous commit, I set that variable in nvptx_option_override, 
but as I've found out that's not enough.


This patch does the init in nvptx_set_current_function.

Build x86_64 with nvptx accelerator and reg-tested libgomp.

Committed.

Thanks,
- Tom
[nvptx] Add per-function initialization of oacc_broadcast_partition

2018-04-05  Tom de Vries  

	* config/nvptx/nvptx.c (nvptx_set_current_function): Initialize
	oacc_broadcast_partition.

---
 gcc/config/nvptx/nvptx.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index 0b46e13..009ca59 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -5962,6 +5962,7 @@ nvptx_set_current_function (tree fndecl)
 
   gangprivate_shared_hmap.empty ();
   nvptx_previous_fndecl = fndecl;
+  oacc_bcast_partition = 0;
 }
 
 #undef TARGET_OPTION_OVERRIDE

Re: [C++ PATCH] Implement P0961

2018-04-05 Thread Ville Voutilainen

On 5 April 2018 at 16:37, Jason Merrill  wrote:
> On Wed, Apr 4, 2018 at 6:26 PM, Ville Voutilainen
>  wrote:
>> +   tree parm = TREE_VEC_ELT (TREE_VALUE (tparms), 0);
>
> I think you want to use INNERMOST_TEMPLATE_PARMS here rather than
> TREE_VALUE directly.
>
> Please also add a comment quoting the rule you're implementing.

Alright, here's a new patch.
diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 746084c..31d9c98 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -7432,7 +7432,27 @@ get_tuple_decomp_init (tree decl, unsigned i)
 
   tree fns = lookup_qualified_name (TREE_TYPE (e), get_id,
 /*type*/false, /*complain*/false);
-  if (fns != error_mark_node)
+  bool use_member_get = false;
+
+  /* To use a member get, member lookup must find at least one
+ declaration that is a function template
+ whose first template parameter is a non-type parameter.  */
+  for (lkp_iterator iter (MAYBE_BASELINK_FUNCTIONS (fns)); iter; ++iter)
+{
+  tree fn = *iter;
+  if (TREE_CODE (fn) == TEMPLATE_DECL)
+	{
+	  tree tparms = DECL_TEMPLATE_PARMS (fn);
+	  tree parm = TREE_VEC_ELT (INNERMOST_TEMPLATE_PARMS (tparms), 0);
+	  if (TREE_CODE (TREE_VALUE (parm)) == PARM_DECL)
+	{
+	  use_member_get = true;
+	  break;
+	}
+	}
+}
+
+  if (use_member_get)
 {
   fns = lookup_template_function (fns, targs);
   return build_new_method_call (e, fns, /*args*/NULL,
diff --git a/gcc/testsuite/g++.dg/cpp1z/decomp10.C b/gcc/testsuite/g++.dg/cpp1z/decomp10.C
index 6ed9272..b4169d3 100644
--- a/gcc/testsuite/g++.dg/cpp1z/decomp10.C
+++ b/gcc/testsuite/g++.dg/cpp1z/decomp10.C
@@ -20,7 +20,7 @@ void f3() { auto [ x ] = a3; }	// { dg-error "get" }
 
 struct A3a { int i,j; int get(); } a3a;
 template<> struct std::tuple_size { enum { value = 1 }; };
-void f3a() { auto [ x ] = a3a; }	// { dg-error "get<0>" }
+void f3a() { auto [ x ] = a3a; }	// { dg-error "get" }
 
 struct A3b { int i,j; } a3b;
 int get(A3b&&);
diff --git a/gcc/testsuite/g++.dg/cpp1z/decomp37.C b/gcc/testsuite/g++.dg/cpp1z/decomp37.C
new file mode 100644
index 000..dc47908
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/decomp37.C
@@ -0,0 +1,62 @@
+// { dg-additional-options -std=c++17 }
+// { dg-do compile }
+
+#include 
+#include 
+#include 
+
+struct X : private std::shared_ptr
+{
+  std::string fun_payload;
+};
+
+template std::string& get(X& x)
+{
+  if constexpr(N==0) return x.fun_payload;
+}
+
+namespace std {
+  template<> class tuple_size : public std::integral_constant {};
+  template<> class tuple_element<0, X> {public: using type = std::string;};
+}
+
+struct X2 : private std::shared_ptr
+{
+  int fun_payload;
+  template  void get();
+};
+
+template int& get(X2& x)
+{
+  if constexpr(N==0) return x.fun_payload;
+}
+
+namespace std {
+  template<> class tuple_size : public std::integral_constant {};
+  template<> class tuple_element<0, X2> {public: using type = int;};
+}
+
+class X3
+{
+  double fun_payload;
+public:
+  template  double& get()
+  {
+if constexpr(N==0) return fun_payload;
+  }
+};
+
+namespace std {
+  template<> class tuple_size : public std::integral_constant {};
+  template<> class tuple_element<0, X3> {public: using type = double;};
+}
+
+int main()
+{
+  X x;
+  auto& [b1] = x;
+  X2 x2;
+  auto& [b2] = x2;
+  X3 x3;
+  auto& [b3] = x3;
+}

Re: [C++ PATCH] Implement P0969

2018-04-05 Thread Jason Merrill

On Thu, Apr 5, 2018 at 9:56 AM, Jakub Jelinek  wrote:
> On Thu, Apr 05, 2018 at 09:53:41AM -0400, Jason Merrill wrote:
>> OK.
>
> Is this something that should go into cxx-status.html?  Is that a DR,
> applying to C++17 too?

Probably, and yes.

>> On Thu, Apr 5, 2018 at 5:06 AM, Ville Voutilainen
>>  wrote:
>> > Tested on Linux-PPC64.
>> >
>> > 2018-04-05  Ville Voutilainen  
>> >
>> > gcc/cp
>> >
>> > Implement P0969
>> > * decl.c (find_decomp_class_base): Check accessibility instead
>> > of declared access, adjust diagnostic.
>> >
>> > testsuite/
>> >
>> > Implement P0969
>> > * g++.dg/cpp1z/decomp4.C: Adjust.
>> > * g++.dg/cpp1z/decomp38.C: New.
>
> Jakub

Re: [C++ Patch] PR 80956 ("[7/8 Regression] ICE with abstract class vector")

2018-04-05 Thread Jason Merrill

On Thu, Apr 5, 2018 at 8:27 AM, Paolo Carlini  wrote:
> Hi,
>
> the main issue is already fixed in trunk but we still ICE on the reduced
> testcase attached by Jakub which has a broken std::initializer_list missing
> the definition. I think we can handle this case similarly to the existing
> check in do_pushtag, which would be also consistent with the plain error we
> give for, eg:
>
> namespace std { template  class initializer_list; }
>
> template class std::initializer_list;
>
> However, we still have the option of issuing a fatal_error, like we do in
> finish_struct.

How about using complete_type_or_maybe_complain instead of a custom error?

Jason

Re: [C++ PATCH] Implement P0969

2018-04-05 Thread Jason Merrill

OK.

On Thu, Apr 5, 2018 at 5:06 AM, Ville Voutilainen
 wrote:
> Tested on Linux-PPC64.
>
> 2018-04-05  Ville Voutilainen  
>
> gcc/cp
>
> Implement P0969
> * decl.c (find_decomp_class_base): Check accessibility instead
> of declared access, adjust diagnostic.
>
> testsuite/
>
> Implement P0969
> * g++.dg/cpp1z/decomp4.C: Adjust.
> * g++.dg/cpp1z/decomp38.C: New.

Re: [C++ Patch] PR 84792 ("[6/7/8 Regression] ICE with broken typedef of a struct")

2018-04-05 Thread Jason Merrill

On Wed, Apr 4, 2018 at 8:48 PM, Paolo Carlini  wrote:
> I'm really happy to report that these 5 ugly lines are causing an actual
> bug. Seriously, not considering the formatting, the problem is that we
> really want to keep 'type' in sync, because we are using it below before
> returning. Note that we don't regress location-wise because either
> explicitly or implicitly we pass input_location anyway. Finishing testing on
> x86_64-linux.

Hmm, I wonder why we need to make the type error_mark_node at all,
given that the problem is with the declarator.

But this patch is OK.

Jason

Re: [C++ PATCH] Implement P0961

2018-04-05 Thread Jason Merrill

On Wed, Apr 4, 2018 at 6:26 PM, Ville Voutilainen
 wrote:
> +   tree parm = TREE_VEC_ELT (TREE_VALUE (tparms), 0);

I think you want to use INNERMOST_TEMPLATE_PARMS here rather than
TREE_VALUE directly.

Please also add a comment quoting the rule you're implementing.

Jason

Re: [PATCH] rs6000: Enable -fasynchronous-unwind-tables by default

2018-04-05 Thread Jakub Jelinek

On Thu, Apr 05, 2018 at 05:08:36AM -0500, Segher Boessenkool wrote:
> On Thu, Apr 05, 2018 at 11:50:38AM +0200, Jakub Jelinek wrote:
> > On Thu, Apr 05, 2018 at 09:45:54AM +, Segher Boessenkool wrote:
> > > To find out where on-entry register values live at any point in a
> > > program, GDB currently tries to parse to parse the executable code.
> > > This does not work very well, for example it gets confused if some
> > > accesses to the stack use the frame pointer (r31) and some use the
> > > stack pointer (r1).  A symptom is that backtraces can be cut short.
> > > 
> > > This patch enables -fasynchronous-unwind-tables by default for rs6000,
> > > which causes us to emit DWARF unwind tables for all functions, solving
> > > these problems.
> > > 
> > > This not do anything for sub-targets without DWARF.
> > > 
> > > It increases executable size, but only modestly, and does not change
> > > memory use, only the disk image.
> > > 
> > > Various other targets already do this (x86, s390, tile*).
> > 
> > aarch64-linux* too (since r258871).
> 
> Ah yes, I forgot.  Aarch does it in its struct default_options; is that
> preferred?  I did it in rs6000_option_init_struct because originally I
> avoided enabling it on some non-DWARF platforms, but the flag is ignored
> elsewhere anyway, and various other things set the flags for everything,
> too, so now I enable it always.

I think either is fine, just note that both x86_64 -m64 and aarch64 turn
-funwind-tables on too by default, aarch64 through the default_options,
x86 through setting f_a_u_t to 2 and then:
  if (opts->x_flag_asynchronous_unwind_tables == 2)
opts->x_flag_unwind_tables
  = opts->x_flag_asynchronous_unwind_tables = 1;
in ix86_option_override_internal.  Note i?86 -m32 doesn't make
-funwind-tables the default on the other side, not sure if it really makes a
difference.  toplev.c process_options has:
  if (flag_non_call_exceptions)
flag_asynchronous_unwind_tables = 1;
  if (flag_asynchronous_unwind_tables)
flag_unwind_tables = 1;
too though.

Jakub

Re: [PATCH] Add -C when using -Wimplicit-fallthrough and --save-temps (PR preprocessor/78497).

2018-04-05 Thread Jakub Jelinek

On Thu, Apr 05, 2018 at 02:51:22PM +0200, Martin Liška wrote:
> On 04/04/2018 09:31 PM, Marek Polacek wrote:
> > On Tue, Apr 03, 2018 at 02:29:37PM +0200, Martin Liška wrote:
> >> Hi.
> >>
> >> This helps the warning with --save-temps. Doing that one needs to preserve 
> >> comments
> >> in preprocessed source file.
> > 
> > Do we really want to only use -C when -Wimplicit-fallthrough is in effect?  
> > I
> > mean, shouldn't we always use -C when -save-temps?
> 
> Why not, Jakub what do you think? Note that it was originally Jakub's idea to 
> do that.

I'd prefer to do that only when we actually care about the comments, it is a
behavior change in any case, and might be undesirable to some people.

Note that we do not care about the comments if -Wimplicit-fallthrough=0
or -Wimplicit-fallthrough=5, but do care for:
-Wimplicit-fallthrough
-Wimplicit-fallthrough=1
-Wimplicit-fallthrough=2
-Wimplicit-fallthrough=3
-Wimplicit-fallthrough=4
-Wextra
-W
(unless -Wno-implicit-fallthrough).  So, it would be desirable to:
1) swap the order, put save-temps to the outer level
2) use
{Wimplicit-fallthrough*:{!Wimplicit-fallthrough=0:{!Wimplicit-fallthrough=5:...}}}
3) verify (including adding testcases) that it doesn't emit comments for the
=0, =5 or -W -Wno-implicit-fallthrough cases, but does for -W etc.

Jakub

Re: [PATCH] [PR c++/85039] no type definitions in builtin offsetof

2018-04-05 Thread Jason Merrill

On Wed, Apr 4, 2018 at 12:25 PM, Alexandre Oliva  wrote:
> On Apr  4, 2018, Jason Merrill  wrote:
>
>> On Tue, Apr 3, 2018 at 11:25 PM, Alexandre Oliva  wrote:
>>> I still think we could attempt to retain the extension as it is, parsing
>>> types introduced in data member initializers within the scope of the
>>> class containing the data member, like we do, but, when the class is
>>> already complete, recording it if not in TYPE_FIELDS, in some additional
>>> field consulted for name mangling purposes and, in order to retain
>>> compatibility, if the type is not a closure or anonymous, also recording
>>> it in the enclosing namespace, so that it is found by lookup as in the
>>> quoted snippet.
>>>
>>> Is that a terrible idea?
>
>> It sounds like a lot of work to support a very questionable pattern.
>
> It's not so much work (the simple patch below does just that, and its
> testing is almost done); I agree it's questionable, and it's limited (it
> never worked in initializers of members of template classes, as the -4
> testcase, so we don't have to worry about retaining temporary
> compatibility with that), but it's there, so I think we'd be better off
> deprecating it, if that's the direction we want to go.  The patch below
> has just the right spot for a deprecation warning, even ;-)
>
> We could recommend users to use a closure that returns the offsetof
> instead of the unadorned offsetof.  That would work portably, but we
> shouldn't make the transformation ourselves: it would change the
> ABI-defined numbering of closure types.
>
>> Perhaps we should just disallow defining a type in offsetof if the
>> current scope is a class.
>
> Even anonymous types?  I suspect this could break a lot of existing
> code, with anonymous types hiding in macros.

It seems unlikely to me that such a use of macros would occur at class
scope; there's no C compatibility issue there.

Jason

Re: [PATCH] Add -C when using -Wimplicit-fallthrough and --save-temps (PR preprocessor/78497).

2018-04-05 Thread Martin Liška

On 04/04/2018 09:31 PM, Marek Polacek wrote:
> On Tue, Apr 03, 2018 at 02:29:37PM +0200, Martin Liška wrote:
>> Hi.
>>
>> This helps the warning with --save-temps. Doing that one needs to preserve 
>> comments
>> in preprocessed source file.
> 
> Do we really want to only use -C when -Wimplicit-fallthrough is in effect?  I
> mean, shouldn't we always use -C when -save-temps?

Why not, Jakub what do you think? Note that it was originally Jakub's idea to 
do that.

Martin

> 
>   Marek
>

[PATCH][RFC] Fix PR85180, find_base_term is exponential

2018-04-05 Thread Richard Biener


This fixes exponential time spent in find_base_term by extending the
existing fix of NULL-ifying VALUE locations during their processing
to keep them NULL-ified during the whole recursion.

As Jakub figured this has the chance of returning NULL prematurely
for the case of the plus/minus case rejecting a found base on the
ground of

if (base != NULL_RTX
&& ((REG_P (tmp1) && REG_POINTER (tmp1))
 || known_base_value_p (base)))
  return base;

so any VALUE visited during a such rejected base will not be
visited again (but we'll get a NULL result).

Quoting him from IRC: still no differences, out of 850mil calls

for an instrumented version.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

Alternative (safer in the above regard) variants using a hash-map
to cache the base for a VALUE are quite a bit slower if not implemented
in ways that also look dangerous at this point.  See the PR for
some of those.

Any comments?  Otherwise we decided to go ahead with this tomorrow.

Thanks,
Richard.

2018-04-05  Richard Biener  

PR middle-end/85180
* alias.c (find_base_term): New wrapper around find_base_term
unwinding CSELIB_VAL_PTR changes.
(find_base_term): Do not restore CSELIB_VAL_PTR during the
recursion.

* gcc.dg/pr85180.c: New testcase.

Index: gcc/alias.c
===
--- gcc/alias.c (revision 259082)
+++ gcc/alias.c (working copy)
@@ -1876,7 +1876,8 @@ rtx_equal_for_memref_p (const_rtx x, con
 }
 
 static rtx
-find_base_term (rtx x)
+find_base_term (rtx x, vec _vals)
 {
   cselib_val *val;
   struct elt_loc_list *l, *f;
@@ -1910,7 +1911,7 @@ find_base_term (rtx x)
 case POST_DEC:
 case PRE_MODIFY:
 case POST_MODIFY:
-  return find_base_term (XEXP (x, 0));
+  return find_base_term (XEXP (x, 0), visited_vals);
 
 case ZERO_EXTEND:
 case SIGN_EXTEND:  /* Used for Alpha/NT pointers */
@@ -1921,7 +1922,7 @@ find_base_term (rtx x)
return 0;
 
   {
-   rtx temp = find_base_term (XEXP (x, 0));
+   rtx temp = find_base_term (XEXP (x, 0), visited_vals);
 
if (temp != 0 && CONSTANT_P (temp))
  temp = convert_memory_address (Pmode, temp);
@@ -1940,7 +1941,9 @@ find_base_term (rtx x)
return static_reg_base_value[STACK_POINTER_REGNUM];
 
   f = val->locs;
-  /* Temporarily reset val->locs to avoid infinite recursion.  */
+  /* Reset val->locs to avoid infinite recursion.  */
+  if (f)
+   visited_vals.safe_push (std::make_pair (val, f));
   val->locs = NULL;
 
   for (l = f; l; l = l->next)
@@ -1949,16 +1952,15 @@ find_base_term (rtx x)
&& !CSELIB_VAL_PTR (l->loc)->locs->next
&& CSELIB_VAL_PTR (l->loc)->locs->loc == x)
  continue;
-   else if ((ret = find_base_term (l->loc)) != 0)
+   else if ((ret = find_base_term (l->loc, visited_vals)) != 0)
  break;
 
-  val->locs = f;
   return ret;
 
 case LO_SUM:
   /* The standard form is (lo_sum reg sym) so look only at the
  second operand.  */
-  return find_base_term (XEXP (x, 1));
+  return find_base_term (XEXP (x, 1), visited_vals);
 
 case CONST:
   x = XEXP (x, 0);
@@ -1984,7 +1986,7 @@ find_base_term (rtx x)
   other operand is the base register.  */
 
if (tmp1 == pic_offset_table_rtx && CONSTANT_P (tmp2))
- return find_base_term (tmp2);
+ return find_base_term (tmp2, visited_vals);
 
/* If either operand is known to be a pointer, then prefer it
   to determine the base term.  */
@@ -2001,12 +2003,12 @@ find_base_term (rtx x)
   term is from a pointer or is a named object or a special address
   (like an argument or stack reference), then use it for the
   base term.  */
-   rtx base = find_base_term (tmp1);
+   rtx base = find_base_term (tmp1, visited_vals);
if (base != NULL_RTX
&& ((REG_P (tmp1) && REG_POINTER (tmp1))
 || known_base_value_p (base)))
  return base;
-   base = find_base_term (tmp2);
+   base = find_base_term (tmp2, visited_vals);
if (base != NULL_RTX
&& ((REG_P (tmp2) && REG_POINTER (tmp2))
 || known_base_value_p (base)))
@@ -2020,7 +2022,7 @@ find_base_term (rtx x)
 
 case AND:
   if (CONST_INT_P (XEXP (x, 1)) && INTVAL (XEXP (x, 1)) != 0)
-   return find_base_term (XEXP (x, 0));
+   return find_base_term (XEXP (x, 0), visited_vals);
   return 0;
 
 case SYMBOL_REF:
@@ -2032,6 +2034,19 @@ find_base_term (rtx x)
 }
 }
 
+/* Wrapper around the worker above which removes locs from visited VALUEs
+   to avoid visiting them multiple times.  We unwind that changes here.  */
+
+static rtx
+find_base_term (rtx x)
+{
+  auto_vec visited_vals;
+  rtx res = find_base_term (x, visited_vals);
+

[PATCH][GCC][mid-end] Fix PR85123 incorrect copies

2018-04-05 Thread Tamar Christina

Hi All,

This patch fixes the code generation for copy_blkmode_to_reg by calculating
the bitsize per iteration doing the maximum copy allowed that does not read more
than the amount of bits left to copy.

This fixes the bad code generation reported and also still produces better code
in most cases. For targets that don't support fast unaligned access it defaults
to the old 1 byte copy (MIN alignment).

This produces for the copying of a 3 byte structure:

fun3:
adrpx1, .LANCHOR0
add x1, x1, :lo12:.LANCHOR0
mov x0, 0
sub sp, sp, #16
ldrhw2, [x1, 16]
ldrbw1, [x1, 18]
add sp, sp, 16
bfi x0, x2, 0, 16
bfi x0, x1, 16, 8
ret

whereas before it was producing

fun3:
adrpx0, .LANCHOR0
add x2, x0, :lo12:.LANCHOR0
sub sp, sp, #16
ldrhw1, [x0, #:lo12:.LANCHOR0]
ldrbw0, [x2, 2]
strhw1, [sp, 8]
strbw0, [sp, 10]
ldr w0, [sp, 8]
add sp, sp, 16
ret

Cross compiled on aarch64-none-elf and no issues
Bootstrapped powerpc64-unknown-linux-gnu, x86_64-pc-linux-gnu,
arm-none-linux-gnueabihf, aarch64-none-linux-gnu with no issues.

Regtested aarch64-none-elf, x86_64-pc-linux-gnu, powerpc64-unknown-linux-gnu
and arm-none-linux-gnueabihf and found no issues.

Regression on powerpc (pr63594-2.c) is fixed now.

OK for trunk?

Thanks,
Tamar

gcc/
2018-04-05  Tamar Christina  

PR middle-end/85123
* expr.c (copy_blkmode_to_reg): Fix wrong code gen.

-- 
diff --git a/gcc/expr.c b/gcc/expr.c
index 00660293f72e5441a6421a280b04c57fca2922b8..7daeb8c91d758edf0b3dc37f6927380b6f3df877 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -2749,7 +2749,7 @@ copy_blkmode_to_reg (machine_mode mode_in, tree src)
 {
   int i, n_regs;
   unsigned HOST_WIDE_INT bitpos, xbitpos, padding_correction = 0, bytes;
-  unsigned int bitsize;
+   unsigned int bitsize = 0;
   rtx *dst_words, dst, x, src_word = NULL_RTX, dst_word = NULL_RTX;
   /* No current ABI uses variable-sized modes to pass a BLKmnode type.  */
   fixed_size_mode mode = as_a  (mode_in);
@@ -2782,7 +2782,7 @@ copy_blkmode_to_reg (machine_mode mode_in, tree src)
 
   n_regs = (bytes + UNITS_PER_WORD - 1) / UNITS_PER_WORD;
   dst_words = XALLOCAVEC (rtx, n_regs);
-  bitsize = BITS_PER_WORD;
+
   if (targetm.slow_unaligned_access (word_mode, TYPE_ALIGN (TREE_TYPE (src
 bitsize = MIN (TYPE_ALIGN (TREE_TYPE (src)), BITS_PER_WORD);
 
@@ -2791,6 +2791,17 @@ copy_blkmode_to_reg (machine_mode mode_in, tree src)
bitpos < bytes * BITS_PER_UNIT;
bitpos += bitsize, xbitpos += bitsize)
 {
+  /* Find the largest integer mode that can be used to copy all or as
+	 many bits as possible of the structure.  */
+  opt_scalar_int_mode mode_iter;
+  FOR_EACH_MODE_IN_CLASS (mode_iter, MODE_INT)
+  if (GET_MODE_BITSIZE (mode_iter.require ())
+			<= ((bytes * BITS_PER_UNIT) - bitpos)
+	  && GET_MODE_BITSIZE (mode_iter.require ()) <= BITS_PER_WORD)
+	bitsize = GET_MODE_BITSIZE (mode_iter.require ());
+  else
+	break;
+
   /* We need a new destination pseudo each time xbitpos is
 	 on a word boundary and when xbitpos == padding_correction
 	 (the first time through).  */

[C++ Patch] PR 80956 ("[7/8 Regression] ICE with abstract class vector")

2018-04-05 Thread Paolo Carlini


Hi,

the main issue is already fixed in trunk but we still ICE on the reduced 
testcase attached by Jakub which has a broken std::initializer_list 
missing the definition. I think we can handle this case similarly to the 
existing check in do_pushtag, which would be also consistent with the 
plain error we give for, eg:


namespace std { template  class initializer_list; }

template class std::initializer_list;

However, we still have the option of issuing a fatal_error, like we do 
in finish_struct.


In any case, I'm tweaking for consistency do_pushtag about %< and %>.

Tested x86_64-linux.

Thanks, Paolo.

///

/cp
2018-04-05  Paolo Carlini  

PR c++/80956
* call.c (convert_like_real): Fail gracefully for a broken
std::initializer_list, missing a definition.

* name-lookup.c (do_pushtag): Tweak message, use %< and %>.

/testsuite
2018-04-05  Paolo Carlini  

PR c++/80956
* g++.dg/cpp0x/initlist100.C: New.
* g++.dg/cpp0x/initlist101.C: Likewise.
Index: cp/call.c
===
--- cp/call.c   (revision 259124)
+++ cp/call.c   (working copy)
@@ -6880,8 +6880,19 @@ convert_like_real (conversion *convs, tree expr, t
if (array == error_mark_node)
  return error_mark_node;
 
-   /* Build up the initializer_list object.  */
+   /* Build up the initializer_list object.  Note: fail gracefully
+  if the object cannot be completed because, for example, no
+  definition is provided.  */
totype = complete_type (totype);
+   if (!COMPLETE_TYPE_P (totype))
+ {
+   if (complain & tf_error)
+ error_at (DECL_SOURCE_LOCATION (TYPE_MAIN_DECL (totype)),
+   "declaration of % does not "
+   "match %<#include %>, cannot be "
+   "completed");
+   return error_mark_node;
+ }
field = next_initializable_field (TYPE_FIELDS (totype));
CONSTRUCTOR_APPEND_ELT (vec, field, array);
field = next_initializable_field (DECL_CHAIN (field));
Index: cp/name-lookup.c
===
--- cp/name-lookup.c(revision 259124)
+++ cp/name-lookup.c(working copy)
@@ -6476,8 +6476,8 @@ do_pushtag (tree name, tree type, tag_scope scope)
  && init_list_identifier == DECL_NAME (TYPE_NAME (type))
  && !CLASSTYPE_TEMPLATE_INFO (type))
{
- error ("declaration of std::initializer_list does not match "
-"#include , isn't a template");
+ error ("declaration of % does not match "
+"%<#include %>, isn't a template");
  return error_mark_node;
}
}
Index: testsuite/g++.dg/cpp0x/initlist100.C
===
--- testsuite/g++.dg/cpp0x/initlist100.C(nonexistent)
+++ testsuite/g++.dg/cpp0x/initlist100.C(working copy)
@@ -0,0 +1,10 @@
+// PR c++/80956
+// { dg-do compile { target c++11 } }
+
+namespace std {
+template  class initializer_list;  // { dg-error "declaration of 
.*std::initializer_list.* does not match" }
+}
+
+template  struct B { B (std::initializer_list); };
+struct C { virtual int foo (); };
+struct D : C {} d { B { D {} } };  // { dg-error "no matching" }
Index: testsuite/g++.dg/cpp0x/initlist101.C
===
--- testsuite/g++.dg/cpp0x/initlist101.C(nonexistent)
+++ testsuite/g++.dg/cpp0x/initlist101.C(working copy)
@@ -0,0 +1,8 @@
+// PR c++/80956
+// { dg-do compile { target c++11 } }
+
+#include 
+
+template  struct B { B (std::initializer_list); };
+struct C { virtual int foo (); };
+struct D : C {} d { B { D {} } };  // { dg-error "no matching" }

Re: [PATCH] rs6000: Enable -fasynchronous-unwind-tables by default

2018-04-05 Thread Jakub Jelinek

On Thu, Apr 05, 2018 at 09:45:54AM +, Segher Boessenkool wrote:
> To find out where on-entry register values live at any point in a
> program, GDB currently tries to parse to parse the executable code.
> This does not work very well, for example it gets confused if some
> accesses to the stack use the frame pointer (r31) and some use the
> stack pointer (r1).  A symptom is that backtraces can be cut short.
> 
> This patch enables -fasynchronous-unwind-tables by default for rs6000,
> which causes us to emit DWARF unwind tables for all functions, solving
> these problems.
> 
> This not do anything for sub-targets without DWARF.
> 
> It increases executable size, but only modestly, and does not change
> memory use, only the disk image.
> 
> Various other targets already do this (x86, s390, tile*).

aarch64-linux* too (since r258871).

> Tested on powerpc64-linux {-m32,-m64}.  David, I'd like to commit this
> to current trunk; does that seem too dangerous to you?

If David is ok with it, it is fine for trunk even in stage4.

Jakub

Re: [PATCH 1/3] improve detection of attribute conflicts (PR 81544)

2018-04-05 Thread Jakub Jelinek

On Thu, Apr 05, 2018 at 09:01:02AM +0200, Andreas Krebbel wrote:
> > Wouldn't it be better to just use aligned (8) and aligned (16) instead of
> > aligned (4) and aligned (8)?
> 
> aligned (8) does not trigger the warning anymore.  Neither on s390 nor on 
> x86.  I don't think there
> is a way to ever see this warning on s390.  Values >= 8 would be ok (no 
> warning, no error) and
> values < 8 would trigger the error but not the warning.

Your patch is ok then.

Jakub

Re: [PATCH] rs6000: Enable -fasynchronous-unwind-tables by default

2018-04-05 Thread Segher Boessenkool

On Thu, Apr 05, 2018 at 11:50:38AM +0200, Jakub Jelinek wrote:
> On Thu, Apr 05, 2018 at 09:45:54AM +, Segher Boessenkool wrote:
> > To find out where on-entry register values live at any point in a
> > program, GDB currently tries to parse to parse the executable code.
> > This does not work very well, for example it gets confused if some
> > accesses to the stack use the frame pointer (r31) and some use the
> > stack pointer (r1).  A symptom is that backtraces can be cut short.
> > 
> > This patch enables -fasynchronous-unwind-tables by default for rs6000,
> > which causes us to emit DWARF unwind tables for all functions, solving
> > these problems.
> > 
> > This not do anything for sub-targets without DWARF.
> > 
> > It increases executable size, but only modestly, and does not change
> > memory use, only the disk image.
> > 
> > Various other targets already do this (x86, s390, tile*).
> 
> aarch64-linux* too (since r258871).

Ah yes, I forgot.  Aarch does it in its struct default_options; is that
preferred?  I did it in rs6000_option_init_struct because originally I
avoided enabling it on some non-DWARF platforms, but the flag is ignored
elsewhere anyway, and various other things set the flags for everything,
too, so now I enable it always.

> > Tested on powerpc64-linux {-m32,-m64}.  David, I'd like to commit this
> > to current trunk; does that seem too dangerous to you?
> 
> If David is ok with it, it is fine for trunk even in stage4.

Thanks.


Segher

[PATCH] rs6000: Enable -fasynchronous-unwind-tables by default

2018-04-05 Thread Segher Boessenkool

To find out where on-entry register values live at any point in a
program, GDB currently tries to parse to parse the executable code.
This does not work very well, for example it gets confused if some
accesses to the stack use the frame pointer (r31) and some use the
stack pointer (r1).  A symptom is that backtraces can be cut short.

This patch enables -fasynchronous-unwind-tables by default for rs6000,
which causes us to emit DWARF unwind tables for all functions, solving
these problems.

This not do anything for sub-targets without DWARF.

It increases executable size, but only modestly, and does not change
memory use, only the disk image.

Various other targets already do this (x86, s390, tile*).

Tested on powerpc64-linux {-m32,-m64}.  David, I'd like to commit this
to current trunk; does that seem too dangerous to you?


Segher


2018-04-05  Segher Boessenkool  

* common/config/rs6000/rs6000-common.c (rs6000_option_init_struct):
Enable -fasynchronous-unwind-tables by default.

gcc/testsuite/
* gcc.target/powerpc/dfmode_off.c: Add -fno-asynchronous-unwind-tables.
* gcc.target/powerpc/dimode_off.c: Ditto.
* gcc.target/powerpc/tfmode_off.c: Ditto.
* gcc.target/powerpc/timode_off.c: Ditto.

---
 gcc/common/config/rs6000/rs6000-common.c  | 7 +++
 gcc/testsuite/gcc.target/powerpc/dfmode_off.c | 2 +-
 gcc/testsuite/gcc.target/powerpc/dimode_off.c | 2 +-
 gcc/testsuite/gcc.target/powerpc/tfmode_off.c | 2 +-
 gcc/testsuite/gcc.target/powerpc/timode_off.c | 2 +-
 5 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/gcc/common/config/rs6000/rs6000-common.c 
b/gcc/common/config/rs6000/rs6000-common.c
index 0ddadfd..e45ed5e 100644
--- a/gcc/common/config/rs6000/rs6000-common.c
+++ b/gcc/common/config/rs6000/rs6000-common.c
@@ -49,6 +49,13 @@ rs6000_option_init_struct (struct gcc_options *opts)
   /* Enable section anchors by default.  */
   if (!TARGET_MACHO)
 opts->x_flag_section_anchors = 1;
+
+  /* By default, always emit DWARF-2 unwind info.  This allows debugging
+ without maintaining a stack frame back-chain.  It also allows the
+ debugger to find out where on-entry register values are stored at any
+ point in a function, without having to analyze the executable code (which
+ isn't even possible to do in the general case).  */
+  opts->x_flag_asynchronous_unwind_tables = 1;
 }
 
 /* Implement TARGET_OPTION_DEFAULT_PARAMS.  */
diff --git a/gcc/testsuite/gcc.target/powerpc/dfmode_off.c 
b/gcc/testsuite/gcc.target/powerpc/dfmode_off.c
index 1942f48..b5940cb 100644
--- a/gcc/testsuite/gcc.target/powerpc/dfmode_off.c
+++ b/gcc/testsuite/gcc.target/powerpc/dfmode_off.c
@@ -1,5 +1,5 @@
 /* { dg-do assemble } */
-/* { dg-options "-O2 -fno-align-functions -mtraceback=no -save-temps" } */
+/* { dg-options "-O2 -fno-align-functions -fno-asynchronous-unwind-tables 
-mtraceback=no -save-temps" } */
 
 void w1 (void *x, double y) { *(double *) (x + 32767) = y; }
 void w2 (void *x, double y) { *(double *) (x + 32766) = y; }
diff --git a/gcc/testsuite/gcc.target/powerpc/dimode_off.c 
b/gcc/testsuite/gcc.target/powerpc/dimode_off.c
index 77a1863..19ca40c 100644
--- a/gcc/testsuite/gcc.target/powerpc/dimode_off.c
+++ b/gcc/testsuite/gcc.target/powerpc/dimode_off.c
@@ -1,5 +1,5 @@
 /* { dg-do assemble } */
-/* { dg-options "-O2 -fno-align-functions -mtraceback=no -save-temps" } */
+/* { dg-options "-O2 -fno-align-functions -fno-asynchronous-unwind-tables 
-mtraceback=no -save-temps" } */
 
 void w1 (void *x, long long y) { *(long long *) (x + 32767) = y; }
 void w2 (void *x, long long y) { *(long long *) (x + 32766) = y; }
diff --git a/gcc/testsuite/gcc.target/powerpc/tfmode_off.c 
b/gcc/testsuite/gcc.target/powerpc/tfmode_off.c
index cbb3d75..f19e759 100644
--- a/gcc/testsuite/gcc.target/powerpc/tfmode_off.c
+++ b/gcc/testsuite/gcc.target/powerpc/tfmode_off.c
@@ -2,7 +2,7 @@
 /* { dg-skip-if "" { powerpc-ibm-aix* } } */
 /* { dg-skip-if "no TFmode" { powerpc-*-eabi* } } */
 /* { dg-require-effective-target longdouble128 } */
-/* { dg-options "-O2 -fno-align-functions -mtraceback=no -save-temps" } */
+/* { dg-options "-O2 -fno-align-functions -fno-asynchronous-unwind-tables 
-mtraceback=no -save-temps" } */
 
 typedef float TFmode __attribute__ ((mode (TF)));
 
diff --git a/gcc/testsuite/gcc.target/powerpc/timode_off.c 
b/gcc/testsuite/gcc.target/powerpc/timode_off.c
index efeffa7..b635953 100644
--- a/gcc/testsuite/gcc.target/powerpc/timode_off.c
+++ b/gcc/testsuite/gcc.target/powerpc/timode_off.c
@@ -1,6 +1,6 @@
 /* { dg-do assemble { target { lp64 } } } */
 /* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
"-mcpu=power5" } } */
-/* { dg-options "-O2 -fno-align-functions -mtraceback=no -save-temps 
-mcpu=power5" } */
+/* { dg-options "-O2 -fno-align-functions -fno-asynchronous-unwind-tables 
-mtraceback=no -save-temps -mcpu=power5" } */
 
 typedef int TImode __attribute__ ((mode

[C++ PATCH] Implement P0969

2018-04-05 Thread Ville Voutilainen

Tested on Linux-PPC64.

2018-04-05  Ville Voutilainen  

gcc/cp

Implement P0969
* decl.c (find_decomp_class_base): Check accessibility instead
of declared access, adjust diagnostic.

testsuite/

Implement P0969
* g++.dg/cpp1z/decomp4.C: Adjust.
* g++.dg/cpp1z/decomp38.C: New.
diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 1a100c8..2cde65b 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -7322,9 +7322,9 @@ find_decomp_class_base (location_t loc, tree type, tree ret)
 	inform (DECL_SOURCE_LOCATION (field), "declared here");
 	return error_mark_node;
   }
-else if (TREE_PRIVATE (field) || TREE_PROTECTED (field))
+else if (!accessible_p (type, field, true))
   {
-	error_at (loc, "cannot decompose non-public member %qD of %qT",
+	error_at (loc, "cannot decompose inaccessible member %qD of %qT",
 		  field, type);
 	inform (DECL_SOURCE_LOCATION (field),
 		TREE_PRIVATE (field)
diff --git a/gcc/testsuite/g++.dg/cpp1z/decomp38.C b/gcc/testsuite/g++.dg/cpp1z/decomp38.C
new file mode 100644
index 000..fc69c02
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/decomp38.C
@@ -0,0 +1,48 @@
+// { dg-additional-options -std=c++17 }
+// { dg-do compile }
+
+class X
+{
+   int a, b;
+   void f()
+   {
+ auto[x,y] = *this;
+   }
+};
+
+class X2
+{
+   int a, b;
+   void f(X2& other)
+   {
+ auto[x,y] = other;
+   }
+};
+
+struct X3
+{
+  friend void foo();
+private:
+  int a;
+};
+
+void foo()
+{
+  X3 x;
+  auto [a] = x;
+}
+
+struct X4
+{
+  int a;
+};
+
+struct X5 : private X4
+{
+  friend void foo2();
+};
+
+void foo2() {
+  X5 x;
+  auto [a] = x;
+}
diff --git a/gcc/testsuite/g++.dg/cpp1z/decomp4.C b/gcc/testsuite/g++.dg/cpp1z/decomp4.C
index e50b882..69b5455 100644
--- a/gcc/testsuite/g++.dg/cpp1z/decomp4.C
+++ b/gcc/testsuite/g++.dg/cpp1z/decomp4.C
@@ -18,10 +18,10 @@ test (A , B , C , D , E , F , G , H , I )
 	// { dg-warning "structured bindings only available with -std=c..17 or -std=gnu..17" "" { target c++14_down } .-1 }
   auto [ k ] { b };			// { dg-error "cannot decompose class type 'B' because it has an anonymous union member" }
 	// { dg-warning "structured bindings only available with -std=c..17 or -std=gnu..17" "" { target c++14_down } .-1 }
-  auto [ l, l2 ] = c;			// { dg-error "cannot decompose non-public member 'C::b' of 'C'" }
+  auto [ l, l2 ] = c;			// { dg-error "cannot decompose inaccessible member 'C::b' of 'C'" }
 	// { dg-warning "structured bindings only available with -std=c..17 or -std=gnu..17" "" { target c++14_down } .-1 }
   auto [ m ] = d;			// { dg-warning "structured bindings only available with -std=c..17 or -std=gnu..17" "" { target c++14_down } }
-  auto [ n ] { e };			// { dg-error "cannot decompose non-public member 'E::a' of 'E'" }
+  auto [ n ] { e };			// { dg-error "cannot decompose inaccessible member 'E::a' of 'E'" }
 	// { dg-warning "structured bindings only available with -std=c..17 or -std=gnu..17" "" { target c++14_down } .-1 }
   auto [ o ] { f };			// { dg-warning "structured bindings only available with -std=c..17 or -std=gnu..17" "" { target c++14_down } }
   auto & [ p ] { g };			// { dg-error "cannot decompose class type 'G': both it and its base class 'F' have non-static data members" }

[nvptx, PR85204] Fix neutering of bb with only cond jump

2018-04-05 Thread Tom de Vries


Hi,

When compiling the test-case in the patch, the following ptx code is 
generated:

...
$L4:
  @ %r91 bra.uni $L24;
  selp.u32 %r95,1,0,%r80;
  st.shared.u32 [__worker_bcast],%r95;
 $L25:
 $L24:
  @ %r92 bra $L25;
...

There's an eternal loop starting at the last insn, and unsurprisingly 
the test-case hangs.


The last insn is a vector neutering branch, which should have been 
inserted after the worker neutering branch (the first insn).


In other words, we want:
...
 $L4:
   @ %r91 bra.uni $L24;
+  @ %r92 bra $L25;
   selp.u32 %r95,1,0,%r80;
   st.shared.u32 [__worker_bcast],%r95;
  $L25:
  $L24:
-  @ %r92 bra $L25;
...

This minimal stage4 patch fixes this problem. [ I filed a PR85223 
"[nvptx] nvptx_single needs rewrite" for a stage1 rewrite of nvptx_single. ]


Build x86_64 with nvptx accelerator, and tested libgomp.

Committed to stage4 trunk.

Thanks,
- Tom
[nvptx] Fix neutering of bb with only cond jump

2018-04-05  Tom de Vries  

	PR target/85204
	* config/nvptx/nvptx.c (nvptx_single): Fix neutering of bb with only
	cond jump.

	* testsuite/libgomp.oacc-c-c++-common/broadcast-1.c: New test.

---
 gcc/config/nvptx/nvptx.c   |  6 ++-
 .../libgomp.oacc-c-c++-common/broadcast-1.c| 49 ++
 2 files changed, 54 insertions(+), 1 deletion(-)

diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index b2b150f..a9a3053 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -4048,6 +4048,7 @@ nvptx_single (unsigned mask, basic_block from, basic_block to)
   /* Insert the vector test inside the worker test.  */
   unsigned mode;
   rtx_insn *before = tail;
+  rtx_insn *neuter_start = NULL;
   for (mode = GOMP_DIM_WORKER; mode <= GOMP_DIM_VECTOR; mode++)
 if (GOMP_DIM_MASK (mode) & skip_mask)
   {
@@ -4065,7 +4066,10 @@ nvptx_single (unsigned mask, basic_block from, basic_block to)
 	  br = gen_br_true (pred, label);
 	else
 	  br = gen_br_true_uni (pred, label);
-	emit_insn_before (br, head);
+	if (neuter_start)
+	  neuter_start = emit_insn_after (br, neuter_start);
+	else
+	  neuter_start = emit_insn_before (br, head);
 
 	LABEL_NUSES (label)++;
 	if (tail_branch)
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/broadcast-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/broadcast-1.c
new file mode 100644
index 000..ca0d37b
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/broadcast-1.c
@@ -0,0 +1,49 @@
+/* Ensure that worker-vector state conditional expressions are
+   properly handled by the nvptx backend.  */
+
+#include 
+#include 
+
+
+#define N 1024
+
+int A[N][N] ;
+
+void test(int x)
+{
+#pragma acc parallel  num_gangs(16) num_workers(4) vector_length(32) copyout(A)
+  {
+#pragma acc loop gang
+for(int j=0;j

Re: [PATCH, rs6000] Undefine vector, bool, pixel in xmmintrin.h

2018-04-05 Thread Jakub Jelinek

On Wed, Apr 04, 2018 at 03:47:18PM -0500, Bill Schmidt wrote:
> > If we (for GCC9?) want to create a spot for target C++ tests, we should
> > just add g++.target// directories and add all the needed infrastructure
> > for those.
> > 
> > Can you please revert the powerpc.exp change and move the test to
> > g++.dg/ext/ ?  Thanks.
> 
> Sure, I can -- I just want to point out that there is precedent here.  I 
> noticed

Thanks.

> that gcc.target/s390/s390.exp allows .C suffixes and there is one such
> (compile-only) test in that directory, so I assumed the framework was okay
> for this.

Yes, and apparently aarch64 and arm have a couple of *.C tests too.
Still it doesn't feel right, running C++ tests under make check-gcc rather
than check-g++ is just weird.

I think we should just introduce g++.target/ in GCC9 and move those tests
there, plus any g++.dg/ tests guarded for single targets only.  g++.target
should do the -std=c++{98,11,14,17,2a} cycling etc.

Jakub

Patch ping^2

2018-04-05 Thread Jakub Jelinek

Hi!

I'd like to ping the

http://gcc.gnu.org/ml/gcc-patches/2018-03/msg01244.html
  - PR83157 - improve debug info for x86 setcc peepholes

patch.  Thanks.

Jakub

Re: [PATCH] Fix BIT_FIELD_REF folding (PR middle-end/85195)

2018-04-05 Thread Richard Biener

On Wed, 4 Apr 2018, Jakub Jelinek wrote:

> Hi!
> 
> On the following testcase because of the SCCVN limit we end up with a weird,
> but valid, BIT_FIELD_REF - trying to extract V1TImode type out of a V1TImode
> SSA_NAME, with 128 bits width and offset 0 (just SSA_NAME move would be
> enough).  Not trying to address why we create it, rather fix how we handle
> it.
> 
> The problem is that the BIT_FIELD_REF vector extraction is guarded with:
>  (if (VECTOR_TYPE_P (TREE_TYPE (@0))
>   && (types_match (type, TREE_TYPE (TREE_TYPE (@0)))
>   || (VECTOR_TYPE_P (type)
>   && types_match (TREE_TYPE (type), TREE_TYPE (TREE_TYPE (@0))
> and thus we must handle vector type as result, but assume incorrectly that
> if count is 1, then we must be asking for an element and thus not vector
> type.  For the first hunk we could do what we do in the second hunk, do
> a VIEW_CONVERT_EXPR, but it seems cleaner to just fall through into the
> count > 1 code which builds a CONSTRUCTOR.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

I think I prefer a V_C_E in the first case like in the second.  That
may help relaxing the type constraint above to a size one so we
can handle SImode extracts from a VnSFmode vector for example.

It also shows we are lacking a generic
(BIT_FIELD_REF @0 bitsize-of-@0 integer_zerop) to (VIEW_CONVERT @0)
pattern which would need to be ordered before the constructor one
to make sure it matches first (and it would have handled the
very specific case as well).

So - OK with doing the same in the first hunk as in the second.

Thanks,
Richard.

> 2018-04-04  Jakub Jelinek  
> 
>   PR middle-end/85195
>   * match.pd (BIT_FIELD_REF CONSTRUCTOR@0 @1 @2): If count == 1, only
>   use elt value if type is not vector type, otherwise build CONSTRUCTOR.
>   For n == const_k, use view_convert.
> 
>   * gcc.dg/pr85195.c: New test.
> 
> --- gcc/match.pd.jj   2018-03-28 21:15:00.0 +0200
> +++ gcc/match.pd  2018-04-04 14:35:10.424127155 +0200
> @@ -4648,7 +4648,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>(if (multiple_p (idx, k, ) && multiple_p (n, k, ))
> (if (CONSTRUCTOR_NELTS (ctor) == 0)
>  { build_constructor (type, NULL); }
> - (if (count == 1)
> + (if (count == 1 && !VECTOR_TYPE_P (type))
>(if (elt < CONSTRUCTOR_NELTS (ctor))
> { CONSTRUCTOR_ELT (ctor, elt)->value; }
> { build_zero_cst (type); })
> @@ -4668,7 +4668,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>   (if (CONSTRUCTOR_NELTS (ctor) <= idx / const_k)
>{ build_zero_cst (type); })
>   (if (n == const_k)
> -  { CONSTRUCTOR_ELT (ctor, idx / const_k)->value; })
> +  (view_convert { CONSTRUCTOR_ELT (ctor, idx / const_k)->value; }))
>   (BIT_FIELD_REF { CONSTRUCTOR_ELT (ctor, idx / const_k)->value; }
>  @1 { bitsize_int ((idx % const_k) * width); })
>  
> --- gcc/testsuite/gcc.dg/pr85195.c.jj 2018-04-04 14:38:29.233235494 +0200
> +++ gcc/testsuite/gcc.dg/pr85195.c2018-04-04 14:38:14.696227566 +0200
> @@ -0,0 +1,19 @@
> +/* PR middle-end/85195 */
> +/* { dg-do compile { target int128 } } */
> +/* { dg-options "-Wno-psabi -O -fno-tree-ccp --param=sccvn-max-scc-size=10" 
> } */
> +
> +typedef __int128 V __attribute__ ((vector_size (16)));
> +
> +extern int bar (V);
> +
> +V v;
> +int i;
> +
> +V
> +foo (void)
> +{
> +  do
> +v *= bar (v & i);
> +  while ((V){}[0]);
> +  return v;
> +}
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

Re: [PATCH 1/3] improve detection of attribute conflicts (PR 81544)

2018-04-05 Thread Andreas Krebbel

On 04/04/2018 07:34 PM, Jakub Jelinek wrote:
> On Wed, Apr 04, 2018 at 06:51:00PM +0200, Andreas Krebbel wrote:
>>> On targets enforcing a function alignment bigger than 4 bytes this triggers 
>>> an error instead:
>>>
>>> +inline int ATTR ((aligned (4)))
>>> +finline_hot_noret_align (int);  /* { dg-warning "ignoring attribute 
>>> .aligned \\(4\\). because it
>>> conflicts with attribute .aligned \\(8\\)." } */
>>>
>>> gcc/gcc/testsuite/c-c++-common/Wattributes.c:404:1: error: alignment for 
>>> 'finline_hot_noret_align'
>>> must be at least 8^M
>>>
>>
>> diff --git a/gcc/testsuite/c-c++-common/Wattributes.c 
>> b/gcc/testsuite/c-c++-common/Wattributes.c
>> index 902bcb61c30..a260d018dcf 100644
>> --- a/gcc/testsuite/c-c++-common/Wattributes.c
>> +++ b/gcc/testsuite/c-c++-common/Wattributes.c
>> @@ -401,7 +401,8 @@ inline int ATTR ((warn_unused_result))
>>  finline_hot_noret_align (int);  /* { dg-warning "ignoring attribute 
>> .warn_unused_result. because it
>> conflicts with attribute .noreturn." } */
>>
>>  inline int ATTR ((aligned (4)))
>> -finline_hot_noret_align (int);  /* { dg-warning "ignoring attribute 
>> .aligned \\(4\\). because it
>> conflicts with attribute .aligned \\(8\\)." } */
>> +  finline_hot_noret_align (int);  /* { dg-warning "ignoring attribute 
>> .aligned \\(4\\). because it
>> conflicts with attribute .aligned \\(8\\)." "" { target { ! s390*-*-* } } } 
>> */
>> +/* { dg-error "alignment for 'finline_hot_noret_align' must be at least 8" 
>> "" { target s390*-*-* }
>> .-1 } */
>>
>>  inline int ATTR ((aligned (8)))
>>  finline_hot_noret_align (int);
>>
>> OK?
> 
> Wouldn't it be better to just use aligned (8) and aligned (16) instead of
> aligned (4) and aligned (8)?

aligned (8) does not trigger the warning anymore.  Neither on s390 nor on x86.  
I don't think there
is a way to ever see this warning on s390.  Values >= 8 would be ok (no 
warning, no error) and
values < 8 would trigger the error but not the warning.

-Andreas-

> 
>   Jakub
>

60 matches

Mail list logo