Re: [PATCH] c++: Prevent bogus -Wtype-limits warning with NTTP [PR100161]

2021-04-20 Thread Jason Merrill via Gcc-patches

On 4/20/21 8:42 PM, Marek Polacek wrote:

Recently, we made sure that we never call value_dependent_expression_p
on an expression that isn't potential_constant_expression.  That caused
this bogus warning with a non-type template parameter, something that
users don't want to see.

The problem is that in tsubst_copy_and_build/LE_EXPR 't' is "i < n",
which, due to 'i', is not p_c_e, therefore we call t_d_e_p.  But the
type of 'n' isn't dependent, so we think the whole 't' expression is
not dependent.  It seems we need to test both op0 and op1 separately
to suppress this warning.  I use a lambda so as not to repeat the
check.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/11.2?
Think I also want to backport to 10.4, but then the patch can't use
a lambda.


It should be straightforward to rewrite the lambda as a local class; 
let's use that version for all the branches.



gcc/cp/ChangeLog:

PR c++/100161
* pt.c (tsubst_copy_and_build) : Test op0 and
op1 separately for value- or type-dependence.

gcc/testsuite/ChangeLog:

PR c++/100161
* g++.dg/warn/Wtype-limits6.C: New test.
---
  gcc/cp/pt.c   | 24 ++-
  gcc/testsuite/g++.dg/warn/Wtype-limits6.C | 17 
  2 files changed, 32 insertions(+), 9 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/warn/Wtype-limits6.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 7bcbe6dc3ce..8d64fef957d 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -19906,15 +19906,21 @@ tsubst_copy_and_build (tree t,
  case MEMBER_REF:
  case DOTSTAR_EXPR:
{
-   /* If T was type-dependent, suppress warnings that depend on the range
-  of the types involved.  */
-   ++processing_template_decl;
-   const bool was_dep = (potential_constant_expression (t)
- ? value_dependent_expression_p (t)
- : type_dependent_expression_p (t));
-   --processing_template_decl;
-   tree op0 = RECUR (TREE_OPERAND (t, 0));
-   tree op1 = RECUR (TREE_OPERAND (t, 1));
+   /* If either OP0 or OP1 was value- or type-dependent, suppress
+  warnings that depend on the range of the types involved.  */
+   tree op0 = TREE_OPERAND (t, 0);
+   tree op1 = TREE_OPERAND (t, 1);
+   auto dep_p = [](tree t) {
+ ++processing_template_decl;
+ bool r = (potential_constant_expression (t)
+   ? value_dependent_expression_p (t)
+   : type_dependent_expression_p (t));
+ --processing_template_decl;
+ return r;
+   };
+   const bool was_dep = dep_p (op0) || dep_p (op1);
+   op0 = RECUR (op0);
+   op1 = RECUR (op1);
  
  	warning_sentinel s1(warn_type_limits, was_dep);

warning_sentinel s2(warn_div_by_zero, was_dep);
diff --git a/gcc/testsuite/g++.dg/warn/Wtype-limits6.C 
b/gcc/testsuite/g++.dg/warn/Wtype-limits6.C
new file mode 100644
index 000..9d5886d5323
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wtype-limits6.C
@@ -0,0 +1,17 @@
+// PR c++/100161
+// { dg-additional-options "-Wtype-limits" }
+
+void f(unsigned);
+
+template
+void g()
+{
+for (unsigned i = 0; i < n; i++) { // { dg-bogus "always false" }
+f(i);
+}
+}
+
+void h()
+{
+g<0>();
+}

base-commit: 5491da23088734d516aae220810f253b9922c98c





Re: [PATCH] c++: Don't allow defining types in enum-base [PR96380]

2021-04-20 Thread Jason Merrill via Gcc-patches

On 4/20/21 3:33 PM, Marek Polacek wrote:

In r11-2064 I made cp_parser_enum_specifier commit to tentative parse
when seeing a '{'.  That still looks like the correct thing to do, but
it caused an ICE-on-invalid as well as accepts-invalid.

When we have something sneaky like this, which is broken in multiple
ways:

   template 
   enum struct c : union enum struct c { e = b, f = a };

we parse the "enum struct c" part (that's OK) and then we see that
we have an enum-base, so we consume ':' and then parse the type-specifier
that follows the :.  "union enum" is clearly invalid, but we're still
parsing tentatively and we parse everything up to the ;, and then
throw away the underlying type.  We parsed everything because we were
tricked into parsing an enum-specifier in an enum-base of another
enum-specifier!  Not good.

Since the grammar for enum-base doesn't allow a defining-type-specifier,
only a type-specifier, we should set type_definition_forbidden_message
which fixes all the problems in this PR.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk (GCC 12)?
This may be too obscure to fix in 11.2, but it looks like a reasonable
change to backport.


OK for trunk, 11.2, and 10.4, since this is a regression from 10.1.


gcc/cp/ChangeLog:

PR c++/96380
* parser.c (cp_parser_enum_specifier): Don't allow defining
types in enum-base.

gcc/testsuite/ChangeLog:

PR c++/96380
* g++.dg/cpp0x/enum_base4.C: New test.
* g++.dg/cpp0x/enum_base5.C: New test.
---
  gcc/cp/parser.c | 4 
  gcc/testsuite/g++.dg/cpp0x/enum_base4.C | 8 
  gcc/testsuite/g++.dg/cpp0x/enum_base5.C | 7 +++
  3 files changed, 19 insertions(+)
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/enum_base4.C
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/enum_base5.C

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 99eccf0c5e4..fba516efa23 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -19942,6 +19942,10 @@ cp_parser_enum_specifier (cp_parser* parser)
/* Consume the `:'.  */
cp_lexer_consume_token (parser->lexer);
  
+  auto tdf

+   = make_temp_override (parser->type_definition_forbidden_message,
+ G_("types may not be defined in enum-base"));
+
/* Parse the type-specifier-seq.  */
cp_parser_type_specifier_seq (parser, CP_PARSER_FLAGS_NONE,
/*is_declaration=*/false,
diff --git a/gcc/testsuite/g++.dg/cpp0x/enum_base4.C 
b/gcc/testsuite/g++.dg/cpp0x/enum_base4.C
new file mode 100644
index 000..b3015256386
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/enum_base4.C
@@ -0,0 +1,8 @@
+// PR c++/96380
+// { dg-do compile { target c++11 } }
+
+extern const int a, b;
+enum struct c;
+template 
+enum struct c : union enum struct c { e = b, f = a };  // { dg-error "types may not 
be defined|expected|elaborated-type-specifier" }
+enum class c {};
diff --git a/gcc/testsuite/g++.dg/cpp0x/enum_base5.C 
b/gcc/testsuite/g++.dg/cpp0x/enum_base5.C
new file mode 100644
index 000..c01e857e612
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/enum_base5.C
@@ -0,0 +1,7 @@
+// PR c++/96380
+// { dg-do compile { target c++11 } }
+
+extern const int a, b;
+enum struct c;
+template 
+enum struct c : union enum struct c { e = b, f = a }; // { dg-error "types may not 
be defined|expected|elaborated-type-specifier" }

base-commit: 5491da23088734d516aae220810f253b9922c98c





Re: [PATCH] libphobos: Fix build fails for powerpc-linux

2021-04-20 Thread Iain Buclaw via Gcc-patches
Excerpts from Iain Buclaw's message of April 21, 2021 1:35 am:
> Hi,
> 
> This patch addresses the raised issue in the RC thread (haven't seen a
> bugzilla PR for it?)
> 
> As register names are required for darwin, but not accepted by gas
> unless you use `-mregnames', they have been conditionally removed on
> non-darwin targets.
> 
> To avoid duplicating large blocks of almost identical code, the inline
> assembly is now statically generated.
> 
> I've tested this on powerpc64le-linux-gnu.  The build succeeds, and both
> gdc and libphobos testsuite looks fine.
> 
> OK for others if I commit this to mainline and releases/gcc-11?
> 

I've seen a couple of affirmations off the list, so have gone ahead and
committed it.

Iain.


[PATCH] c++: Prevent bogus -Wtype-limits warning with NTTP [PR100161]

2021-04-20 Thread Marek Polacek via Gcc-patches
Recently, we made sure that we never call value_dependent_expression_p
on an expression that isn't potential_constant_expression.  That caused
this bogus warning with a non-type template parameter, something that
users don't want to see.

The problem is that in tsubst_copy_and_build/LE_EXPR 't' is "i < n",
which, due to 'i', is not p_c_e, therefore we call t_d_e_p.  But the
type of 'n' isn't dependent, so we think the whole 't' expression is
not dependent.  It seems we need to test both op0 and op1 separately
to suppress this warning.  I use a lambda so as not to repeat the
check.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/11.2?
Think I also want to backport to 10.4, but then the patch can't use
a lambda.

gcc/cp/ChangeLog:

PR c++/100161
* pt.c (tsubst_copy_and_build) : Test op0 and
op1 separately for value- or type-dependence.

gcc/testsuite/ChangeLog:

PR c++/100161
* g++.dg/warn/Wtype-limits6.C: New test.
---
 gcc/cp/pt.c   | 24 ++-
 gcc/testsuite/g++.dg/warn/Wtype-limits6.C | 17 
 2 files changed, 32 insertions(+), 9 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/warn/Wtype-limits6.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 7bcbe6dc3ce..8d64fef957d 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -19906,15 +19906,21 @@ tsubst_copy_and_build (tree t,
 case MEMBER_REF:
 case DOTSTAR_EXPR:
   {
-   /* If T was type-dependent, suppress warnings that depend on the range
-  of the types involved.  */
-   ++processing_template_decl;
-   const bool was_dep = (potential_constant_expression (t)
- ? value_dependent_expression_p (t)
- : type_dependent_expression_p (t));
-   --processing_template_decl;
-   tree op0 = RECUR (TREE_OPERAND (t, 0));
-   tree op1 = RECUR (TREE_OPERAND (t, 1));
+   /* If either OP0 or OP1 was value- or type-dependent, suppress
+  warnings that depend on the range of the types involved.  */
+   tree op0 = TREE_OPERAND (t, 0);
+   tree op1 = TREE_OPERAND (t, 1);
+   auto dep_p = [](tree t) {
+ ++processing_template_decl;
+ bool r = (potential_constant_expression (t)
+   ? value_dependent_expression_p (t)
+   : type_dependent_expression_p (t));
+ --processing_template_decl;
+ return r;
+   };
+   const bool was_dep = dep_p (op0) || dep_p (op1);
+   op0 = RECUR (op0);
+   op1 = RECUR (op1);
 
warning_sentinel s1(warn_type_limits, was_dep);
warning_sentinel s2(warn_div_by_zero, was_dep);
diff --git a/gcc/testsuite/g++.dg/warn/Wtype-limits6.C 
b/gcc/testsuite/g++.dg/warn/Wtype-limits6.C
new file mode 100644
index 000..9d5886d5323
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wtype-limits6.C
@@ -0,0 +1,17 @@
+// PR c++/100161
+// { dg-additional-options "-Wtype-limits" }
+
+void f(unsigned);
+
+template
+void g()
+{
+for (unsigned i = 0; i < n; i++) { // { dg-bogus "always false" }
+f(i);
+}
+}
+
+void h()
+{
+g<0>();
+}

base-commit: 5491da23088734d516aae220810f253b9922c98c
-- 
2.30.2



Re: [PATCH] or1k: Add mcmodel option to handle large GOTs

2021-04-20 Thread Jeff Law via Gcc-patches



On 4/20/2021 6:12 PM, Stafford Horne wrote:

On Tue, Apr 20, 2021 at 05:20:39PM -0600, Jeff Law wrote:

On 4/18/2021 6:10 PM, Stafford Horne via Gcc-patches wrote:

When building libgeos we get an error with:

  linux-uclibc/9.3.0/crtbeginS.o: in function `__do_global_dtors_aux':
  crtstuff.c:(.text+0x118): relocation truncated to fit: R_OR1K_GOT16 
against symbol `__cxa_finalize' defined in .text section in
  
/home/shorne/work/openrisc/3eb9f9d0f6d8274b2d19753c006bd83f7d536e3c/output/host/or1k-buildroot-linux-uclibc/sysroot/lib/libc.so.

This is caused by GOT code having a limit of 64k.  In OpenRISC this
looks to be the only relocation code pattern to be limited to 64k.

This patch allows specifying a new option -mcmodel=large which can be
used to generate 2 more instructions to construct 32-bit addresses for
up to 4G GOTs.

gcc/ChangeLog:

PR 99783
* config/or1k/or1k-opts.h: New file.
* config/or1k/or1k.c (or1k_legitimize_address_1, print_reloc):
Support generating gotha relocations if -mcmodel=large is
specified.
* config/or1k/or1k.h (TARGET_CMODEL_SMALL, TARGET_CMODEL_LARGE):
New macros.
* config/or1k/or1k.opt (mcmodel=): New option.
* doc/invoke.text (OpenRISC Options): Document mcmodel.

Note you're the port maintainer, so you don't need approval to commit this
to GCC.

Thanks for confirming, I know that after Segher told me with some previous
patches. :)

For this series Giulio picked it up and tested it.  We found some issues and I
am going to fix with a V2.  I should commit that version, after the binutils
changes are in.


Sounds good.  Thanks.  Given we're making gcc-11 RCs you'll have to 
decide if its worth the risk of introducing those changes this late in 
the cycle is safe or not.


jeff



Re: [PATCH] or1k: Add mcmodel option to handle large GOTs

2021-04-20 Thread Stafford Horne via Gcc-patches
On Tue, Apr 20, 2021 at 05:20:39PM -0600, Jeff Law wrote:
> 
> On 4/18/2021 6:10 PM, Stafford Horne via Gcc-patches wrote:
> > When building libgeos we get an error with:
> > 
> >  linux-uclibc/9.3.0/crtbeginS.o: in function `__do_global_dtors_aux':
> >  crtstuff.c:(.text+0x118): relocation truncated to fit: R_OR1K_GOT16 
> > against symbol `__cxa_finalize' defined in .text section in
> >  
> > /home/shorne/work/openrisc/3eb9f9d0f6d8274b2d19753c006bd83f7d536e3c/output/host/or1k-buildroot-linux-uclibc/sysroot/lib/libc.so.
> > 
> > This is caused by GOT code having a limit of 64k.  In OpenRISC this
> > looks to be the only relocation code pattern to be limited to 64k.
> > 
> > This patch allows specifying a new option -mcmodel=large which can be
> > used to generate 2 more instructions to construct 32-bit addresses for
> > up to 4G GOTs.
> > 
> > gcc/ChangeLog:
> > 
> > PR 99783
> > * config/or1k/or1k-opts.h: New file.
> > * config/or1k/or1k.c (or1k_legitimize_address_1, print_reloc):
> > Support generating gotha relocations if -mcmodel=large is
> > specified.
> > * config/or1k/or1k.h (TARGET_CMODEL_SMALL, TARGET_CMODEL_LARGE):
> > New macros.
> > * config/or1k/or1k.opt (mcmodel=): New option.
> > * doc/invoke.text (OpenRISC Options): Document mcmodel.
> 
> Note you're the port maintainer, so you don't need approval to commit this
> to GCC.

Thanks for confirming, I know that after Segher told me with some previous
patches. :)

For this series Giulio picked it up and tested it.  We found some issues and I
am going to fix with a V2.  I should commit that version, after the binutils
changes are in.

-Stafford


[PATCH] libphobos: Fix build fails for powerpc-linux

2021-04-20 Thread Iain Buclaw via Gcc-patches
Hi,

This patch addresses the raised issue in the RC thread (haven't seen a
bugzilla PR for it?)

As register names are required for darwin, but not accepted by gas
unless you use `-mregnames', they have been conditionally removed on
non-darwin targets.

To avoid duplicating large blocks of almost identical code, the inline
assembly is now statically generated.

I've tested this on powerpc64le-linux-gnu.  The build succeeds, and both
gdc and libphobos testsuite looks fine.

OK for others if I commit this to mainline and releases/gcc-11?

Regards,
Iain.

---
libphobos/ChangeLog:

* libdruntime/core/thread/osthread.d (callWithStackShell): Statically
generate PPC and PPC64 asm implementations, and conditionally remove
PPC register names on non-Darwin targets.
---
 libphobos/libdruntime/core/thread/osthread.d | 68 +++-
 1 file changed, 24 insertions(+), 44 deletions(-)

diff --git a/libphobos/libdruntime/core/thread/osthread.d 
b/libphobos/libdruntime/core/thread/osthread.d
index 31764e69691..9fcd30e50fb 100644
--- a/libphobos/libdruntime/core/thread/osthread.d
+++ b/libphobos/libdruntime/core/thread/osthread.d
@@ -1444,55 +1444,35 @@ in (fn)
 else version (PPC)
 {
 void*[19] regs = void;
-asm pure nothrow @nogc
-{
-"stw r13, %0" : "=m" (regs[ 0]);
-"stw r14, %0" : "=m" (regs[ 1]);
-"stw r15, %0" : "=m" (regs[ 2]);
-"stw r16, %0" : "=m" (regs[ 3]);
-"stw r17, %0" : "=m" (regs[ 4]);
-"stw r18, %0" : "=m" (regs[ 5]);
-"stw r19, %0" : "=m" (regs[ 6]);
-"stw r20, %0" : "=m" (regs[ 7]);
-"stw r21, %0" : "=m" (regs[ 9]);
-"stw r22, %0" : "=m" (regs[ 9]);
-"stw r23, %0" : "=m" (regs[10]);
-"stw r24, %0" : "=m" (regs[11]);
-"stw r25, %0" : "=m" (regs[12]);
-"stw r26, %0" : "=m" (regs[13]);
-"stw r27, %0" : "=m" (regs[14]);
-"stw r28, %0" : "=m" (regs[15]);
-"stw r29, %0" : "=m" (regs[16]);
-"stw r30, %0" : "=m" (regs[17]);
-"stw r31, %0" : "=m" (regs[18]);
-}
+version (Darwin)
+enum regname = "r";
+else
+enum regname = "";
+static foreach (i; 0 .. regs.length)
+{{
+enum int j = 13 + i; // source register
+asm pure nothrow @nogc
+{
+"stw "~regname~j.stringof~", %0" : "=m" (regs[i]);
+}
+}}
 sp = cast(void*)[0];
 }
 else version (PPC64)
 {
 void*[19] regs = void;
-asm pure nothrow @nogc
-{
-"std r13, %0" : "=m" (regs[ 0]);
-"std r14, %0" : "=m" (regs[ 1]);
-"std r15, %0" : "=m" (regs[ 2]);
-"std r16, %0" : "=m" (regs[ 3]);
-"std r17, %0" : "=m" (regs[ 4]);
-"std r18, %0" : "=m" (regs[ 5]);
-"std r19, %0" : "=m" (regs[ 6]);
-"std r20, %0" : "=m" (regs[ 7]);
-"std r21, %0" : "=m" (regs[ 8]);
-"std r22, %0" : "=m" (regs[ 9]);
-"std r23, %0" : "=m" (regs[10]);
-"std r24, %0" : "=m" (regs[11]);
-"std r25, %0" : "=m" (regs[12]);
-"std r26, %0" : "=m" (regs[13]);
-"std r27, %0" : "=m" (regs[14]);
-"std r28, %0" : "=m" (regs[15]);
-"std r29, %0" : "=m" (regs[16]);
-"std r30, %0" : "=m" (regs[17]);
-"std r31, %0" : "=m" (regs[18]);
-}
+version (Darwin)
+enum regname = "r";
+else
+enum regname = "";
+static foreach (i; 0 .. regs.length)
+{{
+enum int j = 13 + i; // source register
+asm pure nothrow @nogc
+{
+"std "~regname~j.stringof~", %0" : "=m" (regs[i]);
+}
+}}
 sp = cast(void*)[0];
 }
 else
-- 
2.27.0



Re: [PATCH] varasm: Two SECTION_RETAIN fixes [PR100130]

2021-04-20 Thread Jeff Law via Gcc-patches



On 4/19/2021 3:21 AM, Richard Sandiford via Gcc-patches wrote:

switch_to_section warns if we try to output a retain decl in a
section without a retain flag, or if we try to output a non-retain
decl in a section with a retain flag.  However, the warning only
applied if we were trying to “switch” to the current section.
This works if all decls that use a section are generated consecutively,
but not if there is an unrelated decl in between.

This patch makes the check unconditional, but suppresses the warning
if we're writing the section's named.decl (i.e. the decl from which
the section name and original flags were derived).

Also, the warning didn't fire for -fsection-anchors, for two reasons:
we allowed retain and non-retain to be put into the same block,
and we didn't pass a decl to switch_to_section.

Although these are arguably separate bugs, it isn't easy to fix them
independently without temporarily regressing -fsection-anchor targets.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

Richard


gcc/
PR middle-end/100130
* varasm.c (get_block_for_decl): Make sure that any use of the
retain attribute matches the section's retain flag.
(switch_to_section): Check for retain mismatches even when
changing sections, but do not warn if the given decl is the
section's named.decl.
(output_object_block): Pass the first decl in the block (if any)
to switch_to_section.

gcc/testsuite/
PR middle-end/100130
* c-c++-common/attr-retain-10.c: New test.
* c-c++-common/attr-retain-11.c: Likewise.


OK

Jeff


Re: [PATCH] or1k: Add mcmodel option to handle large GOTs

2021-04-20 Thread Jeff Law via Gcc-patches



On 4/18/2021 6:10 PM, Stafford Horne via Gcc-patches wrote:

When building libgeos we get an error with:

 linux-uclibc/9.3.0/crtbeginS.o: in function `__do_global_dtors_aux':
 crtstuff.c:(.text+0x118): relocation truncated to fit: R_OR1K_GOT16 
against symbol `__cxa_finalize' defined in .text section in
 
/home/shorne/work/openrisc/3eb9f9d0f6d8274b2d19753c006bd83f7d536e3c/output/host/or1k-buildroot-linux-uclibc/sysroot/lib/libc.so.

This is caused by GOT code having a limit of 64k.  In OpenRISC this
looks to be the only relocation code pattern to be limited to 64k.

This patch allows specifying a new option -mcmodel=large which can be
used to generate 2 more instructions to construct 32-bit addresses for
up to 4G GOTs.

gcc/ChangeLog:

PR 99783
* config/or1k/or1k-opts.h: New file.
* config/or1k/or1k.c (or1k_legitimize_address_1, print_reloc):
Support generating gotha relocations if -mcmodel=large is
specified.
* config/or1k/or1k.h (TARGET_CMODEL_SMALL, TARGET_CMODEL_LARGE):
New macros.
* config/or1k/or1k.opt (mcmodel=): New option.
* doc/invoke.text (OpenRISC Options): Document mcmodel.


Note you're the port maintainer, so you don't need approval to commit 
this to GCC.


Jeff



Re: [PATCH] PR tree-optimization/100081 - Limit depth of logical expression windback.

2021-04-20 Thread Jeff Law via Gcc-patches



On 4/19/2021 1:40 AM, Jakub Jelinek via Gcc-patches wrote:

On Sun, Apr 18, 2021 at 08:11:21PM -0400, Andrew MacLeod via Gcc-patches wrote:

--- a/gcc/gimple-range-gori.cc
+++ b/gcc/gimple-range-gori.cc
@@ -29,6 +29,36 @@ along with GCC; see the file COPYING3.  If not see
  #include "gimple-pretty-print.h"
  #include "gimple-range.h"
  
+// Limit the nested depth thru logical expressions which GORI will build

+// def chains.
+#define LOGICAL_LIMIT  6

Such limits should be in params.def so that users can override them.


+// Return TRUE if GS is a logical && or || expression.
+
+static inline bool
+is_gimple_logical_p (const gimple *gs)
+{
+  // Look for boolean and/or condition.
+  if (gimple_code (gs) == GIMPLE_ASSIGN)

   if (is_gimple_assign (gs))

is the normal spelling of this check.

But more importantly, isn't 6 too low for logicals, and wouldn't it be
better to put a cap not on the number of seen logicals, but on how many
SSA_NAMEs are handled in a query?  Because it is easy to construct
testcases where the logical limit would not trigger but the operands
of comparisons used in logicals would be thousands of arithmetic/bitwise
statements etc.  And it isn't just artificial, we have various examples
of generated tests in bugzilla that typically can't be compiled in
reasonable time at -O2 and need to use -O1 and have huge basic blocks with
very deep chains.


FWIW, the DOM code which tries to do similar things has a 4 level 
recursion limit which seemed to catch the vast majority of cases. That 
translates into ~8 operands most of the time.    So Andrew's check seems 
to be in the right ballpark (it's doing something slightly different, 
but I think it's close enough to be comparable).



jeff




Re: [PATCH] testsuite: Fix gcc.dg/vect/bb-slp-39.c on IBM Z

2021-04-20 Thread Jeff Law via Gcc-patches



On 4/20/2021 2:17 AM, Stefan Schulze Frielinghaus via Gcc-patches wrote:

On IBM Z the aliasing stores are realized through one element vector
instructions, if no cost model for vectorization is used which is the
default according to vect.exp.  Fixed by changing the number of times
the pattern must be found in the dump.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/bb-slp-39.c: Change number of times the pattern
must match for target IBM Z only.


OK

jeff



Re: [PATCH] testsuite: Xfail gcc.dg/vect/pr71264.c on IBM Z

2021-04-20 Thread Jeff Law via Gcc-patches



On 4/20/2021 7:21 AM, Stefan Schulze Frielinghaus via Gcc-patches wrote:

The test fails for targets with V4QImode support which is the case for
IBM Z.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/pr71264.c: Xfail on IBM Z due to V4QImode support.


OK

jeff




Re: [Patch][GCC12] Fortran/OpenMP: Add 'omp depobj' and 'depend(mutexinoutset:'

2021-04-20 Thread Tobias Burnus

I have now updated the patch – and intent to commit it tomorrow, unless
there are further comments.

On 17.03.21 19:29, Jakub Jelinek wrote:

On Wed, Mar 17, 2021 at 07:19:29PM +0100, Tobias Burnus wrote:

@@ -1831,6 +1852,7 @@ show_omp_node (int level, gfc_code *c)
+case EXEC_OMP_DEPOBJ: name = "DEPBOBJ"; break;

s/DEPBOBJ/DEPOBJ/

+  || omp_clauses->depobj->ts.kind != 2*gfc_index_integer_kind

Formatting (several times), I think we should use 2 * gfc_index_integer_kind

Fixed.

@@ -2545,6 +2545,8 @@ gfc_trans_omp_clauses (stmtblock_t *block, 
gfc_omp_clauses *clauses,
+  if (POINTER_TYPE_P (TREE_TYPE (decl)))
+decl = build_fold_indirect_ref (decl);

I'm a little bit worried about this, are you sure it won't affect anything
but depobj?

I have added an 'n->u.depend_op == OMP_DEPEND_DEPOBJ' to be sure.

+  case OMP_DEPEND_MUTEXINOUTSET: k = GOMP_DEPEND_MUTEXINOUTSET; break;
+  case OMP_DEPEND_DEPOBJ: k = GOMP_DEPEND_MUTEXINOUTSET; break;

Can depobj_update be OMP_DEPEND_DEPOBJ ?


'update' can't but 'depend' can – but only in OpenMP 5.1; moved
testcase, fixed check & remove 'case ...DEPOBJ:' line for now.


Otherwise LGTM.


Attached is the full patch.

Tobias

-
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München 
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank 
Thürauf
Fortran/OpenMP: Add 'omp depobj' and 'depend(mutexinoutset:'

gcc/fortran/ChangeLog:

	* dump-parse-tree.c (show_omp_namelist): Handle depobj + mutexinoutset
	in the depend clause.
	(show_omp_clauses, show_omp_node, show_code_node): Handle depobj.
	* gfortran.h (enum gfc_statement): Add ST_OMP_DEPOBJ.
	(enum gfc_omp_depend_op): Add OMP_DEPEND_UNSET,
	OMP_DEPEND_MUTEXINOUTSET and OMP_DEPEND_DEPOBJ.
	(gfc_omp_clauses): Add destroy, depobj_update and depobj.
	(enum gfc_exec_op): Add EXEC_OMP_DEPOBJ
	* match.h (gfc_match_omp_depobj): Match 'omp depobj'.
	* openmp.c (gfc_match_omp_clauses): Add depobj + mutexinoutset
to depend clause.
	(gfc_match_omp_depobj, resolve_omp_clauses, gfc_resolve_omp_directive):
	Handle 'omp depobj'.
	* parse.c (decode_omp_directive, next_statement, gfc_ascii_statement):
	Likewise.
	* resolve.c (gfc_resolve_code): Likewise.
	* st.c (gfc_free_statement): Likewise.
	* trans-openmp.c (gfc_trans_omp_clauses): Handle depobj + mutexinoutset
in the depend clause.
	(gfc_trans_omp_depobj, gfc_trans_omp_directive): Handle EXEC_OMP_DEPOBJ.
	* trans.c (trans_code): Likewise.

libgomp/ChangeLog:

	* testsuite/libgomp.fortran/depobj-1.f90: New test.

gcc/testsuite/ChangeLog:

	* gfortran.dg/gomp/depobj-1.f90: New test.
	* gfortran.dg/gomp/depobj-2.f90: New test.

 gcc/fortran/dump-parse-tree.c  |  33 
 gcc/fortran/gfortran.h |  12 ++-
 gcc/fortran/match.h|   1 +
 gcc/fortran/openmp.c   | 113 +
 gcc/fortran/parse.c|   6 +-
 gcc/fortran/resolve.c  |   1 +
 gcc/fortran/st.c   |   1 +
 gcc/fortran/trans-openmp.c |  68 +++
 gcc/fortran/trans.c|   1 +
 gcc/testsuite/gfortran.dg/gomp/depobj-1.f90|  25 ++
 gcc/testsuite/gfortran.dg/gomp/depobj-2.f90|  33 
 libgomp/testsuite/libgomp.fortran/depobj-1.f90 | 113 +
 12 files changed, 402 insertions(+), 5 deletions(-)

diff --git a/gcc/fortran/dump-parse-tree.c b/gcc/fortran/dump-parse-tree.c
index 059d8421bb5..b50265ac742 100644
--- a/gcc/fortran/dump-parse-tree.c
+++ b/gcc/fortran/dump-parse-tree.c
@@ -1332,6 +1332,10 @@ show_omp_namelist (int list_type, gfc_omp_namelist *n)
 	  case OMP_DEPEND_IN: fputs ("in:", dumpfile); break;
 	  case OMP_DEPEND_OUT: fputs ("out:", dumpfile); break;
 	  case OMP_DEPEND_INOUT: fputs ("inout:", dumpfile); break;
+	  case OMP_DEPEND_DEPOBJ: fputs ("depobj:", dumpfile); break;
+	  case OMP_DEPEND_MUTEXINOUTSET:
+	fputs ("mutexinoutset:", dumpfile);
+	break;
 	  case OMP_DEPEND_SINK_FIRST:
 	fputs ("sink:", dumpfile);
 	while (1)
@@ -1754,10 +1758,27 @@ show_omp_clauses (gfc_omp_clauses *omp_clauses)
   show_expr (omp_clauses->if_exprs[i]);
   fputc (')', dumpfile);
 }
+  if (omp_clauses->destroy)
+fputs (" DESTROY", dumpfile);
   if (omp_clauses->depend_source)
 fputs (" DEPEND(source)", dumpfile);
   if (omp_clauses->capture)
 fputs (" CAPTURE", dumpfile);
+  if (omp_clauses->depobj_update != OMP_DEPEND_UNSET)
+{
+  const char *deptype;
+  fputs (" UPDATE(", dumpfile);
+  switch (omp_clauses->depobj_update)
+	{
+	case OMP_DEPEND_IN: deptype = "IN"; break;
+	case OMP_DEPEND_OUT: deptype = "OUT"; break;
+	case OMP_DEPEND_INOUT: deptype = "INOUT"; break;
+	case OMP_DEPEND_MUTEXINOUTSET: deptype = "MUTEXINOUTSET"; break;
+	default: gcc_unreachable ();
+	}
+  fputs (deptype, 

[GCC 11] Fix PR ada/99360

2021-04-20 Thread Eric Botcazou
Hi,

I'd like to apply the attached stopgap fix for PR ada/99360 on the 11 branch, 
as the real fix is really stage #1 material.  This is a regression there.

Bootstrapped/regtested on x86-64/Linux, OK for the 11 branch?


2021-04-20  Eric Botcazou  
Bob Duff  

PR ada/99360
* exp_ch6.adb (Might_Have_Tasks): Return False when the type is the
class-wide type of a predefined iterator type.

-- 
Eric Botcazoudiff --git a/gcc/ada/exp_ch6.adb b/gcc/ada/exp_ch6.adb
index 2cd40e42f47..c1dc371786d 100644
--- a/gcc/ada/exp_ch6.adb
+++ b/gcc/ada/exp_ch6.adb
@@ -9616,7 +9616,15 @@ package body Exp_Ch6 is
 and then not No_Run_Time_Mode
 and then (Has_Task (Typ)
 or else (Is_Class_Wide_Type (Typ)
-   and then Is_Limited_Record (Typ)));
+   and then Is_Limited_Record (Typ)))
+
+--  Predefined iterator types do not contain tasks, even when
+--  class-wide.
+
+and then not (In_Predefined_Unit (Typ)
+and then Chars (Typ) in
+  Name_Find ("Tforward_iteratorC") |
+  Name_Find ("Treversible_iteratorC"));
end Might_Have_Tasks;
 



[PATCH] c++: Don't allow defining types in enum-base [PR96380]

2021-04-20 Thread Marek Polacek via Gcc-patches
In r11-2064 I made cp_parser_enum_specifier commit to tentative parse
when seeing a '{'.  That still looks like the correct thing to do, but
it caused an ICE-on-invalid as well as accepts-invalid.

When we have something sneaky like this, which is broken in multiple
ways:

  template 
  enum struct c : union enum struct c { e = b, f = a };

we parse the "enum struct c" part (that's OK) and then we see that
we have an enum-base, so we consume ':' and then parse the type-specifier
that follows the :.  "union enum" is clearly invalid, but we're still
parsing tentatively and we parse everything up to the ;, and then
throw away the underlying type.  We parsed everything because we were
tricked into parsing an enum-specifier in an enum-base of another
enum-specifier!  Not good.

Since the grammar for enum-base doesn't allow a defining-type-specifier,
only a type-specifier, we should set type_definition_forbidden_message
which fixes all the problems in this PR.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk (GCC 12)?
This may be too obscure to fix in 11.2, but it looks like a reasonable
change to backport.

gcc/cp/ChangeLog:

PR c++/96380
* parser.c (cp_parser_enum_specifier): Don't allow defining
types in enum-base.

gcc/testsuite/ChangeLog:

PR c++/96380
* g++.dg/cpp0x/enum_base4.C: New test.
* g++.dg/cpp0x/enum_base5.C: New test.
---
 gcc/cp/parser.c | 4 
 gcc/testsuite/g++.dg/cpp0x/enum_base4.C | 8 
 gcc/testsuite/g++.dg/cpp0x/enum_base5.C | 7 +++
 3 files changed, 19 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/enum_base4.C
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/enum_base5.C

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 99eccf0c5e4..fba516efa23 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -19942,6 +19942,10 @@ cp_parser_enum_specifier (cp_parser* parser)
   /* Consume the `:'.  */
   cp_lexer_consume_token (parser->lexer);
 
+  auto tdf
+   = make_temp_override (parser->type_definition_forbidden_message,
+ G_("types may not be defined in enum-base"));
+
   /* Parse the type-specifier-seq.  */
   cp_parser_type_specifier_seq (parser, CP_PARSER_FLAGS_NONE,
/*is_declaration=*/false,
diff --git a/gcc/testsuite/g++.dg/cpp0x/enum_base4.C 
b/gcc/testsuite/g++.dg/cpp0x/enum_base4.C
new file mode 100644
index 000..b3015256386
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/enum_base4.C
@@ -0,0 +1,8 @@
+// PR c++/96380
+// { dg-do compile { target c++11 } }
+
+extern const int a, b;
+enum struct c;
+template 
+enum struct c : union enum struct c { e = b, f = a };  // { dg-error "types 
may not be defined|expected|elaborated-type-specifier" }
+enum class c {};
diff --git a/gcc/testsuite/g++.dg/cpp0x/enum_base5.C 
b/gcc/testsuite/g++.dg/cpp0x/enum_base5.C
new file mode 100644
index 000..c01e857e612
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/enum_base5.C
@@ -0,0 +1,7 @@
+// PR c++/96380
+// { dg-do compile { target c++11 } }
+
+extern const int a, b;
+enum struct c;
+template 
+enum struct c : union enum struct c { e = b, f = a }; // { dg-error "types may 
not be defined|expected|elaborated-type-specifier" }

base-commit: 5491da23088734d516aae220810f253b9922c98c
-- 
2.30.2



Re: [PATCH] Bump LTO_major_version to 11.

2021-04-20 Thread Martin Liška
On 4/20/21 2:46 PM, Richard Biener wrote:
> OK.  Can you somehow arrange for trunk to pick up LTO_major from GCC
> major automagically then?

I have a pretty nice solution for it where I extended (and simplified)
the existing gcov-iov.c generator. Doing that we can remove gcc/version.[ch].

Using the patch, the following version.h is generated:

#ifndef VERSION_H
#define VERSION_H

/* Generated automatically by genversion.  */

#define GCC_major_version 12

/* The complete version string, assembled from several pieces.
BASEVER, DATESTAMP, DEVPHASE, and REVISION are defined by the
Makefile.  */

#define version_string "12.0.0 20210420 (experimental)"
#define pkgversion_string "(GCC) "

/* This is the location of the online document giving instructions for
reporting bugs.  If you distribute a modified version of GCC,
please configure with --with-bugurl pointing to a document giving
instructions for reporting bugs to you, not us.  (You are of course
welcome to forward us bugs reported to you, if you determine that
they are not bugs in your modifications.)  */

#define bug_report_url "<https://gcc.gnu.org/bugs/>"

#define GCOV_VERSION ((gcov_unsigned_t)0x42323020)  /* B20  */

#endif /* VERSION_H */

Ready for master?
Thanks,
Martin
>From 86d1946449de8aec5e71030f6c7c48010e83cd67 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Tue, 20 Apr 2021 16:56:07 +0200
Subject: [PATCH] Use genversion to generate version.h.

c++tools/ChangeLog:

	* Makefile.in: Include also ../gcc folder.

gcc/ChangeLog:

	* Makefile.in: Rename gcov-iov to genversion and depend
	on version.h (instead of gcov-iov.h).
	* gcov-io.h: Include version.h instread of gcov-iov.h.
	* gengtype-state.c (read_state_version): Likewise.
	* gcov-iov.c: Moved to...
	* genversion.c: ...here.
	* lto-streamer.h (LTO_major_version): Define it with
	GCC_major_version.
	* version.c: Removed.
	* version.h: Removed.

libgcc/ChangeLog:

	* libgcov-driver.c (gcov_version): Use different name that does
	not clash with newly introduced macro.
---
 c++tools/Makefile.in |  4 +--
 gcc/Makefile.in  | 53 +++-
 gcc/gcov-io.h|  6 +---
 gcc/gengtype-state.c |  6 ++--
 gcc/{gcov-iov.c => genversion.c} | 42 -
 gcc/lto-streamer.h   |  3 +-
 gcc/version.c| 35 -
 gcc/version.h|  6 
 libgcc/libgcov-driver.c  |  4 +--
 9 files changed, 55 insertions(+), 104 deletions(-)
 rename gcc/{gcov-iov.c => genversion.c} (59%)
 delete mode 100644 gcc/version.c
 delete mode 100644 gcc/version.h

diff --git a/c++tools/Makefile.in b/c++tools/Makefile.in
index 9700706206d..afc87d07ad9 100644
--- a/c++tools/Makefile.in
+++ b/c++tools/Makefile.in
@@ -89,9 +89,9 @@ all::g++-mapper-server$(exeext)
 
 MAPPER.O := server.o resolver.o
 CODYLIB = ../libcody/libcody.a
-CXXINC += -I$(srcdir)/../libcody -I$(srcdir)/../include -I$(srcdir)/../gcc -I.
+CXXINC += -I$(srcdir)/../libcody -I$(srcdir)/../include -I$(srcdir)/../gcc -I. -I../gcc
 g++-mapper-server$(exeext): $(MAPPER.O) $(CODYLIB)
-	+$(CXX) $(LDFLAGS) $(PIEFLAG) -o $@ $^ $(VERSION.O) $(LIBIBERTY) $(NETLIBS)
+	+$(CXX) $(LDFLAGS) $(PIEFLAG) -o $@ $^ $(LIBIBERTY) $(NETLIBS)
 
 # copy to gcc dir so tests there can run
 all::../gcc/g++-mapper-server$(exeext)
diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 8a5fb3fd99c..e5d07fb98b0 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -960,7 +960,7 @@ BASIC_BLOCK_H = basic-block.h $(PREDICT_H) $(VEC_H) $(FUNCTION_H) \
 GIMPLE_H = gimple.h gimple.def gsstruct.def $(VEC_H) \
 	$(GGC_H) $(BASIC_BLOCK_H) $(TREE_H) tree-ssa-operands.h \
 	tree-ssa-alias.h $(INTERNAL_FN_H) $(HASH_TABLE_H) is-a.h
-GCOV_IO_H = gcov-io.h gcov-iov.h auto-host.h gcov-counter.def
+GCOV_IO_H = gcov-io.h version.h auto-host.h gcov-counter.def
 RECOG_H = recog.h
 EMIT_RTL_H = emit-rtl.h
 FLAGS_H = flags.h flag-types.h $(OPTIONS_H)
@@ -1711,7 +1711,7 @@ OBJS-libcommon = diagnostic.o diagnostic-color.o diagnostic-show-locus.o \
 	edit-context.o \
 	pretty-print.o intl.o \
 	sbitmap.o \
-	vec.o input.o version.o hash-table.o ggc-none.o memory-block.o \
+	vec.o input.o hash-table.o ggc-none.o memory-block.o \
 	selftest.o selftest-diagnostic.o sort.o
 
 # Objects in libcommon-target.a, used by drivers and by the core
@@ -1763,7 +1763,7 @@ MOSTLYCLEANFILES = insn-flags.h insn-config.h insn-codes.h \
  $(EXTRA_PROGRAMS) gcc-cross$(exeext) \
  $(SPECS) collect2$(exeext) gcc-ar$(exeext) gcc-nm$(exeext) \
  gcc-ranlib$(exeext) \
- gcov-iov$(build_exeext) gcov$(exeext) gcov-dump$(exeext) \
+ genversion$(build_exeext) gcov$(exeext) gcov-dump$(exeext) \
  gcov-tool$(exeect) \
  gengtype$(exeext) *.[0-9][0-9].* *.[si] *-checksum.c libbackend.a \
  libcommon-target.a libcommon.a libgcc.mk perf.data
@@ -2144,7 +2144,7 @@ checksum-options:
 # Build libgcc.a.
 
 libgcc-support: libgcc.mvars stmp

Re: [PATCH v10] Practical improvement to libgcc complex divide

2021-04-20 Thread Joseph Myers
On Wed, 7 Apr 2021, Patrick McGehearty via Gcc-patches wrote:

> +   macro_name = XALLOCAVEC (char, name_len
> ++ sizeof ("__LIBGCC__EXCESS_PRECISION__"));
> sprintf (macro_name, "__LIBGCC_%s_EXCESS_PRECISION__", name);
> builtin_define_with_int_value (macro_name, excess_precision);
> +
> +   char val_name[64];
> +
> +   macro_name = XALLOCAVEC (char, name_len
> ++ sizeof ("__LIBGCC_EPSILON__"));
> +   sprintf (macro_name, "__LIBGCC_%s_EPSILON__", name);
> +   sprintf (val_name, "__%s_EPSILON__", float_h_prefix);
> +   builtin_define_with_value (macro_name, val_name, 0);
> +
> +   macro_name = XALLOCAVEC (char, name_len + sizeof ("__LIBGCC_MAX__"));
> +   sprintf (macro_name, "__LIBGCC_%s_MAX__", name);
> +   sprintf (val_name, "__%s_MAX__", float_h_prefix);
> +   builtin_define_with_value (macro_name, val_name, 0);
> +
> +   macro_name = XALLOCAVEC (char, name_len + sizeof ("__LIBGCC_MIN__"));
> +   sprintf (macro_name, "__LIBGCC_%s_MIN__", name);
> +   sprintf (val_name, "__%s_MIN__", float_h_prefix);
> +   builtin_define_with_value (macro_name, val_name, 0);

I think there's an off-by-one error calculating the allocation sizes for 
these three macro names.  Note that the code just above uses

  sizeof ("__LIBGCC__EXCESS_PRECISION__")

with two underscores between "__LIBGCC" and "EXCESS_PRECISION__", 
reflecting that the macro name being constructed has both those 
underscores (around the %s expansion of size name_len).  I think the three 
sizeof calls in the three subsequent allocations likewise need to have 
both those underscores present in their arguments.

> diff --git a/libgcc/config/rs6000/_divkc3.c b/libgcc/config/rs6000/_divkc3.c
> index d261f40..f7fa47f 100644
> --- a/libgcc/config/rs6000/_divkc3.c
> +++ b/libgcc/config/rs6000/_divkc3.c
> @@ -37,29 +37,115 @@ see the files COPYING3 and COPYING.RUNTIME respectively. 
>  If not, see
>  #define __divkc3 __divkc3_sw
>  #endif
>  
> +#define RBIG   (__LIBGCC_TF_MAX__ / 2)
> +#define RMIN   (__LIBGCC_TF_MIN__)
> +#define RMIN2  (__LIBGCC_TF_EPSILON__)
> +#define RMINSCAL (1 / __LIBGCC_TF_EPSILON__)
> +#define RMAX2  (RBIG * RMIN2)

This file includes quad-float128.h, which does some remapping from TF to 
KF depending on __LONG_DOUBLE_IEEE128__.

I think you probably need to have a similar __LONG_DOUBLE_IEEE128__ 
conditional here.  If __LONG_DOUBLE_IEEE128__ is not defined, use 
__LIBGCC_KF_* macros instead of __LIBGCC_TF_*; if __LONG_DOUBLE_IEEE128__ 
is defined, use __LIBGCC_TF_* as above.  (Unless the powerpc maintainers 
say otherwise.)

-- 
Joseph S. Myers
jos...@codesourcery.com


[PATCH] x86: Move OPTION_MASK_* to i386-common.h

2021-04-20 Thread H.J. Lu via Gcc-patches
Move OPTION_MASK_* to i386-common.h so that they can be used in x86
backend.

* common/config/i386/i386-common.c (OPTION_MASK_*): Move to ...
* common/config/i386/i386-common.h: Here.  New file.
* config/i386/i386.h: Include common/config/i386/i386-common.h.
---
 gcc/common/config/i386/i386-common.c | 297 -
 gcc/common/config/i386/i386-common.h | 315 +++
 gcc/config/i386/i386.h   |   1 +
 3 files changed, 316 insertions(+), 297 deletions(-)
 create mode 100644 gcc/common/config/i386/i386-common.h

diff --git a/gcc/common/config/i386/i386-common.c 
b/gcc/common/config/i386/i386-common.c
index 1e6c1590ac4..37ff47bd676 100644
--- a/gcc/common/config/i386/i386-common.c
+++ b/gcc/common/config/i386/i386-common.c
@@ -29,303 +29,6 @@ along with GCC; see the file COPYING3.  If not see
 #include "opts.h"
 #include "flags.h"
 
-/* Define a set of ISAs which are available when a given ISA is
-   enabled.  MMX and SSE ISAs are handled separately.  */
-
-#define OPTION_MASK_ISA_MMX_SET OPTION_MASK_ISA_MMX
-#define OPTION_MASK_ISA_3DNOW_SET \
-  (OPTION_MASK_ISA_3DNOW | OPTION_MASK_ISA_MMX_SET)
-#define OPTION_MASK_ISA_3DNOW_A_SET \
-  (OPTION_MASK_ISA_3DNOW_A | OPTION_MASK_ISA_3DNOW_SET)
-
-#define OPTION_MASK_ISA_SSE_SET OPTION_MASK_ISA_SSE
-#define OPTION_MASK_ISA_SSE2_SET \
-  (OPTION_MASK_ISA_SSE2 | OPTION_MASK_ISA_SSE_SET)
-#define OPTION_MASK_ISA_SSE3_SET \
-  (OPTION_MASK_ISA_SSE3 | OPTION_MASK_ISA_SSE2_SET)
-#define OPTION_MASK_ISA_SSSE3_SET \
-  (OPTION_MASK_ISA_SSSE3 | OPTION_MASK_ISA_SSE3_SET)
-#define OPTION_MASK_ISA_SSE4_1_SET \
-  (OPTION_MASK_ISA_SSE4_1 | OPTION_MASK_ISA_SSSE3_SET)
-#define OPTION_MASK_ISA_SSE4_2_SET \
-  (OPTION_MASK_ISA_SSE4_2 | OPTION_MASK_ISA_SSE4_1_SET)
-#define OPTION_MASK_ISA_AVX_SET \
-  (OPTION_MASK_ISA_AVX | OPTION_MASK_ISA_SSE4_2_SET \
-   | OPTION_MASK_ISA_XSAVE_SET)
-#define OPTION_MASK_ISA_FMA_SET \
-  (OPTION_MASK_ISA_FMA | OPTION_MASK_ISA_AVX_SET)
-#define OPTION_MASK_ISA_AVX2_SET \
-  (OPTION_MASK_ISA_AVX2 | OPTION_MASK_ISA_AVX_SET)
-#define OPTION_MASK_ISA_FXSR_SET OPTION_MASK_ISA_FXSR
-#define OPTION_MASK_ISA_XSAVE_SET OPTION_MASK_ISA_XSAVE
-#define OPTION_MASK_ISA_XSAVEOPT_SET \
-  (OPTION_MASK_ISA_XSAVEOPT | OPTION_MASK_ISA_XSAVE_SET)
-#define OPTION_MASK_ISA_AVX512F_SET \
-  (OPTION_MASK_ISA_AVX512F | OPTION_MASK_ISA_AVX2_SET)
-#define OPTION_MASK_ISA_AVX512CD_SET \
-  (OPTION_MASK_ISA_AVX512CD | OPTION_MASK_ISA_AVX512F_SET)
-#define OPTION_MASK_ISA_AVX512PF_SET \
-  (OPTION_MASK_ISA_AVX512PF | OPTION_MASK_ISA_AVX512F_SET)
-#define OPTION_MASK_ISA_AVX512ER_SET \
-  (OPTION_MASK_ISA_AVX512ER | OPTION_MASK_ISA_AVX512F_SET)
-#define OPTION_MASK_ISA_AVX512DQ_SET \
-  (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512F_SET)
-#define OPTION_MASK_ISA_AVX512BW_SET \
-  (OPTION_MASK_ISA_AVX512BW | OPTION_MASK_ISA_AVX512F_SET)
-#define OPTION_MASK_ISA_AVX512VL_SET \
-  (OPTION_MASK_ISA_AVX512VL | OPTION_MASK_ISA_AVX512F_SET)
-#define OPTION_MASK_ISA_AVX512IFMA_SET \
-  (OPTION_MASK_ISA_AVX512IFMA | OPTION_MASK_ISA_AVX512F_SET)
-#define OPTION_MASK_ISA_AVX512VBMI_SET \
-  (OPTION_MASK_ISA_AVX512VBMI | OPTION_MASK_ISA_AVX512BW_SET)
-#define OPTION_MASK_ISA2_AVX5124FMAPS_SET OPTION_MASK_ISA2_AVX5124FMAPS
-#define OPTION_MASK_ISA2_AVX5124VNNIW_SET OPTION_MASK_ISA2_AVX5124VNNIW
-#define OPTION_MASK_ISA_AVX512VBMI2_SET \
-  (OPTION_MASK_ISA_AVX512VBMI2 | OPTION_MASK_ISA_AVX512F_SET)
-#define OPTION_MASK_ISA_AVX512VNNI_SET \
-  (OPTION_MASK_ISA_AVX512VNNI | OPTION_MASK_ISA_AVX512F_SET)
-#define OPTION_MASK_ISA2_AVXVNNI_SET OPTION_MASK_ISA2_AVXVNNI
-#define OPTION_MASK_ISA_AVX512VPOPCNTDQ_SET \
-  (OPTION_MASK_ISA_AVX512VPOPCNTDQ | OPTION_MASK_ISA_AVX512F_SET)
-#define OPTION_MASK_ISA_AVX512BITALG_SET \
-  (OPTION_MASK_ISA_AVX512BITALG | OPTION_MASK_ISA_AVX512F_SET)
-#define OPTION_MASK_ISA2_AVX512BF16_SET OPTION_MASK_ISA2_AVX512BF16
-#define OPTION_MASK_ISA_RTM_SET OPTION_MASK_ISA_RTM
-#define OPTION_MASK_ISA_PRFCHW_SET OPTION_MASK_ISA_PRFCHW
-#define OPTION_MASK_ISA_RDSEED_SET OPTION_MASK_ISA_RDSEED
-#define OPTION_MASK_ISA_ADX_SET OPTION_MASK_ISA_ADX
-#define OPTION_MASK_ISA_PREFETCHWT1_SET OPTION_MASK_ISA_PREFETCHWT1
-#define OPTION_MASK_ISA_CLFLUSHOPT_SET OPTION_MASK_ISA_CLFLUSHOPT
-#define OPTION_MASK_ISA_XSAVES_SET \
-  (OPTION_MASK_ISA_XSAVES | OPTION_MASK_ISA_XSAVE_SET)
-#define OPTION_MASK_ISA_XSAVEC_SET \
-  (OPTION_MASK_ISA_XSAVEC | OPTION_MASK_ISA_XSAVE_SET)
-#define OPTION_MASK_ISA_CLWB_SET OPTION_MASK_ISA_CLWB
-#define OPTION_MASK_ISA2_AVX512VP2INTERSECT_SET 
OPTION_MASK_ISA2_AVX512VP2INTERSECT
-#define OPTION_MASK_ISA2_AMX_TILE_SET OPTION_MASK_ISA2_AMX_TILE
-#define OPTION_MASK_ISA2_AMX_INT8_SET OPTION_MASK_ISA2_AMX_INT8
-#define OPTION_MASK_ISA2_AMX_BF16_SET OPTION_MASK_ISA2_AMX_BF16
-
-/* SSE4 includes both SSE4.1 and SSE4.2. -msse4 should be the same
-   as -msse4.2.  */
-#define OPTION_MASK_ISA_SSE4_SET OPTION_MASK_ISA_SSE4_2_SET
-
-#define OPTION_MASK_ISA_SSE4A_SET \
-  

[PATCH] x86: Add general_regs_only function attribute

2021-04-20 Thread H.J. Lu via Gcc-patches
commit 87c753ac241f25d222d46ba1ac66ceba89d6a200
Author: H.J. Lu 
Date:   Fri Aug 21 09:42:49 2020 -0700

x86: Add target("general-regs-only") function attribute

is incomplete since it is impossible to call integer intrinsics from
a function with general-regs-only target attribute.

1. Add general_regs_only function attribute to inform the compiler that
functions use only general purpose registers.  When making inlining
decisions on such functions, non-GPR compiler options are excluded.
2. Add general_regs_only attribute to x86 intrinsics which use only
general purpose registers.

gcc/

PR target/99744
* config/i386/i386-options.c (ix86_attribute_table): Add
general_regs_only.
* config/i386/i386.c (ix86_can_inline_p): Exclude non-integer
target options if callee has general_regs_only attribute.
* config/i386/adxintrin.h: Add general_regs_only attribute to
intrinsics which use only general purpose registers.
* config/i386/bmiintrin.h: Likewise.
* config/i386/bmi2intrin.h: Likewise.
* config/i386/cetintrin.h: Likewise.
* config/i386/cldemoteintrin.h: Likewise.
* config/i386/clflushoptintrin.h: Likewise.
* config/i386/clwbintrin.h: Likewise.
* config/i386/clzerointrin.h: Likewise.
* config/i386/enqcmdintrin.h: Likewise.
* config/i386/fxsrintrin.h: Likewise.
* config/i386/hresetintrin.h: Likewise.
* config/i386/ia32intrin.h: Likewise.
* config/i386/lwpintrin.h: Likewise.
* config/i386/lzcntintrin.h: Likewise.
* config/i386/movdirintrin.h: Likewise.
* config/i386/mwaitxintrin.h: Likewise.
* config/i386/pconfigintrin.h: Likewise.
* config/i386/pkuintrin.h: Likewise.
* config/i386/popcntintrin.h: Likewise.
* config/i386/rdseedintrin.h: Likewise.
* config/i386/rtmintrin.h: Likewise.
* config/i386/serializeintrin.h: Likewise.
* config/i386/sgxintrin.h: Likewise.
* config/i386/tbmintrin.h: Likewise.
* config/i386/tsxldtrkintrin.h: Likewise.
* config/i386/uintrintrin.h: Likewise.
* config/i386/waitpkgintrin.h: Likewise.
* config/i386/wbnoinvdintrin.h: Likewise.
* config/i386/x86gprintrin.h: Likewise.
* config/i386/xsavecintrin.h: Likewise.
* config/i386/xsaveintrin.h: Likewise.
* config/i386/xsaveoptintrin.h: Likewise.
* config/i386/xsavesintrin.h: Likewise.
* config/i386/xtestintrin.h: Likewise.
* doc/extend.texi: Document general_regs_only function attribute.

gcc/testsuite/

PR target/99744
* gcc.target/i386/pr99744-3.c: New test.
* gcc.target/i386/pr99744-4.c: Likewise.
---
 gcc/config/i386/adxintrin.h   |  18 +-
 gcc/config/i386/bmi2intrin.h  |  24 +-
 gcc/config/i386/bmiintrin.h   |  92 --
 gcc/config/i386/cetintrin.h   |  33 +-
 gcc/config/i386/cldemoteintrin.h  |   3 +-
 gcc/config/i386/clflushoptintrin.h|   3 +-
 gcc/config/i386/clwbintrin.h  |   3 +-
 gcc/config/i386/clzerointrin.h|   4 +-
 gcc/config/i386/enqcmdintrin.h|   6 +-
 gcc/config/i386/fxsrintrin.h  |  12 +-
 gcc/config/i386/hresetintrin.h|   3 +-
 gcc/config/i386/i386-options.c|   2 +
 gcc/config/i386/i386.c|  29 +-
 gcc/config/i386/ia32intrin.h  |  94 --
 gcc/config/i386/lwpintrin.h   |  24 +-
 gcc/config/i386/lzcntintrin.h |  20 +-
 gcc/config/i386/movdirintrin.h|   9 +-
 gcc/config/i386/mwaitxintrin.h|   8 +-
 gcc/config/i386/pconfigintrin.h   |   3 +-
 gcc/config/i386/pkuintrin.h   |   6 +-
 gcc/config/i386/popcntintrin.h|   8 +-
 gcc/config/i386/rdseedintrin.h|   9 +-
 gcc/config/i386/rtmintrin.h   |   9 +-
 gcc/config/i386/serializeintrin.h |   8 +-
 gcc/config/i386/sgxintrin.h   |   9 +-
 gcc/config/i386/tbmintrin.h   |  80 +++--
 gcc/config/i386/tsxldtrkintrin.h  |   6 +-
 gcc/config/i386/uintrintrin.h |  12 +-
 gcc/config/i386/waitpkgintrin.h   |   9 +-
 gcc/config/i386/wbnoinvdintrin.h  |   3 +-
 gcc/config/i386/x86gprintrin.h|  45 ++-
 gcc/config/i386/xsavecintrin.h|   6 +-
 gcc/config/i386/xsaveintrin.h |  18 +-
 gcc/config/i386/xsaveoptintrin.h  |   6 +-
 gcc/config/i386/xsavesintrin.h|  12 +-
 gcc/config/i386/xtestintrin.h |   3 +-
 gcc/doc/extend.texi   |   5 +
 gcc/testsuite/gcc.target/i386/pr99744-3.c |  13 +
 gcc/testsuite/gcc.target/i386/pr99744-4.c | 357 ++
 39 files changed, 831 insertions(+), 183 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-3.c
 create mode 100644 

Re: [wwwdocs] Add znver3 support to changes.html

2021-04-20 Thread Gerald Pfeifer
Hi Martin,

On Tue, 20 Apr 2021, Martin Jambor wrote:
> I'd like to ping the following, since we already have an RC.

sorry for not getting to this via list right away.

> > +  GCC 10.3 supports AMD CPUs based on znver3 core
> > +  through -march=znver3.

I believe "based on the ... core" will be better and (just a 
recommendation, totally your call) would use "via" over "through" 
to make this more direct.

This applies to both parts of the original patch and both are fine. :)

Thank you,
Gerald


Re: [wwwdocs] Add znver3 support to changes.html

2021-04-20 Thread Martin Jambor
Hi,

I'd like to ping the following, since we already have an RC.

Thanks,

On Tue, Apr 13 2021, Martin Jambor wrote:
> Hi,
>
> Martin Liška correctly observed that the newly added support for AMD
> zenver3 in GCC 11 and 10.3 is not reflected in the changes.html files.
>
> Would the following be OK?
>
> Thanks,
>
> Martin
>
>
> diff --git a/htdocs/gcc-10/changes.html b/htdocs/gcc-10/changes.html
> index d5166879..d9971ffb 100644
> --- a/htdocs/gcc-10/changes.html
> +++ b/htdocs/gcc-10/changes.html
> @@ -1144,6 +1144,12 @@ are not listed here).
>makes the code specific to 512-bit SVE.
>  
>  
> +x86-64
> +
> +  GCC 10.3 supports AMD CPUs based on znver3 core
> +  through -march=znver3.
> +  
> +
>  
>  
>  
> diff --git a/htdocs/gcc-11/changes.html b/htdocs/gcc-11/changes.html
> index a7fa4e1b..97a622f4 100644
> --- a/htdocs/gcc-11/changes.html
> +++ b/htdocs/gcc-11/changes.html
> @@ -634,6 +634,9 @@ a work-in-progress.
>  The switch enables the CLDEMOTE, PTWRITE, WAITPKG, SERIALIZE, KEYLOCKER,
>  AVX-VNNI, and HRESET ISA extensions.
>
> +  GCC now supports AMD CPUs based on znver3 core
> +  through -march=znver3.
> +  
>  
>  
>  


[PATCH] rs6000: Fix cpu selection w/ isel (PR100108)

2021-04-20 Thread Segher Boessenkool
There are various non-IBM CPUs with isel as well, so it is easiest if we
just don't consider that flag here (it is not needed).

2021-04-20  Segher Boessenkool  

PR target/100108
* config/rs6000/rs6000.c (rs6000_machine_from_flags): Do not consider
OPTION_MASK_ISEL.
---
Committed to trunk and 11.  Will do 10 in a week or so.


Segher


 gcc/config/rs6000/rs6000.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 48b8efd732b2..844fee88cf3c 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -5765,7 +5765,7 @@ rs6000_machine_from_flags (void)
   HOST_WIDE_INT flags = rs6000_isa_flags;
 
   /* Disable the flags that should never influence the .machine selection.  */
-  flags &= ~(OPTION_MASK_PPC_GFXOPT | OPTION_MASK_PPC_GPOPT);
+  flags &= ~(OPTION_MASK_PPC_GFXOPT | OPTION_MASK_PPC_GPOPT | 
OPTION_MASK_ISEL);
 
   if ((flags & (ISA_3_1_MASKS_SERVER & ~ISA_3_0_MASKS_SERVER)) != 0)
 return "power10";
-- 
1.8.3.1



Re: [PATCH][libgomp, nvptx] Fix hang in gomp_team_barrier_wait_end

2021-04-20 Thread Alexander Monakov via Gcc-patches
Hello Tom,

Thank you for the investigation and the detailed writeup. It was difficult for
me to infer the internal API contracts here (and still is), sorry about the
mistake.

Most importantly: does GCN handle this, and if yes, how? I think the solution
should be the same for config/gcn and config/nvptx (I guess this is a question
for Andrew).

Some comments inline below:

On Tue, 20 Apr 2021, Tom de Vries wrote:

> Hi,
> 
> Consider the following omp fragment.
> ...
>   #pragma omp target
>   #pragma omp parallel num_threads (2)
>   #pragma omp task
> ;
> ...
> 
> This hangs at -O0 for nvptx.
> 
> Investigating the behaviour gives us the following trace of events:
> - both threads execute GOMP_task, where they:
>   - deposit a task, and
>   - execute gomp_team_barrier_wake
> - thread 1 executes gomp_team_barrier_wait_end and, not being the last thread,
>   proceeds to wait at the team barrier

Shouldn't it try to handle deposited tasks before suspending on the barrier?

I guess you are describing what the code does, I'm just commenting that I'm
confused why it behaves so.

> - thread 0 executes gomp_team_barrier_wait_end and, being the last thread, it
>   calls gomp_barrier_handle_tasks, where it:
>   - executes both tasks and marks the team barrier done
>   - executes a gomp_team_barrier_wake which wakes up thread 1
> - thread 1 exits the team barrier

Up to this point it looks reasonable.

> - thread 0 returns from gomp_barrier_handle_tasks and goes to wait at
>   the team barrier.

At this point the code should realize that the team barrier was already released
and not attempt to wait on it again. Maybe by inspecting the generation counter?

I may be wrong though, I don't understand the overall flow well enough yet.

> - thread 0 hangs.
> 
> To understand why there is a hang here, it's good to understand how things
> are setup for nvptx.  The libgomp/config/nvptx/bar.c implementation is
> a copy of the libgomp/config/linux/bar.c implementation, with uses of both
> futex_wake and do_wait replaced with uses of nvptx insn bar.sync:
> ...
>   if (bar->total > 1)
> asm ("bar.sync 1, %0;" : : "r" (32 * bar->total));
> ...
> 
> The point where thread 0 goes to wait at the team barrier, corresponds in
> the linux implementation with a do_wait.  In the linux case, the call to
> do_wait doesn't hang, because it's waiting for bar->generation to become
> a certain value, and if bar->generation already has that value, it just
> proceeds, without any need for coordination with other threads.
> 
> In the nvtpx case, the bar.sync waits until thread 1 joins it in the same
> logical barrier, which never happens: thread 1 is lingering in the
> thread pool at the thread pool barrier (using a different logical barrier),
> waiting to join a new team.
> 
> The easiest way to fix this is to revert to the posix implementation for
> bar.{c,h}.
> 
> Another way would be to revert to the linux implementation for bar.{c,h},
> and implement the primitives futex_wait and do_wait using nvptx insns.

I don't think implementing futex_wait is possible on nvptx.

Alexander

> This patch instead implements a minimal fix (which makes the implementation
> deviate further from the linux one).
> 
> The hang was only observed in gomp_team_barrier_wait_end, but we propagate the
> fix to its twin gomp_team_barrier_wait_cancel_end as well.
> 
> The fix is based on the assumptions that at the point of the fix, after the
> call to gomp_barrier_handle_tasks:
> - all tasks are done
>   (an assert is added to check this), and consequently:
> - the executing thread is the only thread left in the team barrier
>   (so it's accurate to set nthreads to 1).
> 
> Tested libgomp on x86_64 with nvptx accelerator.
> 
> Any comments?
> 
> Thanks,
> - Tom
> 
> [libgomp, nvptx] Fix hang in gomp_team_barrier_wait_end
> 
> libgomp/ChangeLog:
> 
> 2021-04-20  Tom de Vries  
> 
>   PR target/99555
>   * config/nvptx/bar.c (gomp_team_barrier_wait_end)
>   (gomp_team_barrier_wait_cancel_end): Don't try to sync with team threads
>   that have left the team barrier.
>   * testsuite/libgomp.c-c++-common/task-detach-6.c: Remove nvptx-specific
>   workarounds.
>   * testsuite/libgomp.c/pr99555-1.c: Same.
>   * testsuite/libgomp.fortran/task-detach-6.f90: Same.
> 
> ---
>  libgomp/config/nvptx/bar.c | 32 
> --
>  .../testsuite/libgomp.c-c++-common/task-detach-6.c |  8 --
>  libgomp/testsuite/libgomp.c/pr99555-1.c|  8 --
>  .../testsuite/libgomp.fortran/task-detach-6.f90| 12 
>  4 files changed, 24 insertions(+), 36 deletions(-)
> 
> diff --git a/libgomp/config/nvptx/bar.c b/libgomp/config/nvptx/bar.c
> index c5c2fa8829b..058a8d4d5ca 100644
> --- a/libgomp/config/nvptx/bar.c
> +++ b/libgomp/config/nvptx/bar.c
> @@ -78,6 +78,7 @@ void
>  gomp_team_barrier_wait_end (gomp_barrier_t *bar, gomp_barrier_state_t state)
>  {
>unsigned int generation, gen;
> +  

Re: [PATCH] Support the new ("v0") mangling scheme in rust-demangle.

2021-04-20 Thread Andreas Schwab
On Nov 13 2020, Nikhil Benesch via Gcc-patches wrote:

> +static void
> +demangle_const_char (struct rust_demangler *rdm)
> +{
> +  size_t hex_len;
> +  uint64_t value;
> +
> +  hex_len = parse_hex_nibbles (rdm, );
> +
> +  if (hex_len == 0 || hex_len > 8)
> +{
> +  rdm->errored = 1;
> +  return;
> +}
> +
> +  /* Match Rust's character "debug" output as best as we can. */
> +  PRINT ("'");
> +  if (value == '\t')
> +PRINT ("\\t");
> +  else if (value == '\r')
> +PRINT ("\\r");
> +  else if (value == '\n')
> +PRINT ("\\n");
> +  else if (value > ' ' && value < '~')
> +/* Rust also considers many non-ASCII codepoints to be printable, but
> +   that logic is not easily ported to C. */
> +print_str (rdm, (char *) , 1);

* rust-demangle.c (demangle_const_char): Properly print the
character value.

diff --git a/libiberty/rust-demangle.c b/libiberty/rust-demangle.c
index 6fd8f6a4db0..449941b56dc 100644
--- a/libiberty/rust-demangle.c
+++ b/libiberty/rust-demangle.c
@@ -1253,9 +1253,12 @@ demangle_const_char (struct rust_demangler *rdm)
   else if (value == '\n')
 PRINT ("\\n");
   else if (value > ' ' && value < '~')
-/* Rust also considers many non-ASCII codepoints to be printable, but
-   that logic is not easily ported to C. */
-print_str (rdm, (char *) , 1);
+{
+  /* Rust also considers many non-ASCII codepoints to be printable, but
+that logic is not easily ported to C. */
+  char c = value;
+  print_str (rdm, , 1);
+}
   else
 {
   PRINT ("\\u{");
-- 
2.31.1

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."


Re: [PATCH] [libstdc++] Refactor/cleanup of atomic wait implementation

2021-04-20 Thread Jonathan Wakely via Gcc-patches

On 20/04/21 15:25 +0100, Jonathan Wakely wrote:

On 20/04/21 12:41 +0100, Jonathan Wakely wrote:

On 20/04/21 12:04 +0100, Jonathan Wakely wrote:

On 19/04/21 12:23 -0700, Thomas Rodgers wrote:

From: Thomas Rodgers 

This patch address jwakely's feedback from 2021-04-15.

This is a substantial rewrite of the atomic wait/notify (and timed wait
counterparts) implementation.

The previous __platform_wait looped on EINTR however this behavior is
not required by the standard. A new _GLIBCXX_HAVE_PLATFORM_WAIT macro
now controls whether wait/notify are implemented using a platform
specific primitive or with a platform agnostic mutex/condvar. This
patch only supplies a definition for linux futexes. A future update
could add support __ulock_wait/wake on Darwin, for instance.

The members of __waiters were lifted to a new base class. The members
are now arranged such that overall sizeof(__waiters_base) fits in two
cache lines (on platforms with at least 64 byte cache lines). The
definition will also use destructive_interference_size for this if it
is available.

The __waiters type is now specific to untimed waits. Timed waits have a
corresponding __timed_waiters type. Much of the code has been moved from
the previous __atomic_wait() free function to the __waiter_base template
and a __waiter derived type is provided to implement the un-timed wait
operations. A similar change has been made to the timed wait
implementation.

The __atomic_spin code has been extended to take a spin policy which is
invoked after the initial busy wait loop. The default policy is to
return from the spin. The timed wait code adds a timed backoff spinning
policy. The code from  which implements this_thread::sleep_for,
sleep_until has been moved to a new  header


The commit msg wasn't updated for the latest round of changes
(this_thread_sleep, __waiters_pool_base etc).


which allows the thread sleep code to be consumed without pulling in the
whole of .

The entry points into the wait/notify code have been restructured to
support either -
* Testing the current value of the atomic stored at the given address
 and waiting on a notification.
* Applying a predicate to determine if the wait was satisfied.
The entry points were renamed to make it clear that the wait and wake
operations operate on addresses. The first variant takes the expected
value and a function which returns the current value that should be used
in comparison operations, these operations are named with a _v suffix
(e.g. 'value'). All atomic<_Tp> wait/notify operations use the first
variant. Barriers, latches and semaphores use the predicate variant.

This change also centralizes what it means to compare values for the
purposes of atomic::wait rather than scattering through individual
predicates.

This change also centralizes the repetitive code which adjusts for
different user supplied clocks (this should be moved elsewhere
and all such adjustments should use a common implementation).

This change also removes the hashing of the pointer and uses
the pointer value directly for indexing into the waiters table.

libstdc++-v3/ChangeLog:
* include/Makefile.am: Add new  header.


The name needs updating to correspond to the latest version of the
patch.


* include/Makefile.in: Regenerate.
* include/bits/atomic_base.h: Adjust all calls
to __atomic_wait/__atomic_notify for new call signatures.
* include/bits/atomic_wait.h: Extensive rewrite.
* include/bits/atomic_timed_wait.h: Likewise.
* include/bits/semaphore_base.h: Adjust all calls
to __atomic_wait/__atomic_notify for new call signatures.
* include/bits/this_thread_sleep.h: New file.
* include/std/atomic: Likewise.
* include/std/barrier: Likewise.
* include/std/latch: Likewise.


include/std/thread is missing from the changelog entry. You can use
the 'git gcc-verify' alias to check your commit log will be accepted
by the server-side hook:

'gcc-verify' is aliased to '!f() { "`git rev-parse 
--show-toplevel`/contrib/gcc-changelog/git_check_commit.py" $@; } ; f'



* testsuite/29_atomics/atomic/wait_notify/bool.cc: Simplify
test.
* testsuite/29_atomics/atomic/wait_notify/generic.cc: Likewise.
* testsuite/29_atomics/atomic/wait_notify/pointers.cc: Likewise.
* testsuite/29_atomics/atomic_flag/wait_notify.cc: Likewise.
* testsuite/29_atomics/atomic_float/wait_notify.cc: Likewise.
* testsuite/29_atomics/atomic_integral/wait_notify.cc: Likewise.
* testsuite/29_atomics/atomic_ref/wait_notify.cc: Likewise.



-struct __timed_waiters : __waiters
+struct __timed_waiters : __waiter_pool_base


Should this be __timed_waiter_pool for consistency with
__waiter_pool_base and __waiter_pool?



-inline void
-__thread_relax() noexcept
-{
-#if defined __i386__ || defined __x86_64__
-  __builtin_ia32_pause();
-#elif defined _GLIBCXX_USE_SCHED_YIELD
-  

Re: [PATCH] [libstdc++] Refactor/cleanup of atomic wait implementation

2021-04-20 Thread Jonathan Wakely via Gcc-patches

On 20/04/21 12:41 +0100, Jonathan Wakely wrote:

On 20/04/21 12:04 +0100, Jonathan Wakely wrote:

On 19/04/21 12:23 -0700, Thomas Rodgers wrote:

From: Thomas Rodgers 

This patch address jwakely's feedback from 2021-04-15.

This is a substantial rewrite of the atomic wait/notify (and timed wait
counterparts) implementation.

The previous __platform_wait looped on EINTR however this behavior is
not required by the standard. A new _GLIBCXX_HAVE_PLATFORM_WAIT macro
now controls whether wait/notify are implemented using a platform
specific primitive or with a platform agnostic mutex/condvar. This
patch only supplies a definition for linux futexes. A future update
could add support __ulock_wait/wake on Darwin, for instance.

The members of __waiters were lifted to a new base class. The members
are now arranged such that overall sizeof(__waiters_base) fits in two
cache lines (on platforms with at least 64 byte cache lines). The
definition will also use destructive_interference_size for this if it
is available.

The __waiters type is now specific to untimed waits. Timed waits have a
corresponding __timed_waiters type. Much of the code has been moved from
the previous __atomic_wait() free function to the __waiter_base template
and a __waiter derived type is provided to implement the un-timed wait
operations. A similar change has been made to the timed wait
implementation.

The __atomic_spin code has been extended to take a spin policy which is
invoked after the initial busy wait loop. The default policy is to
return from the spin. The timed wait code adds a timed backoff spinning
policy. The code from  which implements this_thread::sleep_for,
sleep_until has been moved to a new  header


The commit msg wasn't updated for the latest round of changes
(this_thread_sleep, __waiters_pool_base etc).


which allows the thread sleep code to be consumed without pulling in the
whole of .

The entry points into the wait/notify code have been restructured to
support either -
* Testing the current value of the atomic stored at the given address
  and waiting on a notification.
* Applying a predicate to determine if the wait was satisfied.
The entry points were renamed to make it clear that the wait and wake
operations operate on addresses. The first variant takes the expected
value and a function which returns the current value that should be used
in comparison operations, these operations are named with a _v suffix
(e.g. 'value'). All atomic<_Tp> wait/notify operations use the first
variant. Barriers, latches and semaphores use the predicate variant.

This change also centralizes what it means to compare values for the
purposes of atomic::wait rather than scattering through individual
predicates.

This change also centralizes the repetitive code which adjusts for
different user supplied clocks (this should be moved elsewhere
and all such adjustments should use a common implementation).

This change also removes the hashing of the pointer and uses
the pointer value directly for indexing into the waiters table.

libstdc++-v3/ChangeLog:
* include/Makefile.am: Add new  header.


The name needs updating to correspond to the latest version of the
patch.


* include/Makefile.in: Regenerate.
* include/bits/atomic_base.h: Adjust all calls
to __atomic_wait/__atomic_notify for new call signatures.
* include/bits/atomic_wait.h: Extensive rewrite.
* include/bits/atomic_timed_wait.h: Likewise.
* include/bits/semaphore_base.h: Adjust all calls
to __atomic_wait/__atomic_notify for new call signatures.
* include/bits/this_thread_sleep.h: New file.
* include/std/atomic: Likewise.
* include/std/barrier: Likewise.
* include/std/latch: Likewise.


include/std/thread is missing from the changelog entry. You can use
the 'git gcc-verify' alias to check your commit log will be accepted
by the server-side hook:

'gcc-verify' is aliased to '!f() { "`git rev-parse 
--show-toplevel`/contrib/gcc-changelog/git_check_commit.py" $@; } ; f'



* testsuite/29_atomics/atomic/wait_notify/bool.cc: Simplify
test.
* testsuite/29_atomics/atomic/wait_notify/generic.cc: Likewise.
* testsuite/29_atomics/atomic/wait_notify/pointers.cc: Likewise.
* testsuite/29_atomics/atomic_flag/wait_notify.cc: Likewise.
* testsuite/29_atomics/atomic_float/wait_notify.cc: Likewise.
* testsuite/29_atomics/atomic_integral/wait_notify.cc: Likewise.
* testsuite/29_atomics/atomic_ref/wait_notify.cc: Likewise.



-struct __timed_waiters : __waiters
+struct __timed_waiters : __waiter_pool_base


Should this be __timed_waiter_pool for consistency with
__waiter_pool_base and __waiter_pool?



-inline void
-__thread_relax() noexcept
-{
-#if defined __i386__ || defined __x86_64__
-  __builtin_ia32_pause();
-#elif defined _GLIBCXX_USE_SCHED_YIELD
-  __gthread_yield();
-#endif
-}
+template
+  struct 

[PATCH] Fix PR88085

2021-04-20 Thread Andreas Krebbel via Gcc-patches
With the current handling of decl alignments it is impossible to
reduce the alignment requirement as part of a variable declaration.

This change has been proposed by Richard in the PR. It fixes the
align-1.c testcase on IBM Z.

Bootstrapped on x86_64 and s390x. No regressions.

Ok for mainline?

gcc/ChangeLog:

PR middle-end/88085
* emit-rtl.c (set_mem_attributes_minus_bitpos): Use the user
alignment if there are no pre-existing mem attrs.
---
 gcc/emit-rtl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
index 07e908624a0..fc12fa927da 100644
--- a/gcc/emit-rtl.c
+++ b/gcc/emit-rtl.c
@@ -2124,7 +2124,7 @@ set_mem_attributes_minus_bitpos (rtx ref, tree t, int 
objectp,
   unsigned int diff_align = known_alignment (obj_bitpos - bitpos);
   if (diff_align != 0)
obj_align = MIN (obj_align, diff_align);
-  attrs.align = MAX (attrs.align, obj_align);
+  attrs.align = refattrs ? MAX (refattrs->align, obj_align) : obj_align;
 }
 
   poly_uint64 const_size;
-- 
2.30.2



Re: [PATCH] libstdc++: Implement P2259R1 changes [PR95983]

2021-04-20 Thread Patrick Palka via Gcc-patches
On Tue, 20 Apr 2021, Jonathan Wakely wrote:

> On 19/04/21 22:25 -0400, Patrick Palka via Libstdc++ wrote:
> > This implements the wording changes of P2259R1 "Repairing input range
> > adaptors and counted_iterator", which resolves LWG 3283, 3289 and 3408.
> > 
> > The wording changes are relatively straightforward, but they require
> > some boilerplate to implement: the changes to make a type alias
> > "conditionally present" in some iterator class are realized by defining
> > a base class template and an appropriately constrained partial
> > specialization thereof that contains the type alias, and having the
> > iterator class derive from this base class.  Sometimes the relevant
> > condition depend on members from the iterator class, but because a
> > class is incomplete when instantiating its bases, the constraints on
> > the partial specialization can't use anything from the iterator class.
> > This patch works around this by hoisting these members out to the
> > enclosing scope (e.g. transform_view::_Iterator::_Base is hoisted out
> > to transform_view::_Base so that transform_view::__iter_cat can use it).
> > 
> > This patch also implements the proposed resolution of LWG 3291 to rename
> > iota_view::iterator_category to iota_view::iterator_concept, which was
> > previously problematic due to LWG 3408.
> > 
> > Tested on x86_64-pc-linux-gnu.
> > 
> > libstdc++-v3/ChangeLog:
> > 
> > PR libstdc++/95983
> > * include/bits/stl_iterator.h (__detail::__move_iter_cat):
> > Define.
> > (move_iterator): Derive from the above in C++20 in order to
> > conditionally define iterator_category as per P2259.
> > (move_iterator::__base_cat): No longer used, so remove.
> > (move_iterator::iterator_category): Remove in C++20.
> > (__detail::__common_iter_use_postfix_proxy): Define.
> > (common_iterator::_Proxy): Rename to ...
> > (common_iterator:__arrow_proxy): ... this.
> > (common_iterator::__postfix_proxy): Define as per P2259.
> > (common_iterator::operator->): Adjust.
> > (common_iterator::operator++): Adjust as per P2259.
> > (iterator_traits::_S_iter_cat): Define.
> > (iterator_traits::iterator_category): Change as
> > per P2259.
> > (__detail::__counted_iter_value_type): Define.
> > (__detail::__counted_iter_concept): Define.
> > (__detail::__counted_iter_cat): Define.
> > (counted_iterator): Derive from the above three classes in order
> > to conditionally define value_type, iterator_concept and
> > iterator category respectively as per P2259.
> > (counted_iterator::operator->): Define as per P2259.
> > (incrementable_traits): Remove as per P2259.
> > (iterator_traits): Adjust as per P2259.
> > * include/std/ranges (__detail::__iota_view_iter_cat): Define.
> > (iota_view::_Iterator): Derive from the above in order to
> > conditionally define iterator_category as per P2259.
> > (iota_view::_S_iter_cat): Rename to ...
> > (iota_view::_S_iter_concept): ... this.
> > (iota_view::iterator_concept): Use it to apply LWG 3291 changes.
> > (iota_view::iterator_category): Remove.
> > (__detail::__filter_view_iter_cat): Define.
> > (filter_view::_Iterator): Derive from the above in order to
> > conditionally define iterator_category as per P2259.
> > (filter_view::_Iterator): Move to struct __filter_view_iter_cat.
> > (filter_view::_Iterator::iterator_category): Remove.
> > (transform_view::_Base): Define.
> > (transform_view::__iter_cat): Define.
> > (transform_view::_Iterator): Derive from the above in order to
> > conditionally define iterator_category as per P2259.
> > (transform_view::_Iterator::_Base): Just alias
> > transform_view::_Base.
> > (transform_view::_Iterator::_S_iter_cat): Move to struct
> > transform_view::__iter_cat.
> > (transform_view::_Iterator::iterator_category): Remove.
> > (transform_view::_Sentinel::_Base): Just alias
> > transform_view::_Base.
> > (join_view::_Base): Define.
> > (join_view::_Outer_iter): Define.
> > (join_view::_Inner_iter): Define.
> > (join_view::_S_ref_is_glvalue): Define.
> > (join_view::__iter_cat): Define.
> > (join_view::_Iterator): Derive from it in order to conditionally
> > define iterator_category as per P2259.
> > (join_view::_Iterator::_Base): Just alias join_view::_Base.
> > (join_view::_Iterator::_S_ref_is_glvalue): Just alias
> > join_view::_S_ref_is_glvalue.
> > (join_view::_Iterator::_S_iter_cat): Move to struct
> > transform_view::__iter_cat.
> > (join_view::_Iterator::_Outer_iter): Just alias
> > join_view::_Outer_iter.
> > (join_view::_Iterator::_Inner_iter): Just alias
> > join_view::_Inner_iter.
> > (join_view::_Iterator::iterator_category): Remove.
> > (join_view::_Sentinel::_Base): Just alias join_view::_Base.
> > (__detail::__split_view_outer_iter_cat): Define.
> > 

Re: [PATCH] [libstdc++] Refactor/cleanup of atomic wait implementation

2021-04-20 Thread Jonathan Wakely via Gcc-patches

On 19/04/21 12:23 -0700, Thomas Rodgers wrote:

+#if __cpp_lib_atomic_wait
+  struct __atomic_semaphore
+  {
+static constexpr ptrdiff_t _S_max = __gnu_cxx::__int_traits::__max;
+explicit __atomic_semaphore(__detail::__platform_wait_t __count) noexcept
+  : _M_counter(__count)
{
-  static_assert(std::is_integral_v<_Tp>);
-  static_assert(__gnu_cxx::__int_traits<_Tp>::__max
- <= __gnu_cxx::__int_traits::__max);
-  static constexpr ptrdiff_t _S_max = __gnu_cxx::__int_traits<_Tp>::__max;
+  __glibcxx_assert(__count >= 0 && __count <= _S_max);
+}

-  explicit __atomic_semaphore(_Tp __count) noexcept
-   : _M_counter(__count)
+__atomic_semaphore(const __atomic_semaphore&) = delete;
+__atomic_semaphore& operator=(const __atomic_semaphore&) = delete;
+
+static _GLIBCXX_ALWAYS_INLINE bool
+_S_do_try_acquire(__detail::__platform_wait_t* __counter,
+ __detail::__platform_wait_t& __old) noexcept
+{
+  if (__old == 0)
+   return false;
+
+  return __atomic_impl::compare_exchange_strong(__counter,
+   __old, __old - 1,
+   memory_order::acquire,
+   memory_order::release);


This violates the compare_exchange precondition:

Preconditions: The failure argument is neither memory_order::release nor 
memory_order::acq_rel.


Should this be relaxed? I don't think a failed try_acquire has to
synchronize, does it?




Re: [PATCH] [libstdc++] Refactor/cleanup of atomic wait implementation

2021-04-20 Thread Jonathan Wakely via Gcc-patches

On 19/04/21 12:23 -0700, Thomas Rodgers wrote:

+struct __timed_backoff_spin_policy
+{
+  __wait_clock_t::time_point _M_deadline;
+  __wait_clock_t::time_point _M_t0;
+
+  template
+   __timed_backoff_spin_policy(chrono::time_point<_Clock, _Dur>
+ __deadline = _Clock::time_point::max(),
+   chrono::time_point<_Clock, _Dur>
+ __t0 = _Clock::now()) noexcept
+ : _M_deadline(__to_wait_clock(__deadline))
+ , _M_t0(__to_wait_clock(__t0))
+   { }
+
+  bool
+  operator()() const noexcept
  {
-   static_assert(sizeof(__timed_waiters) == sizeof(__waiters));
-   return static_cast<__timed_waiters&>(__waiters::_S_for(__t));
+   using namespace literals::chrono_literals;
+   auto __now = __wait_clock_t::now();
+   if (_M_deadline <= __now)
+ return false;
+
+   auto __elapsed = __now - _M_t0;
+   if (__elapsed > 128ms)
+ {
+   this_thread::sleep_for(64ms);
+ }
+   else if (__elapsed > 64us)
+ {
+   this_thread::sleep_for(__elapsed / 2);
+ }
+   else if (__elapsed > 4us)
+ {
+   __thread_yield();
+ }
+   else
+ return false;


Ah, the reason for some of the time outs I'm seeing is that this
function doesn't return anything!

/home/jwakely/gcc/12/include/c++/12.0.0/bits/atomic_timed_wait.h: In member 
function 'bool std::__detail::__timed_backoff_spin_policy::operator()() const':
/home/jwakely/gcc/12/include/c++/12.0.0/bits/atomic_timed_wait.h:259:7: 
warning: control reaches end of non-void function [-Wreturn-type]
  259 |   }
  |   ^

Should it return true if it waited?



Re: [PATCH] [libstdc++] Refactor/cleanup of atomic wait implementation

2021-04-20 Thread Jonathan Wakely via Gcc-patches

On 20/04/21 14:20 +0100, Jonathan Wakely wrote:

On 20/04/21 13:02 +0100, Jonathan Wakely wrote:

On 19/04/21 12:23 -0700, Thomas Rodgers wrote:

+   template
+ static bool
+ _S_do_spin_v(__platform_wait_t* __addr,
+  const _Up& __old, _ValFn __vfn,
+  __platform_wait_t& __val,
+  _Spin __spin = _Spin{ })
+ {
+   auto const __pred = [=]
+ { return __atomic_compare(__old, __vfn()); };


This doesn't compile, there are 28 FAILs in 29_atomics/*

FAIL: 29_atomics/atomic_integral/cons/value_init.cc (test for excess errors)

It needs to be qualified as __detail::__atomic_compare.


Ah no, the problem is that atomic_flag::wait uses it, but it tries to
compare a bool to atomic_flag::__atomic_flag_data_type, which isn't
the same.


And this on solaris:

FAIL: 29_atomics/atomic_integral/cons/value_init.cc (test for excess errors)
Excess errors:
/export/home/jwakely/build/sparc-sun-solaris2.11/libstdc++-v3/include/bits/atomic_wait.h:263:
 error: '_M_addr' was not declared in this scope; did you mean '__addr'?
/export/home/jwakely/build/sparc-sun-solaris2.11/libstdc++-v3/include/bits/atomic_wait.h:263:
 error: argument 1 of '__atomic_load' must be a non-void pointer type

UNRESOLVED: 29_atomics/atomic_integral/cons/value_init.cc compilation failed to 
produce executable

Just a typo, but I don't think we can push this to gcc-11 at this
late stage.



[PATCH] testsuite: Xfail gcc.dg/vect/pr71264.c on IBM Z

2021-04-20 Thread Stefan Schulze Frielinghaus via Gcc-patches
The test fails for targets with V4QImode support which is the case for
IBM Z.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/pr71264.c: Xfail on IBM Z due to V4QImode support.

Ok for mainline?

---
 gcc/testsuite/gcc.dg/vect/pr71264.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/vect/pr71264.c 
b/gcc/testsuite/gcc.dg/vect/pr71264.c
index 5f6407a2411..dc849bf2797 100644
--- a/gcc/testsuite/gcc.dg/vect/pr71264.c
+++ b/gcc/testsuite/gcc.dg/vect/pr71264.c
@@ -19,5 +19,5 @@ void test(uint8_t *ptr, uint8_t *mask)
 }
 }
 
-/* { dg-final { scan-tree-dump "vectorized 1 loops in function" "vect" { xfail 
sparc*-*-* } } } */
+/* { dg-final { scan-tree-dump "vectorized 1 loops in function" "vect" { xfail 
s390*-*-* sparc*-*-* } } } */
 
-- 
2.23.0



Re: [PATCH] [libstdc++] Refactor/cleanup of atomic wait implementation

2021-04-20 Thread Jonathan Wakely via Gcc-patches

On 20/04/21 13:02 +0100, Jonathan Wakely wrote:

On 19/04/21 12:23 -0700, Thomas Rodgers wrote:

+   template
+ static bool
+ _S_do_spin_v(__platform_wait_t* __addr,
+  const _Up& __old, _ValFn __vfn,
+  __platform_wait_t& __val,
+  _Spin __spin = _Spin{ })
+ {
+   auto const __pred = [=]
+ { return __atomic_compare(__old, __vfn()); };


This doesn't compile, there are 28 FAILs in 29_atomics/*

FAIL: 29_atomics/atomic_integral/cons/value_init.cc (test for excess errors)

It needs to be qualified as __detail::__atomic_compare.


Ah no, the problem is that atomic_flag::wait uses it, but it tries to
compare a bool to atomic_flag::__atomic_flag_data_type, which isn't
the same.




Re: [PATCH] Bump LTO_major_version to 11.

2021-04-20 Thread Richard Biener via Gcc-patches
On Tue, Apr 20, 2021 at 11:57 AM Martin Liška  wrote:
>
> It seems we bumped LTO_major_version last time 2 years ago.
>
> Right now, the following is seen when one links a GCC 10.2.x LTO object file:
> $ gcc a.o
>
> lto1: fatal error: bytecode stream in file ‘a.o’ generated with LTO version 
> 9.2 instead of the expected 9.0
>
>
> I suggest bumping LTO_major_version for releases/gcc-11 branch.
> Can we please align it with a GCC release (version 11)? For the future, if 
> e.g. GCC 12 consumes LTO
> bytecode from GCC 11, we can leave LTO_major_version. Once e.g. GCC 13 needs 
> bumping,
> I would then change it to 13.
>
> What do you think about the suggested scheme?
> Ready for gcc-11 branch?

OK.  Can you somehow arrange for trunk to pick up LTO_major from GCC
major automagically then?

Thanks,
Richard.

> Martin
>
> gcc/ChangeLog:
>
> * lto-streamer.h (LTO_major_version): Bump to 11.
> ---
>  gcc/lto-streamer.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gcc/lto-streamer.h b/gcc/lto-streamer.h
> index 5c7cd84d46f..7a7be80dab8 100644
> --- a/gcc/lto-streamer.h
> +++ b/gcc/lto-streamer.h
> @@ -120,7 +120,7 @@ along with GCC; see the file COPYING3.  If not see
>   String are represented in the table as pairs, a length in ULEB128
>   form followed by the data for the string.  */
>
> -#define LTO_major_version 9
> +#define LTO_major_version 11
>  #define LTO_minor_version 0
>
>  typedef unsigned char  lto_decl_flags_t;
> --
> 2.31.1
>


[pushed] c++: unexpanded pack in enum in lambda [PR100109]

2021-04-20 Thread Jason Merrill via Gcc-patches
Another construct we need to look inside.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/cp/ChangeLog:

PR c++/100109
* pt.c (find_parameter_packs_r): Look into enum initializers.

gcc/testsuite/ChangeLog:

PR c++/100109
* g++.dg/cpp0x/lambda/lambda-variadic14.C: New test.
---
 gcc/cp/pt.c   | 8 +++-
 gcc/testsuite/g++.dg/cpp0x/lambda/lambda-variadic14.C | 8 
 2 files changed, 15 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/lambda/lambda-variadic14.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 19fdafa4c43..7bcbe6dc3ce 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -4066,12 +4066,18 @@ find_parameter_packs_r (tree *tp, int *walk_subtrees, 
void* data)
   return NULL_TREE;
 
 case TAG_DEFN:
-  /* Local class, need to look through the whole definition.  */
   t = TREE_TYPE (t);
   if (CLASS_TYPE_P (t))
+   /* Local class, need to look through the whole definition.  */
for (tree bb : BINFO_BASE_BINFOS (TYPE_BINFO (t)))
  cp_walk_tree (_TYPE (bb), _parameter_packs_r,
ppd, ppd->visited);
+  else
+   /* Enum, look at the values.  */
+   for (tree l = TYPE_VALUES (t); l; l = TREE_CHAIN (l))
+ cp_walk_tree (_INITIAL (TREE_VALUE (l)),
+   _parameter_packs_r,
+   ppd, ppd->visited);
   return NULL_TREE;
 
 case FUNCTION_TYPE:
diff --git a/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-variadic14.C 
b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-variadic14.C
new file mode 100644
index 000..185aa0332e7
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-variadic14.C
@@ -0,0 +1,8 @@
+// PR c++/100109
+// { dg-do compile { target c++11 } }
+
+template 
+void f() {
+  [] { enum e { e = E }; };// { dg-error "not expanded" }
+}
+template void f<>();

base-commit: f1a68574b1f8d2961d3a676dbcf0cc24b6368578
-- 
2.27.0



[GCC 12] [PATCH] x86: Add -mmwait for -mgeneral-regs-only

2021-04-20 Thread H.J. Lu via Gcc-patches
Add -mmwait so that the MWAIT and MONITOR intrinsics can be used with
-mgeneral-regs-only and make -msse3 to imply -mmwait.

gcc/

* config.gcc: Install mwaitintrin.h for i[34567]86-*-* and
x86_64-*-* targets.
* common/config/i386/i386-common.c (OPTION_MASK_ISA2_MWAIT_SET):
New.
(OPTION_MASK_ISA2_MWAIT_UNSET): Likewise.
(ix86_handle_option): Handle -mmwait.
* config/i386/i386-builtins.c (ix86_init_mmx_sse_builtins):
Replace OPTION_MASK_ISA_SSE3 with OPTION_MASK_ISA2_MWAIT on
__builtin_ia32_monitor and __builtin_ia32_mwait.
* config/i386/i386-options.c (isa2_opts): Add -mmwait.
(ix86_valid_target_attribute_inner_p): Likewise.
(ix86_option_override_internal): Enable mwait/monitor
instructions for -msse3.
* config/i386/i386.h (TARGET_MWAIT): New.
(TARGET_MWAIT_P): Likewise.
* config/i386/i386.opt: Add -mmwait.
* config/i386/mwaitintrin.h: New file.
* config/i386/pmmintrin.h: Include .
* config/i386/sse.md (sse3_mwait): Replace TARGET_SSE3 with
TARGET_MWAIT.
(@sse3_monitor_): Likewise.
* config/i386/x86gprintrin.h: Include .
* doc/extend.texi: Document mwait target attribute.
* doc/invoke.texi: Document -mmwait.

gcc/testsuite/

* gcc.target/i386/monitor-2.c: New test.
---
 gcc/common/config/i386/i386-common.c  | 15 +++
 gcc/config.gcc|  6 ++-
 gcc/config/i386/i386-builtins.c   |  4 +-
 gcc/config/i386/i386-options.c|  7 +++
 gcc/config/i386/i386.h|  2 +
 gcc/config/i386/i386.opt  |  4 ++
 gcc/config/i386/mwaitintrin.h | 52 +++
 gcc/config/i386/pmmintrin.h   | 13 +-
 gcc/config/i386/sse.md|  4 +-
 gcc/config/i386/x86gprintrin.h|  2 +
 gcc/doc/extend.texi   |  5 +++
 gcc/doc/invoke.texi   |  8 +++-
 gcc/testsuite/gcc.target/i386/monitor-2.c | 27 
 13 files changed, 130 insertions(+), 19 deletions(-)
 create mode 100644 gcc/config/i386/mwaitintrin.h
 create mode 100644 gcc/testsuite/gcc.target/i386/monitor-2.c

diff --git a/gcc/common/config/i386/i386-common.c 
b/gcc/common/config/i386/i386-common.c
index 1e6c1590ac4..af3f38755fa 100644
--- a/gcc/common/config/i386/i386-common.c
+++ b/gcc/common/config/i386/i386-common.c
@@ -150,6 +150,7 @@ along with GCC; see the file COPYING3.  If not see
 #define OPTION_MASK_ISA_F16C_SET \
   (OPTION_MASK_ISA_F16C | OPTION_MASK_ISA_AVX_SET)
 #define OPTION_MASK_ISA2_MWAITX_SET OPTION_MASK_ISA2_MWAITX
+#define OPTION_MASK_ISA2_MWAIT_SET OPTION_MASK_ISA2_MWAIT
 #define OPTION_MASK_ISA2_CLZERO_SET OPTION_MASK_ISA2_CLZERO
 #define OPTION_MASK_ISA_PKU_SET OPTION_MASK_ISA_PKU
 #define OPTION_MASK_ISA2_RDPID_SET OPTION_MASK_ISA2_RDPID
@@ -245,6 +246,7 @@ along with GCC; see the file COPYING3.  If not see
 #define OPTION_MASK_ISA_XSAVES_UNSET OPTION_MASK_ISA_XSAVES
 #define OPTION_MASK_ISA_CLWB_UNSET OPTION_MASK_ISA_CLWB
 #define OPTION_MASK_ISA2_MWAITX_UNSET OPTION_MASK_ISA2_MWAITX
+#define OPTION_MASK_ISA2_MWAIT_UNSET OPTION_MASK_ISA2_MWAIT
 #define OPTION_MASK_ISA2_CLZERO_UNSET OPTION_MASK_ISA2_CLZERO
 #define OPTION_MASK_ISA_PKU_UNSET OPTION_MASK_ISA_PKU
 #define OPTION_MASK_ISA2_RDPID_UNSET OPTION_MASK_ISA2_RDPID
@@ -1546,6 +1548,19 @@ ix86_handle_option (struct gcc_options *opts,
}
   return true;
 
+case OPT_mmwait:
+  if (value)
+   {
+ opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA2_MWAIT_SET;
+ opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_MWAIT_SET;
+   }
+  else
+   {
+ opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA2_MWAIT_UNSET;
+ opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_MWAIT_UNSET;
+   }
+  return true;
+
 case OPT_mclzero:
   if (value)
{
diff --git a/gcc/config.gcc b/gcc/config.gcc
index 357b0bed067..a020e0808c9 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -414,7 +414,8 @@ i[34567]86-*-*)
   avx512vp2intersectintrin.h avx512vp2intersectvlintrin.h
   tsxldtrkintrin.h amxtileintrin.h amxint8intrin.h
   amxbf16intrin.h x86gprintrin.h uintrintrin.h
-  hresetintrin.h keylockerintrin.h avxvnniintrin.h"
+  hresetintrin.h keylockerintrin.h avxvnniintrin.h
+  mwaitintrin.h"
;;
 x86_64-*-*)
cpu_type=i386
@@ -451,7 +452,8 @@ x86_64-*-*)
   avx512vp2intersectintrin.h avx512vp2intersectvlintrin.h
   tsxldtrkintrin.h amxtileintrin.h amxint8intrin.h
   amxbf16intrin.h x86gprintrin.h uintrintrin.h
-  hresetintrin.h keylockerintrin.h avxvnniintrin.h"
+  hresetintrin.h keylockerintrin.h avxvnniintrin.h
+  

Re: [PATCH] [libstdc++] Refactor/cleanup of atomic wait implementation

2021-04-20 Thread Jonathan Wakely via Gcc-patches

On 19/04/21 12:23 -0700, Thomas Rodgers wrote:

+   template
+ static bool
+ _S_do_spin_v(__platform_wait_t* __addr,
+  const _Up& __old, _ValFn __vfn,
+  __platform_wait_t& __val,
+  _Spin __spin = _Spin{ })
+ {
+   auto const __pred = [=]
+ { return __atomic_compare(__old, __vfn()); };


This doesn't compile, there are 28 FAILs in 29_atomics/*

FAIL: 29_atomics/atomic_integral/cons/value_init.cc (test for excess errors)

It needs to be qualified as __detail::__atomic_compare.

I was hoping to push this to trunk and gcc-11 for the gcc-11 release,
but I'm a bit concerned now.




[committed] libstdc++: Define __cpp_lib_to_chars for C++20 [PR 100146]

2021-04-20 Thread Jonathan Wakely via Gcc-patches
This defines the feature test macro when uselocale is available, because
the floating-point std::from_chars support currently depends on that.

Co-authored-by: Jakub Jelinek 

libstdc++-v3/ChangeLog:

PR libstdc++/100146
* include/std/charconv (__cpp_lib_to_chars): Define
conditionally.
* include/std/version (__cpp_lib_to_chars): Likewise..
* testsuite/20_util/from_chars/4.cc: Only check feature test
macro, not _GLIBCXX_HAVE_USELOCALE.
* testsuite/20_util/from_chars/5.cc: Likewise.
* testsuite/20_util/from_chars/6.cc: Likewise.
* testsuite/20_util/to_chars/long_double.cc: Likewise.

Tested x86_64-linux. Committed to trunk. I'll also backport to gcc-11.


commit f1a68574b1f8d2961d3a676dbcf0cc24b6368578
Author: Jonathan Wakely 
Date:   Tue Apr 20 12:46:11 2021

libstdc++: Define __cpp_lib_to_chars for C++20 [PR 100146]

This defines the feature test macro when uselocale is available, because
the floating-point std::from_chars support currently depends on that.

Co-authored-by: Jakub Jelinek 

libstdc++-v3/ChangeLog:

PR libstdc++/100146
* include/std/charconv (__cpp_lib_to_chars): Define
conditionally.
* include/std/version (__cpp_lib_to_chars): Likewise..
* testsuite/20_util/from_chars/4.cc: Only check feature test
macro, not _GLIBCXX_HAVE_USELOCALE.
* testsuite/20_util/from_chars/5.cc: Likewise.
* testsuite/20_util/from_chars/6.cc: Likewise.
* testsuite/20_util/to_chars/long_double.cc: Likewise.

diff --git a/libstdc++-v3/include/std/charconv 
b/libstdc++-v3/include/std/charconv
index 851b1e5cac2..6e407f31e30 100644
--- a/libstdc++-v3/include/std/charconv
+++ b/libstdc++-v3/include/std/charconv
@@ -44,8 +44,9 @@
 #include  // for std::errc
 #include 
 
-// FIXME: Define when floating point is supported:
-// #define __cpp_lib_to_chars 201611L
+#if _GLIBCXX_HAVE_USELOCALE
+# define __cpp_lib_to_chars 201611L
+#endif
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
diff --git a/libstdc++-v3/include/std/version b/libstdc++-v3/include/std/version
index cb25148fca5..d9f6a3c3dfd 100644
--- a/libstdc++-v3/include/std/version
+++ b/libstdc++-v3/include/std/version
@@ -161,7 +161,9 @@
 #endif
 #define __cpp_lib_shared_ptr_weak_type 201606
 #define __cpp_lib_string_view 201803L
-// #define __cpp_lib_to_chars 201611L
+#if _GLIBCXX_HAVE_USELOCALE
+# define __cpp_lib_to_chars 201611L
+#endif
 #define __cpp_lib_unordered_map_try_emplace 201411
 #define __cpp_lib_variant 201606L
 #endif
diff --git a/libstdc++-v3/testsuite/20_util/from_chars/4.cc 
b/libstdc++-v3/testsuite/20_util/from_chars/4.cc
index 1482b5dfb6e..7cee4dcbc23 100644
--- a/libstdc++-v3/testsuite/20_util/from_chars/4.cc
+++ b/libstdc++-v3/testsuite/20_util/from_chars/4.cc
@@ -28,9 +28,7 @@
 
 // Test std::from_chars floating-point conversions.
 
-// As of July 2020 __cpp_lib_to_chars is not defined, but std::from_chars
-// works for floating-point types when _GLIBCXX_HAVE_USELOCALE is defined.
-#if __cpp_lib_to_chars >= 201611L || _GLIBCXX_HAVE_USELOCALE
+#if __cpp_lib_to_chars >= 201611L
 void
 test01()
 {
@@ -363,7 +361,7 @@ test06()
 int
 main()
 {
-#if __cpp_lib_to_chars >= 201611L || _GLIBCXX_HAVE_USELOCALE
+#if __cpp_lib_to_chars >= 201611L
   test01();
   test02();
   test03();
diff --git a/libstdc++-v3/testsuite/20_util/from_chars/5.cc 
b/libstdc++-v3/testsuite/20_util/from_chars/5.cc
index dd5f9229f68..fa86ab7383c 100644
--- a/libstdc++-v3/testsuite/20_util/from_chars/5.cc
+++ b/libstdc++-v3/testsuite/20_util/from_chars/5.cc
@@ -25,9 +25,7 @@
 
 // Test std::from_chars error handling.
 
-// As of July 2020 __cpp_lib_to_chars is not defined, but std::from_chars
-// works for floating-point types when _GLIBCXX_HAVE_USELOCALE is defined.
-#if __cpp_lib_to_chars >= 201611L || _GLIBCXX_HAVE_USELOCALE
+#if __cpp_lib_to_chars >= 201611L
 void
 test01()
 {
@@ -160,7 +158,7 @@ test04()
 int
 main()
 {
-#if __cpp_lib_to_chars >= 201611L || _GLIBCXX_HAVE_USELOCALE
+#if __cpp_lib_to_chars >= 201611L
   test01();
   test02();
   test03();
diff --git a/libstdc++-v3/testsuite/20_util/from_chars/6.cc 
b/libstdc++-v3/testsuite/20_util/from_chars/6.cc
index 58e0f1fc932..139a1811db0 100644
--- a/libstdc++-v3/testsuite/20_util/from_chars/6.cc
+++ b/libstdc++-v3/testsuite/20_util/from_chars/6.cc
@@ -27,7 +27,7 @@
 void
 test01()
 {
-#if __cpp_lib_to_chars >= 201611L || _GLIBCXX_HAVE_USELOCALE
+#if __cpp_lib_to_chars >= 201611L
 #if _GLIBCXX_USE_C99_FENV_TR1
   double d;
   std::fesetround(FE_DOWNWARD);
diff --git a/libstdc++-v3/testsuite/20_util/to_chars/long_double.cc 
b/libstdc++-v3/testsuite/20_util/to_chars/long_double.cc
index 22d42067d65..8cf45ad5e94 100644
--- a/libstdc++-v3/testsuite/20_util/to_chars/long_double.cc
+++ b/libstdc++-v3/testsuite/20_util/to_chars/long_double.cc
@@ -69,7 +69,7 @@ test01()
   // Verifies correctness of the hexadecimal 

[Patch, fortran] PR84119 - Type parameter inquiry for PDT returns array instead of scalar

2021-04-20 Thread Paul Richard Thomas via Gcc-patches
Hi All,

This is another PDT warm-up patch before tackling the real beast: PR82649.

As the contributor wrote in the PR, "The F08 standard clearly distinguishes
between type parameter definition statements and component definition
statements. See R425, R431, R435, and in particular see Note 6.7 which says
'It [array%a, for example] is scalar even if designator is an array.' "
gfortran was not making this distinction. The patch realises the fix by
lifting the code used for inquiry part references into a new function and
calling for PDT parameters and inquiry references. The arrayspec lbound is
used for 'start' now, rather than unity. In principle this should remove
the need to suppress bound checking. However, since this would be confusing
for the user to say the least of it, the suppression has been retained.

Bootstraps and regtests on FC33/x86_64. OK for 12- and 11-branches?

Cheers

Paul

Fortran: Make PDT LEN and KIND expressions always scalar [PR84119].

2021-04-20  Paul Thomas  

gcc/fortran
PR fortran/84119
* resolve.c (reset_array_ref_to_scalar): New function.
(gfc_resolve_ref): Call it for PDT kind and len expressions.
Code for inquiry refs. moved to new function and replaced by a
call to it.

gcc/testsuite/
PR fortran/84119
* gfortran.dg/pdt_32.f03: New test.
* gfortran.dg/pdt_20.f03: Correct the third test to be against
a scalar instead of an array.
diff --git a/gcc/fortran/resolve.c b/gcc/fortran/resolve.c
index dd4b26680e0..1571fa9d70c 100644
--- a/gcc/fortran/resolve.c
+++ b/gcc/fortran/resolve.c
@@ -5254,12 +5254,46 @@ gfc_resolve_substring_charlen (gfc_expr *e)
 }
 
 
+/* Convert an array reference to an array element so that PDT KIND and LEN
+   or inquiry references are always scalar.  */
+
+static void
+reset_array_ref_to_scalar (gfc_expr *expr, gfc_ref *array_ref)
+{
+  gfc_expr *unity = gfc_get_int_expr (gfc_default_integer_kind, NULL, 1);
+  int dim;
+
+  array_ref->u.ar.type = AR_ELEMENT;
+  expr->rank = 0;
+  /* Suppress the runtime bounds check.  */
+  expr->no_bounds_check = 1;
+  for (dim = 0; dim < array_ref->u.ar.dimen; dim++)
+{
+  array_ref->u.ar.dimen_type[dim] = DIMEN_ELEMENT;
+  if (array_ref->u.ar.start[dim])
+	gfc_free_expr (array_ref->u.ar.start[dim]);
+
+  if (array_ref->u.ar.as && array_ref->u.ar.as->lower[dim])
+	array_ref->u.ar.start[dim]
+			= gfc_copy_expr (array_ref->u.ar.as->lower[dim]);
+  else
+	array_ref->u.ar.start[dim] = gfc_copy_expr (unity);
+
+  if (array_ref->u.ar.end[dim])
+	gfc_free_expr (array_ref->u.ar.end[dim]);
+  if (array_ref->u.ar.stride[dim])
+	gfc_free_expr (array_ref->u.ar.stride[dim]);
+}
+  gfc_free_expr (unity);
+}
+
+
 /* Resolve subtype references.  */
 
 bool
 gfc_resolve_ref (gfc_expr *expr)
 {
-  int current_part_dimension, n_components, seen_part_dimension, dim;
+  int current_part_dimension, n_components, seen_part_dimension;
   gfc_ref *ref, **prev, *array_ref;
   bool equal_length;
 
@@ -5365,6 +5399,14 @@ gfc_resolve_ref (gfc_expr *expr)
 		}
 	}
 
+	  /* The F08 standard distinguishes between type parameter definition
+	 statements and component definition statements. See R425, R431,
+	 R435, and in particular see Note 6.7 which says "It [array%a, for
+	 example] is scalar even if designator is an array."  */
+	  if (array_ref && (ref->u.c.component->attr.pdt_kind
+			|| ref->u.c.component->attr.pdt_len))
+	reset_array_ref_to_scalar (expr, array_ref);
+
 	  n_components++;
 	  break;
 
@@ -5375,27 +5417,7 @@ gfc_resolve_ref (gfc_expr *expr)
 	  /* Implement requirement in note 9.7 of F2018 that the result of the
 	 LEN inquiry be a scalar.  */
 	  if (ref->u.i == INQUIRY_LEN && array_ref && expr->ts.deferred)
-	{
-	  array_ref->u.ar.type = AR_ELEMENT;
-	  expr->rank = 0;
-	  /* INQUIRY_LEN is not evaluated from the rest of the expr
-		 but directly from the string length. This means that setting
-		 the array indices to one does not matter but might trigger
-		 a runtime bounds error. Suppress the check.  */
-	  expr->no_bounds_check = 1;
-	  for (dim = 0; dim < array_ref->u.ar.dimen; dim++)
-		{
-		  array_ref->u.ar.dimen_type[dim] = DIMEN_ELEMENT;
-		  if (array_ref->u.ar.start[dim])
-		gfc_free_expr (array_ref->u.ar.start[dim]);
-		  array_ref->u.ar.start[dim]
-			= gfc_get_int_expr (gfc_default_integer_kind, NULL, 1);
-		  if (array_ref->u.ar.end[dim])
-		gfc_free_expr (array_ref->u.ar.end[dim]);
-		  if (array_ref->u.ar.stride[dim])
-		gfc_free_expr (array_ref->u.ar.stride[dim]);
-		}
-	}
+	reset_array_ref_to_scalar (expr, array_ref);
 	  break;
 	}
 
diff --git a/gcc/testsuite/gfortran.dg/pdt_20.f03 b/gcc/testsuite/gfortran.dg/pdt_20.f03
index b712ed59dbb..3aa9b2e086b 100644
--- a/gcc/testsuite/gfortran.dg/pdt_20.f03
+++ b/gcc/testsuite/gfortran.dg/pdt_20.f03
@@ -16,5 +16,5 @@ program p
allocate (t2(3) :: x)! Used to segfault in trans-array.c.
if (x%b .ne. 3) STOP 1
if (x%b .ne. size (x%r, 1)) 

Re: [PATCH] [libstdc++] Refactor/cleanup of atomic wait implementation

2021-04-20 Thread Jonathan Wakely via Gcc-patches

On 20/04/21 12:04 +0100, Jonathan Wakely wrote:

On 19/04/21 12:23 -0700, Thomas Rodgers wrote:

From: Thomas Rodgers 

This patch address jwakely's feedback from 2021-04-15.

This is a substantial rewrite of the atomic wait/notify (and timed wait
counterparts) implementation.

The previous __platform_wait looped on EINTR however this behavior is
not required by the standard. A new _GLIBCXX_HAVE_PLATFORM_WAIT macro
now controls whether wait/notify are implemented using a platform
specific primitive or with a platform agnostic mutex/condvar. This
patch only supplies a definition for linux futexes. A future update
could add support __ulock_wait/wake on Darwin, for instance.

The members of __waiters were lifted to a new base class. The members
are now arranged such that overall sizeof(__waiters_base) fits in two
cache lines (on platforms with at least 64 byte cache lines). The
definition will also use destructive_interference_size for this if it
is available.

The __waiters type is now specific to untimed waits. Timed waits have a
corresponding __timed_waiters type. Much of the code has been moved from
the previous __atomic_wait() free function to the __waiter_base template
and a __waiter derived type is provided to implement the un-timed wait
operations. A similar change has been made to the timed wait
implementation.

The __atomic_spin code has been extended to take a spin policy which is
invoked after the initial busy wait loop. The default policy is to
return from the spin. The timed wait code adds a timed backoff spinning
policy. The code from  which implements this_thread::sleep_for,
sleep_until has been moved to a new  header


The commit msg wasn't updated for the latest round of changes
(this_thread_sleep, __waiters_pool_base etc).


which allows the thread sleep code to be consumed without pulling in the
whole of .

The entry points into the wait/notify code have been restructured to
support either -
 * Testing the current value of the atomic stored at the given address
   and waiting on a notification.
 * Applying a predicate to determine if the wait was satisfied.
The entry points were renamed to make it clear that the wait and wake
operations operate on addresses. The first variant takes the expected
value and a function which returns the current value that should be used
in comparison operations, these operations are named with a _v suffix
(e.g. 'value'). All atomic<_Tp> wait/notify operations use the first
variant. Barriers, latches and semaphores use the predicate variant.

This change also centralizes what it means to compare values for the
purposes of atomic::wait rather than scattering through individual
predicates.

This change also centralizes the repetitive code which adjusts for
different user supplied clocks (this should be moved elsewhere
and all such adjustments should use a common implementation).

This change also removes the hashing of the pointer and uses
the pointer value directly for indexing into the waiters table.

libstdc++-v3/ChangeLog:
* include/Makefile.am: Add new  header.


The name needs updating to correspond to the latest version of the
patch.


* include/Makefile.in: Regenerate.
* include/bits/atomic_base.h: Adjust all calls
to __atomic_wait/__atomic_notify for new call signatures.
* include/bits/atomic_wait.h: Extensive rewrite.
* include/bits/atomic_timed_wait.h: Likewise.
* include/bits/semaphore_base.h: Adjust all calls
to __atomic_wait/__atomic_notify for new call signatures.
* include/bits/this_thread_sleep.h: New file.
* include/std/atomic: Likewise.
* include/std/barrier: Likewise.
* include/std/latch: Likewise.


include/std/thread is missing from the changelog entry. You can use
the 'git gcc-verify' alias to check your commit log will be accepted
by the server-side hook:

'gcc-verify' is aliased to '!f() { "`git rev-parse 
--show-toplevel`/contrib/gcc-changelog/git_check_commit.py" $@; } ; f'



* testsuite/29_atomics/atomic/wait_notify/bool.cc: Simplify
test.
* testsuite/29_atomics/atomic/wait_notify/generic.cc: Likewise.
* testsuite/29_atomics/atomic/wait_notify/pointers.cc: Likewise.
* testsuite/29_atomics/atomic_flag/wait_notify.cc: Likewise.
* testsuite/29_atomics/atomic_float/wait_notify.cc: Likewise.
* testsuite/29_atomics/atomic_integral/wait_notify.cc: Likewise.
* testsuite/29_atomics/atomic_ref/wait_notify.cc: Likewise.



-struct __timed_waiters : __waiters
+struct __timed_waiters : __waiter_pool_base


Should this be __timed_waiter_pool for consistency with
__waiter_pool_base and __waiter_pool?



-inline void
-__thread_relax() noexcept
-{
-#if defined __i386__ || defined __x86_64__
-  __builtin_ia32_pause();
-#elif defined _GLIBCXX_USE_SCHED_YIELD
-  __gthread_yield();
-#endif
-}
+template
+  struct __waiter_base
+  {
+   using 

[PATCH][libgomp, nvptx] Fix hang in gomp_team_barrier_wait_end

2021-04-20 Thread Tom de Vries
Hi,

Consider the following omp fragment.
...
  #pragma omp target
  #pragma omp parallel num_threads (2)
  #pragma omp task
;
...

This hangs at -O0 for nvptx.

Investigating the behaviour gives us the following trace of events:
- both threads execute GOMP_task, where they:
  - deposit a task, and
  - execute gomp_team_barrier_wake
- thread 1 executes gomp_team_barrier_wait_end and, not being the last thread,
  proceeds to wait at the team barrier
- thread 0 executes gomp_team_barrier_wait_end and, being the last thread, it
  calls gomp_barrier_handle_tasks, where it:
  - executes both tasks and marks the team barrier done
  - executes a gomp_team_barrier_wake which wakes up thread 1
- thread 1 exits the team barrier
- thread 0 returns from gomp_barrier_handle_tasks and goes to wait at
  the team barrier.
- thread 0 hangs.

To understand why there is a hang here, it's good to understand how things
are setup for nvptx.  The libgomp/config/nvptx/bar.c implementation is
a copy of the libgomp/config/linux/bar.c implementation, with uses of both
futex_wake and do_wait replaced with uses of nvptx insn bar.sync:
...
  if (bar->total > 1)
asm ("bar.sync 1, %0;" : : "r" (32 * bar->total));
...

The point where thread 0 goes to wait at the team barrier, corresponds in
the linux implementation with a do_wait.  In the linux case, the call to
do_wait doesn't hang, because it's waiting for bar->generation to become
a certain value, and if bar->generation already has that value, it just
proceeds, without any need for coordination with other threads.

In the nvtpx case, the bar.sync waits until thread 1 joins it in the same
logical barrier, which never happens: thread 1 is lingering in the
thread pool at the thread pool barrier (using a different logical barrier),
waiting to join a new team.

The easiest way to fix this is to revert to the posix implementation for
bar.{c,h}.

Another way would be to revert to the linux implementation for bar.{c,h},
and implement the primitives futex_wait and do_wait using nvptx insns.

This patch instead implements a minimal fix (which makes the implementation
deviate further from the linux one).

The hang was only observed in gomp_team_barrier_wait_end, but we propagate the
fix to its twin gomp_team_barrier_wait_cancel_end as well.

The fix is based on the assumptions that at the point of the fix, after the
call to gomp_barrier_handle_tasks:
- all tasks are done
  (an assert is added to check this), and consequently:
- the executing thread is the only thread left in the team barrier
  (so it's accurate to set nthreads to 1).

Tested libgomp on x86_64 with nvptx accelerator.

Any comments?

Thanks,
- Tom

[libgomp, nvptx] Fix hang in gomp_team_barrier_wait_end

libgomp/ChangeLog:

2021-04-20  Tom de Vries  

PR target/99555
* config/nvptx/bar.c (gomp_team_barrier_wait_end)
(gomp_team_barrier_wait_cancel_end): Don't try to sync with team threads
that have left the team barrier.
* testsuite/libgomp.c-c++-common/task-detach-6.c: Remove nvptx-specific
workarounds.
* testsuite/libgomp.c/pr99555-1.c: Same.
* testsuite/libgomp.fortran/task-detach-6.f90: Same.

---
 libgomp/config/nvptx/bar.c | 32 --
 .../testsuite/libgomp.c-c++-common/task-detach-6.c |  8 --
 libgomp/testsuite/libgomp.c/pr99555-1.c|  8 --
 .../testsuite/libgomp.fortran/task-detach-6.f90| 12 
 4 files changed, 24 insertions(+), 36 deletions(-)

diff --git a/libgomp/config/nvptx/bar.c b/libgomp/config/nvptx/bar.c
index c5c2fa8829b..058a8d4d5ca 100644
--- a/libgomp/config/nvptx/bar.c
+++ b/libgomp/config/nvptx/bar.c
@@ -78,6 +78,7 @@ void
 gomp_team_barrier_wait_end (gomp_barrier_t *bar, gomp_barrier_state_t state)
 {
   unsigned int generation, gen;
+  unsigned int nthreads = bar->total;
 
   if (__builtin_expect (state & BAR_WAS_LAST, 0))
 {
@@ -90,6 +91,15 @@ gomp_team_barrier_wait_end (gomp_barrier_t *bar, 
gomp_barrier_state_t state)
   if (__builtin_expect (team->task_count, 0))
{
  gomp_barrier_handle_tasks (state);
+ /* Assert that all tasks have been handled.  */
+ if (team->task_count != 0)
+   __builtin_abort ();
+ /* In gomp_barrier_handle_tasks, the team barrier has been marked
+as done, and all pending threads woken up.  So this is now the
+last and only thread in the barrier.  Adjust nthreads to
+reflect the new situation, to make sure we don't hang
+indefinitely at the bar.sync below.  */
+ nthreads = 1;
  state &= ~BAR_WAS_LAST;
}
   else
@@ -97,8 +107,8 @@ gomp_team_barrier_wait_end (gomp_barrier_t *bar, 
gomp_barrier_state_t state)
  state &= ~BAR_CANCELLED;
  state += BAR_INCR - BAR_WAS_LAST;
  __atomic_store_n (>generation, state, MEMMODEL_RELEASE);
- if (bar->total > 1)
-   asm ("bar.sync 1, 

Re: [PATCH] [libstdc++] Refactor/cleanup of atomic wait implementation

2021-04-20 Thread Jonathan Wakely via Gcc-patches

On 19/04/21 12:23 -0700, Thomas Rodgers wrote:

From: Thomas Rodgers 

This patch address jwakely's feedback from 2021-04-15.

This is a substantial rewrite of the atomic wait/notify (and timed wait
counterparts) implementation.

The previous __platform_wait looped on EINTR however this behavior is
not required by the standard. A new _GLIBCXX_HAVE_PLATFORM_WAIT macro
now controls whether wait/notify are implemented using a platform
specific primitive or with a platform agnostic mutex/condvar. This
patch only supplies a definition for linux futexes. A future update
could add support __ulock_wait/wake on Darwin, for instance.

The members of __waiters were lifted to a new base class. The members
are now arranged such that overall sizeof(__waiters_base) fits in two
cache lines (on platforms with at least 64 byte cache lines). The
definition will also use destructive_interference_size for this if it
is available.

The __waiters type is now specific to untimed waits. Timed waits have a
corresponding __timed_waiters type. Much of the code has been moved from
the previous __atomic_wait() free function to the __waiter_base template
and a __waiter derived type is provided to implement the un-timed wait
operations. A similar change has been made to the timed wait
implementation.

The __atomic_spin code has been extended to take a spin policy which is
invoked after the initial busy wait loop. The default policy is to
return from the spin. The timed wait code adds a timed backoff spinning
policy. The code from  which implements this_thread::sleep_for,
sleep_until has been moved to a new  header


The commit msg wasn't updated for the latest round of changes
(this_thread_sleep, __waiters_pool_base etc).


which allows the thread sleep code to be consumed without pulling in the
whole of .

The entry points into the wait/notify code have been restructured to
support either -
  * Testing the current value of the atomic stored at the given address
and waiting on a notification.
  * Applying a predicate to determine if the wait was satisfied.
The entry points were renamed to make it clear that the wait and wake
operations operate on addresses. The first variant takes the expected
value and a function which returns the current value that should be used
in comparison operations, these operations are named with a _v suffix
(e.g. 'value'). All atomic<_Tp> wait/notify operations use the first
variant. Barriers, latches and semaphores use the predicate variant.

This change also centralizes what it means to compare values for the
purposes of atomic::wait rather than scattering through individual
predicates.

This change also centralizes the repetitive code which adjusts for
different user supplied clocks (this should be moved elsewhere
and all such adjustments should use a common implementation).

This change also removes the hashing of the pointer and uses
the pointer value directly for indexing into the waiters table.

libstdc++-v3/ChangeLog:
* include/Makefile.am: Add new  header.


The name needs updating to correspond to the latest version of the
patch.


* include/Makefile.in: Regenerate.
* include/bits/atomic_base.h: Adjust all calls
to __atomic_wait/__atomic_notify for new call signatures.
* include/bits/atomic_wait.h: Extensive rewrite.
* include/bits/atomic_timed_wait.h: Likewise.
* include/bits/semaphore_base.h: Adjust all calls
to __atomic_wait/__atomic_notify for new call signatures.
* include/bits/this_thread_sleep.h: New file.
* include/std/atomic: Likewise.
* include/std/barrier: Likewise.
* include/std/latch: Likewise.


include/std/thread is missing from the changelog entry. You can use
the 'git gcc-verify' alias to check your commit log will be accepted
by the server-side hook:

'gcc-verify' is aliased to '!f() { "`git rev-parse 
--show-toplevel`/contrib/gcc-changelog/git_check_commit.py" $@; } ; f'



* testsuite/29_atomics/atomic/wait_notify/bool.cc: Simplify
test.
* testsuite/29_atomics/atomic/wait_notify/generic.cc: Likewise.
* testsuite/29_atomics/atomic/wait_notify/pointers.cc: Likewise.
* testsuite/29_atomics/atomic_flag/wait_notify.cc: Likewise.
* testsuite/29_atomics/atomic_float/wait_notify.cc: Likewise.
* testsuite/29_atomics/atomic_integral/wait_notify.cc: Likewise.
* testsuite/29_atomics/atomic_ref/wait_notify.cc: Likewise.



-struct __timed_waiters : __waiters
+struct __timed_waiters : __waiter_pool_base


Should this be __timed_waiter_pool for consistency with
__waiter_pool_base and __waiter_pool?



-inline void
-__thread_relax() noexcept
-{
-#if defined __i386__ || defined __x86_64__
-  __builtin_ia32_pause();
-#elif defined _GLIBCXX_USE_SCHED_YIELD
-  __gthread_yield();
-#endif
-}
+template
+  struct __waiter_base
+  {
+   using __waiter_type = _Tp;

-inline void
-

Re: [Patch, fortran] PR100110 - Parameterized Derived Types, problems with global variable

2021-04-20 Thread Paul Richard Thomas via Gcc-patches
Hi Tobias,

That was entirely accidental. I should have been more careful about
checking the timing of the merge. When I last checked the number of P1s
seemed to indicate that there was a while before it would happen.

Apologies to all.

Paul


On Tue, 20 Apr 2021 at 11:07, Tobias Burnus  wrote:

> Answer: Because my 'git pull' somehow got stuck – and showed an old trunk.
>
> Your patch just went in before the merge – thus it was on mainline GCC
> 11 and is now
> on mainline GCC 12 + GCC 11 branch ...
>
> Sorry for the confusion.
>
> Tobias
>
> On 20.04.21 11:58, Tobias Burnus wrote:
> > Hi Paul,
> >
> > is there a reason why you did not apply the patch to mainline ('master')
> > but only to GCC 11 ('releases/gcc-11')?
> >
> > While GCC 11 is okay, I had expected it to be (only) on mainline!
> >
> > Tobias
> >
> > On 20.04.21 10:55, Paul Richard Thomas wrote:
> >> Hi Tobias,
> >>
> >> Thanks. Commit r11-8255-g67378cd63d62bf0c69e966d1d202a1e586550a68.
> >>
> >> By the way, I did check that there were no problems with pdt_26.f03
> >> reported by valgrind, given the decrease in the malloc count.
> >>
> >> Cheers
> >>
> >> Paul
> >>
> >>
> >> On Mon, 19 Apr 2021 at 14:08, Tobias Burnus  >> > wrote:
> >>
> >> Hi Paul,
> >>
> >> On 19.04.21 14:40, Paul Richard Thomas via Gcc-patches wrote:
> >> > I was just about to announce that I will only do backports and
> >> regressions,
> >> > while I finally attack the fundamental problem with the
> >> representation of
> >> > Parameterized Derived Types. Then this PR came up that was such
> >> clear low
> >> > hanging fruit that I decided to fix it right away.
> >> >
> >> > Regtests on FC33/x86_64 - OK for mainline?
> >>
> >> LGTM.
> >>
> >> Thanks,
> >>
> >> Tobias
> >>
> >> -
> >> Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634
> >> München Registergericht München HRB 106955, Geschäftsführer:
> >> Thomas Heurung, Frank Thürauf
> >>
> >>
> >>
> >> --
> >> "If you can't explain it simply, you don't understand it well enough"
> >> - Albert Einstein
> > -
> > Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München
> > Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung,
> > Frank Thürauf
> -
> Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München
> Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank
> Thürauf
>


-- 
"If you can't explain it simply, you don't understand it well enough" -
Albert Einstein


Re: [PATCH] libstdc++: Update ppc64le baseline_symbols.txt

2021-04-20 Thread Jonathan Wakely via Gcc-patches

On 19/04/21 15:07 +0200, Jakub Jelinek wrote:

On Fri, Apr 16, 2021 at 08:48:12PM +0200, Jakub Jelinek via Gcc-patches wrote:

Tested on powerpc64{,le}-linux now (-m32/-m64 on be) and while the first
patch works fine, the second one unfortunately doesn't on either be or le,
so more work is needed there.  Thus, I'm withdrawing the second
* config/abi/post/powerpc64-linux-gnu/baseline_symbols.txt: Update.
only patch.


Here are the needed changes to make it work.
For symbols with _LDBL_ substring in version name we already have code to
ignore those if no such symbols appear (but it is slightly incorrect, see
below).
So, this patch does the same thing for symbol versions with _IEEE128_
substring.
The previously incorrectly handled case is that in addition to
FUNC:_ZNKSt17__gnu_cxx_ieee1287num_getIcSt19istreambuf_iteratorIcSt11char_traitsIcEEE14_M_extract_intImEES4_S4_S4_RSt8ios_baseRSt12_Ios_IostateRT_@@GLIBCXX_IEEE128_3.4.29
or
OBJECT:12:_ZTSu9__ieee128@@CXXABI_IEEE128_1.3.13
cases we also have the
OBJECT:0:CXXABI_IEEE128_1.3.13
OBJECT:0:GLIBCXX_IEEE128_3.4.29
cases, which have empty version_name and the name is in that case the
symbol version.  Those need to be ignored too.

Tested on {powerpc64le,powerpc64,x86_64}-linux, ok for trunk?


OK for trunk and gcc-11, thanks.



Re: [PATCH] libstdc++: Implement P2259R1 changes [PR95983]

2021-04-20 Thread Jonathan Wakely via Gcc-patches

On 19/04/21 22:25 -0400, Patrick Palka via Libstdc++ wrote:

This implements the wording changes of P2259R1 "Repairing input range
adaptors and counted_iterator", which resolves LWG 3283, 3289 and 3408.

The wording changes are relatively straightforward, but they require
some boilerplate to implement: the changes to make a type alias
"conditionally present" in some iterator class are realized by defining
a base class template and an appropriately constrained partial
specialization thereof that contains the type alias, and having the
iterator class derive from this base class.  Sometimes the relevant
condition depend on members from the iterator class, but because a
class is incomplete when instantiating its bases, the constraints on
the partial specialization can't use anything from the iterator class.
This patch works around this by hoisting these members out to the
enclosing scope (e.g. transform_view::_Iterator::_Base is hoisted out
to transform_view::_Base so that transform_view::__iter_cat can use it).

This patch also implements the proposed resolution of LWG 3291 to rename
iota_view::iterator_category to iota_view::iterator_concept, which was
previously problematic due to LWG 3408.

Tested on x86_64-pc-linux-gnu.

libstdc++-v3/ChangeLog:

PR libstdc++/95983
* include/bits/stl_iterator.h (__detail::__move_iter_cat):
Define.
(move_iterator): Derive from the above in C++20 in order to
conditionally define iterator_category as per P2259.
(move_iterator::__base_cat): No longer used, so remove.
(move_iterator::iterator_category): Remove in C++20.
(__detail::__common_iter_use_postfix_proxy): Define.
(common_iterator::_Proxy): Rename to ...
(common_iterator:__arrow_proxy): ... this.
(common_iterator::__postfix_proxy): Define as per P2259.
(common_iterator::operator->): Adjust.
(common_iterator::operator++): Adjust as per P2259.
(iterator_traits::_S_iter_cat): Define.
(iterator_traits::iterator_category): Change as
per P2259.
(__detail::__counted_iter_value_type): Define.
(__detail::__counted_iter_concept): Define.
(__detail::__counted_iter_cat): Define.
(counted_iterator): Derive from the above three classes in order
to conditionally define value_type, iterator_concept and
iterator category respectively as per P2259.
(counted_iterator::operator->): Define as per P2259.
(incrementable_traits): Remove as per P2259.
(iterator_traits): Adjust as per P2259.
* include/std/ranges (__detail::__iota_view_iter_cat): Define.
(iota_view::_Iterator): Derive from the above in order to
conditionally define iterator_category as per P2259.
(iota_view::_S_iter_cat): Rename to ...
(iota_view::_S_iter_concept): ... this.
(iota_view::iterator_concept): Use it to apply LWG 3291 changes.
(iota_view::iterator_category): Remove.
(__detail::__filter_view_iter_cat): Define.
(filter_view::_Iterator): Derive from the above in order to
conditionally define iterator_category as per P2259.
(filter_view::_Iterator): Move to struct __filter_view_iter_cat.
(filter_view::_Iterator::iterator_category): Remove.
(transform_view::_Base): Define.
(transform_view::__iter_cat): Define.
(transform_view::_Iterator): Derive from the above in order to
conditionally define iterator_category as per P2259.
(transform_view::_Iterator::_Base): Just alias
transform_view::_Base.
(transform_view::_Iterator::_S_iter_cat): Move to struct
transform_view::__iter_cat.
(transform_view::_Iterator::iterator_category): Remove.
(transform_view::_Sentinel::_Base): Just alias
transform_view::_Base.
(join_view::_Base): Define.
(join_view::_Outer_iter): Define.
(join_view::_Inner_iter): Define.
(join_view::_S_ref_is_glvalue): Define.
(join_view::__iter_cat): Define.
(join_view::_Iterator): Derive from it in order to conditionally
define iterator_category as per P2259.
(join_view::_Iterator::_Base): Just alias join_view::_Base.
(join_view::_Iterator::_S_ref_is_glvalue): Just alias
join_view::_S_ref_is_glvalue.
(join_view::_Iterator::_S_iter_cat): Move to struct
transform_view::__iter_cat.
(join_view::_Iterator::_Outer_iter): Just alias
join_view::_Outer_iter.
(join_view::_Iterator::_Inner_iter): Just alias
join_view::_Inner_iter.
(join_view::_Iterator::iterator_category): Remove.
(join_view::_Sentinel::_Base): Just alias join_view::_Base.
(__detail::__split_view_outer_iter_cat): Define.
(__detail::__split_view_inner_iter_cat): Define.
(split_view::_Base): Define.
(split_view::_Outer_iter): Derive from 

Re: [PATCH] libstdc++: Implement P2259R1 changes [PR95983]

2021-04-20 Thread Jonathan Wakely via Gcc-patches

On 19/04/21 22:25 -0400, Patrick Palka via Libstdc++ wrote:

This implements the wording changes of P2259R1 "Repairing input range
adaptors and counted_iterator", which resolves LWG 3283, 3289 and 3408.

The wording changes are relatively straightforward, but they require
some boilerplate to implement: the changes to make a type alias
"conditionally present" in some iterator class are realized by defining
a base class template and an appropriately constrained partial
specialization thereof that contains the type alias, and having the
iterator class derive from this base class.  Sometimes the relevant
condition depend on members from the iterator class, but because a
class is incomplete when instantiating its bases, the constraints on
the partial specialization can't use anything from the iterator class.
This patch works around this by hoisting these members out to the
enclosing scope (e.g. transform_view::_Iterator::_Base is hoisted out
to transform_view::_Base so that transform_view::__iter_cat can use it).

This patch also implements the proposed resolution of LWG 3291 to rename
iota_view::iterator_category to iota_view::iterator_concept, which was
previously problematic due to LWG 3408.

Tested on x86_64-pc-linux-gnu.

libstdc++-v3/ChangeLog:

PR libstdc++/95983
* include/bits/stl_iterator.h (__detail::__move_iter_cat):
Define.
(move_iterator): Derive from the above in C++20 in order to
conditionally define iterator_category as per P2259.
(move_iterator::__base_cat): No longer used, so remove.
(move_iterator::iterator_category): Remove in C++20.
(__detail::__common_iter_use_postfix_proxy): Define.
(common_iterator::_Proxy): Rename to ...
(common_iterator:__arrow_proxy): ... this.
(common_iterator::__postfix_proxy): Define as per P2259.
(common_iterator::operator->): Adjust.
(common_iterator::operator++): Adjust as per P2259.
(iterator_traits::_S_iter_cat): Define.
(iterator_traits::iterator_category): Change as
per P2259.
(__detail::__counted_iter_value_type): Define.
(__detail::__counted_iter_concept): Define.
(__detail::__counted_iter_cat): Define.
(counted_iterator): Derive from the above three classes in order
to conditionally define value_type, iterator_concept and
iterator category respectively as per P2259.
(counted_iterator::operator->): Define as per P2259.
(incrementable_traits): Remove as per P2259.
(iterator_traits): Adjust as per P2259.
* include/std/ranges (__detail::__iota_view_iter_cat): Define.
(iota_view::_Iterator): Derive from the above in order to
conditionally define iterator_category as per P2259.
(iota_view::_S_iter_cat): Rename to ...
(iota_view::_S_iter_concept): ... this.
(iota_view::iterator_concept): Use it to apply LWG 3291 changes.
(iota_view::iterator_category): Remove.
(__detail::__filter_view_iter_cat): Define.
(filter_view::_Iterator): Derive from the above in order to
conditionally define iterator_category as per P2259.
(filter_view::_Iterator): Move to struct __filter_view_iter_cat.
(filter_view::_Iterator::iterator_category): Remove.
(transform_view::_Base): Define.
(transform_view::__iter_cat): Define.
(transform_view::_Iterator): Derive from the above in order to
conditionally define iterator_category as per P2259.
(transform_view::_Iterator::_Base): Just alias
transform_view::_Base.
(transform_view::_Iterator::_S_iter_cat): Move to struct
transform_view::__iter_cat.
(transform_view::_Iterator::iterator_category): Remove.
(transform_view::_Sentinel::_Base): Just alias
transform_view::_Base.
(join_view::_Base): Define.
(join_view::_Outer_iter): Define.
(join_view::_Inner_iter): Define.
(join_view::_S_ref_is_glvalue): Define.
(join_view::__iter_cat): Define.
(join_view::_Iterator): Derive from it in order to conditionally
define iterator_category as per P2259.
(join_view::_Iterator::_Base): Just alias join_view::_Base.
(join_view::_Iterator::_S_ref_is_glvalue): Just alias
join_view::_S_ref_is_glvalue.
(join_view::_Iterator::_S_iter_cat): Move to struct
transform_view::__iter_cat.
(join_view::_Iterator::_Outer_iter): Just alias
join_view::_Outer_iter.
(join_view::_Iterator::_Inner_iter): Just alias
join_view::_Inner_iter.
(join_view::_Iterator::iterator_category): Remove.
(join_view::_Sentinel::_Base): Just alias join_view::_Base.
(__detail::__split_view_outer_iter_cat): Define.
(__detail::__split_view_inner_iter_cat): Define.
(split_view::_Base): Define.
(split_view::_Outer_iter): Derive from 

Re: [Patch, fortran] PR100110 - Parameterized Derived Types, problems with global variable

2021-04-20 Thread Tobias Burnus

Answer: Because my 'git pull' somehow got stuck – and showed an old trunk.

Your patch just went in before the merge – thus it was on mainline GCC
11 and is now
on mainline GCC 12 + GCC 11 branch ...

Sorry for the confusion.

Tobias

On 20.04.21 11:58, Tobias Burnus wrote:

Hi Paul,

is there a reason why you did not apply the patch to mainline ('master')
but only to GCC 11 ('releases/gcc-11')?

While GCC 11 is okay, I had expected it to be (only) on mainline!

Tobias

On 20.04.21 10:55, Paul Richard Thomas wrote:

Hi Tobias,

Thanks. Commit r11-8255-g67378cd63d62bf0c69e966d1d202a1e586550a68.

By the way, I did check that there were no problems with pdt_26.f03
reported by valgrind, given the decrease in the malloc count.

Cheers

Paul


On Mon, 19 Apr 2021 at 14:08, Tobias Burnus mailto:tob...@codesourcery.com>> wrote:

Hi Paul,

On 19.04.21 14:40, Paul Richard Thomas via Gcc-patches wrote:
> I was just about to announce that I will only do backports and
regressions,
> while I finally attack the fundamental problem with the
representation of
> Parameterized Derived Types. Then this PR came up that was such
clear low
> hanging fruit that I decided to fix it right away.
>
> Regtests on FC33/x86_64 - OK for mainline?

LGTM.

Thanks,

Tobias

-
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634
München Registergericht München HRB 106955, Geschäftsführer:
Thomas Heurung, Frank Thürauf



--
"If you can't explain it simply, you don't understand it well enough"
- Albert Einstein

-
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung,
Frank Thürauf

-
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München 
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank 
Thürauf


Re: [Patch, fortran] PR100110 - Parameterized Derived Types, problems with global variable

2021-04-20 Thread Jakub Jelinek via Gcc-patches
On Tue, Apr 20, 2021 at 11:58:32AM +0200, Tobias Burnus wrote:
> is there a reason why you did not apply the patch to mainline ('master')
> but only to GCC 11 ('releases/gcc-11')?
> 
> While GCC 11 is okay, I had expected it to be (only) on mainline!

r11-8255 is before the branchpoint, so is both in GCC 11 and trunk.
r11-8256 was the last revision on both, followed by r12-0 on trunk
and r11-8257 on releases/gcc-11.

Jakub



Re: [Patch, fortran] PR100110 - Parameterized Derived Types, problems with global variable

2021-04-20 Thread Tobias Burnus

Hi Paul,

is there a reason why you did not apply the patch to mainline ('master')
but only to GCC 11 ('releases/gcc-11')?

While GCC 11 is okay, I had expected it to be (only) on mainline!

Tobias

On 20.04.21 10:55, Paul Richard Thomas wrote:

Hi Tobias,

Thanks. Commit r11-8255-g67378cd63d62bf0c69e966d1d202a1e586550a68.

By the way, I did check that there were no problems with pdt_26.f03
reported by valgrind, given the decrease in the malloc count.

Cheers

Paul


On Mon, 19 Apr 2021 at 14:08, Tobias Burnus mailto:tob...@codesourcery.com>> wrote:

Hi Paul,

On 19.04.21 14:40, Paul Richard Thomas via Gcc-patches wrote:
> I was just about to announce that I will only do backports and
regressions,
> while I finally attack the fundamental problem with the
representation of
> Parameterized Derived Types. Then this PR came up that was such
clear low
> hanging fruit that I decided to fix it right away.
>
> Regtests on FC33/x86_64 - OK for mainline?

LGTM.

Thanks,

Tobias

-
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634
München Registergericht München HRB 106955, Geschäftsführer:
Thomas Heurung, Frank Thürauf



--
"If you can't explain it simply, you don't understand it well enough"
- Albert Einstein

-
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München 
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank 
Thürauf


[PATCH] Bump LTO_major_version to 11.

2021-04-20 Thread Martin Liška
It seems we bumped LTO_major_version last time 2 years ago.

Right now, the following is seen when one links a GCC 10.2.x LTO object file:
$ gcc a.o

lto1: fatal error: bytecode stream in file ‘a.o’ generated with LTO version 9.2 
instead of the expected 9.0


I suggest bumping LTO_major_version for releases/gcc-11 branch.
Can we please align it with a GCC release (version 11)? For the future, if e.g. 
GCC 12 consumes LTO
bytecode from GCC 11, we can leave LTO_major_version. Once e.g. GCC 13 needs 
bumping,
I would then change it to 13.

What do you think about the suggested scheme?
Ready for gcc-11 branch?
Martin

gcc/ChangeLog:

* lto-streamer.h (LTO_major_version): Bump to 11.
---
 gcc/lto-streamer.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/lto-streamer.h b/gcc/lto-streamer.h
index 5c7cd84d46f..7a7be80dab8 100644
--- a/gcc/lto-streamer.h
+++ b/gcc/lto-streamer.h
@@ -120,7 +120,7 @@ along with GCC; see the file COPYING3.  If not see
  String are represented in the table as pairs, a length in ULEB128
  form followed by the data for the string.  */
 
-#define LTO_major_version 9
+#define LTO_major_version 11
 #define LTO_minor_version 0
 
 typedef unsigned char  lto_decl_flags_t;
-- 
2.31.1



[PATCH][ADA] Bump version to 12.

2021-04-20 Thread Martin Liška
Pushed as obvious to master.

Martin

gcc/ada/ChangeLog:

* gnatvsn.ads: Bump Library_Version to 12.
---
 gcc/ada/gnatvsn.ads | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/ada/gnatvsn.ads b/gcc/ada/gnatvsn.ads
index bbfa9f58d76..c0a77bc8c59 100644
--- a/gcc/ada/gnatvsn.ads
+++ b/gcc/ada/gnatvsn.ads
@@ -32,7 +32,7 @@ package Gnatvsn is
--  Static string identifying this version, that can be used as an argument
--  to e.g. pragma Ident.
 
-   Library_Version : constant String := "11";
+   Library_Version : constant String := "12";
--  Library version. It needs to be updated whenever the major version
--  number is changed.
--
-- 
2.31.1



[wwwdocs] Remove part about coordination with FSF.

2021-04-20 Thread Martin Liška
Pushed after Jakub's confirmation. We do not coordinate major release with FSF.

Martin

---
 htdocs/releasing.html | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/htdocs/releasing.html b/htdocs/releasing.html
index d51d6fc7..cb7bb5fe 100644
--- a/htdocs/releasing.html
+++ b/htdocs/releasing.html
@@ -121,9 +121,6 @@ to point to the online-documentation for the branch.
 and info-...@gnu.org mailing lists.
 
 Make sure that most mirrors have retrieved the new release already.
-
-In the case of a major release, coordinate announcements with the
-FSF.
 
 
 
-- 
2.31.1



Re: move selftests into their own files?

2021-04-20 Thread Aldy Hernandez via Gcc-patches
FWIW, I agree with Martin.  Source files with lots of tests make
finding things difficult.  For example, value-range.cc has 400 lines
of tests (which I wrote ;-)).  Even though these tests have proven
invaluable in finding bugs, they make navigating the source file
cumbersome.

Aldy

On Tue, Apr 20, 2021 at 1:27 AM Martin Sebor via Gcc-patches
 wrote:
>
> On 4/19/21 3:13 PM, Koning, Paul wrote:
> >
> >
> >> On Apr 19, 2021, at 4:50 PM, Martin Sebor via Gcc-patches 
> >>  wrote:
> >>
> >> On 4/19/21 2:03 PM, David Malcolm wrote:
> >>> On Mon, 2021-04-19 at 13:47 -0600, Martin Sebor via Gcc-patches wrote:
>  The selftests at the end of many source files are only rarely read
>  or modified, but they contribute to the size/complexity of the files
>  and make moving within the rest of the code more difficult.
> 
> >>> FWIW I prefer having the tests in the same file as the code they test.
>  Would anyone be opposed to moving any of them into new files of their
>  own? E.g., those in tree.c to tree-tests.c, etc.?  I would be happy
>  to do this for a subset of these, with the goal of eventually moving
>  all of them and adding new ones accordingly.
> >>> Having the selftests in the same source file as the thing they test
> >>> allows for the selftest to use "static" declarations and anonymous
> >>> namespaces from that file.  For example, the selftests in diagnostic-
> >>> show-locus.c make use of various things declared in an anonymous
> >>> namespace in that file.  If I had to move the selftests to a different
> >>> file, I'd have to expose these interfaces, which I don't want to do.
> >>
> >> I was actually thinking of just #including each foo-tests.c file
> >> to bring in the code right where it is now, so this shouldn't be
> >> a problem.  Would that work for you?
> >>
> >> Martin
> >
> > How does that help the problem you said need to be solved?  If having self 
> > test code be part of the compilation unit makes modifying things more 
> > difficult, it doesn't matter whether that code is in the compilation unit 
> > due to being in the main source file, or due to being a #include.
>
> The self tests make the sources bigger and so harder to move around
> in and difficult to find just the calls to tested functions made
> from elsewhere in the file or from other parts of the compiler (i.e.,
> not tests).  They are only rarely relevant when reading or changing
> the file.
>
> Keeping them separate from the code they exercise will be helpful
> to me and I assumed to others as well.  But I wouldn't want to make
> some common tasks difficult, so if you or someone else has one that
> would be made so by it, I won't pursue it.  Do you?
>
> Martin
>



Re: [PATCH] [libstdc++] Refactor/cleanup of atomic wait implementation

2021-04-20 Thread Jonathan Wakely via Gcc-patches

On 19/04/21 12:23 -0700, Thomas Rodgers wrote:

  namespace __detail
  {
-using __platform_wait_t = int;
-
-constexpr auto __atomic_spin_count_1 = 16;
-constexpr auto __atomic_spin_count_2 = 12;
-
-template
-  inline constexpr bool __platform_wait_uses_type
#ifdef _GLIBCXX_HAVE_LINUX_FUTEX
-   = is_same_v, __platform_wait_t>;
+using __platform_wait_t = int;
#else
-   = false;
+using __platform_wait_t = uint64_t;
+#endif
+static constexpr size_t __platform_wait_alignment
+ = alignof(__platform_wait_t);


The value of this constant can't be alignof(__platform_wait_t). As
discussed, a futex always needs 4-byte alignment, but on at least one
target that GCC supports, alignof(int) == 2.


+  } // namespace __detail
+
+  template
+inline constexpr bool __platform_wait_uses_type
+#ifdef _GLIBCXX_HAVE_PLATFORM_WAIT
+  = is_scalar_v<_Tp>
+   && ((sizeof(_Tp) == sizeof(__detail::__platform_wait_t))
+   && (alignof(_Tp*) >= alignof(__detail::__platform_wait_t)));


Now that we have the __platform_wait_alignment it should be used here
(so that when we fix the constant, this gets fixed too).


+#else
+  = false;
#endif

+  namespace __detail
+  {
#ifdef _GLIBCXX_HAVE_LINUX_FUTEX
+#define _GLIBCXX_HAVE_PLATFORM_WAIT 1
enum class __futex_wait_flags : int
{
#ifdef _GLIBCXX_HAVE_LINUX_FUTEX_PRIVATE






+
+ static __waiter_type&
+ _S_for(const void* __addr)
+ {
+   static_assert(sizeof(__waiter_type) == sizeof(__waiter_pool_base));
+   auto& res = __waiter_pool_base::_S_for(__addr);
+   return reinterpret_cast<__waiter_type&>(res);
+ }


Nit: this is still indented as if it were a function template.


  : _M_a(__expected) { }
@@ -73,8 +73,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
_GLIBCXX_ALWAYS_INLINE void
wait() const noexcept
{
-  auto const __old = __atomic_impl::load(&_M_a, memory_order::acquire);
-  std::__atomic_wait(&_M_a, __old, [this] { return this->try_wait(); });
+  auto const __pred = [this] { return this->try_wait(); };
+  std::__atomic_wait_address(&_M_a, __pred);
}

_GLIBCXX_ALWAYS_INLINE void
@@ -85,7 +85,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
}

  private:
-alignas(__alignof__(ptrdiff_t)) ptrdiff_t _M_a;
+alignas(__alignof__(__detail::__platform_wait_t)) 
__detail::__platform_wait_t _M_a;


This should use the new constant too.





RE: [PATCH] Adjust guality xfails for aarch64*-*-*

2021-04-20 Thread Kyrylo Tkachov via Gcc-patches


> -Original Message-
> From: Richard Sandiford 
> Sent: 19 April 2021 20:34
> To: gcc-patches@gcc.gnu.org
> Cc: Richard Earnshaw ; Kyrylo Tkachov
> 
> Subject: [PATCH] Adjust guality xfails for aarch64*-*-*
> 
> This patch gives clean guality.exp test results for aarch64-linux-gnu
> with modern (top-of-tree) gdb.
> 
> For people using older gdbs, it will trade one set of noisy results for
> another set.  I still think it's better to have the xfails based on
> one “clean” and “modern” run rather than have FAILs and XPASSes for
> all runs.
> 
> It's hard to tell which of these results are aarch64-specific and
> which aren't.  If other target maintainers want to do something similar,
> and are prepared to assume the same gdb version, then it should become
> clearer over time which ones are target-specific and which aren't.
> 
> There are no new skips here, so changes in test results will still
> show up as XPASSes.
> 
> I've not analysed the failures or filed PRs for them.  In some
> ways the guality directory itself seems like the best place to
> start looking for xfails, if someone's interested in working
> in this area.
> 
> Tested on aarch64-linux-gnu.  I'd like to install this once the
> earlier no-opts/any-opts patch is in, but please let me know
> if you think it's a bad idea.

It's okay with me.
Thanks,
Kyrill

> 
> Richard
> 
> 
> gcc/testsuite/
>   * gcc.dg/guality/example.c: Update aarch64*-*-* xfails.
>   * gcc.dg/guality/guality.c: Likewise.
>   * gcc.dg/guality/inline-params.c: Likewise.
>   * gcc.dg/guality/loop-1.c: Likewise.
>   * gcc.dg/guality/pr36728-1.c: Likewise.
>   * gcc.dg/guality/pr36728-2.c: Likewise.
>   * gcc.dg/guality/pr36728-3.c: Likewise.
>   * gcc.dg/guality/pr41447-1.c: Likewise.
>   * gcc.dg/guality/pr54200.c:  Likewise.
>   * gcc.dg/guality/pr54519-1.c: Likewise.
>   * gcc.dg/guality/pr54519-2.c: Likewise.
>   * gcc.dg/guality/pr54519-3.c: Likewise.
>   * gcc.dg/guality/pr54519-4.c: Likewise.
>   * gcc.dg/guality/pr54519-5.c: Likewise.
>   * gcc.dg/guality/pr54519-6.c: Likewise.
>   * gcc.dg/guality/pr54693-2.c: Likewise.
>   * gcc.dg/guality/pr56154-1.c: Likewise.
>   * gcc.dg/guality/pr59776.c: Likewise.
>   * gcc.dg/guality/pr68860-1.c: Likewise.
>   * gcc.dg/guality/pr68860-2.c: Likewise.
>   * gcc.dg/guality/pr90074.c: Likewise.
>   * gcc.dg/guality/pr90716.c: Likewise.
>   * gcc.dg/guality/sra-1.c: Likewise.
> ---
>  gcc/testsuite/gcc.dg/guality/example.c   |  3 +-
>  gcc/testsuite/gcc.dg/guality/guality.c   |  2 +-
>  gcc/testsuite/gcc.dg/guality/inline-params.c |  2 +-
>  gcc/testsuite/gcc.dg/guality/loop-1.c|  2 +-
>  gcc/testsuite/gcc.dg/guality/pr36728-1.c |  2 +-
>  gcc/testsuite/gcc.dg/guality/pr36728-2.c | 30 ++--
>  gcc/testsuite/gcc.dg/guality/pr36728-3.c |  2 +-
>  gcc/testsuite/gcc.dg/guality/pr41447-1.c |  1 +
>  gcc/testsuite/gcc.dg/guality/pr54200.c   |  2 +-
>  gcc/testsuite/gcc.dg/guality/pr54519-1.c |  8 +++---
>  gcc/testsuite/gcc.dg/guality/pr54519-2.c |  2 +-
>  gcc/testsuite/gcc.dg/guality/pr54519-3.c |  8 +++---
>  gcc/testsuite/gcc.dg/guality/pr54519-4.c |  2 +-
>  gcc/testsuite/gcc.dg/guality/pr54519-5.c |  2 +-
>  gcc/testsuite/gcc.dg/guality/pr54519-6.c |  2 +-
>  gcc/testsuite/gcc.dg/guality/pr54693-2.c |  2 +-
>  gcc/testsuite/gcc.dg/guality/pr56154-1.c |  2 +-
>  gcc/testsuite/gcc.dg/guality/pr59776.c   | 12 
>  gcc/testsuite/gcc.dg/guality/pr68860-1.c |  2 +-
>  gcc/testsuite/gcc.dg/guality/pr68860-2.c |  2 +-
>  gcc/testsuite/gcc.dg/guality/pr90074.c   |  4 +--
>  gcc/testsuite/gcc.dg/guality/pr90716.c   |  2 +-
>  gcc/testsuite/gcc.dg/guality/sra-1.c |  8 +++---
>  23 files changed, 53 insertions(+), 51 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.dg/guality/example.c
> b/gcc/testsuite/gcc.dg/guality/example.c
> index 26d25c28590..6f1c017a253 100644
> --- a/gcc/testsuite/gcc.dg/guality/example.c
> +++ b/gcc/testsuite/gcc.dg/guality/example.c
> @@ -1,5 +1,6 @@
> -/* { dg-do run { xfail *-*-* } } */
> +/* { dg-do run { xfail { ! aarch64*-*-* } } } */
>  /* { dg-options "-g" } */
> +/* { dg-xfail-run-if "" aarch64*-*-* "*" { "-O[01g]" } } */
> 
>  #define GUALITY_DONT_FORCE_LIVE_AFTER -1
> 
> diff --git a/gcc/testsuite/gcc.dg/guality/guality.c
> b/gcc/testsuite/gcc.dg/guality/guality.c
> index db015e6a558..a4de5646fc7 100644
> --- a/gcc/testsuite/gcc.dg/guality/guality.c
> +++ b/gcc/testsuite/gcc.dg/guality/guality.c
> @@ -1,4 +1,4 @@
> -/* { dg-do run { xfail *-*-* } } */
> +/* { dg-do run { xfail { ! aarch64*-*-* } } } */
>  /* { dg-options "-g" } */
>  /* { dg-require-effective-target alloca } */
> 
> diff --git a/gcc/testsuite/gcc.dg/guality/inline-params.c
> b/gcc/testsuite/gcc.dg/guality/inline-params.c
> index f4c5f15094c..6be240a28d2 100644
> --- a/gcc/testsuite/gcc.dg/guality/inline-params.c
> +++ 

[PATCH][DOC] Document ranger-logical-depth in invoke.texi

2021-04-20 Thread Martin Liška
Pushed as obvious to master (and releases/gcc-11 branch).

Martin

gcc/ChangeLog:

* doc/invoke.texi: Document new param.
---
 gcc/doc/invoke.texi | 4 
 1 file changed, 4 insertions(+)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 8b70fdf580d..a1b50406048 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -14364,6 +14364,10 @@ The parameter is used only in GIMPLE FE.
 The maximum number of 'after supernode' exploded nodes within the analyzer
 per supernode, before terminating analysis.
 
+@item ranger-logical-depth
+Maximum depth of logical expression evaluation ranger will look through
+when evaluting outgoing edge ranges.
+
 @item openacc-kernels
 Specify mode of OpenACC `kernels' constructs handling.
 With @option{--param=openacc-kernels=decompose}, OpenACC `kernels'
-- 
2.31.1



Re: [Patch, fortran] PR100110 - Parameterized Derived Types, problems with global variable

2021-04-20 Thread Paul Richard Thomas via Gcc-patches
Hi Tobias,

Thanks. Commit r11-8255-g67378cd63d62bf0c69e966d1d202a1e586550a68.

By the way, I did check that there were no problems with pdt_26.f03
reported by valgrind, given the decrease in the malloc count.

Cheers

Paul


On Mon, 19 Apr 2021 at 14:08, Tobias Burnus  wrote:

> Hi Paul,
>
> On 19.04.21 14:40, Paul Richard Thomas via Gcc-patches wrote:
> > I was just about to announce that I will only do backports and
> regressions,
> > while I finally attack the fundamental problem with the representation of
> > Parameterized Derived Types. Then this PR came up that was such clear low
> > hanging fruit that I decided to fix it right away.
> >
> > Regtests on FC33/x86_64 - OK for mainline?
>
> LGTM.
>
> Thanks,
>
> Tobias
>
> -
> Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München
> Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank
> Thürauf
>


-- 
"If you can't explain it simply, you don't understand it well enough" -
Albert Einstein


[PATCH] testsuite: Fix gcc.dg/vect/bb-slp-39.c on IBM Z

2021-04-20 Thread Stefan Schulze Frielinghaus via Gcc-patches
On IBM Z the aliasing stores are realized through one element vector
instructions, if no cost model for vectorization is used which is the
default according to vect.exp.  Fixed by changing the number of times
the pattern must be found in the dump.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/bb-slp-39.c: Change number of times the pattern
must match for target IBM Z only.

Ok for mainline?

---
 gcc/testsuite/gcc.dg/vect/bb-slp-39.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-39.c 
b/gcc/testsuite/gcc.dg/vect/bb-slp-39.c
index 255bb1095dc..ee596cfa08b 100644
--- a/gcc/testsuite/gcc.dg/vect/bb-slp-39.c
+++ b/gcc/testsuite/gcc.dg/vect/bb-slp-39.c
@@ -16,4 +16,5 @@ void foo (double *p)
 }
 
 /* See that we vectorize three SLP instances.  */
-/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 3 "slp2" } 
} */
+/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 3 "slp2" { 
target { ! s390*-*-* } } } } */
+/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 5 "slp2" { 
target {   s390*-*-* } } } } */
-- 
2.23.0



Re: [PATCH] testsuite: Fix up gcc.target/s390/zero-scratch-regs-1.c

2021-04-20 Thread Andreas Krebbel via Gcc-patches
On 4/20/21 9:17 AM, Stefan Schulze Frielinghaus wrote:
> Depending on whether GCC is configured using --with-mode=zarch or not,
> for the 31bit target instructions are generated either for ESA or
> z/Architecture.  For the sake of simplicity and robustness test only for
> the latter by adding manually option -mzarch.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/s390/zero-scratch-regs-1.c: Force test to run for
>   z/Architecture only.>
> Ok for mainline?

Ok. Thanks!

Andreas

> 
> ---
>  .../gcc.target/s390/zero-scratch-regs-1.c | 95 ---
>  1 file changed, 40 insertions(+), 55 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.target/s390/zero-scratch-regs-1.c 
> b/gcc/testsuite/gcc.target/s390/zero-scratch-regs-1.c
> index c394c4b69e7..1c02c0c4e51 100644
> --- a/gcc/testsuite/gcc.target/s390/zero-scratch-regs-1.c
> +++ b/gcc/testsuite/gcc.target/s390/zero-scratch-regs-1.c
> @@ -1,65 +1,50 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O2 -fzero-call-used-regs=all -march=z13" } */
> +/* { dg-options "-O2 -fzero-call-used-regs=all -march=z13 -mzarch" } */
>  
>  /* Ensure that all call clobbered GPRs, FPRs, and VRs are zeroed and all call
> saved registers are kept. */
>  
>  void foo (void) { }
>  
> -/* { dg-final { scan-assembler-times "lhi\t" 6 { target { ! lp64 } } } } */
> -/* { dg-final { scan-assembler "lhi\t%r0,0" { target { ! lp64 } } } } */
> -/* { dg-final { scan-assembler "lhi\t%r1,0" { target { ! lp64 } } } } */
> -/* { dg-final { scan-assembler "lhi\t%r2,0" { target { ! lp64 } } } } */
> -/* { dg-final { scan-assembler "lhi\t%r3,0" { target { ! lp64 } } } } */
> -/* { dg-final { scan-assembler "lhi\t%r4,0" { target { ! lp64 } } } } */
> -/* { dg-final { scan-assembler "lhi\t%r5,0" { target { ! lp64 } } } } */
> +/* { dg-final { scan-assembler-times "lghi\t" 6 } } */
> +/* { dg-final { scan-assembler "lghi\t%r0,0" } } */
> +/* { dg-final { scan-assembler "lghi\t%r1,0" } } */
> +/* { dg-final { scan-assembler "lghi\t%r2,0" } } */
> +/* { dg-final { scan-assembler "lghi\t%r3,0" } } */
> +/* { dg-final { scan-assembler "lghi\t%r4,0" } } */
> +/* { dg-final { scan-assembler "lghi\t%r5,0" } } */
>  
> -/* { dg-final { scan-assembler-times "lzdr\t" 14 { target { ! lp64 } } } } */
> -/* { dg-final { scan-assembler "lzdr\t%f0" { target { ! lp64 } } } } */
> -/* { dg-final { scan-assembler "lzdr\t%f1" { target { ! lp64 } } } } */
> -/* { dg-final { scan-assembler "lzdr\t%f2" { target { ! lp64 } } } } */
> -/* { dg-final { scan-assembler "lzdr\t%f3" { target { ! lp64 } } } } */
> -/* { dg-final { scan-assembler "lzdr\t%f5" { target { ! lp64 } } } } */
> -/* { dg-final { scan-assembler "lzdr\t%f7" { target { ! lp64 } } } } */
> -/* { dg-final { scan-assembler "lzdr\t%f8" { target { ! lp64 } } } } */
> -/* { dg-final { scan-assembler "lzdr\t%f9" { target { ! lp64 } } } } */
> -/* { dg-final { scan-assembler "lzdr\t%f10" { target { ! lp64 } } } } */
> -/* { dg-final { scan-assembler "lzdr\t%f11" { target { ! lp64 } } } } */
> -/* { dg-final { scan-assembler "lzdr\t%f12" { target { ! lp64 } } } } */
> -/* { dg-final { scan-assembler "lzdr\t%f13" { target { ! lp64 } } } } */
> -/* { dg-final { scan-assembler "lzdr\t%f14" { target { ! lp64 } } } } */
> -/* { dg-final { scan-assembler "lzdr\t%f15" { target { ! lp64 } } } } */
> -
> -/* { dg-final { scan-assembler-times "lghi\t" 6 { target { lp64 } } } } */
> -/* { dg-final { scan-assembler "lghi\t%r0,0" { target { lp64 } } } } */
> -/* { dg-final { scan-assembler "lghi\t%r1,0" { target { lp64 } } } } */
> -/* { dg-final { scan-assembler "lghi\t%r2,0" { target { lp64 } } } } */
> -/* { dg-final { scan-assembler "lghi\t%r3,0" { target { lp64 } } } } */
> -/* { dg-final { scan-assembler "lghi\t%r4,0" { target { lp64 } } } } */
> -/* { dg-final { scan-assembler "lghi\t%r5,0" { target { lp64 } } } } */
> -
> -/* { dg-final { scan-assembler-times "vzero\t" 24 { target { lp64 } } } } */
> -/* { dg-final { scan-assembler "vzero\t%v0" { target { lp64 } } } } */
> -/* { dg-final { scan-assembler "vzero\t%v1" { target { lp64 } } } } */
> -/* { dg-final { scan-assembler "vzero\t%v2" { target { lp64 } } } } */
> -/* { dg-final { scan-assembler "vzero\t%v3" { target { lp64 } } } } */
> +/* { dg-final { scan-assembler-times "vzero\t" 30 { target { ! lp64 } } } } 
> */
> +/* { dg-final { scan-assembler-times "vzero\t" 24 { target {   lp64 } } } } 
> */
> +/* { dg-final { scan-assembler "vzero\t%v0" } } */
> +/* { dg-final { scan-assembler "vzero\t%v1" } } */
> +/* { dg-final { scan-assembler "vzero\t%v2" } } */
> +/* { dg-final { scan-assembler "vzero\t%v3" } } */
>  /* { dg-final { scan-assembler "vzero\t%v4" { target { lp64 } } } } */
> -/* { dg-final { scan-assembler "vzero\t%v5" { target { lp64 } } } } */
> +/* { dg-final { scan-assembler "vzero\t%v5" } } */
>  /* { dg-final { scan-assembler "vzero\t%v6" { target { lp64 } } } } */
> -/* { dg-final { scan-assembler "vzero\t%v7" { target { lp64 } } } } */
> -/* { dg-final { 

[PATCH] testsuite: Fix up gcc.target/s390/zero-scratch-regs-1.c

2021-04-20 Thread Stefan Schulze Frielinghaus via Gcc-patches
Depending on whether GCC is configured using --with-mode=zarch or not,
for the 31bit target instructions are generated either for ESA or
z/Architecture.  For the sake of simplicity and robustness test only for
the latter by adding manually option -mzarch.

gcc/testsuite/ChangeLog:

* gcc.target/s390/zero-scratch-regs-1.c: Force test to run for
z/Architecture only.

Ok for mainline?

---
 .../gcc.target/s390/zero-scratch-regs-1.c | 95 ---
 1 file changed, 40 insertions(+), 55 deletions(-)

diff --git a/gcc/testsuite/gcc.target/s390/zero-scratch-regs-1.c 
b/gcc/testsuite/gcc.target/s390/zero-scratch-regs-1.c
index c394c4b69e7..1c02c0c4e51 100644
--- a/gcc/testsuite/gcc.target/s390/zero-scratch-regs-1.c
+++ b/gcc/testsuite/gcc.target/s390/zero-scratch-regs-1.c
@@ -1,65 +1,50 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fzero-call-used-regs=all -march=z13" } */
+/* { dg-options "-O2 -fzero-call-used-regs=all -march=z13 -mzarch" } */
 
 /* Ensure that all call clobbered GPRs, FPRs, and VRs are zeroed and all call
saved registers are kept. */
 
 void foo (void) { }
 
-/* { dg-final { scan-assembler-times "lhi\t" 6 { target { ! lp64 } } } } */
-/* { dg-final { scan-assembler "lhi\t%r0,0" { target { ! lp64 } } } } */
-/* { dg-final { scan-assembler "lhi\t%r1,0" { target { ! lp64 } } } } */
-/* { dg-final { scan-assembler "lhi\t%r2,0" { target { ! lp64 } } } } */
-/* { dg-final { scan-assembler "lhi\t%r3,0" { target { ! lp64 } } } } */
-/* { dg-final { scan-assembler "lhi\t%r4,0" { target { ! lp64 } } } } */
-/* { dg-final { scan-assembler "lhi\t%r5,0" { target { ! lp64 } } } } */
+/* { dg-final { scan-assembler-times "lghi\t" 6 } } */
+/* { dg-final { scan-assembler "lghi\t%r0,0" } } */
+/* { dg-final { scan-assembler "lghi\t%r1,0" } } */
+/* { dg-final { scan-assembler "lghi\t%r2,0" } } */
+/* { dg-final { scan-assembler "lghi\t%r3,0" } } */
+/* { dg-final { scan-assembler "lghi\t%r4,0" } } */
+/* { dg-final { scan-assembler "lghi\t%r5,0" } } */
 
-/* { dg-final { scan-assembler-times "lzdr\t" 14 { target { ! lp64 } } } } */
-/* { dg-final { scan-assembler "lzdr\t%f0" { target { ! lp64 } } } } */
-/* { dg-final { scan-assembler "lzdr\t%f1" { target { ! lp64 } } } } */
-/* { dg-final { scan-assembler "lzdr\t%f2" { target { ! lp64 } } } } */
-/* { dg-final { scan-assembler "lzdr\t%f3" { target { ! lp64 } } } } */
-/* { dg-final { scan-assembler "lzdr\t%f5" { target { ! lp64 } } } } */
-/* { dg-final { scan-assembler "lzdr\t%f7" { target { ! lp64 } } } } */
-/* { dg-final { scan-assembler "lzdr\t%f8" { target { ! lp64 } } } } */
-/* { dg-final { scan-assembler "lzdr\t%f9" { target { ! lp64 } } } } */
-/* { dg-final { scan-assembler "lzdr\t%f10" { target { ! lp64 } } } } */
-/* { dg-final { scan-assembler "lzdr\t%f11" { target { ! lp64 } } } } */
-/* { dg-final { scan-assembler "lzdr\t%f12" { target { ! lp64 } } } } */
-/* { dg-final { scan-assembler "lzdr\t%f13" { target { ! lp64 } } } } */
-/* { dg-final { scan-assembler "lzdr\t%f14" { target { ! lp64 } } } } */
-/* { dg-final { scan-assembler "lzdr\t%f15" { target { ! lp64 } } } } */
-
-/* { dg-final { scan-assembler-times "lghi\t" 6 { target { lp64 } } } } */
-/* { dg-final { scan-assembler "lghi\t%r0,0" { target { lp64 } } } } */
-/* { dg-final { scan-assembler "lghi\t%r1,0" { target { lp64 } } } } */
-/* { dg-final { scan-assembler "lghi\t%r2,0" { target { lp64 } } } } */
-/* { dg-final { scan-assembler "lghi\t%r3,0" { target { lp64 } } } } */
-/* { dg-final { scan-assembler "lghi\t%r4,0" { target { lp64 } } } } */
-/* { dg-final { scan-assembler "lghi\t%r5,0" { target { lp64 } } } } */
-
-/* { dg-final { scan-assembler-times "vzero\t" 24 { target { lp64 } } } } */
-/* { dg-final { scan-assembler "vzero\t%v0" { target { lp64 } } } } */
-/* { dg-final { scan-assembler "vzero\t%v1" { target { lp64 } } } } */
-/* { dg-final { scan-assembler "vzero\t%v2" { target { lp64 } } } } */
-/* { dg-final { scan-assembler "vzero\t%v3" { target { lp64 } } } } */
+/* { dg-final { scan-assembler-times "vzero\t" 30 { target { ! lp64 } } } } */
+/* { dg-final { scan-assembler-times "vzero\t" 24 { target {   lp64 } } } } */
+/* { dg-final { scan-assembler "vzero\t%v0" } } */
+/* { dg-final { scan-assembler "vzero\t%v1" } } */
+/* { dg-final { scan-assembler "vzero\t%v2" } } */
+/* { dg-final { scan-assembler "vzero\t%v3" } } */
 /* { dg-final { scan-assembler "vzero\t%v4" { target { lp64 } } } } */
-/* { dg-final { scan-assembler "vzero\t%v5" { target { lp64 } } } } */
+/* { dg-final { scan-assembler "vzero\t%v5" } } */
 /* { dg-final { scan-assembler "vzero\t%v6" { target { lp64 } } } } */
-/* { dg-final { scan-assembler "vzero\t%v7" { target { lp64 } } } } */
-/* { dg-final { scan-assembler "vzero\t%v16" { target { lp64 } } } } */
-/* { dg-final { scan-assembler "vzero\t%v17" { target { lp64 } } } } */
-/* { dg-final { scan-assembler "vzero\t%v18" { target { lp64 } } } } */
-/* { dg-final { scan-assembler "vzero\t%v19" { target { 

Re: [PATCH] s390/testsuite: Fix oscbreak-1.c.

2021-04-20 Thread Andreas Krebbel via Gcc-patches
On 4/16/21 3:59 PM, Robin Dapp wrote:
> Hi,
> 
> checking for an osc break is somewhat brittle especially with many
> passes potentially introducing new insns and moving them around.
> Therefore, only run the test with -O1 -fschedule-insns in order to limit
> the influence of other passes.

Yeah, that's because of the very limited analysis we do in the backend to 
detect such cases. In fact
we probably would want to have an OSC break in many of them as well.

For me the testcase appears to work with -O2 on all the -march levels. I think 
-O2 would be
preferred because that's what is most frequently used.

> 
> Is it OK?

Yes, either with -O2 or the options you have proposed if -O2 doesn't work out 
for you.

Thanks!

Andreas

> 
> Regards
>   Robin
> 
> --
> 
> gcc/testsuite/ChangeLog:
> 
> * gcc.target/s390/oscbreak-1.c: Compile with -O1
>   -fschedule-insns.
> 



Re: Fix Fortran rounding issues, PR fortran/96983.

2021-04-20 Thread Richard Biener via Gcc-patches
On Mon, Apr 19, 2021 at 9:40 PM Michael Meissner via Fortran
 wrote:
>
> Fix Fortran rounding issues, PR fortran/96983.
>
> I was looking at Fortran PR 96983, which fails on the PowerPC when trying to
> run the test PR96711.F90.  The compiler ICEs because the PowerPC does not have
> a floating point type with a type precision of 128.  The reason is that the
> PowerPC has 3 different 128 bit floating point types (__float128/_Float128,
> __ibm128, and long double).  Currently long double uses the IBM extended 
> double
> type, but we would like to switch to using IEEE 128-bit long doubles in the
> future.
>
> In order to prevent the compiler from converting explicit __ibm128 types to
> long double when long double uses the IEEE 128-bit representation, we have set
> up the precision for __ibm128 to be 128, long double to be 127, and
> __float128/_Float128 to be 126.
>
> Originally, I was trying to see if for Fortran, I could change the precision 
> of
> long double to be 128 (Fortran doesn't access __ibm128), but it quickly became
> hard to get the changes to work.
>
> I looked at the Fortran code in build_round_expr, and I came to the conclusion
> that there is no reason to promote the floating point type.  If you just do a
> normal round of the value using the current floating point format and then
> convert it to the integer type.  We don't have an appropriate built-in 
> function
> that provides the equivalent of llround for 128-bit integer types.
>
> This patch fixes the compiler crash.
>
> However, while with this patch, the PowerPC compiler will not crash when
> building the test case, it will not run on the current default installation.
> The failure is because the test is explicitly expecting 128-bit floating point
> to handle 10384593717069655257060992658440192_16 (i.e. 2**113).
>
> By default, the PowerPC uses IBM extended double used for 128-bit floating
> point.  The IBM extended double format is a pair of doubles that provides more
> mantissa bits but does not grow the expoenent range.  The value in the test is
> fine for IEEE 128-bit floating point, but it is too large for the PowerPC
> extended double setup.
>
> I have built the following tests with this patch:
>
>* I have built a bootstrap compiler on a little endian power9 Linux system
>  with the default long double format (IBM extended double).  The
>  pr96711.f90 test builds, but it does not run due to the range of the
>  real*16 exponent.  There were no other regressions in the C/C++/Fortran
>  tests.
>
>* I have built a bootstrap compiler on a little endian power9 Linux system
>  with the default long double format set to IEEE 128-bit. I used the
>  Advance Toolchain 14.0-2 to provide the IEEE 128-bits.  The compiler was
>  configured to build power9 code by default, so the test generated native
>  power9 IEEE 128-bit instructions.  The pr96711.f90 test builds and runs
>  correctly in this setup.
>
>* I have built a bootstrap compiler on a big endian power8 Linux system 
> with
>  the default long double format (IBM extended double).  Like the first
>  case, the pr96711.f90 test does not crash the compiler, but the test 
> fails
>  due to the range of the real*16 exponent.There were no other
>  regressions in the C/C++/Fortran tests.
>
>* I built a bootstrap compiler on my x86_64 laptop.  There were no
>  regressions in the tests.
>
>
> Can I check this change into the GCC trunk?

The patch looks fine technically and is definitely an improvement since the
intermediate conversion looks odd.  But it might be that the standard
requires such dance though the preceeding cases handled don't seem to
care.  I'm thinking of a FP format where round(1.6) == 3 because of lack
of precision but using an intermediate FP format with higher precision
would "correctly" compute 2.

Of course the current code doesn't handle this correctly for the
case if llroundf either.

Richard.

> I've not contributed to the Fortran front end before.  If the maintainers like
> the patch, can somebody point out if I need to do additional things to commit
> the patch?
>
> gcc/fortran/
> 2021-04-19  Michael Meissner  
>
> PR gfortran/96983
> * trans-intrinsic.c (build_round_expr): If int type is larger than
> long long, do the round and convert to the integer type.  Do not
> try to find a floating point type the exact size of the integer
> type.
> ---
>  gcc/fortran/trans-intrinsic.c | 26 --
>  1 file changed, 8 insertions(+), 18 deletions(-)
>
> diff --git a/gcc/fortran/trans-intrinsic.c b/gcc/fortran/trans-intrinsic.c
> index 5e53d1162fa..cceef8f34ac 100644
> --- a/gcc/fortran/trans-intrinsic.c
> +++ b/gcc/fortran/trans-intrinsic.c
> @@ -386,30 +386,20 @@ build_round_expr (tree arg, tree restype)
>argprec = TYPE_PRECISION (argtype);
>resprec = TYPE_PRECISION (restype);
>
> -  /* Depending on the type of the result, choose